From patchwork Fri Apr 28 11:47:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 98331 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp221730qgf; Fri, 28 Apr 2017 04:48:10 -0700 (PDT) X-Received: by 10.84.212.1 with SMTP id d1mr14491159pli.109.1493380090466; Fri, 28 Apr 2017 04:48:10 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t19si5750041pgj.88.2017.04.28.04.48.10; Fri, 28 Apr 2017 04:48:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1034740AbdD1LsD (ORCPT + 25 others); Fri, 28 Apr 2017 07:48:03 -0400 Received: from foss.arm.com ([217.140.101.70]:47832 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968657AbdD1Lrq (ORCPT ); Fri, 28 Apr 2017 07:47:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 49E7D1682; Fri, 28 Apr 2017 04:47:45 -0700 (PDT) Received: from leverpostej.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 6750A3F220; Fri, 28 Apr 2017 04:47:44 -0700 (PDT) From: Mark Rutland To: linux-kernel@vger.kernel.org Cc: acme@kernel.org, alexander.shishkin@linux.intel.com, ganapatrao.kulkarni@cavium.com, mark.rutland@arm.com Subject: [PATCH 2/2] perf stat: balance opening/closing of events Date: Fri, 28 Apr 2017 12:47:10 +0100 Message-Id: <1493380030-4683-3-git-send-email-mark.rutland@arm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1493380030-4683-1-git-send-email-mark.rutland@arm.com> References: <1493380030-4683-1-git-send-email-mark.rutland@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org While perf-stat has its own create_perf_stat_counter() helper to open events, dependent on target configuration, it uses perf_evlist__close() to close events. The common perf_evlist__{open,close}() helpers don't consider the target configuration, and always evsel->cpus even where create_perf_stat_counter() would have used an empty_cpu_map. On some systems, the CPU PMU describes cpus under sysfs, and evsel->cpus may be set even when we open per-thread events. For per-thread events, we don't use evsel->cpus in create_perf_stat_counter(), though perf_evlist__close() will. Thus we try to close more events than we allocated and opened, resulting in segfaults when we go out-of-bounds: $ perf stat -e armv8_pmuv3/cpu_cycles/ true *** Error in `./old-perf': free(): invalid next size (fast): 0x00000000263a55c0 *** Aborted This problem was introduced by commit: 3df33eff2ba96be4 ("perf stat: Avoid skew when reading events") ... prior to that commit, we closed events as we read them, using the correct number of CPUs. This patch fixes the problem by adding a new close_counters() routine that mirrors create_perf_stat_counter(), ensuring that we always correctly balance open/close. Fixes: 3df33eff2ba96be4 ("perf stat: Avoid skew when reading events") Signed-off-by: Mark Rutland Tested-by: Ganapatrao Kulkarni Cc: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: linux-kernel@vger.kernel.org --- tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) -- 1.9.1 diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index a935b50..ceedc0a 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -364,6 +364,28 @@ static void read_counters(void) } } +/* + * Close all evnt FDs we open in __run_perf_stat() and + * create_perf_stat_counter(), taking care to match the number of threads and CPUs. + * + * Note that perf_evlist__close(evsel_list) is not equivalent, as it doesn't + * take the target into account. + */ +static void close_counters(void) +{ + bool per_cpu = target__has_cpu(&target); + struct perf_evsel *evsel; + + evlist__for_each_entry(evsel_list, evsel) { + if (per_cpu) + perf_evsel__close_per_cpu(evsel, + perf_evsel__cpus(evsel)); + else + perf_evsel__close_per_thread(evsel, + evsel_list->threads); + } +} + static void process_interval(void) { struct timespec ts, rs; @@ -704,7 +726,7 @@ static int __run_perf_stat(int argc, const char **argv) * group leaders. */ read_counters(); - perf_evlist__close(evsel_list); + close_counters(); return WEXITSTATUS(status); }