diff mbox

[RFCv2,4/4] perf: util: support sysfs supported_cpumask file

Message ID 1468577293-19667-5-git-send-email-mark.rutland@arm.com
State Superseded
Headers show

Commit Message

Mark Rutland July 15, 2016, 10:08 a.m. UTC
For system PMUs, the perf tools have long expected a cpumask file under
sysfs, describing the single CPU which they support events being
opened/handled on. Prior patches in this series have reworked this
support to support multiple CPUs in a mask, as is required to handle
heterogeneous CPU PMUs.

Unfortunately, adding a cpumask file to CPU PMUs would break existing
userspace. Prior to this series, perf record will refuse to open events,
and perf stat may unexpectedly block at exit time. In the absence of a
cpumask, perf stat is functional.

To address this, this patch adds support for a new file,
supported_cpumask, which can be used to describe heterogeneous CPUs,
without the risk of breaking existing userspace binaries.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>

---
 tools/perf/util/pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

-- 
1.9.1

Comments

Mark Rutland July 18, 2016, 3 p.m. UTC | #1
On Mon, Jul 18, 2016 at 04:30:18PM +0200, Jiri Olsa wrote:
> On Fri, Jul 15, 2016 at 11:08:13AM +0100, Mark Rutland wrote:

> > For system PMUs, the perf tools have long expected a cpumask file under

> > sysfs, describing the single CPU which they support events being

> 

> single cpu? it's cpumask.. 


Indeed.

The issue is that in practice, due to an internal inconsistency the
perf tools only work work when a single CPU is described in the mask.
More details below (and in patch 1).

> > opened/handled on. Prior patches in this series have reworked this

> > support to support multiple CPUs in a mask, as is required to handle

> > heterogeneous CPU PMUs.

> > 

> > Unfortunately, adding a cpumask file to CPU PMUs would break existing

> > userspace. Prior to this series, perf record will refuse to open events,

> 

> I'm lost.. we already have 'cpumask' file under pmu..


Sorry, I should spell out the problem more concretely:

When manipulating events, the tools sometimes use evsel->cpus, and other
times evlist->cpus. Sometimes, the two are used inconsistently, which
only works if they are the same size and/or describe the same CPUs.
Patch 1 fixes an instance of this, where the inconsistency results in
treating uninitialised memory as perf event FDs.

In the absence of a PMU cpumask file, the evsel's cpumask is initialised
to that of the evlist, so things line up.

Currently the only PMUs which happen to expose a cpumask are uncore
PMUs, which in practice only describe a single CPU.

When recording system-wide, various parts of the perf tools assume a
single CPU, regardless of evlist->cpus, for the purpose of manipulating
events. This happens to make uncore PMUs work, avoiding the
inconsistency.

Were we to just add a 'cpumask' file to our CPU PMUs, we would break
existing userspace (e.g. hitting the issue fixed in patch 1).

The difference in naming allows new userspace to do the right thing
while not breaking existing userspace, though I agree it's somewhat
clunky.

> > and perf stat may unexpectedly block at exit time. In the absence of a

> > cpumask, perf stat is functional.

> > 

> > To address this, this patch adds support for a new file,

> > supported_cpumask, which can be used to describe heterogeneous CPUs,

> > without the risk of breaking existing userspace binaries.

> 

> is there kernel patch adding supported_cpumask support?


Modulo the naming, a patch exists [1]. We were holding off adding that
until we'd figured out how to address breaking existing userspace [2].

Thanks,
Mark.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-June/438239.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/441953.html
Mark Rutland July 18, 2016, 5:13 p.m. UTC | #2
On Mon, Jul 18, 2016 at 05:38:16PM +0100, Suzuki K Poulose wrote:
> On 15/07/16 11:08, Mark Rutland wrote:

> >For system PMUs, the perf tools have long expected a cpumask file under

> >sysfs, describing the single CPU which they support events being

> >opened/handled on. Prior patches in this series have reworked this

> >support to support multiple CPUs in a mask, as is required to handle

> >heterogeneous CPU PMUs.

> >

> >Unfortunately, adding a cpumask file to CPU PMUs would break existing

> >userspace. Prior to this series, perf record will refuse to open events,

> >and perf stat may unexpectedly block at exit time. In the absence of a

> >cpumask, perf stat is functional.

> >

> >To address this, this patch adds support for a new file,

> >supported_cpumask, which can be used to describe heterogeneous CPUs,

> >without the risk of breaking existing userspace binaries.

> >

> >Signed-off-by: Mark Rutland <mark.rutland@arm.com>

> >---

> > tools/perf/util/pmu.c | 15 ++++++++++++---

> > 1 file changed, 12 insertions(+), 3 deletions(-)

> >

> >diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c

> >index ddb0261..06c985c 100644

> >--- a/tools/perf/util/pmu.c

> >+++ b/tools/perf/util/pmu.c

> >@@ -445,14 +445,23 @@ static struct cpu_map *pmu_cpumask(const char *name)

> > 	FILE *file;

> > 	struct cpu_map *cpus;

> > 	const char *sysfs = sysfs__mountpoint();

> >+	const char *path_template[] = {

> >+		 "%s/bus/event_source/devices/%s/cpumask",

> >+		 "%s/bus/event_source/devices/%s/supported_cpumask",

> >+		 NULL

> >+	};

> >+	unsigned int i;

> >

> > 	if (!sysfs)

> > 		return NULL;

> >

> >-	snprintf(path, PATH_MAX,

> >-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);

> >+	for (i = 0; i < ARRAY_SIZE(path_template); i++) {

> 

> The check could be "path_template[i]" to avoid an iteration with NULL

> template.


True. I'd reworked this loop a few times to try to avoid duplicated
checks, but evidently I'd messed that up. I'll clean this up.

> 

> >+		snprintf(path, PATH_MAX, *path_template, sysfs, name);

> 

> Btw, did you mean to use path_template[i] here instead of *path_template ?

> 

> >+		if (stat(path, &st) == 0)

> >+			break;

> >+	}

> >

> >-	if (stat(path, &st) < 0)

> >+	if (!*path_template)

> 

> Same here ?


Yes. On both counts I meant to use the current iteration's entry.

Thanks for spotting that. I'll fix that up.

Thanks,
Mark.
Mark Rutland July 21, 2016, 9:49 a.m. UTC | #3
On Thu, Jul 21, 2016 at 10:10:35AM +0200, Jiri Olsa wrote:
> On Mon, Jul 18, 2016 at 04:00:45PM +0100, Mark Rutland wrote:

> > On Mon, Jul 18, 2016 at 04:30:18PM +0200, Jiri Olsa wrote:

> > > On Fri, Jul 15, 2016 at 11:08:13AM +0100, Mark Rutland wrote:

> > > > For system PMUs, the perf tools have long expected a cpumask file under

> > > > sysfs, describing the single CPU which they support events being

> > > 

> > > single cpu? it's cpumask.. 

> > 

> > Indeed.

> > 

> > The issue is that in practice, due to an internal inconsistency the

> > perf tools only work work when a single CPU is described in the mask.

> > More details below (and in patch 1).

> > 

> > > > opened/handled on. Prior patches in this series have reworked this

> > > > support to support multiple CPUs in a mask, as is required to handle

> > > > heterogeneous CPU PMUs.

> > > > 

> > > > Unfortunately, adding a cpumask file to CPU PMUs would break existing

> > > > userspace. Prior to this series, perf record will refuse to open events,

> > > 

> > > I'm lost.. we already have 'cpumask' file under pmu..

> > 

> > Sorry, I should spell out the problem more concretely:

> > 

> > When manipulating events, the tools sometimes use evsel->cpus, and other

> > times evlist->cpus. Sometimes, the two are used inconsistently, which

> > only works if they are the same size and/or describe the same CPUs.

> > Patch 1 fixes an instance of this, where the inconsistency results in

> > treating uninitialised memory as perf event FDs.

> > 

> > In the absence of a PMU cpumask file, the evsel's cpumask is initialised

> > to that of the evlist, so things line up.

> > 

> > Currently the only PMUs which happen to expose a cpumask are uncore

> > PMUs, which in practice only describe a single CPU.

> > 

> > When recording system-wide, various parts of the perf tools assume a

> > single CPU, regardless of evlist->cpus, for the purpose of manipulating

> > events. This happens to make uncore PMUs work, avoiding the

> > inconsistency.

> > 

> > Were we to just add a 'cpumask' file to our CPU PMUs, we would break

> > existing userspace (e.g. hitting the issue fixed in patch 1).

> 

> so you're saying that perf is broken once pmu's cpumask

> contains more than single cpu, is that right?


Yes.

> we should fix that, not make workarounds.. I'll go check,

> I might be still missing something ;-)


I certainly agree that this should be fixed in the perf tool; hence
patches 1-3. ;)

The problem the workaround is trying to solve is kernel compatibility
with existing binaries, for which (prior to this series):

- perf record doesn't work by default in heterogeneous systems in the
  *absence* of a cpumask.

- perf stat doesn't work by default in heterogeneous systems in the
  *presence* of a cpumask.

The kernel doesn't *currently* expose a cpumask for the ARM CPU PMUs, so
we'd need to add one. While new userspace should work as of these
patches, I can't add a file called 'cpumask' kernel-side without
breaking existing perf existing binaries (in the case of perf stat).

If it's possible to solve this without exposing a cpumask file at all,
that would be ideal, but so far I haven't been able to make that work.
Any ideas welcome!

> would be great to have some automated test for this stuff 


Good point. I will take a look into that.

Thanks,
Mark.

[1] http://lkml.kernel.org/r/1468577293-19667-1-git-send-email-mark.rutland@arm.com
diff mbox

Patch

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ddb0261..06c985c 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -445,14 +445,23 @@  static struct cpu_map *pmu_cpumask(const char *name)
 	FILE *file;
 	struct cpu_map *cpus;
 	const char *sysfs = sysfs__mountpoint();
+	const char *path_template[] = {
+		 "%s/bus/event_source/devices/%s/cpumask",
+		 "%s/bus/event_source/devices/%s/supported_cpumask",
+		 NULL
+	};
+	unsigned int i;
 
 	if (!sysfs)
 		return NULL;
 
-	snprintf(path, PATH_MAX,
-		 "%s/bus/event_source/devices/%s/cpumask", sysfs, name);
+	for (i = 0; i < ARRAY_SIZE(path_template); i++) {
+		snprintf(path, PATH_MAX, *path_template, sysfs, name);
+		if (stat(path, &st) == 0)
+			break;
+	}
 
-	if (stat(path, &st) < 0)
+	if (!*path_template)
 		return NULL;
 
 	file = fopen(path, "r");