Message ID | e3c4f253-e1ed-32f6-c252-e8657968fc42@huawei.com |
---|---|
State | New |
Headers | show |
Series | Issue of metrics for multiple uncore PMUs (was Re: [RFC PATCH v2 23/23] perf metricgroup: remove duped metric group events) | expand |
On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote: > > On 07/05/2020 15:08, Ian Rogers wrote: > > Hi Ian, > > I was wondering if you ever tested commit 2440689d62e9 ("perf > metricgroup: Remove duped metric group events") for when we have a > metric which aliases multiple instances of the same uncore PMU in the > system? Sorry for this, I hadn't tested such a metric and wasn't aware of how the aliasing worked. I sent a fix for this issue here: https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/ Could you see if this addresses the issue for you? I don't see the change in Arnaldo's trees yet. Thanks, Ian > I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an > issue where find_evsel_group() fails for the uncore metrics under the > condition mentioned above. > > Unfortunately I don't have an x86 machine to which this test applies. > However, as an experiment, I added a test metric to my broadwell JSON: > > diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > index 8cdc7c13dc2a..fc6d9adf996a 100644 > --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > @@ -348,5 +348,11 @@ > "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", > "MetricGroup": "Power", > "MetricName": "C7_Pkg_Residency" > + }, > + { > + "BriefDescription": "test metric", > + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * > UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", > + "MetricGroup": "Test", > + "MetricName": "test_metric_inc" > } > ] > > > And get this: > > john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M > test_metric_inc sleep 1 > Using CPUID GenuineIntel-6-3D-4 > metric expr unc_cbo_xsnp_response.miss_xcore * > unc_cbo_xsnp_response.miss_eviction for test_metric_inc > found event unc_cbo_xsnp_response.miss_eviction > found event unc_cbo_xsnp_response.miss_xcore > adding > {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ > Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore * > unc_cbo_xsnp_response.miss_eviction > task-clock: 688876 688876 688876 > context-switches: 2 688876 688876 > cpu-migrations: 0 688876 688876 > page-faults: 69 688876 688876 > cycles: 2101719 695690 695690 > instructions: 1180534 695690 695690 > branches: 249450 695690 695690 > branch-misses: 10815 695690 695690 > > Performance counter stats for 'sleep 1': > > 0.69 msec task-clock # 0.001 CPUs > utilized > 2 context-switches # 0.003 M/sec > > 0 cpu-migrations # 0.000 K/sec > > 69 page-faults # 0.100 M/sec > > 2,101,719 cycles # 3.051 GHz > > 1,180,534 instructions # 0.56 insn per > cycle > 249,450 branches # 362.112 M/sec > > 10,815 branch-misses # 4.34% of all > branches > > 1.001177693 seconds time elapsed > > 0.001149000 seconds user > 0.000000000 seconds sys > > > john@localhost:~/linux/tools/perf> > > > Any idea what is going wrong here, before I have to dive in? The issue > seems to be this named commit. > > Thanks, > John > > > A metric group contains multiple metrics. These metrics may use the same > > events. If metrics use separate events then it leads to more > > multiplexing and overall metric counts fail to sum to 100%. > > Modify how metrics are associated with events so that if the events in > > an earlier group satisfy the current metric, the same events are used. > > A record of used events is kept and at the end of processing unnecessary > > events are eliminated. > > > > Before:
On 02/10/2020 21:46, Ian Rogers wrote: > On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote: >> >> On 07/05/2020 15:08, Ian Rogers wrote: >> >> Hi Ian, >> >> I was wondering if you ever tested commit 2440689d62e9 ("perf >> metricgroup: Remove duped metric group events") for when we have a >> metric which aliases multiple instances of the same uncore PMU in the >> system? > > Sorry for this, I hadn't tested such a metric and wasn't aware of how > the aliasing worked. I sent a fix for this issue here: > https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/ > Could you see if this addresses the issue for you? I don't see the > change in Arnaldo's trees yet. Unfortunately this does not seem to fix my issue. So for that patch, you say you fix metric expression for DRAM_BW_Use, which is: { "BriefDescription": "Average external Memory Bandwidth Use for reads and writes [GB / sec]", "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ + uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time", "MetricGroup": "Memory_BW", "MetricName": "DRAM_BW_Use" }, But this metric expression does not include any alias events; rather I think it is just cas_count_write + cas_count_read event count for PMU uncore_imc / duration_time. When I say alias, I mean - as an example, we have event: { "BriefDescription": "write requests to memory controller. Derived from unc_m_cas_count.wr", "Counter": "0,1,2,3", "EventCode": "0x4", "EventName": "LLC_MISSES.MEM_WRITE", "PerPkg": "1", "ScaleUnit": "64Bytes", "UMask": "0xC", "Unit": "iMC" }, And then reference LLC_MISSES.MEM_WRITE in a metric expression: "MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time", This is what seems to be broken for when the alias matches > 1 PMU. Please check this. Thanks, John > > Thanks, > Ian > >> I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an >> issue where find_evsel_group() fails for the uncore metrics under the >> condition mentioned above. >> >> Unfortunately I don't have an x86 machine to which this test applies. >> However, as an experiment, I added a test metric to my broadwell JSON: >> >> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json >> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json >> index 8cdc7c13dc2a..fc6d9adf996a 100644 >> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json >> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json >> @@ -348,5 +348,11 @@ >> "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", >> "MetricGroup": "Power", >> "MetricName": "C7_Pkg_Residency" >> + }, >> + { >> + "BriefDescription": "test metric", >> + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * >> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", >> + "MetricGroup": "Test", >> + "MetricName": "test_metric_inc" >> } >> ] >> >> >> And get this: >> >> john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M >> test_metric_inc sleep 1 >> Using CPUID GenuineIntel-6-3D-4 >> metric expr unc_cbo_xsnp_response.miss_xcore * >> unc_cbo_xsnp_response.miss_eviction for test_metric_inc >> found event unc_cbo_xsnp_response.miss_eviction >> found event unc_cbo_xsnp_response.miss_xcore >> adding >> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ >> Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore * >> unc_cbo_xsnp_response.miss_eviction >> task-clock: 688876 688876 688876 >> context-switches: 2 688876 688876 >> cpu-migrations: 0 688876 688876 >> page-faults: 69 688876 688876 >> cycles: 2101719 695690 695690 >> instructions: 1180534 695690 695690 >> branches: 249450 695690 695690 >> branch-misses: 10815 695690 695690 >> >> Performance counter stats for 'sleep 1': >> >> 0.69 msec task-clock # 0.001 CPUs >> utilized >> 2 context-switches # 0.003 M/sec >> >> 0 cpu-migrations # 0.000 K/sec >> >> 69 page-faults # 0.100 M/sec >> >> 2,101,719 cycles # 3.051 GHz >> >> 1,180,534 instructions # 0.56 insn per >> cycle >> 249,450 branches # 362.112 M/sec >> >> 10,815 branch-misses # 4.34% of all >> branches >> >> 1.001177693 seconds time elapsed >> >> 0.001149000 seconds user >> 0.000000000 seconds sys >> >> >> john@localhost:~/linux/tools/perf> >> >> >> Any idea what is going wrong here, before I have to dive in? The issue >> seems to be this named commit. >> >> Thanks, >> John >> >>> A metric group contains multiple metrics. These metrics may use the same >>> events. If metrics use separate events then it leads to more >>> multiplexing and overall metric counts fail to sum to 100%. >>> Modify how metrics are associated with events so that if the events in >>> an earlier group satisfy the current metric, the same events are used. >>> A record of used events is kept and at the end of processing unnecessary >>> events are eliminated. >>> >>> Before: > . >
On Mon, Oct 5, 2020 at 3:06 AM John Garry <john.garry@huawei.com> wrote: > > On 02/10/2020 21:46, Ian Rogers wrote: > > On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote: > >> > >> On 07/05/2020 15:08, Ian Rogers wrote: > >> > >> Hi Ian, > >> > >> I was wondering if you ever tested commit 2440689d62e9 ("perf > >> metricgroup: Remove duped metric group events") for when we have a > >> metric which aliases multiple instances of the same uncore PMU in the > >> system? > > > > Sorry for this, I hadn't tested such a metric and wasn't aware of how > > the aliasing worked. I sent a fix for this issue here: > > https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/ > > Could you see if this addresses the issue for you? I don't see the > > change in Arnaldo's trees yet. > > Unfortunately this does not seem to fix my issue. > > So for that patch, you say you fix metric expression for DRAM_BW_Use, > which is: > > { > "BriefDescription": "Average external Memory Bandwidth Use for reads > and writes [GB / sec]", > "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ + > uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time", > "MetricGroup": "Memory_BW", > "MetricName": "DRAM_BW_Use" > }, > > But this metric expression does not include any alias events; rather I > think it is just cas_count_write + cas_count_read event count for PMU > uncore_imc / duration_time. > > When I say alias, I mean - as an example, we have event: > > { > "BriefDescription": "write requests to memory controller. > Derived from unc_m_cas_count.wr", > "Counter": "0,1,2,3", > "EventCode": "0x4", > "EventName": "LLC_MISSES.MEM_WRITE", > "PerPkg": "1", > "ScaleUnit": "64Bytes", > "UMask": "0xC", > "Unit": "iMC" > }, > > And then reference LLC_MISSES.MEM_WRITE in a metric expression: > > "MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time", > > This is what seems to be broken for when the alias matches > 1 PMU. > > Please check this. Happy to check. Can you provide a reproduction? Looking on broadwell this metric doesn't exist. Thanks, Ian > Thanks, > John > > > > > Thanks, > > Ian > > > >> I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an > >> issue where find_evsel_group() fails for the uncore metrics under the > >> condition mentioned above. > >> > >> Unfortunately I don't have an x86 machine to which this test applies. > >> However, as an experiment, I added a test metric to my broadwell JSON: > >> > >> diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > >> b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > >> index 8cdc7c13dc2a..fc6d9adf996a 100644 > >> --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > >> +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > >> @@ -348,5 +348,11 @@ > >> "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", > >> "MetricGroup": "Power", > >> "MetricName": "C7_Pkg_Residency" > >> + }, > >> + { > >> + "BriefDescription": "test metric", > >> + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * > >> UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", > >> + "MetricGroup": "Test", > >> + "MetricName": "test_metric_inc" > >> } > >> ] > >> > >> > >> And get this: > >> > >> john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M > >> test_metric_inc sleep 1 > >> Using CPUID GenuineIntel-6-3D-4 > >> metric expr unc_cbo_xsnp_response.miss_xcore * > >> unc_cbo_xsnp_response.miss_eviction for test_metric_inc > >> found event unc_cbo_xsnp_response.miss_eviction > >> found event unc_cbo_xsnp_response.miss_xcore > >> adding > >> {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W > >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ > >> unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ > >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ > >> unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ > >> Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore * > >> unc_cbo_xsnp_response.miss_eviction > >> task-clock: 688876 688876 688876 > >> context-switches: 2 688876 688876 > >> cpu-migrations: 0 688876 688876 > >> page-faults: 69 688876 688876 > >> cycles: 2101719 695690 695690 > >> instructions: 1180534 695690 695690 > >> branches: 249450 695690 695690 > >> branch-misses: 10815 695690 695690 > >> > >> Performance counter stats for 'sleep 1': > >> > >> 0.69 msec task-clock # 0.001 CPUs > >> utilized > >> 2 context-switches # 0.003 M/sec > >> > >> 0 cpu-migrations # 0.000 K/sec > >> > >> 69 page-faults # 0.100 M/sec > >> > >> 2,101,719 cycles # 3.051 GHz > >> > >> 1,180,534 instructions # 0.56 insn per > >> cycle > >> 249,450 branches # 362.112 M/sec > >> > >> 10,815 branch-misses # 4.34% of all > >> branches > >> > >> 1.001177693 seconds time elapsed > >> > >> 0.001149000 seconds user > >> 0.000000000 seconds sys > >> > >> > >> john@localhost:~/linux/tools/perf> > >> > >> > >> Any idea what is going wrong here, before I have to dive in? The issue > >> seems to be this named commit. > >> > >> Thanks, > >> John > >> > >>> A metric group contains multiple metrics. These metrics may use the same > >>> events. If metrics use separate events then it leads to more > >>> multiplexing and overall metric counts fail to sum to 100%. > >>> Modify how metrics are associated with events so that if the events in > >>> an earlier group satisfy the current metric, the same events are used. > >>> A record of used events is kept and at the end of processing unnecessary > >>> events are eliminated. > >>> > >>> Before: > > . > > >
On 05/10/2020 17:28, Ian Rogers wrote: > On Mon, Oct 5, 2020 at 3:06 AM John Garry <john.garry@huawei.com> wrote: >> >> On 02/10/2020 21:46, Ian Rogers wrote: >>> On Fri, Oct 2, 2020 at 5:00 AM John Garry <john.garry@huawei.com> wrote: >>>> >>>> On 07/05/2020 15:08, Ian Rogers wrote: >>>> >>>> Hi Ian, >>>> >>>> I was wondering if you ever tested commit 2440689d62e9 ("perf >>>> metricgroup: Remove duped metric group events") for when we have a >>>> metric which aliases multiple instances of the same uncore PMU in the >>>> system? >>> >>> Sorry for this, I hadn't tested such a metric and wasn't aware of how >>> the aliasing worked. I sent a fix for this issue here: >>> https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/ >>> Could you see if this addresses the issue for you? I don't see the >>> change in Arnaldo's trees yet. >> >> Unfortunately this does not seem to fix my issue. >> >> So for that patch, you say you fix metric expression for DRAM_BW_Use, >> which is: >> >> { >> "BriefDescription": "Average external Memory Bandwidth Use for reads >> and writes [GB / sec]", >> "MetricExpr": "( 64 * ( uncore_imc@cas_count_read@ + >> uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time", >> "MetricGroup": "Memory_BW", >> "MetricName": "DRAM_BW_Use" >> }, >> >> But this metric expression does not include any alias events; rather I >> think it is just cas_count_write + cas_count_read event count for PMU >> uncore_imc / duration_time. >> >> When I say alias, I mean - as an example, we have event: >> >> { >> "BriefDescription": "write requests to memory controller. >> Derived from unc_m_cas_count.wr", >> "Counter": "0,1,2,3", >> "EventCode": "0x4", >> "EventName": "LLC_MISSES.MEM_WRITE", >> "PerPkg": "1", >> "ScaleUnit": "64Bytes", >> "UMask": "0xC", >> "Unit": "iMC" >> }, >> >> And then reference LLC_MISSES.MEM_WRITE in a metric expression: >> >> "MetricExpr": "LLC_MISSES.MEM_WRITE / duration_time", >> >> This is what seems to be broken for when the alias matches > 1 PMU. >> >> Please check this. > Hi Ian, > Happy to check. So I am, but the code is a little complicated :) > Can you provide a reproduction? Looking on broadwell > this metric doesn't exist. Right, I just added this test metric as my 2x x86 platform has no examples which I can find: diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json index 8cdc7c13dc2a..fc6d9adf996a 100644 --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json @@ -348,5 +348,11 @@ "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", "MetricGroup": "Power", "MetricName": "C7_Pkg_Residency" + }, + { + "BriefDescription": "test metric", + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", + "MetricGroup": "Test", + "MetricName": "test_metric_inc" } ] I'll try to find a better mainline example, though, but I'm not hopeful ... Thanks, John
On 05/10/2020 19:05, John Garry wrote: >> Can you provide a reproduction? Looking on broadwell >> this metric doesn't exist. > > Right, I just added this test metric as my 2x x86 platform has no > examples which I can find: > > diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > index 8cdc7c13dc2a..fc6d9adf996a 100644 > --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > @@ -348,5 +348,11 @@ > "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", > "MetricGroup": "Power", > "MetricName": "C7_Pkg_Residency" > + }, > + { > + "BriefDescription": "test metric", > + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * > UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", > + "MetricGroup": "Test", > + "MetricName": "test_metric_inc" > } > ] > It seems that the code in find_evsel_group() does not properly handle the scenario of event alias matching different PMUs (as I already said). So I got it working on top of "perf metricgroup: Fix uncore metric expressions" with the following change: diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index d948a7f910cf..6293378c019c 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -213,7 +213,8 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist, /* Ignore event if already used and merging is disabled. */ if (metric_no_merge && test_bit(ev->idx, evlist_used)) continue; - if (!has_constraint && ev->leader != current_leader) { + if (!has_constraint && (!current_leader || strcmp(current_leader->name, ev->leader->name))) { /* * Start of a new group, discard the whole match and * start again. @@ -279,7 +280,8 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist, * when then group is left. */ if (!has_constraint && - ev->leader != metric_events[i]->leader) + strcmp(ev->leader->name, metric_events[i]->leader->name)) break; if (!strcmp(metric_events[i]->name, ev->name)) { set_bit(ev->idx, evlist_used); which gives for my test metric: ./perf stat -v -M test_metric_inc sleep 1 Using CPUID GenuineIntel-6-3D-4 metric expr unc_cbo_xsnp_response.miss_xcore / unc_cbo_xsnp_response.miss_eviction for test_metric_inc found event unc_cbo_xsnp_response.miss_eviction found event unc_cbo_xsnp_response.miss_xcore adding {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ Control descriptor is not initialized unc_cbo_xsnp_response.miss_eviction: 595175 1001021311 1001021311 unc_cbo_xsnp_response.miss_eviction: 592516 1001020037 1001020037 unc_cbo_xsnp_response.miss_xcore: 39139 1001021311 1001021311 unc_cbo_xsnp_response.miss_xcore: 38718 1001020037 1001020037 Performance counter stats for 'system wide': 1,187,691 unc_cbo_xsnp_response.miss_eviction # 0.07 test_metric_inc 77,857 unc_cbo_xsnp_response.miss_xcore 1.001068918 seconds time elapsed John
On Tue, Oct 6, 2020 at 7:22 AM John Garry <john.garry@huawei.com> wrote: > > On 05/10/2020 19:05, John Garry wrote: > >> Can you provide a reproduction? Looking on broadwell > >> this metric doesn't exist. > > > > Right, I just added this test metric as my 2x x86 platform has no > > examples which I can find: > > > > diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > > b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > > index 8cdc7c13dc2a..fc6d9adf996a 100644 > > --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > > +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > > @@ -348,5 +348,11 @@ > > "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", > > "MetricGroup": "Power", > > "MetricName": "C7_Pkg_Residency" > > + }, > > + { > > + "BriefDescription": "test metric", > > + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * > > UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", > > + "MetricGroup": "Test", > > + "MetricName": "test_metric_inc" > > } > > ] > > > > It seems that the code in find_evsel_group() does not properly handle > the scenario of event alias matching different PMUs (as I already said). > > So I got it working on top of "perf metricgroup: Fix uncore metric > expressions" with the following change: > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c > index d948a7f910cf..6293378c019c 100644 > --- a/tools/perf/util/metricgroup.c > +++ b/tools/perf/util/metricgroup.c > @@ -213,7 +213,8 @@ static struct evsel *find_evsel_group(struct evlist > *perf_evlist, > /* Ignore event if already used and merging is disabled. */ > if (metric_no_merge && test_bit(ev->idx, evlist_used)) > continue; > - if (!has_constraint && ev->leader != current_leader) { > + if (!has_constraint && (!current_leader || > strcmp(current_leader->name, ev->leader->name))) { > /* > * Start of a new group, discard the whole match and > * start again. > @@ -279,7 +280,8 @@ static struct evsel *find_evsel_group(struct evlist > *perf_evlist, > * when then group is left. > */ > if (!has_constraint && > - ev->leader != metric_events[i]->leader) > + strcmp(ev->leader->name, metric_events[i]->leader->name)) > break; > if (!strcmp(metric_events[i]->name, ev->name)) { > set_bit(ev->idx, evlist_used); > > which gives for my test metric: > > ./perf stat -v -M test_metric_inc sleep 1 > Using CPUID GenuineIntel-6-3D-4 > metric expr unc_cbo_xsnp_response.miss_xcore / > unc_cbo_xsnp_response.miss_eviction for test_metric_inc > found event unc_cbo_xsnp_response.miss_eviction > found event unc_cbo_xsnp_response.miss_xcore > adding > {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ > Control descriptor is not initialized > unc_cbo_xsnp_response.miss_eviction: 595175 1001021311 1001021311 > unc_cbo_xsnp_response.miss_eviction: 592516 1001020037 1001020037 > unc_cbo_xsnp_response.miss_xcore: 39139 1001021311 1001021311 > unc_cbo_xsnp_response.miss_xcore: 38718 1001020037 1001020037 > > Performance counter stats for 'system wide': > > 1,187,691 unc_cbo_xsnp_response.miss_eviction # 0.07 > test_metric_inc > 77,857 unc_cbo_xsnp_response.miss_xcore > > > 1.001068918 seconds time elapsed > > John Thanks John! I was able to repro the problem, let me investigate what is happening here as it seems there may be something wrong with the grouping logic. Ian
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json index 8cdc7c13dc2a..fc6d9adf996a 100644 --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json @@ -348,5 +348,11 @@ "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", "MetricGroup": "Power", "MetricName": "C7_Pkg_Residency" + }, + { + "BriefDescription": "test metric", + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", + "MetricGroup": "Test", + "MetricName": "test_metric_inc" }