diff mbox

[v2,1/2] arm: perf: Prevent wraparound during overflow

Message ID 1416587067-3220-2-git-send-email-daniel.thompson@linaro.org
State Accepted
Commit 2d9ed7406fd24987b75f78fea8290202d8108f34
Headers show

Commit Message

Daniel Thompson Nov. 21, 2014, 4:24 p.m. UTC
If the overflow threshold for a counter is set above or near the
0xffffffff boundary then the kernel may lose track of the overflow
causing only events that occur *after* the overflow to be recorded.
Specifically the problem occurs when the value of the performance counter
overtakes its original programmed value due to wrap around.

Typical solutions to this problem are either to avoid programming in
values likely to be overtaken or to treat the overflow bit as the 33rd
bit of the counter.

Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
during irqsave sections (context switches for example) so instead we take
the simpler approach of avoiding values likely to be overtaken.

We set the limit to half of max_period because this matches the limit
imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
rate for large threshold values, however even with a very fast counter
ticking at 4GHz the interrupt rate would only be ~1Hz.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---
 arch/arm/kernel/perf_event.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Daniel Thompson Dec. 4, 2014, 1:58 p.m. UTC | #1
On 04/12/14 10:26, Will Deacon wrote:
> On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
>> If the overflow threshold for a counter is set above or near the
>> 0xffffffff boundary then the kernel may lose track of the overflow
>> causing only events that occur *after* the overflow to be recorded.
>> Specifically the problem occurs when the value of the performance counter
>> overtakes its original programmed value due to wrap around.
>>
>> Typical solutions to this problem are either to avoid programming in
>> values likely to be overtaken or to treat the overflow bit as the 33rd
>> bit of the counter.
>>
>> Its somewhat fiddly to refactor the code to correctly handle the 33rd bit
>> during irqsave sections (context switches for example) so instead we take
>> the simpler approach of avoiding values likely to be overtaken.
>>
>> We set the limit to half of max_period because this matches the limit
>> imposed in __hw_perf_event_init(). This causes a doubling of the interrupt
>> rate for large threshold values, however even with a very fast counter
>> ticking at 4GHz the interrupt rate would only be ~1Hz.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> 
>   Acked-by: Will Deacon <will.deacon@arm.com>
> 
> You'll probably need to refresh this at -rc1 as there are a bunch of
> changes queued for this file already. Then you can stick it into rmk's
> patch system.

I'll do that. Thanks.


> 
> Cheers,
> 
> Will
> 
>> ---
>>  arch/arm/kernel/perf_event.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
>> index 266cba46db3e..ab68833c1e31 100644
>> --- a/arch/arm/kernel/perf_event.c
>> +++ b/arch/arm/kernel/perf_event.c
>> @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
>>  		ret = 1;
>>  	}
>>  
>> -	if (left > (s64)armpmu->max_period)
>> -		left = armpmu->max_period;
>> +	/*
>> +	 * Limit the maximum period to prevent the counter value
>> +	 * from overtaking the one we are about to program. In
>> +	 * effect we are reducing max_period to account for
>> +	 * interrupt latency (and we are being very conservative).
>> +	 */
>> +	if (left > (armpmu->max_period >> 1))
>> +		left = armpmu->max_period >> 1;
>>  
>>  	local64_set(&hwc->prev_count, (u64)-left);
>>  
>> -- 
>> 1.9.3
>>
>>
Daniel Thompson Jan. 5, 2015, 7:31 p.m. UTC | #2
On Mon, Jan 05, 2015 at 03:57:39PM +0100, Peter Zijlstra wrote:
> On Fri, Nov 21, 2014 at 04:24:26PM +0000, Daniel Thompson wrote:
> > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> > index 266cba46db3e..ab68833c1e31 100644
> > --- a/arch/arm/kernel/perf_event.c
> > +++ b/arch/arm/kernel/perf_event.c
> > @@ -115,8 +115,14 @@ int armpmu_event_set_period(struct perf_event *event)
> >  		ret = 1;
> >  	}
> >  
> > -	if (left > (s64)armpmu->max_period)
> > -		left = armpmu->max_period;
> > +	/*
> > +	 * Limit the maximum period to prevent the counter value
> > +	 * from overtaking the one we are about to program. In
> > +	 * effect we are reducing max_period to account for
> > +	 * interrupt latency (and we are being very conservative).
> > +	 */
> > +	if (left > (armpmu->max_period >> 1))
> > +		left = armpmu->max_period >> 1;
> 
> On x86 we simply half max_period, why did you choose to do differently?

In truth because I didn't look at the x86 code... there is an existing
halving of max_period in the arm code and that was enough to satisfy me
that halving max_period was reasonable.

Predividing max_period looks to me like it would work for ARM too although I
don't think we could blame hardware insanity for doing so ;-).

Will: Do you want me to update this?
diff mbox

Patch

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 266cba46db3e..ab68833c1e31 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -115,8 +115,14 @@  int armpmu_event_set_period(struct perf_event *event)
 		ret = 1;
 	}
 
-	if (left > (s64)armpmu->max_period)
-		left = armpmu->max_period;
+	/*
+	 * Limit the maximum period to prevent the counter value
+	 * from overtaking the one we are about to program. In
+	 * effect we are reducing max_period to account for
+	 * interrupt latency (and we are being very conservative).
+	 */
+	if (left > (armpmu->max_period >> 1))
+		left = armpmu->max_period >> 1;
 
 	local64_set(&hwc->prev_count, (u64)-left);