diff mbox series

[2/2] clocksource: arm_arch_timer: set arch_mem_timer cpumask to cpu_possible_mask

Message ID 1531151136-18297-2-git-send-email-sudeep.holla@arm.com
State Accepted
Commit 5e18e412973d6bb1804de1d4d30a891c774b006e
Headers show
Series [1/2] Revert "tick: Prefer a lower rating device only if it's CPU local device" | expand

Commit Message

Sudeep Holla July 9, 2018, 3:45 p.m. UTC
Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be
fine. However, cpu_possible_mask is more accurate and if there are other
clockevent source in the system which are set to cpu_possible_mask, then
having cpu_all_mask may result in issue.

E.g. on a platform with arm,sp804 timer with rating 300 and
cpu_possible_mask and this arch_mem_timer timer with rating 400 and
cpu_all_mask, tick_check_preferred may choose both preferred as the
cpumasks are not equal though they must be.

This issue was root caused incorrectly initially and a fix was merged as
commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU
local device").

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>

---
 drivers/clocksource/arm_arch_timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Hi,

There are few more drivers that set cpu_all_mask, should they also be
changed ? It should be fine as long as it's not compared against
someother non per-CPU timer with shorter cpumask.

Regards,
Sudeep

--
2.7.4

Comments

Thomas Gleixner July 9, 2018, 10:09 p.m. UTC | #1
On Mon, 9 Jul 2018, Sudeep Holla wrote:

> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be

> fine. However, cpu_possible_mask is more accurate and if there are other

> clockevent source in the system which are set to cpu_possible_mask, then

> having cpu_all_mask may result in issue.

> 

> E.g. on a platform with arm,sp804 timer with rating 300 and

> cpu_possible_mask and this arch_mem_timer timer with rating 400 and

> cpu_all_mask, tick_check_preferred may choose both preferred as the

> cpumasks are not equal though they must be.

> 

> This issue was root caused incorrectly initially and a fix was merged as

> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU

> local device").


To avoid that in the future we really should fix the decision logic to mask
out the non possible CPUs from the supplied masks.

Thanks,

	tgkx
Sudeep Holla July 10, 2018, 10:29 a.m. UTC | #2
On 09/07/18 23:09, Thomas Gleixner wrote:
> On Mon, 9 Jul 2018, Sudeep Holla wrote:

> 

>> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be

>> fine. However, cpu_possible_mask is more accurate and if there are other

>> clockevent source in the system which are set to cpu_possible_mask, then

>> having cpu_all_mask may result in issue.

>>

>> E.g. on a platform with arm,sp804 timer with rating 300 and

>> cpu_possible_mask and this arch_mem_timer timer with rating 400 and

>> cpu_all_mask, tick_check_preferred may choose both preferred as the

>> cpumasks are not equal though they must be.

>>

>> This issue was root caused incorrectly initially and a fix was merged as

>> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU

>> local device").

> 

> To avoid that in the future we really should fix the decision logic to mask

> out the non possible CPUs from the supplied masks.


Sure, I can do that. But do you want that as a fix for v4.18 ?

I think we may need this check at another place in tick_setup_device

213         /*
214          * When the device is not per cpu, pin the interrupt to the
215          * current cpu:
216          */
217         if (!cpumask_equal(newdev->cpumask, cpumask))
218                 irq_set_affinity(newdev->irq, cpumask);

Does it make sense trim dev->cpumask when registering the clockevents
device itself instead of adding check at place where this cpumask
can be used ? So that any future user of those masks need not have to
take care of that.

Also only few ARM clocksource drivers use cpu_all_mask which could be
result of copy-paste, we can even fix them too.

arm_arch_timer.c:	clk->cpumask = cpu_all_mask;
tegra20_timer.c:	tegra_clockevent.cpumask = cpu_all_mask;
timer-atcpit100.c:	.cpumask = cpu_all_mask,
timer-keystone.c:	event_dev->cpumask = cpu_all_mask;
zevio-timer.c:		timer->clkevt.cpumask	= cpu_all_mask;

-- 
Regards,
Sudeep
Thomas Gleixner July 10, 2018, 12:21 p.m. UTC | #3
On Tue, 10 Jul 2018, Sudeep Holla wrote:

> 

> 

> On 09/07/18 23:09, Thomas Gleixner wrote:

> > On Mon, 9 Jul 2018, Sudeep Holla wrote:

> > 

> >> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be

> >> fine. However, cpu_possible_mask is more accurate and if there are other

> >> clockevent source in the system which are set to cpu_possible_mask, then

> >> having cpu_all_mask may result in issue.

> >>

> >> E.g. on a platform with arm,sp804 timer with rating 300 and

> >> cpu_possible_mask and this arch_mem_timer timer with rating 400 and

> >> cpu_all_mask, tick_check_preferred may choose both preferred as the

> >> cpumasks are not equal though they must be.

> >>

> >> This issue was root caused incorrectly initially and a fix was merged as

> >> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU

> >> local device").

> > 

> > To avoid that in the future we really should fix the decision logic to mask

> > out the non possible CPUs from the supplied masks.

> 

> Sure, I can do that. But do you want that as a fix for v4.18 ?


No, that's for 4.19

> I think we may need this check at another place in tick_setup_device

> 

> 213         /*

> 214          * When the device is not per cpu, pin the interrupt to the

> 215          * current cpu:

> 216          */

> 217         if (!cpumask_equal(newdev->cpumask, cpumask))

> 218                 irq_set_affinity(newdev->irq, cpumask);

> 

> Does it make sense trim dev->cpumask when registering the clockevents

> device itself instead of adding check at place where this cpumask

> can be used ? So that any future user of those masks need not have to

> take care of that.


The problem is you cannot trim it because cpu_all_mask is global and const.

> Also only few ARM clocksource drivers use cpu_all_mask which could be

> result of copy-paste, we can even fix them too.

> 

> arm_arch_timer.c:	clk->cpumask = cpu_all_mask;

> tegra20_timer.c:	tegra_clockevent.cpumask = cpu_all_mask;

> timer-atcpit100.c:	.cpumask = cpu_all_mask,

> timer-keystone.c:	event_dev->cpumask = cpu_all_mask;

> zevio-timer.c:		timer->clkevt.cpumask	= cpu_all_mask;


Yes, that makes sense. What we could do is warn, when cpu_all_mask is set
at registration time and replace the pointer with cpu_possible_mask.

Thanks,

	tglx
Sudeep Holla July 10, 2018, 1:25 p.m. UTC | #4
On 10/07/18 13:21, Thomas Gleixner wrote:
> On Tue, 10 Jul 2018, Sudeep Holla wrote:

> 

>>

>>

>> On 09/07/18 23:09, Thomas Gleixner wrote:

>>> On Mon, 9 Jul 2018, Sudeep Holla wrote:

>>>

>>>> Currently, arch_mem_timer cpumask is set to cpu_all_mask which should be

>>>> fine. However, cpu_possible_mask is more accurate and if there are other

>>>> clockevent source in the system which are set to cpu_possible_mask, then

>>>> having cpu_all_mask may result in issue.

>>>>

>>>> E.g. on a platform with arm,sp804 timer with rating 300 and

>>>> cpu_possible_mask and this arch_mem_timer timer with rating 400 and

>>>> cpu_all_mask, tick_check_preferred may choose both preferred as the

>>>> cpumasks are not equal though they must be.

>>>>

>>>> This issue was root caused incorrectly initially and a fix was merged as

>>>> commit 1332a9055801 ("tick: Prefer a lower rating device only if it's CPU

>>>> local device").

>>>

>>> To avoid that in the future we really should fix the decision logic to mask

>>> out the non possible CPUs from the supplied masks.

>>

>> Sure, I can do that. But do you want that as a fix for v4.18 ?

> 

> No, that's for 4.19

> 


Ah OK, I assume you are fine with this patch as fix for v4.18.
>> I think we may need this check at another place in tick_setup_device

>>

>> 213         /*

>> 214          * When the device is not per cpu, pin the interrupt to the

>> 215          * current cpu:

>> 216          */

>> 217         if (!cpumask_equal(newdev->cpumask, cpumask))

>> 218                 irq_set_affinity(newdev->irq, cpumask);

>>

>> Does it make sense trim dev->cpumask when registering the clockevents

>> device itself instead of adding check at place where this cpumask

>> can be used ? So that any future user of those masks need not have to

>> take care of that.

> 

> The problem is you cannot trim it because cpu_all_mask is global and const.

> 


Indeed, again forgot it just a pointer to the global one, duh!

>> Also only few ARM clocksource drivers use cpu_all_mask which could be

>> result of copy-paste, we can even fix them too.

>>

>> arm_arch_timer.c:	clk->cpumask = cpu_all_mask;

>> tegra20_timer.c:	tegra_clockevent.cpumask = cpu_all_mask;

>> timer-atcpit100.c:	.cpumask = cpu_all_mask,

>> timer-keystone.c:	event_dev->cpumask = cpu_all_mask;

>> zevio-timer.c:		timer->clkevt.cpumask	= cpu_all_mask;

> 

> Yes, that makes sense. What we could do is warn, when cpu_all_mask is set

> at registration time and replace the pointer with cpu_possible_mask.

> 


I like this approach than having to bitwise and with cpu_possible_mask
at all the necessary place. I will cook up a patch.
-- 
Regards,
Sudeep
Thomas Gleixner July 10, 2018, 8:21 p.m. UTC | #5
On Tue, 10 Jul 2018, Sudeep Holla wrote:
> On 10/07/18 13:21, Thomas Gleixner wrote:

> >> Also only few ARM clocksource drivers use cpu_all_mask which could be

> >> result of copy-paste, we can even fix them too.

> >>

> >> arm_arch_timer.c:	clk->cpumask = cpu_all_mask;

> >> tegra20_timer.c:	tegra_clockevent.cpumask = cpu_all_mask;

> >> timer-atcpit100.c:	.cpumask = cpu_all_mask,

> >> timer-keystone.c:	event_dev->cpumask = cpu_all_mask;

> >> zevio-timer.c:		timer->clkevt.cpumask	= cpu_all_mask;

> > 

> > Yes, that makes sense. What we could do is warn, when cpu_all_mask is set

> > at registration time and replace the pointer with cpu_possible_mask.

> > 

> I like this approach than having to bitwise and with cpu_possible_mask

> at all the necessary place. I will cook up a patch.


Appreciated. I'm inclined to take it for 4.18 even. Please send it along
with the fixes for the above obvious failure spots.

Thanks,

	tglx
diff mbox series

Patch

diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 57cb2f00fc07..d8c7f5750cdb 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -735,7 +735,7 @@  static void __arch_timer_setup(unsigned type,
 		clk->features |= CLOCK_EVT_FEAT_DYNIRQ;
 		clk->name = "arch_mem_timer";
 		clk->rating = 400;
-		clk->cpumask = cpu_all_mask;
+		clk->cpumask = cpu_possible_mask;
 		if (arch_timer_mem_use_virtual) {
 			clk->set_state_shutdown = arch_timer_shutdown_virt_mem;
 			clk->set_state_oneshot_stopped = arch_timer_shutdown_virt_mem;