ARM: Don't ever downscale loops_per_jiffy in SMP systems

Message ID alpine.LFD.2.11.1405082058130.980@knanqh.ubzr
State New
Headers show

Commit Message

Nicolas Pitre May 9, 2014, 1:37 a.m.
On Thu, 8 May 2014, Russell King - ARM Linux wrote:

> If you're in a preempt or SMP environment, provide a timer for udelay().
> IF you're in an environment with IRQs which can take a long time, use
> a timer for udelay().  If you're in an environment where the CPU clock
> can change unexpectedly, use a timer for udelay().

Longer delays are normally not a problem.  If they are, then simply 
disabling IRQs may solve it if absolutely required.  With much shorter 
delays than expected this is another story.

What about the following:



Nicolas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Douglas Anderson May 9, 2014, 4:43 a.m. | #1
Nicolas,

On Thu, May 8, 2014 at 6:37 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Thu, 8 May 2014, Russell King - ARM Linux wrote:
>
>> If you're in a preempt or SMP environment, provide a timer for udelay().
>> IF you're in an environment with IRQs which can take a long time, use
>> a timer for udelay().  If you're in an environment where the CPU clock
>> can change unexpectedly, use a timer for udelay().
>
> Longer delays are normally not a problem.  If they are, then simply
> disabling IRQs may solve it if absolutely required.  With much shorter
> delays than expected this is another story.
>
> What about the following:
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 7c4fada440..10030cc5a0 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -682,6 +682,15 @@ static int cpufreq_callback(struct notifier_block *nb,
>                         cpufreq_scale(per_cpu(l_p_j_ref, cpu),
>                                         per_cpu(l_p_j_ref_freq, cpu),
>                                         freq->new);
> +               /*
> +                * Another CPU might have called udelay() just before LPJ
> +                * and a shared CPU clock is increased.  That other CPU still
> +                * looping on the old LPJ value would return significantly
> +                * sooner than expected.  The actual fix is to provide a
> +                * timer based udelay() implementation instead.
> +                */
> +               if (freq->old < freq->new)
> +                       pr_warn_once("*** udelay() on SMP is racy and may be much shorter than expected ***\n");

I would be OK with that.  At least someone would have a clue what to do.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Russell King - ARM Linux May 9, 2014, 9:18 a.m. | #2
On Thu, May 08, 2014 at 09:37:15PM -0400, Nicolas Pitre wrote:
> On Thu, 8 May 2014, Russell King - ARM Linux wrote:
> 
> > If you're in a preempt or SMP environment, provide a timer for udelay().
> > IF you're in an environment with IRQs which can take a long time, use
> > a timer for udelay().  If you're in an environment where the CPU clock
> > can change unexpectedly, use a timer for udelay().
> 
> Longer delays are normally not a problem.  If they are, then simply 
> disabling IRQs may solve it if absolutely required.  With much shorter 
> delays than expected this is another story.
> 
> What about the following:
> 
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 7c4fada440..10030cc5a0 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -682,6 +682,15 @@ static int cpufreq_callback(struct notifier_block *nb,
>  			cpufreq_scale(per_cpu(l_p_j_ref, cpu),
>  					per_cpu(l_p_j_ref_freq, cpu),
>  					freq->new);
> +		/*
> +		 * Another CPU might have called udelay() just before LPJ
> +		 * and a shared CPU clock is increased.  That other CPU still
> +		 * looping on the old LPJ value would return significantly
> +		 * sooner than expected.  The actual fix is to provide a
> +		 * timer based udelay() implementation instead.
> +		 */
> +		if (freq->old < freq->new)
> +			pr_warn_once("*** udelay() on SMP is racy and may be much shorter than expected ***\n");
>  	}
>  	return NOTIFY_OK;
>  }

No, because you're assuming this is just a SMP problem.  What about
preempt, where you could preempt away from a udelay loop to change
the CPU frequency, and then back again, possibly resulting in the
CPU clock rate increasing and maybe a shorter delay if the switch
from-change-clock-and-back is fast enough?  Remember that udelay()
can be used for up to 2ms delays.
Russell King - ARM Linux May 9, 2014, 6:22 p.m. | #3
On Fri, May 09, 2014 at 02:00:54PM -0400, Nicolas Pitre wrote:
> On Fri, 9 May 2014, Russell King - ARM Linux wrote:
> 
> > On Thu, May 08, 2014 at 09:37:15PM -0400, Nicolas Pitre wrote:
> > > On Thu, 8 May 2014, Russell King - ARM Linux wrote:
> > > 
> > > > If you're in a preempt or SMP environment, provide a timer for udelay().
> > > > IF you're in an environment with IRQs which can take a long time, use
> > > > a timer for udelay().  If you're in an environment where the CPU clock
> > > > can change unexpectedly, use a timer for udelay().
> > > 
> > > Longer delays are normally not a problem.  If they are, then simply 
> > > disabling IRQs may solve it if absolutely required.  With much shorter 
> > > delays than expected this is another story.
> > > 
> > > What about the following:
> > > 
> > > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > > index 7c4fada440..10030cc5a0 100644
> > > --- a/arch/arm/kernel/smp.c
> > > +++ b/arch/arm/kernel/smp.c
> > > @@ -682,6 +682,15 @@ static int cpufreq_callback(struct notifier_block *nb,
> > >  			cpufreq_scale(per_cpu(l_p_j_ref, cpu),
> > >  					per_cpu(l_p_j_ref_freq, cpu),
> > >  					freq->new);
> > > +		/*
> > > +		 * Another CPU might have called udelay() just before LPJ
> > > +		 * and a shared CPU clock is increased.  That other CPU still
> > > +		 * looping on the old LPJ value would return significantly
> > > +		 * sooner than expected.  The actual fix is to provide a
> > > +		 * timer based udelay() implementation instead.
> > > +		 */
> > > +		if (freq->old < freq->new)
> > > +			pr_warn_once("*** udelay() on SMP is racy and may be much shorter than expected ***\n");
> > >  	}
> > >  	return NOTIFY_OK;
> > >  }
> > 
> > No, because you're assuming this is just a SMP problem.  What about
> > preempt, where you could preempt away from a udelay loop to change
> > the CPU frequency, and then back again, possibly resulting in the
> > CPU clock rate increasing and maybe a shorter delay if the switch
> > from-change-clock-and-back is fast enough?  Remember that udelay()
> > can be used for up to 2ms delays.
> 
> Well... that would be somewhat less likely but still possible yes.
> 
> So the only way to "solve" this might look similar in spirit to what 
> Doug alluded to earlier i.e. increase a sequence number on 
> CPUFREQ_PRECHANGE and increase it again on CPUFREQ_POSTCHANGE, and have 
> udelay() compare the count sampled before reading lpj and after 
> returning from the loop code.  When the sequence count doesn't match 
> then suffice to perform some arbitrarily large extra loops.

I'd much prefer just printing a warning at kernel boot time to report
that the kernel is running with features which would make udelay() less
than accurate.

Remember, it should be usable for _short_ delays on slow machines as
well as other stuff, and if we're going to start throwing stuff like
the above at it, it's going to become very inefficient.

And... I go back to what I've been saying all along: use a timer in
this situation, don't rely on the loops-based udelay if you have
preempt, USB interrupts, SMP etc.

Patch

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7c4fada440..10030cc5a0 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -682,6 +682,15 @@  static int cpufreq_callback(struct notifier_block *nb,
 			cpufreq_scale(per_cpu(l_p_j_ref, cpu),
 					per_cpu(l_p_j_ref_freq, cpu),
 					freq->new);
+		/*
+		 * Another CPU might have called udelay() just before LPJ
+		 * and a shared CPU clock is increased.  That other CPU still
+		 * looping on the old LPJ value would return significantly
+		 * sooner than expected.  The actual fix is to provide a
+		 * timer based udelay() implementation instead.
+		 */
+		if (freq->old < freq->new)
+			pr_warn_once("*** udelay() on SMP is racy and may be much shorter than expected ***\n");
 	}
 	return NOTIFY_OK;
 }