[3/3] hwrng: st: Use real-world device timings for timeout

Message ID 1444142640-14721-3-git-send-email-lee.jones@linaro.org
State New
Headers show

Commit Message

Lee Jones Oct. 6, 2015, 2:44 p.m.
Samples are documented to be available every 0.667us, so in theory
the 8 sample deep FIFO should take 5.336us to fill.  However, during
thorough testing, it became apparent that filling the FIFO actually
takes closer to 12us.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/char/hw_random/st-rng.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Russell King - ARM Linux Oct. 6, 2015, 7:37 p.m. | #1
On Tue, Oct 06, 2015 at 03:44:00PM +0100, Lee Jones wrote:
> Samples are documented to be available every 0.667us, so in theory
> the 8 sample deep FIFO should take 5.336us to fill.  However, during
> thorough testing, it became apparent that filling the FIFO actually
> takes closer to 12us.

Is that measured?

> +/*
> + * Samples are documented to be available every 0.667us, so in theory
> + * the 8 sample deep FIFO should take 5.336us to fill.  However, during
> + * thorough testing, it became apparent that filling the FIFO actually
> + * takes closer to 12us.
> + */
> +#define ST_RNG_FILL_FIFO_TIMEOUT	12

I hope you're not using such a precise figure with udelay().  udelay()
is not guaranteed to give exactly (or even at least) the delay you
request.  It's defined to give an approximate delay.

Many people have a problem understanding that, so I won't explain why
it is that way, just accept that it is and move on... it's not going
to magically get "fixed" because someone has just learnt about this. :)
Lee Jones Oct. 6, 2015, 8:51 p.m. | #2
On Tue, 06 Oct 2015, Russell King - ARM Linux wrote:
> On Tue, Oct 06, 2015 at 03:44:00PM +0100, Lee Jones wrote:
> > Samples are documented to be available every 0.667us, so in theory
> > the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > thorough testing, it became apparent that filling the FIFO actually
> > takes closer to 12us.
> 
> Is that measured?

I measured it using ktime.  Hopefully that was adequate.

> > +/*
> > + * Samples are documented to be available every 0.667us, so in theory
> > + * the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > + * thorough testing, it became apparent that filling the FIFO actually
> > + * takes closer to 12us.
> > + */
> > +#define ST_RNG_FILL_FIFO_TIMEOUT	12
> 
> I hope you're not using such a precise figure with udelay().  udelay()
> is not guaranteed to give exactly (or even at least) the delay you
> request.  It's defined to give an approximate delay.
> 
> Many people have a problem understanding that, so I won't explain why
> it is that way, just accept that it is and move on... it's not going
> to magically get "fixed" because someone has just learnt about this. :)

Thanks for the info.  I did do testing, again using ktime, to make
sure and on our platform (is it platform specific?) I measured
udelay(1) to be ~1100ns.  After moving to a 12us timeout and reading
many MBs of randomness I am yet to receive any more timeouts.
Russell King - ARM Linux Oct. 6, 2015, 8:56 p.m. | #3
On Tue, Oct 06, 2015 at 09:51:22PM +0100, Lee Jones wrote:
> On Tue, 06 Oct 2015, Russell King - ARM Linux wrote:
> > On Tue, Oct 06, 2015 at 03:44:00PM +0100, Lee Jones wrote:
> > > Samples are documented to be available every 0.667us, so in theory
> > > the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > > thorough testing, it became apparent that filling the FIFO actually
> > > takes closer to 12us.
> > 
> > Is that measured?
> 
> I measured it using ktime.  Hopefully that was adequate.
> 
> > > +/*
> > > + * Samples are documented to be available every 0.667us, so in theory
> > > + * the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > > + * thorough testing, it became apparent that filling the FIFO actually
> > > + * takes closer to 12us.
> > > + */
> > > +#define ST_RNG_FILL_FIFO_TIMEOUT	12
> > 
> > I hope you're not using such a precise figure with udelay().  udelay()
> > is not guaranteed to give exactly (or even at least) the delay you
> > request.  It's defined to give an approximate delay.
> > 
> > Many people have a problem understanding that, so I won't explain why
> > it is that way, just accept that it is and move on... it's not going
> > to magically get "fixed" because someone has just learnt about this. :)
> 
> Thanks for the info.  I did do testing, again using ktime, to make
> sure and on our platform (is it platform specific?) I measured
> udelay(1) to be ~1100ns.  After moving to a 12us timeout and reading
> many MBs of randomness I am yet to receive any more timeouts.

If you happen to fall back to the software timing loop, udelay(1) will not
be >=1us anymore, but will be slightly shorter.

That's because the loops_per_jiffy value is calculated as the number of
loops between each timer interrupt - so the period being measured is the
timer period, minus the time it takes for the timer interrupt to run.
The latter is indeterminant.  Consequently, the loops_per_jiffy estimate
is always slightly under the real number of loops-per-jiffy, so delays
generated by udelay() and friends will always be slightly short.

The faster your HZ value, the bigger the error.  The longer the interrupt
handler takes, the bigger the error.

IIRC, Linus recommends a x2 factor on delays, especially timeouts generated
by these functions.
Lee Jones Oct. 7, 2015, 7:53 a.m. | #4
On Tue, 06 Oct 2015, Russell King - ARM Linux wrote:

> On Tue, Oct 06, 2015 at 09:51:22PM +0100, Lee Jones wrote:
> > On Tue, 06 Oct 2015, Russell King - ARM Linux wrote:
> > > On Tue, Oct 06, 2015 at 03:44:00PM +0100, Lee Jones wrote:
> > > > Samples are documented to be available every 0.667us, so in theory
> > > > the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > > > thorough testing, it became apparent that filling the FIFO actually
> > > > takes closer to 12us.
> > > 
> > > Is that measured?
> > 
> > I measured it using ktime.  Hopefully that was adequate.
> > 
> > > > +/*
> > > > + * Samples are documented to be available every 0.667us, so in theory
> > > > + * the 8 sample deep FIFO should take 5.336us to fill.  However, during
> > > > + * thorough testing, it became apparent that filling the FIFO actually
> > > > + * takes closer to 12us.
> > > > + */
> > > > +#define ST_RNG_FILL_FIFO_TIMEOUT	12
> > > 
> > > I hope you're not using such a precise figure with udelay().  udelay()
> > > is not guaranteed to give exactly (or even at least) the delay you
> > > request.  It's defined to give an approximate delay.
> > > 
> > > Many people have a problem understanding that, so I won't explain why
> > > it is that way, just accept that it is and move on... it's not going
> > > to magically get "fixed" because someone has just learnt about this. :)
> > 
> > Thanks for the info.  I did do testing, again using ktime, to make
> > sure and on our platform (is it platform specific?) I measured
> > udelay(1) to be ~1100ns.  After moving to a 12us timeout and reading
> > many MBs of randomness I am yet to receive any more timeouts.
> 
> If you happen to fall back to the software timing loop, udelay(1) will not
> be >=1us anymore, but will be slightly shorter.
> 
> That's because the loops_per_jiffy value is calculated as the number of
> loops between each timer interrupt - so the period being measured is the
> timer period, minus the time it takes for the timer interrupt to run.
> The latter is indeterminant.  Consequently, the loops_per_jiffy estimate
> is always slightly under the real number of loops-per-jiffy, so delays
> generated by udelay() and friends will always be slightly short.
> 
> The faster your HZ value, the bigger the error.  The longer the interrupt
> handler takes, the bigger the error.

Thanks for taking the time to explain.

> IIRC, Linus recommends a x2 factor on delays, especially timeouts generated
> by these functions.

In this implementation it shouldn't matter too much either way.  Even
when the timeouts were prolific, bandwidth was not reduced due to the
quick turn-round of the subsystem.  I don't foresee any impact on
bandwidth if we were to raise the timeout either; in fact, I doubt
we'd ever see a timeout again.

Patch

diff --git a/drivers/char/hw_random/st-rng.c b/drivers/char/hw_random/st-rng.c
index 44480fe..3b1432c 100644
--- a/drivers/char/hw_random/st-rng.c
+++ b/drivers/char/hw_random/st-rng.c
@@ -33,8 +33,13 @@ 
 #define ST_RNG_FIFO_DEPTH		8
 #define ST_RNG_FIFO_SIZE		(ST_RNG_FIFO_DEPTH * ST_RNG_SAMPLE_SIZE)
 
-/* Samples are available every 0.667us, which we round to 1us */
-#define ST_RNG_FILL_FIFO_TIMEOUT   (1 * (ST_RNG_FIFO_SIZE / ST_RNG_SAMPLE_SIZE))
+/*
+ * Samples are documented to be available every 0.667us, so in theory
+ * the 8 sample deep FIFO should take 5.336us to fill.  However, during
+ * thorough testing, it became apparent that filling the FIFO actually
+ * takes closer to 12us.
+ */
+#define ST_RNG_FILL_FIFO_TIMEOUT	12
 
 struct st_rng_data {
 	void __iomem	*base;