diff mbox series

[1/2] random: schedule jitter credit for next jiffy, not in two jiffies

Message ID 20220930231050.749824-1-Jason@zx2c4.com
State New
Headers show
Series [1/2] random: schedule jitter credit for next jiffy, not in two jiffies | expand

Commit Message

Jason A. Donenfeld Sept. 30, 2022, 11:10 p.m. UTC
Counterintuitively, mod_timer(..., jiffies + 1) will cause the timer to
fire not in the next jiffy, but in two jiffies. The way to cause
the timer to fire in the next jiffy is with mod_timer(..., jiffies).
Doing so then lets us bump the upper bound back up again.

Fixes: 50ee7529ec45 ("random: try to actively add entropy rather than passively wait for it")
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 drivers/char/random.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jason A. Donenfeld Oct. 1, 2022, 9:21 a.m. UTC | #1
On Sat, Oct 01, 2022 at 01:10:50AM +0200, Jason A. Donenfeld wrote:
> Rather than merely hoping that the callback gets called on another CPU,
> arrange for that to actually happen, by round robining which CPU the
> timer fires on. This way, on multiprocessor machines, we exacerbate
> jitter by touching the same memory from multiple different cores.
> 
> Cc: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Sultan Alsawaf <sultan@kerneltoast.com>
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  drivers/char/random.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index fdf15f5c87dd..74627b53179a 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1209,6 +1209,7 @@ static void __cold try_to_generate_entropy(void)
>  	struct entropy_timer_state stack;
>  	unsigned int i, num_different = 0;
>  	unsigned long last = random_get_entropy();
> +	int cpu = -1;
>  
>  	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
>  		stack.entropy = random_get_entropy();
> @@ -1223,8 +1224,17 @@ static void __cold try_to_generate_entropy(void)
>  	stack.samples = 0;
>  	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
>  	while (!crng_ready() && !signal_pending(current)) {
> -		if (!timer_pending(&stack.timer))
> -			mod_timer(&stack.timer, jiffies);
> +		if (!timer_pending(&stack.timer)) {
> +			preempt_disable();
> +			do {
> +				cpu = cpumask_next(cpu, cpu_online_mask);
> +				if (cpu == nr_cpumask_bits)
> +					cpu = cpumask_first(cpu_online_mask);
> +			} while (cpu == smp_processor_id() && cpumask_weight(cpu_online_mask) > 1);
> +			stack.timer.expires = jiffies;
> +			add_timer_on(&stack.timer, cpu);

Sultan points out that timer_pending() returns false before the function
has actually run, while add_timer_on() adds directly to the timer base,
which means del_timer_sync() might fail to notice a pending timer, which
means UaF. This seems like a somewhat hard problem to solve. So I think
I'll just drop this patch 2/2 here until a better idea comes around.

Jason
Sebastian Andrzej Siewior Oct. 5, 2022, 5:26 p.m. UTC | #2
On 2022-10-01 11:21:30 [+0200], Jason A. Donenfeld wrote:
> Sultan points out that timer_pending() returns false before the function
> has actually run, while add_timer_on() adds directly to the timer base,
> which means del_timer_sync() might fail to notice a pending timer, which
> means UaF. This seems like a somewhat hard problem to solve. So I think
> I'll just drop this patch 2/2 here until a better idea comes around.

I don't know what you exactly intend but this:

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 79d7d4e4e5828..18d785f5969e5 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1195,6 +1195,7 @@ static void __cold try_to_generate_entropy(void)
 	struct entropy_timer_state stack;
 	unsigned int i, num_different = 0;
 	unsigned long last = random_get_entropy();
+	unsigned int cpu = raw_smp_processor_id();
 
 	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
 		stack.entropy = random_get_entropy();
@@ -1207,10 +1208,17 @@ static void __cold try_to_generate_entropy(void)
 		return;
 
 	stack.samples = 0;
-	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
+	timer_setup_on_stack(&stack.timer, entropy_timer, TIMER_PINNED);
 	while (!crng_ready() && !signal_pending(current)) {
-		if (!timer_pending(&stack.timer))
-			mod_timer(&stack.timer, jiffies + 1);
+
+		if (!timer_pending(&stack.timer)) {
+			cpu = cpumask_next(cpu, cpu_online_mask);
+			if (cpu == nr_cpumask_bits)
+				cpu = cpumask_first(cpu_online_mask);
+
+			stack.timer.expires = jiffies;
+			add_timer_on(&stack.timer, cpu);
+		}
 		mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
 		schedule();
 		stack.entropy = random_get_entropy();

will enqueue a timer once none is pending. That is on first invocation
_or_ as soon as the callback is about to be invoked. So basically the
timer is about to be called and you enqueue it right away.
With "expires = jiffies" the timer will be invoked on every tick while
"jiffies + 1" will invoke it on every other tick.

You will start the timer on "this-CPU + 1" and iterate it in a round
robin fashion through all CPUs. It seems this is important. I don't
think that you need to ensure that the CPU running
try_to_generate_entropy() will not fire the timer since it won't happen
most of the time (due to the round-robin thingy). This is (of course)
different between a busy system and an idle one.

That del_timer_sync() at the end is what you want. If the timer is
pending (as in enqueued in the timer wheel) then it will be removed
before it is invoked. If the timer's callback is invoked then it will
spin until the callback is done.

I *think* you are aware that schedule() here is kind of pointless
because if there is not much going on (this is the only task in the
system), then you leave schedule() right away and continue. Assuming
random_get_entropy() is returning current clock (which is either the
rdtsc on x86 or random_get_entropy_fallback() somewhere else) then you
get little noise.

With some additional trace prints:

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 79d7d4e4e5828..802e0d9254611 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1195,6 +1195,8 @@ static void __cold try_to_generate_entropy(void)
 	struct entropy_timer_state stack;
 	unsigned int i, num_different = 0;
 	unsigned long last = random_get_entropy();
+	unsigned int cpu = raw_smp_processor_id();
+	unsigned long v1, v2;
 
 	for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
 		stack.entropy = random_get_entropy();
@@ -1207,15 +1209,26 @@ static void __cold try_to_generate_entropy(void)
 		return;
 
 	stack.samples = 0;
-	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
+	timer_setup_on_stack(&stack.timer, entropy_timer, TIMER_PINNED);
+	v1 = v2 = 0;
 	while (!crng_ready() && !signal_pending(current)) {
-		if (!timer_pending(&stack.timer))
-			mod_timer(&stack.timer, jiffies + 1);
+
+		if (!timer_pending(&stack.timer)) {
+			cpu = cpumask_next(cpu, cpu_online_mask);
+			if (cpu == nr_cpumask_bits)
+				cpu = cpumask_first(cpu_online_mask);
+
+			stack.timer.expires = jiffies;
+			add_timer_on(&stack.timer, cpu);
+		}
 		mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
 		schedule();
-		stack.entropy = random_get_entropy();
+		v1 = random_get_entropy();
+		stack.entropy = v1;
+		trace_printk("%lx | %lx\n", v1, v1 - v2);
+		v2 = v1;
 	}
-
+	tracing_off();
 	del_timer_sync(&stack.timer);
 	destroy_timer_on_stack(&stack.timer);
 	mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));

I get:

|       swapper/0-1       [002] .....     2.570083: try_to_generate_entropy: 275e8a56d | 2e4
|       swapper/0-1       [002] .....     2.570084: try_to_generate_entropy: 275e8a82c | 2bf
|       swapper/0-1       [002] .....     2.570084: try_to_generate_entropy: 275e8ab10 | 2e4
|       swapper/0-1       [002] .....     2.570084: try_to_generate_entropy: 275e8adcf | 2bf
|       swapper/0-1       [002] .....     2.570084: try_to_generate_entropy: 275e8b0b3 | 2e4
|       swapper/0-1       [002] .....     2.570084: try_to_generate_entropy: 275e8b372 | 2bf
|       swapper/0-1       [002] .....     2.570085: try_to_generate_entropy: 275e8b85c | 4ea
|       swapper/0-1       [002] .....     2.570085: try_to_generate_entropy: 275e8bb1b | 2bf
|       swapper/0-1       [002] .....     2.570085: try_to_generate_entropy: 275e8be49 | 32e
|       swapper/0-1       [002] .....     2.570085: try_to_generate_entropy: 275e8c12d | 2e4
|       swapper/0-1       [002] .....     2.570087: try_to_generate_entropy: 275e8de15 | 1ce8
|       swapper/0-1       [002] .....     2.570088: try_to_generate_entropy: 275e8e168 | 353
|       swapper/0-1       [002] .....     2.570088: try_to_generate_entropy: 275e8e471 | 309
|       swapper/0-1       [002] .....     2.570088: try_to_generate_entropy: 275e8e833 | 3c2
|       swapper/0-1       [002] .....     2.570088: try_to_generate_entropy: 275e8edd6 | 5a3

So with sizeof(entropy) = 8 bytes you add 8 bytes only little changes in
lower bits.
That is maybe where you say that I don't need to worry because it is a
very good hash function and the timer accounts only one bit of entropy
every jiffy.

> Jason

Sebastian
Jason A. Donenfeld Oct. 5, 2022, 9:08 p.m. UTC | #3
Hi Sebastian,

On Wed, Oct 05, 2022 at 07:26:42PM +0200, Sebastian Andrzej Siewior wrote:
> That del_timer_sync() at the end is what you want. If the timer is
> pending (as in enqueued in the timer wheel) then it will be removed
> before it is invoked. If the timer's callback is invoked then it will
> spin until the callback is done.

del_timer_sync() is not guaranteed to succeed with add_timer_on() being
used in conjunction with timer_pending() though. That's why I've
abandoned this.

Jason
Sebastian Andrzej Siewior Oct. 6, 2022, 6:46 a.m. UTC | #4
On 2022-10-05 23:08:19 [+0200], Jason A. Donenfeld wrote:
> Hi Sebastian,
Hi Jason,

> On Wed, Oct 05, 2022 at 07:26:42PM +0200, Sebastian Andrzej Siewior wrote:
> > That del_timer_sync() at the end is what you want. If the timer is
> > pending (as in enqueued in the timer wheel) then it will be removed
> > before it is invoked. If the timer's callback is invoked then it will
> > spin until the callback is done.
> 
> del_timer_sync() is not guaranteed to succeed with add_timer_on() being
> used in conjunction with timer_pending() though. That's why I've
> abandoned this.

But why? The timer is added to a timer-base on a different CPU. Should
work.

> Jason

Sebastian
diff mbox series

Patch

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 64ee16ffb8b7..fdf15f5c87dd 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1205,7 +1205,7 @@  static void __cold entropy_timer(struct timer_list *timer)
  */
 static void __cold try_to_generate_entropy(void)
 {
-	enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = HZ / 30 };
+	enum { NUM_TRIAL_SAMPLES = 8192, MAX_SAMPLES_PER_BIT = HZ / 15 };
 	struct entropy_timer_state stack;
 	unsigned int i, num_different = 0;
 	unsigned long last = random_get_entropy();
@@ -1224,7 +1224,7 @@  static void __cold try_to_generate_entropy(void)
 	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
 	while (!crng_ready() && !signal_pending(current)) {
 		if (!timer_pending(&stack.timer))
-			mod_timer(&stack.timer, jiffies + 1);
+			mod_timer(&stack.timer, jiffies);
 		mix_pool_bytes(&stack.entropy, sizeof(stack.entropy));
 		schedule();
 		stack.entropy = random_get_entropy();