Message ID | 20220405140906.222350-1-Jason@zx2c4.com |
---|---|
State | New |
Headers | show |
Series | random: opportunistically initialize on /dev/urandom reads | expand |
On Tue, Apr 5, 2022 at 7:10 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > Practically speaking, this means that at least on x86, /dev/urandom > becomes safe. Probably other architectures with working cycle counters > will also become safe. And architectures with slow or broken cycle > counters at least won't be affected at all by this change. I think this is a good change, as it's a bit pointless to warn about uninitialized random data if we can just initialize it. I do wonder if it wouldn't be better to perhaps move this all into wait_for_random_bytes(), though, and add an argument to that function for "no delay". Because I think we should at the same time also add a warning to wait_for_random_bytes() for the "uhhhuh, it timed out". Right now wait_for_random_bytes() returns an error that most people then just ignore. Including drivers/net/wireguard/cookie.c. So instead of returning an error that nobody can do much about, how about we move the warning code into wait_for_random_bytes()? And make that urandom_read() call the same wait_for_random_bytes() that random_read() calls, just with GRND_NONBLOCK as an argument? Not a big deal. Your patch is fine by me too. Linus
Hi Linus, On Tue, Apr 5, 2022 at 7:37 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > Right now wait_for_random_bytes() returns an error that most people > then just ignore. Including drivers/net/wireguard/cookie.c. > > So instead of returning an error that nobody can do much about, how > about we move the warning code into wait_for_random_bytes()? > I think this is a good change, as it's a bit pointless to warn about > uninitialized random data if we can just initialize it. WireGuard's usage of these APIs breaks down to: 1) in receive.c, rng_is_initialized() is checked, and incoming handshake & cookie packets are dropped if the RNG isn't initialized, so that an attacker can't queue up tons of work to do before it can be done. 2) in noise.c, wait_for_random_bytes() is called before taking locks, because later curve25519_generate_secret() uses get_random_bytes_wait() internally. This happens in a worker, so wait_for_random_bytes() can't fail, since there's no default-enabled signal delivery (I thinkā½ That's been my assumption anyhow.) This actually is just out of an abundance of caution, because step (1) means we'll never hit this uninitialized. 3) in cookie.c, get_random_bytes_wait() is called so that we don't leak premature randomness via the rather large nonce parameter. But the same caveats as (2) apply: worker, so no signals, and protected by (1) still. If my assumption about signal delivery is wrong, I'll need to revisit this. But anyway I think that's what explains why some of those cases check the return value and others don't, and why get_random_bytes_wait() isn't a __must_check. > I do wonder if it wouldn't be better to perhaps move this all into > wait_for_random_bytes(), though, and add an argument to that function > for "no delay". > > Because I think we should at the same time also add a warning to > wait_for_random_bytes() for the "uhhhuh, it timed out". > > So instead of returning an error that nobody can do much about, how > about we move the warning code into wait_for_random_bytes()? Just so we're on the same page here, wait_for_random_bytes() does this now: while (!crng_ready()) { int ret; try_to_generate_entropy(); ret = wait_event_interruptible_timeout(crng_init_wait, crng_ready(), HZ); if (ret) return ret > 0 ? 0 : ret; } So it either eventually returns 0, or it gets interrupted by a signal. It never times out without trying again. It sounds like your suggestion would be to make that: while (!crng_ready()) { int ret; try_to_generate_entropy(); if (nodelay && !crng_ready()) { warn(...); return -EBUSY; } ret = wait_event_interruptible_timeout(crng_init_wait, crng_ready(), HZ); if (ret) return ret > 0 ? 0 : ret; } or maybe you want to always wait at least a second, a la: while (!crng_ready()) { int ret; try_to_generate_entropy(); ret = wait_event_interruptible_timeout(crng_init_wait, crng_ready(), HZ); if (ret) return ret > 0 ? 0 : ret; if (nodelay && !ret) { warn(...); return -EBUSY; } } I guess we could do one of these, though IMHO it's a bit awkward, making for a sort of, "wait, but don't actually" circumstance. Though, I can see the appeal of having only one caller of try_to_generate_entropy(), tied to one circumstance, and fit all the things through that circumstance. Six of one... Jason
Does this help? I have some code I'm not yet ready to submit as patches. Here's some of it that could be used to initialise the pool (& I think also the counter for chacha). The paper linked in the comments suggests that adding a bit reversal would improve diffusion, but I have not done that yet. /************************************************************************** * Load a 64-bit word with data from whatever source we have * * arch_get_random_long() * hardware RNG * emulated HWRNG in a VM * * When there are two sources, alternate. * * If you have no better source, or if one fails, * or if the argument 'fast' is set, then fall back * to random_get_entropy(). * * Also use random_get_entropy() sometimes even * if we have a good source, to avoid trusting * the source completely ***************************************************************************/ static int load_count = 0; static spinlock_t source_lock; static unsigned long source_value __latent_entropy ; #define rotl64(x,n) ((x>>(n)) | (x<<(n))) static int get_hw_long(unsigned long *x) { int ret ; unsigned s = sizeof(unsigned long) ; ret = get_random_bytes_arch((u8 *) x, s) ; return (ret == s) ? 1 : 0 ; } /* This should be a Mersenne number, (2^x)-1 */ #define MIX_MASK 15 #define GOT_A IS_ENABLED(CONFIG_ARCH_RANDOM) #define GOT_H IS_ENABLED(CONFIG_HW_RANDOM) static unsigned long get_64(int fast) { int ret = 0 ; unsigned long x, flags ; if (!fast && (GOT_A||GOT_H) && (load_count&MIX_MASK)) { if (GOT_A && GOT_H) { if (load_count & 1) ret = arch_get_random_long(&x) ; else ret = get_hw_long(&x) ; /* * if the chosen source failed * then try the other */ if (!ret) if (load_count & 1) ret = get_hw_long(&x) ; else ret = arch_get_random_long(&x) ; } if (GOT_A && !GOT_H) ret = arch_get_random_long(&x) ; if (GOT_H && !GOT_A) ret = get_hw_long(&x) ; } /* * fast is nonzero, so not trying expensive methods * * or no source configured, neither GOT_A nor GOT_H set * or configured one(s) failed, ret is still zero * * or it is just time for a different source * (load_count&MIX_MASK) == 0 */ if (!ret) x = random_get_entropy() ; /* * use 19-bit rotation, based on * https://eprint.iacr.org/2021/523.pdf */ spin_lock_irqsave(&source_lock, flags); source_value = rotl64(source_value, 19) ^ x ; load_count++ ; spin_unlock_irqrestore(&source_lock, flags); memzero_explicit(x, sizeof(x)) ; return(source_value) ; }
diff --git a/drivers/char/random.c b/drivers/char/random.c index ee3ad2ba0942..388025d6d38d 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1534,6 +1534,13 @@ static ssize_t urandom_read(struct file *file, char __user *buf, size_t nbytes, { static int maxwarn = 10; + /* + * Opportunistically attempt to initialize the RNG on platforms that + * have fast cycle counters, but don't (for now) require it to succeed. + */ + if (!crng_ready()) + try_to_generate_entropy(); + if (!crng_ready() && maxwarn > 0) { maxwarn--; if (__ratelimit(&urandom_warning))
In 6f98a4bfee72 ("random: block in /dev/urandom"), we tried to make a successful try_to_generate_entropy() call *required* if the RNG was not already initialized. Unfortunately, weird architectures and old userspaces combined in TCG test harnesses, making that change still not realistic, so it was reverted in 0313bc278dac ("Revert "random: block in /dev/urandom""). However, rather than making a successful try_to_generate_entropy() call *required*, we can instead make it *best-effort*. If try_to_generate_entropy() fails, it fails, and nothing changes from the current behavior. If it succeeds, then /dev/urandom becomes safe to use for free. This way, we don't risk the regression potential that led to us reverting the required-try_to_generate_entropy() call before. Practically speaking, this means that at least on x86, /dev/urandom becomes safe. Probably other architectures with working cycle counters will also become safe. And architectures with slow or broken cycle counters at least won't be affected at all by this change. So it may not be the glorious "all things are unified!" change we were hoping for initially, but practically speaking, it makes a positive impact. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> --- drivers/char/random.c | 7 +++++++ 1 file changed, 7 insertions(+)