Message ID | ZvEFQAWVgWNd9j7e@gondor.apana.org.au |
---|---|
State | Accepted |
Commit | fb10c7a84661471cdcc8998d63703211b873c126 |
Headers | show |
Series | hwrng: core - Add WARN_ON for buggy read return values | expand |
On Mon Sep 23, 2024 at 5:31 PM EEST, Jarkko Sakkinen wrote: > Greg, did we have something under Documentation/ that would fully > address the use of WARN? ... personally I think that even if there was a separate document, this should be somehow addressed already in SubmittingPatches so that it can't be possibly missed by anyone. BR, Jarkko
On Mon, Sep 23, 2024 at 04:48:27PM +0200, Greg KH wrote: > > Please see: > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on > which describes that. We should make it more explicit that any WARN() > or WARN_ON() calls that can be hit by user interactions somehow, will > end up getting a CVE id when we fix it up to not do so. If the aformentioned WARN_ON hits, then the driver has probabaly already done a buffer overrun so it's a CVE anyway. Cheers,
On Tue Sep 24, 2024 at 1:32 AM EEST, Herbert Xu wrote: > On Mon, Sep 23, 2024 at 04:48:27PM +0200, Greg KH wrote: > > > > Please see: > > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on > > which describes that. We should make it more explicit that any WARN() > > or WARN_ON() calls that can be hit by user interactions somehow, will > > end up getting a CVE id when we fix it up to not do so. > > If the aformentioned WARN_ON hits, then the driver has probabaly > already done a buffer overrun so it's a CVE anyway. We'll see I finally got into testing this. Sorry for latencies, I'm switching jobs and unfortunately German Post Office lost my priority mail containing contracts (sent them from Finland to Berlin) so have been signing, scanning etc. the whole day :-) My last week in the current job, and next week is the first in the new job, so this week is a bit bumpy. BR, Jarkko
On Tue, Sep 24, 2024 at 08:43:00PM +0300, Jarkko Sakkinen wrote: > > Without any traces that would provide more information I don't see > the smoking gun. I haven't confirmed that it's definitely the tpm2 driver, it's just based on the backtrace. Hopefully my patch will confirm it one way or the other. Here is the backtrace: [ 100.784159] vmd 0000:c2:00.5: Bound to PCI domain 10002 [ 100.786209] Monitor-Mwait will be used to enter C-1 state [ 100.786225] Monitor-Mwait will be used to enter C-2 state [ 100.786244] ACPI: \_SB_.SCK0.C000: Found 2 idle states [ 100.823093] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 [ 100.823636] ACPI: button: Power Button [PWRF] [ 100.905756] ERST: Error Record Serialization Table (ERST) support is initialized. [ 100.905858] pstore: Using crash dump compression: deflate [ 100.905861] pstore: Registered erst as persistent store backend [ 100.907044] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 100.908305] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 100.926608] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A [ 100.942953] Non-volatile memory driver v1.3 [ 100.947908] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22) [ 101.226913] ACPI: bus type drm_connector registered [ 101.229708] alg: ecdh-nist-p256 (ecdh-nist-p256-generic) is disabled due to FIPS [ 101.229745] tpm tpm0: crypto ecdh allocation failed [ 101.236311] tpm tpm0: A TPM error (708) occurred start auth session [ 101.238797] ================================================================== [ 101.238800] BUG: KASAN: slab-out-of-bounds in blake2s_update+0x135/0x2b0 [ 101.238808] Read of size 44 at addr ff11000167334d98 by task hwrng/318 [ 101.238811] [ 101.238813] CPU: 26 UID: 0 PID: 318 Comm: hwrng Not tainted 6.11.0-0.rc5.22.el10.x86_64+debug #1 [ 101.238818] Hardware name: Supermicro SSG-110P-NTR10-EI018/X12SPO-NTF, BIOS 1.3 05/20/2022 [ 101.238820] Call Trace: [ 101.238823] <TASK> [ 101.238826] dump_stack_lvl+0x6f/0xb0 [ 101.238833] ? blake2s_update+0x135/0x2b0 [ 101.238836] print_address_description.constprop.0+0x88/0x330 [ 101.238843] ? blake2s_update+0x135/0x2b0 [ 101.238847] print_report+0x108/0x209 [ 101.238851] ? blake2s_update+0x135/0x2b0 [ 101.238855] ? __virt_addr_valid+0x20b/0x440 [ 101.238859] ? blake2s_update+0x135/0x2b0 [ 101.238863] kasan_report+0xa8/0xe0 [ 101.238868] ? blake2s_update+0x135/0x2b0 [ 101.238874] kasan_check_range+0x10f/0x1f0 [ 101.238879] __asan_memcpy+0x23/0x60 [ 101.238883] blake2s_update+0x135/0x2b0 [ 101.238887] add_hwgenerator_randomness+0x3d/0xe0 [ 101.238895] hwrng_fillfn+0x144/0x270 [ 101.238900] ? __pfx_hwrng_fillfn+0x10/0x10 [ 101.238904] kthread+0x2d2/0x3a0 [ 101.238908] ? __pfx_kthread+0x10/0x10 [ 101.238912] ret_from_fork+0x31/0x70 [ 101.238917] ? __pfx_kthread+0x10/0x10 [ 101.238920] ret_from_fork_asm+0x1a/0x30 [ 101.238929] </TASK> [ 101.238931] [ 101.238932] Allocated by task 1: [ 101.238934] kasan_save_stack+0x30/0x50 [ 101.238937] kasan_save_track+0x14/0x30 [ 101.238940] __kasan_kmalloc+0x8f/0xa0 [ 101.238942] __kmalloc_noprof+0x1fe/0x410 [ 101.238947] kobj_map+0x7e/0x6d0 [ 101.238951] cdev_add+0x92/0x180 [ 101.238954] tty_cdev_add+0x17a/0x280 [ 101.238957] tty_register_device_attr+0x401/0x740 [ 101.238960] tty_register_driver+0x381/0x6f0 [ 101.238963] vty_init+0x2c1/0x2f0 [ 101.238967] tty_init+0x13b/0x150 [ 101.238970] do_one_initcall+0x11c/0x5c0 [ 101.238975] do_initcalls+0x1b4/0x1f0 [ 101.238980] kernel_init_freeable+0x4ae/0x520 [ 101.238984] kernel_init+0x1c/0x150 [ 101.238988] ret_from_fork+0x31/0x70 [ 101.238991] ret_from_fork_asm+0x1a/0x30 [ 101.238994] [ 101.238995] The buggy address belongs to the object at ff11000167334d80 [ 101.238995] which belongs to the cache kmalloc-64 of size 64 [ 101.238998] The buggy address is located 24 bytes inside of [ 101.238998] allocated 56-byte region [ff11000167334d80, ff11000167334db8) [ 101.239002] [ 101.239003] The buggy address belongs to the physical page: [ 101.239004] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x167334 [ 101.239008] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 101.239012] page_type: 0xfdffffff(slab) [ 101.239016] raw: 0017ffffc0000000 ff1100010003c8c0 dead000000000122 0000000000000000 [ 101.239019] raw: 0000000000000000 0000000000200020 00000001fdffffff 0000000000000000 [ 101.239021] page dumped because: kasan: bad access detected [ 101.239023] [ 101.239024] Memory state around the buggy address: [ 101.239025] ff11000167334c80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc [ 101.239028] ff11000167334d00: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc [ 101.239030] >ff11000167334d80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc [ 101.239031] ^ [ 101.239033] ff11000167334e00: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc [ 101.239035] ff11000167334e80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc [ 101.239037] ================================================================== [ 101.383067] rdac: device handler registered [ 101.383412] hp_sw: device handler registered [ 101.383415] emc: device handler registered [ 101.383879] alua: device handler registered [ 101.391255] xhci_hcd 0000:00:14.0: xHCI Host Controller [ 101.391892] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1 [ 101.393706] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810 [ 101.399646] xhci_hcd 0000:00:14.0: xHCI Host Controller [ 101.400136] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2 [ 101.400163] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed [ 101.400818] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 6.11 [ 101.400823] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 101.400826] usb usb1: Product: xHCI Host Controller [ 101.400829] usb usb1: Manufacturer: Linux 6.11.0-0.rc5.22.el10.x86_64+debug xhci-hcd [ 101.400832] usb usb1: SerialNumber: 0000:00:14.0 [ 101.403055] hub 1-0:1.0: USB hub found [ 101.403222] hub 1-0:1.0: 16 ports detected [ 101.657974] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 6.11 [ 101.657982] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 101.657986] usb usb2: Product: xHCI Host Controller [ 101.657990] usb usb2: Manufacturer: Linux 6.11.0-0.rc5.22.el10.x86_64+debug xhci-hcd [ 101.657993] usb usb2: SerialNumber: 0000:00:14.0 [ 101.660659] hub 2-0:1.0: USB hub found [ 101.660882] hub 2-0:1.0: 10 ports detected {code} Thanks,
On Fri, 2024-09-27 at 08:42 +0800, Herbert Xu wrote: > On Tue, Sep 24, 2024 at 08:43:00PM +0300, Jarkko Sakkinen wrote: > > > > Without any traces that would provide more information I don't see > > the smoking gun. > > I haven't confirmed that it's definitely the tpm2 driver, it's just > based on the backtrace. Hopefully my patch will confirm it one way > or the other. Here is the backtrace: Agreed. > > [ 100.784159] vmd 0000:c2:00.5: Bound to PCI domain 10002 > [ 100.786209] Monitor-Mwait will be used to enter C-1 state > [ 100.786225] Monitor-Mwait will be used to enter C-2 state > [ 100.786244] ACPI: \_SB_.SCK0.C000: Found 2 idle states > [ 100.823093] input: Power Button as > /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 > [ 100.823636] ACPI: button: Power Button [PWRF] > [ 100.905756] ERST: Error Record Serialization Table (ERST) support > is initialized. > [ 100.905858] pstore: Using crash dump compression: deflate > [ 100.905861] pstore: Registered erst as persistent store backend > [ 100.907044] Serial: 8250/16550 driver, 4 ports, IRQ sharing > enabled > [ 100.908305] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = > 115200) is a 16550A > [ 100.926608] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = > 115200) is a 16550A > [ 100.942953] Non-volatile memory driver v1.3 > [ 100.947908] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id > 22) > [ 101.226913] ACPI: bus type drm_connector registered > [ 101.229708] alg: ecdh-nist-p256 (ecdh-nist-p256-generic) is > disabled due to FIPS > [ 101.229745] tpm tpm0: crypto ecdh allocation failed > [ 101.236311] tpm tpm0: A TPM error (708) occurred start auth I guess it is TPM2_StartAuthSession returning TPM_RC_VALUE. Probably James should look into this as the bus encryption code is clearly tripping here. I'm on second week on a new job so cannot promise any bandwidth yet this week. Earliest next week... BR, Jarkko
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 57c51efa5613..018316f54621 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -181,8 +181,15 @@ static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size, int present; BUG_ON(!mutex_is_locked(&reading_mutex)); - if (rng->read) - return rng->read(rng, (void *)buffer, size, wait); + if (rng->read) { + int err; + + err = rng->read(rng, buffer, size, wait); + if (WARN_ON_ONCE(err > 0 && err > size)) + err = size; + + return err; + } if (rng->data_present) present = rng->data_present(rng, wait);
Dear TPM maintainers: Please have a look at the tpm hwrng driver because it appears to be returning a length longer than the buffer length that we gave it. In particular, tpm2 appears to be the culprit (though I didn't really check tpm1 at all so it could also be buggy). The following patch hopefully should confirm that this is indeed caused by TPM and not some other HWRNG driver. ---8<--- If a buggy driver returns a length that is longer than the size of the buffer provided to it, then this may lead to a buffer overread in the caller. Stop this by adding a check for it in the hwrng core. Reported-by: Guangwu Zhang <guazhang@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>