Message ID | 20210913203201.1844253-1-ndesaulniers@google.com |
---|---|
State | New |
Headers | show |
Series | [5.10] overflow.h: use new generic division helpers to avoid / operator | expand |
On 13/09/2021 22.32, Nick Desaulniers wrote: > commit fad7cd3310db ("nbd: add the check to prevent overflow in > __nbd_ioctl()") raised an issue from the fallback helpers added in > commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and > add fallback code") > > ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! > > As Stephen Rothwell notes: > The added check_mul_overflow() call is being passed 64 bit values. > COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see > include/linux/overflow.h). > > Specifically, the helpers for checking whether the results of a > multiplication overflowed (__unsigned_mul_overflow, > __signed_add_overflow) use the division operator when > !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b > operands on 32b hosts. > > This was fixed upstream by > commit 76ae847497bc ("Documentation: raise minimum supported version of > GCC to 5.1") > which is not suitable to be backported to stable; I didn't have this > patch ready in time. > > Cc: stable@vger.kernel.org > Cc: Arnd Bergmann <arnd@kernel.org> > Cc: Kees Cook <keescook@chromium.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> > Reported-by: Nathan Chancellor <nathan@kernel.org> > Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> > Suggested-by: Pavel Machek <pavel@ucw.cz> > Link: https://github.com/ClangBuiltLinux/linux/issues/1438 > Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/ > Link: https://lore.kernel.org/lkml/20210910234047.1019925-1-ndesaulniers@google.com/ > Fixes: f0907827a8a9 ("compiler.h: enable builtin overflow checkers and > add fallback code") > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> > --- > This kind of generic meta-programming in C and my lack of experience in > doing so makes me very uncomfortable (and slightly ashamed) to send > this. I would appreciate careful review and feedback. I would appreciate > if Naresh can test this with GCC 4.9, which I don't have handy. > > Linus also suggested I look into the use of _Generic; I haven't > evaluated that approach yet, but I felt like posting this early for > feedback. > > include/linux/math64.h | 35 +++++++++++++++++++++++++++++++++++ > include/linux/overflow.h | 8 ++++---- > 2 files changed, 39 insertions(+), 4 deletions(-) > > diff --git a/include/linux/math64.h b/include/linux/math64.h > index 66deb1fdc2ef..bc9c12c168d0 100644 > --- a/include/linux/math64.h > +++ b/include/linux/math64.h > @@ -10,6 +10,9 @@ > > #define div64_long(x, y) div64_s64((x), (y)) > #define div64_ul(x, y) div64_u64((x), (y)) > +#ifndef is_signed_type > +#define is_signed_type(type) (((type)(-1)) < (type)1) > +#endif > > /** > * div_u64_rem - unsigned 64bit divide with 32bit divisor with remainder > @@ -111,6 +114,15 @@ extern s64 div64_s64(s64 dividend, s64 divisor); > > #endif /* BITS_PER_LONG */ Some comments on when and how to use this would be nice (not just build bugs when used wrong). > +#define div64_x64(dividend, divisor) ({ \ > + BUILD_BUG_ON_MSG(sizeof(dividend) < sizeof(u64),\ > + "prefer div_x64"); \ > + __builtin_choose_expr( \ > + is_signed_type(typeof(dividend)), \ > + div64_s64(dividend, divisor), \ > + div64_u64(dividend, divisor)); \ > +}) > + > /** > * div_u64 - unsigned 64bit divide with 32bit divisor > * @dividend: unsigned 64bit dividend > @@ -141,6 +153,29 @@ static inline s64 div_s64(s64 dividend, s32 divisor) > } > #endif > > +#define div_x64(dividend, divisor) ({ \ > + BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u32),\ > + "prefer div64_x64"); \ > + __builtin_choose_expr( \ > + is_signed_type(typeof(dividend)), \ > + div_s64(dividend, divisor), \ > + div_u64(dividend, divisor)); \ > +}) > + > +#define div_64(dividend, divisor) ({ \ > + BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u64), \ > + "128b div unsupported"); \ > + __builtin_choose_expr( \ > + __builtin_types_compatible_p(typeof(dividend), s64) || \ > + __builtin_types_compatible_p(typeof(dividend), u64), \ You can save a bit of typing using __same_type(dividend, s64) - it's a nice property of typeof() that it's idempotent when applied to a type name. _Generic would probably also do, but I don't think it would save that much, if anything, here. > u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder); > > #ifndef mul_u32_u32 > diff --git a/include/linux/overflow.h b/include/linux/overflow.h > index ef74051d5cfe..2ebdf220c184 100644 > --- a/include/linux/overflow.h > +++ b/include/linux/overflow.h > @@ -123,8 +123,8 @@ static inline bool __must_check __must_check_overflow(bool overflow) > (void) (&__a == __d); \ > *__d = __a * __b; \ > __builtin_constant_p(__b) ? \ > - __b > 0 && __a > type_max(typeof(__a)) / __b : \ > - __a > 0 && __b > type_max(typeof(__b)) / __a; \ > + __b > 0 && __a > div_64(type_max(typeof(__a)), __b) : \ > + __a > 0 && __b > div_64(type_max(typeof(__b)), __a); \ > }) > > /* > @@ -195,8 +195,8 @@ static inline bool __must_check __must_check_overflow(bool overflow) > (void) (&__a == &__b); \ > (void) (&__a == __d); \ > *__d = (u64)__a * (u64)__b; \ > - (__b > 0 && (__a > __tmax/__b || __a < __tmin/__b)) || \ > - (__b < (typeof(__b))-1 && (__a > __tmin/__b || __a < __tmax/__b)) || \ > + (__b > 0 && (__a > div_64(__tmax, __b) || __a < div_64(__tmin, __b))) || \ > + (__b < (typeof(__b))-1 && (__a > div_64(__tmin, __b) || __a < div_64(__tmax, __b))) || \ > (__b == (typeof(__b))-1 && __a == __tmin); \ > }) I had actually forgotten what horrors lay hidden in these fallback macros, I just knew they needed a wooden stake sooner or later. Now you made me look at this right before bedtime :( So, I'd sleep a little better if we got the 64 bit tests commented back in in test_overflow.c, and [assuming that the above would actually make that file build with gcc 4.9] that patch also backported to 5.10, so we had some confidence that the whole house of cards is actually solid. Rasmus
On Mon, Sep 13, 2021 at 11:05:02PM +0200, Rasmus Villemoes wrote: > So, I'd sleep a little better if we got the 64 bit tests commented back > in in test_overflow.c, and [assuming that the above would actually make > that file build with gcc 4.9] that patch also backported to 5.10, so we > had some confidence that the whole house of cards is actually solid. Yeah, I'm all for that too. -- Kees Cook
On Tue, Sep 14, 2021 at 10:32 AM Kees Cook <keescook@chromium.org> wrote: > > On Mon, Sep 13, 2021 at 11:05:02PM +0200, Rasmus Villemoes wrote: > > So, I'd sleep a little better if we got the 64 bit tests commented back > > in in test_overflow.c, and [assuming that the above would actually make > > that file build with gcc 4.9] that patch also backported to 5.10, so we > > had some confidence that the whole house of cards is actually solid. > > Yeah, I'm all for that too. Hmm. Another thing that might be worth doing is to just say "don't do that". IOW, don't do silly overflow checks with 64-bit things on 32-bit architectures. Apparently the first and only use case of this is the ndb code, and the fact is, that ndb code is just being truly horrendously stupid. That 'blksize' thing shouldn't be a blocksize, it should be a blockshift instead. The code to set blksize already expressly verifies that it's a power-of-2, and less or equial to PAGE_SIZE so this: loff_t blksize; is just plain silly. Doubly silly. Why is it a 'loff_t'? It should never have been a loff_t in the first place, and honestly, it really should have been a shift count that fits in a few bits. All this pain could have been trivially avoided with just people writing better code, knowing that multiplies and divides are expensive, and that shift counts are small and cheap. Linus
On Tue, Sep 14, 2021 at 11:14 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > All this pain could have been trivially avoided with just people > writing better code, knowing that multiplies and divides are > expensive, and that shift counts are small and cheap. IOW, maybe the fix is just this attached trivial patch. I didn't bother to change the order of the 'struct ndb_config' structure. It would pack better if you put the (now 32-bit) blksize_bits field next to the 'atomic_t' field, I think. But I wanted to just see how a minimal patch looks. I did make the debugfs interface reflect the change to blocksize_bits, so this is visible in user space. But it's debugfs. If people care, it could be made to use a DEFINE_SHOW_ATTRIBUTE() function the way it already does for 'flags', so that's not a fundamental issue, I just didn't bother. Hmm? Btw, I really think more of the block layer should perhaps think about use bit shifts more, not expanded values. Can things like the queue 'discard_alignment' really be non-powers-of-two? Linus drivers/block/nbd.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 5170a630778d..9e12edf892d7 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -97,13 +97,18 @@ struct nbd_config { atomic_t recv_threads; wait_queue_head_t recv_wq; - loff_t blksize; + unsigned blksize_bits; loff_t bytesize; #if IS_ENABLED(CONFIG_DEBUG_FS) struct dentry *dbg_dir; #endif }; +static inline unsigned int nbd_blksize(struct nbd_config *config) +{ + return 1u << config->blksize_bits; +} + struct nbd_device { struct blk_mq_tag_set tag_set; @@ -146,7 +151,7 @@ static struct dentry *nbd_dbg_dir; #define NBD_MAGIC 0x68797548 -#define NBD_DEF_BLKSIZE 1024 +#define NBD_DEF_BLKSIZE_BITS 10 static unsigned int nbds_max = 16; static int max_part = 16; @@ -317,12 +322,12 @@ static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize, loff_t blksize) { if (!blksize) - blksize = NBD_DEF_BLKSIZE; + blksize = 1u << NBD_DEF_BLKSIZE_BITS; if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) return -EINVAL; nbd->config->bytesize = bytesize; - nbd->config->blksize = blksize; + nbd->config->blksize_bits = __ffs(blksize); if (!nbd->task_recv) return 0; @@ -1337,7 +1342,7 @@ static int nbd_start_device(struct nbd_device *nbd) args->index = i; queue_work(nbd->recv_workq, &args->work); } - return nbd_set_size(nbd, config->bytesize, config->blksize); + return nbd_set_size(nbd, config->bytesize, nbd_blksize(config)); } static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev) @@ -1406,11 +1411,12 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, case NBD_SET_BLKSIZE: return nbd_set_size(nbd, config->bytesize, arg); case NBD_SET_SIZE: - return nbd_set_size(nbd, arg, config->blksize); + return nbd_set_size(nbd, arg, nbd_blksize(config)); case NBD_SET_SIZE_BLOCKS: - if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize)) + bytesize = arg << config->blksize_bits; + if (bytesize >> config->blksize_bits != arg) return -EINVAL; - return nbd_set_size(nbd, bytesize, config->blksize); + return nbd_set_size(nbd, bytesize, nbd_blksize(config)); case NBD_SET_TIMEOUT: nbd_set_cmd_timeout(nbd, arg); return 0; @@ -1476,7 +1482,7 @@ static struct nbd_config *nbd_alloc_config(void) atomic_set(&config->recv_threads, 0); init_waitqueue_head(&config->recv_wq); init_waitqueue_head(&config->conn_wait); - config->blksize = NBD_DEF_BLKSIZE; + config->blksize_bits = NBD_DEF_BLKSIZE_BITS; atomic_set(&config->live_connections, 0); try_module_get(THIS_MODULE); return config; @@ -1604,7 +1610,7 @@ static int nbd_dev_dbg_init(struct nbd_device *nbd) debugfs_create_file("tasks", 0444, dir, nbd, &nbd_dbg_tasks_fops); debugfs_create_u64("size_bytes", 0444, dir, &config->bytesize); debugfs_create_u32("timeout", 0444, dir, &nbd->tag_set.timeout); - debugfs_create_u64("blocksize", 0444, dir, &config->blksize); + debugfs_create_u32("blocksize_bits", 0444, dir, &config->blksize_bits); debugfs_create_file("flags", 0444, dir, nbd, &nbd_dbg_flags_fops); return 0; @@ -1826,7 +1832,7 @@ nbd_device_policy[NBD_DEVICE_ATTR_MAX + 1] = { static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd) { struct nbd_config *config = nbd->config; - u64 bsize = config->blksize; + u64 bsize = nbd_blksize(config); u64 bytes = config->bytesize; if (info->attrs[NBD_ATTR_SIZE_BYTES]) @@ -1835,7 +1841,7 @@ static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd) if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]) bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]); - if (bytes != config->bytesize || bsize != config->blksize) + if (bytes != config->bytesize || bsize != nbd_blksize(config)) return nbd_set_size(nbd, bytes, bsize); return 0; }
On Tue, Sep 14, 2021 at 11:30 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Tue, Sep 14, 2021 at 11:14 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > All this pain could have been trivially avoided with just people > > writing better code, knowing that multiplies and divides are > > expensive, and that shift counts are small and cheap. > > IOW, maybe the fix is just this attached trivial patch. > > I didn't bother to change the order of the 'struct ndb_config' > structure. It would pack better if you put the (now 32-bit) > blksize_bits field next to the 'atomic_t' field, I think. But I wanted > to just see how a minimal patch looks. > > I did make the debugfs interface reflect the change to blocksize_bits, > so this is visible in user space. But it's debugfs. > > If people care, it could be made to use a DEFINE_SHOW_ATTRIBUTE() > function the way it already does for 'flags', so that's not a > fundamental issue, I just didn't bother. > > Hmm? > > Btw, I really think more of the block layer should perhaps think about > use bit shifts more, not expanded values. Can things like the queue > 'discard_alignment' really be non-powers-of-two? > > Linus Any issues passing an loff_t (aka long long) to __ffs which expects an unsigned long for ilp32 targets? (I hate the whole family of ffs()...why did ffs() ever accept just an int?!) Any issues modifying the sysfs interface? Perhaps something in userspace relies on parsing those strings? Other than that LGTM, and I like your new overflow check. :) -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 11:30:03AM -0700, Linus Torvalds wrote: > case NBD_SET_SIZE_BLOCKS: > - if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize)) > + bytesize = arg << config->blksize_bits; > + if (bytesize >> config->blksize_bits != arg) > return -EINVAL; FWIW, it's probably better to avoid open-coding the check. There are helpers for shift-left too. :) + if (check_shl_overflow(arg, config->blksize_bits, &bytesize)) -- Kees Cook
On Tue, Sep 14, 2021 at 11:45 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Any issues passing an loff_t (aka long long) to __ffs which expects an > unsigned long for ilp32 targets? No. We literally _just_ checked that the value is a power-of-two, and that it's in the range [1024, PAGE_SIZE]. There was never anything "loff_t" about bitmask at any point. > Any issues modifying the sysfs interface? Perhaps something in > userspace relies on parsing those strings? See my comment about how it could use DEFINE_SHOW_ATTRIBUTE() to always show the bits as a value. But it's not even sysfs. It's debugfs. So nobody _should_ have relied on any of this anyway. Of course, "should have" is just a dream world - but the point is that if somebody complains, it's very fixable. Linus
On Tue, Sep 14, 2021 at 11:48 AM Kees Cook <keescook@chromium.org> wrote: > > FWIW, it's probably better to avoid open-coding the check. There are > helpers for shift-left too. :) I actually looked for them. But I only did so with a grep for "check_shift_overflow". Which didn't find anything. I didn't think anybody would call a shift overflow function "shl", since a right-shift by definition cannot overflow. But no complaints about using the oddly named overflow function, though - it makes it more obvious that the patch is purely about changing 'blksize' to use a bit count. Btw, these kinds of issues is exactly why I've been hardnosed about 64-bit divides for decades. 64-bit divides on 32-bit machines are *expensive*. It's why I don't like saying "just use '/' and we'll pick up the routines from libgcc". In almost all real-life cases - at least in a kernel - the full divide is unnecessary. It's almost always people being silly and lazy, and the very expensive operation can be avoided entirely (or at least minimized to something like 64/32). Linus
On Tue, Sep 14, 2021 at 11:48 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > There was never anything "loff_t" about bitmask at any point. Not 'bitmask'. It's 'blksize'. Hopefully that was obvious in context. Linus
On Tue, Sep 14, 2021 at 11:56 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Tue, Sep 14, 2021 at 11:48 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > There was never anything "loff_t" about bitmask at any point. > > Not 'bitmask'. It's 'blksize'. Hopefully that was obvious in context. Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`? -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 11:55 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Btw, these kinds of issues is exactly why I've been hardnosed about > 64-bit divides for decades. 64-bit divides on 32-bit machines are > *expensive*. It's why I don't like saying "just use '/' and we'll pick > up the routines from libgcc". I was going to ask about the history there; not to derail the thread further, but this is a question whose answer is important to me. Are the helpers from libgcc insufficient? Working through https://github.com/ClangBuiltLinux/linux/issues/1438 which all came about because LLVM's equivalent of libgcc, "compiler-rt," had a nice helper for builtin multiply with overflow check that libgcc does not. As such, llvm cannot assume compiler-rt is being linked against, so llvm must expand these inline every time. And the code in line is HUGE: https://godbolt.org/z/MM4hPGxTE. IMO we could do a much much better job on code size (and thus probably I$ performance improvements) had we just linked against the compiler runtime. Perhaps the concern is of the quality of implementations of the compiler runtime routines; that we may have arch specific implementations that are better? 64b division on 32b targets is expensive either way; I'd rather have the compiler generate a libcall than try to expand these inline. I'm not sure if it's the case, but I can't help but wonder if there are other optimization decisions being based on whether the compiler runtime is being linked against or not; it's hard for the compiler to know what will happen at link time. Vaguely reminiscent of the issues we face against using -ffreestanding. Switching that now (so that we did link in the compiler runtimes) would be a massive yak shave, for sure. > In almost all real-life cases - at least in a kernel - the full divide > is unnecessary. It's almost always people being silly and lazy, and > the very expensive operation can be avoided entirely (or at least > minimized to something like 64/32). At least when dealing in powers of two, sure. -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:10 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`? So? I'm not seeing your point. We've checked the range of it - in loff_t. So the *value* is already checked, and most definitely fits in 'unsigned long'. So __ffs() is perfectly fine. It will truncate that loff_t to a sane type. What is the problem you're trying to solve? Linus
On Tue, Sep 14, 2021 at 12:46 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Tue, Sep 14, 2021 at 12:10 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`? > > So? > > I'm not seeing your point. > > We've checked the range of it - in loff_t. > > So the *value* is already checked, and most definitely fits in 'unsigned long'. > > So __ffs() is perfectly fine. It will truncate that loff_t to a sane type. > > What is the problem you're trying to solve? Just making sure __ffs() works as expected should blksize > LONG_MAX on 32b targets. I don't see the range check you're referring to. loff_t is a long long, yeah? -- Thanks, ~Nick Desaulniers
On Tue, Sep 14, 2021 at 12:12 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Sep 14, 2021 at 11:55 AM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > Btw, these kinds of issues is exactly why I've been hardnosed about > > 64-bit divides for decades. 64-bit divides on 32-bit machines are > > *expensive*. It's why I don't like saying "just use '/' and we'll pick > > up the routines from libgcc". > > I was going to ask about the history there; not to derail the thread > further, but this is a question whose answer is important to me. > > Are the helpers from libgcc insufficient? W No. The helpers from libgcc are *overly* sufficient. The reason we strive to avoid libgcc on any relevant architectures is not because it's not sufficient, it's because it hides problems. libgcc has all these helpers for things that the kernel simply shouldn't do. So _not_ linking against it is the thing that traditionally helps us find problems, because you get things like undefined symbol '__udivdi3' or whatever. In other words, avoiding libgcc is what protects us from people doing (some) stupid things. Linus
On Tue, Sep 14, 2021 at 12:50 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Just making sure __ffs() works as expected should blksize > LONG_MAX > on 32b targets. I don't see the range check you're referring to. > loff_t is a long long, yeah? Christ Nick. Stop wasting my time. Read the patch: if (!blksize) - blksize = NBD_DEF_BLKSIZE; + blksize = 1u << NBD_DEF_BLKSIZE_BITS; if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize)) return -EINVAL; nbd->config->bytesize = bytesize; - nbd->config->blksize = blksize; + nbd->config->blksize_bits = __ffs(blksize); See that range check? Seriously, I've now replied several times to you just because you were too damn lazy to just look three lines up from the __ffs() that you reacted to, when I explicitly mentioned the range check several times, including in the original submission, and when it was RIGHT THERE IN THE PATCH IN THE ONLY PLACE THAT DID THAT __FFS. (Ok, so the lower check of range was 512, not 1024, sue me). It was all there in the diff all the time. Don't email me again about this. At least not without spending the FIVE SECONDS to look at what the hell you are emailing me about. Linus
diff --git a/include/linux/math64.h b/include/linux/math64.h index 66deb1fdc2ef..bc9c12c168d0 100644 --- a/include/linux/math64.h +++ b/include/linux/math64.h @@ -10,6 +10,9 @@ #define div64_long(x, y) div64_s64((x), (y)) #define div64_ul(x, y) div64_u64((x), (y)) +#ifndef is_signed_type +#define is_signed_type(type) (((type)(-1)) < (type)1) +#endif /** * div_u64_rem - unsigned 64bit divide with 32bit divisor with remainder @@ -111,6 +114,15 @@ extern s64 div64_s64(s64 dividend, s64 divisor); #endif /* BITS_PER_LONG */ +#define div64_x64(dividend, divisor) ({ \ + BUILD_BUG_ON_MSG(sizeof(dividend) < sizeof(u64),\ + "prefer div_x64"); \ + __builtin_choose_expr( \ + is_signed_type(typeof(dividend)), \ + div64_s64(dividend, divisor), \ + div64_u64(dividend, divisor)); \ +}) + /** * div_u64 - unsigned 64bit divide with 32bit divisor * @dividend: unsigned 64bit dividend @@ -141,6 +153,29 @@ static inline s64 div_s64(s64 dividend, s32 divisor) } #endif +#define div_x64(dividend, divisor) ({ \ + BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u32),\ + "prefer div64_x64"); \ + __builtin_choose_expr( \ + is_signed_type(typeof(dividend)), \ + div_s64(dividend, divisor), \ + div_u64(dividend, divisor)); \ +}) + +#define div_64(dividend, divisor) ({ \ + BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u64), \ + "128b div unsupported"); \ + __builtin_choose_expr( \ + __builtin_types_compatible_p(typeof(dividend), s64) || \ + __builtin_types_compatible_p(typeof(dividend), u64), \ + __builtin_choose_expr( \ + __builtin_types_compatible_p(typeof(divisor), s64) || \ + __builtin_types_compatible_p(typeof(divisor), u64), \ + div64_x64(dividend, divisor), \ + div_x64(dividend, divisor)), \ + dividend / divisor); \ +}) + u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder); #ifndef mul_u32_u32 diff --git a/include/linux/overflow.h b/include/linux/overflow.h index ef74051d5cfe..2ebdf220c184 100644 --- a/include/linux/overflow.h +++ b/include/linux/overflow.h @@ -123,8 +123,8 @@ static inline bool __must_check __must_check_overflow(bool overflow) (void) (&__a == __d); \ *__d = __a * __b; \ __builtin_constant_p(__b) ? \ - __b > 0 && __a > type_max(typeof(__a)) / __b : \ - __a > 0 && __b > type_max(typeof(__b)) / __a; \ + __b > 0 && __a > div_64(type_max(typeof(__a)), __b) : \ + __a > 0 && __b > div_64(type_max(typeof(__b)), __a); \ }) /* @@ -195,8 +195,8 @@ static inline bool __must_check __must_check_overflow(bool overflow) (void) (&__a == &__b); \ (void) (&__a == __d); \ *__d = (u64)__a * (u64)__b; \ - (__b > 0 && (__a > __tmax/__b || __a < __tmin/__b)) || \ - (__b < (typeof(__b))-1 && (__a > __tmin/__b || __a < __tmax/__b)) || \ + (__b > 0 && (__a > div_64(__tmax, __b) || __a < div_64(__tmin, __b))) || \ + (__b < (typeof(__b))-1 && (__a > div_64(__tmin, __b) || __a < div_64(__tmax, __b))) || \ (__b == (typeof(__b))-1 && __a == __tmin); \ })
commit fad7cd3310db ("nbd: add the check to prevent overflow in __nbd_ioctl()") raised an issue from the fallback helpers added in commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and add fallback code") ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined! As Stephen Rothwell notes: The added check_mul_overflow() call is being passed 64 bit values. COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see include/linux/overflow.h). Specifically, the helpers for checking whether the results of a multiplication overflowed (__unsigned_mul_overflow, __signed_add_overflow) use the division operator when !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b operands on 32b hosts. This was fixed upstream by commit 76ae847497bc ("Documentation: raise minimum supported version of GCC to 5.1") which is not suitable to be backported to stable; I didn't have this patch ready in time. Cc: stable@vger.kernel.org Cc: Arnd Bergmann <arnd@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Pavel Machek <pavel@ucw.cz> Link: https://github.com/ClangBuiltLinux/linux/issues/1438 Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/ Link: https://lore.kernel.org/lkml/20210910234047.1019925-1-ndesaulniers@google.com/ Fixes: f0907827a8a9 ("compiler.h: enable builtin overflow checkers and add fallback code") Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> --- This kind of generic meta-programming in C and my lack of experience in doing so makes me very uncomfortable (and slightly ashamed) to send this. I would appreciate careful review and feedback. I would appreciate if Naresh can test this with GCC 4.9, which I don't have handy. Linus also suggested I look into the use of _Generic; I haven't evaluated that approach yet, but I felt like posting this early for feedback. include/linux/math64.h | 35 +++++++++++++++++++++++++++++++++++ include/linux/overflow.h | 8 ++++---- 2 files changed, 39 insertions(+), 4 deletions(-) -- 2.33.0.309.g3052b89438-goog