diff mbox series

[5.10] overflow.h: use new generic division helpers to avoid / operator

Message ID 20210913203201.1844253-1-ndesaulniers@google.com
State New
Headers show
Series [5.10] overflow.h: use new generic division helpers to avoid / operator | expand

Commit Message

Nick Desaulniers Sept. 13, 2021, 8:32 p.m. UTC
commit fad7cd3310db ("nbd: add the check to prevent overflow in
__nbd_ioctl()") raised an issue from the fallback helpers added in
commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and
add fallback code")

ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined!

As Stephen Rothwell notes:
  The added check_mul_overflow() call is being passed 64 bit values.
  COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see
  include/linux/overflow.h).

Specifically, the helpers for checking whether the results of a
multiplication overflowed (__unsigned_mul_overflow,
__signed_add_overflow) use the division operator when
!COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW.  This is problematic for 64b
operands on 32b hosts.

This was fixed upstream by
commit 76ae847497bc ("Documentation: raise minimum supported version of
GCC to 5.1")
which is not suitable to be backported to stable; I didn't have this
patch ready in time.

Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Pavel Machek <pavel@ucw.cz>
Link: https://github.com/ClangBuiltLinux/linux/issues/1438
Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/lkml/20210910234047.1019925-1-ndesaulniers@google.com/
Fixes: f0907827a8a9 ("compiler.h: enable builtin overflow checkers and
add fallback code")
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

---
This kind of generic meta-programming in C and my lack of experience in
doing so makes me very uncomfortable (and slightly ashamed) to send
this. I would appreciate careful review and feedback. I would appreciate
if Naresh can test this with GCC 4.9, which I don't have handy.

Linus also suggested I look into the use of _Generic; I haven't
evaluated that approach yet, but I felt like posting this early for
feedback.

 include/linux/math64.h   | 35 +++++++++++++++++++++++++++++++++++
 include/linux/overflow.h |  8 ++++----
 2 files changed, 39 insertions(+), 4 deletions(-)

-- 
2.33.0.309.g3052b89438-goog

Comments

Rasmus Villemoes Sept. 13, 2021, 9:05 p.m. UTC | #1
On 13/09/2021 22.32, Nick Desaulniers wrote:
> commit fad7cd3310db ("nbd: add the check to prevent overflow in

> __nbd_ioctl()") raised an issue from the fallback helpers added in

> commit f0907827a8a9 ("compiler.h: enable builtin overflow checkers and

> add fallback code")

> 

> ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined!

> 

> As Stephen Rothwell notes:

>   The added check_mul_overflow() call is being passed 64 bit values.

>   COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see

>   include/linux/overflow.h).

> 

> Specifically, the helpers for checking whether the results of a

> multiplication overflowed (__unsigned_mul_overflow,

> __signed_add_overflow) use the division operator when

> !COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW.  This is problematic for 64b

> operands on 32b hosts.

> 

> This was fixed upstream by

> commit 76ae847497bc ("Documentation: raise minimum supported version of

> GCC to 5.1")

> which is not suitable to be backported to stable; I didn't have this

> patch ready in time.

> 

> Cc: stable@vger.kernel.org

> Cc: Arnd Bergmann <arnd@kernel.org>

> Cc: Kees Cook <keescook@chromium.org>

> Cc: Linus Torvalds <torvalds@linux-foundation.org>

> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>

> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>

> Reported-by: Nathan Chancellor <nathan@kernel.org>

> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>

> Suggested-by: Pavel Machek <pavel@ucw.cz>

> Link: https://github.com/ClangBuiltLinux/linux/issues/1438

> Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/

> Link: https://lore.kernel.org/lkml/20210910234047.1019925-1-ndesaulniers@google.com/

> Fixes: f0907827a8a9 ("compiler.h: enable builtin overflow checkers and

> add fallback code")

> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

> ---

> This kind of generic meta-programming in C and my lack of experience in

> doing so makes me very uncomfortable (and slightly ashamed) to send

> this. I would appreciate careful review and feedback. I would appreciate

> if Naresh can test this with GCC 4.9, which I don't have handy.

> 

> Linus also suggested I look into the use of _Generic; I haven't

> evaluated that approach yet, but I felt like posting this early for

> feedback.

> 

>  include/linux/math64.h   | 35 +++++++++++++++++++++++++++++++++++

>  include/linux/overflow.h |  8 ++++----

>  2 files changed, 39 insertions(+), 4 deletions(-)

> 

> diff --git a/include/linux/math64.h b/include/linux/math64.h

> index 66deb1fdc2ef..bc9c12c168d0 100644

> --- a/include/linux/math64.h

> +++ b/include/linux/math64.h

> @@ -10,6 +10,9 @@

>  

>  #define div64_long(x, y) div64_s64((x), (y))

>  #define div64_ul(x, y)   div64_u64((x), (y))

> +#ifndef is_signed_type

> +#define is_signed_type(type)       (((type)(-1)) < (type)1)

> +#endif

>  

>  /**

>   * div_u64_rem - unsigned 64bit divide with 32bit divisor with remainder

> @@ -111,6 +114,15 @@ extern s64 div64_s64(s64 dividend, s64 divisor);

>  

>  #endif /* BITS_PER_LONG */


Some comments on when and how to use this would be nice (not just build
bugs when used wrong).

> +#define div64_x64(dividend, divisor) ({			\

> +	BUILD_BUG_ON_MSG(sizeof(dividend) < sizeof(u64),\

> +	                 "prefer div_x64");		\

> +	__builtin_choose_expr(				\

> +		is_signed_type(typeof(dividend)),	\

> +		div64_s64(dividend, divisor),		\

> +		div64_u64(dividend, divisor));		\

> +})

> +

>  /**

>   * div_u64 - unsigned 64bit divide with 32bit divisor

>   * @dividend: unsigned 64bit dividend

> @@ -141,6 +153,29 @@ static inline s64 div_s64(s64 dividend, s32 divisor)

>  }

>  #endif

>  

> +#define div_x64(dividend, divisor) ({			\

> +	BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u32),\

> +	                 "prefer div64_x64");		\

> +	__builtin_choose_expr(				\

> +		is_signed_type(typeof(dividend)),	\

> +		div_s64(dividend, divisor),		\

> +		div_u64(dividend, divisor));		\

> +})

> +

> +#define div_64(dividend, divisor) ({						\

> +	BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u64),			\

> +	                 "128b div unsupported");				\

> +	__builtin_choose_expr(							\

> +		__builtin_types_compatible_p(typeof(dividend), s64) ||		\

> +		__builtin_types_compatible_p(typeof(dividend), u64),		\


You can save a bit of typing using __same_type(dividend, s64) - it's a
nice property of typeof() that it's idempotent when applied to a type
name. _Generic would probably also do, but I don't think it would save
that much, if anything, here.

>  u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder);

>  

>  #ifndef mul_u32_u32

> diff --git a/include/linux/overflow.h b/include/linux/overflow.h

> index ef74051d5cfe..2ebdf220c184 100644

> --- a/include/linux/overflow.h

> +++ b/include/linux/overflow.h

> @@ -123,8 +123,8 @@ static inline bool __must_check __must_check_overflow(bool overflow)

>  	(void) (&__a == __d);				\

>  	*__d = __a * __b;				\

>  	__builtin_constant_p(__b) ?			\

> -	  __b > 0 && __a > type_max(typeof(__a)) / __b : \

> -	  __a > 0 && __b > type_max(typeof(__b)) / __a;	 \

> +	  __b > 0 && __a > div_64(type_max(typeof(__a)), __b) :	\

> +	  __a > 0 && __b > div_64(type_max(typeof(__b)), __a);	\

>  })

>  

>  /*

> @@ -195,8 +195,8 @@ static inline bool __must_check __must_check_overflow(bool overflow)

>  	(void) (&__a == &__b);						\

>  	(void) (&__a == __d);						\

>  	*__d = (u64)__a * (u64)__b;					\

> -	(__b > 0   && (__a > __tmax/__b || __a < __tmin/__b)) ||	\

> -	(__b < (typeof(__b))-1  && (__a > __tmin/__b || __a < __tmax/__b)) || \

> +	(__b > 0 && (__a > div_64(__tmax, __b) || __a < div_64(__tmin, __b))) ||		\

> +	(__b < (typeof(__b))-1 && (__a > div_64(__tmin, __b) || __a < div_64(__tmax, __b))) ||	\

>  	(__b == (typeof(__b))-1 && __a == __tmin);			\

>  })


I had actually forgotten what horrors lay hidden in these fallback
macros, I just knew they needed a wooden stake sooner or later. Now you
made me look at this right before bedtime :(

So, I'd sleep a little better if we got the 64 bit tests commented back
in in test_overflow.c, and [assuming that the above would actually make
that file build with gcc 4.9] that patch also backported to 5.10, so we
had some confidence that the whole house of cards is actually solid.

Rasmus
Kees Cook Sept. 14, 2021, 5:32 p.m. UTC | #2
On Mon, Sep 13, 2021 at 11:05:02PM +0200, Rasmus Villemoes wrote:
> So, I'd sleep a little better if we got the 64 bit tests commented back

> in in test_overflow.c, and [assuming that the above would actually make

> that file build with gcc 4.9] that patch also backported to 5.10, so we

> had some confidence that the whole house of cards is actually solid.


Yeah, I'm all for that too.

-- 
Kees Cook
Linus Torvalds Sept. 14, 2021, 6:14 p.m. UTC | #3
On Tue, Sep 14, 2021 at 10:32 AM Kees Cook <keescook@chromium.org> wrote:
>

> On Mon, Sep 13, 2021 at 11:05:02PM +0200, Rasmus Villemoes wrote:

> > So, I'd sleep a little better if we got the 64 bit tests commented back

> > in in test_overflow.c, and [assuming that the above would actually make

> > that file build with gcc 4.9] that patch also backported to 5.10, so we

> > had some confidence that the whole house of cards is actually solid.

>

> Yeah, I'm all for that too.


Hmm.

Another thing that might be worth doing is to just say "don't do that".

IOW, don't do silly overflow checks with 64-bit things on 32-bit architectures.

Apparently the first and only use case of this is the ndb code, and
the fact is, that ndb code is just being truly horrendously stupid.

That 'blksize' thing shouldn't be a blocksize, it should be a
blockshift instead. The code to set blksize already expressly verifies
that it's a power-of-2, and less or equial to PAGE_SIZE so this:

        loff_t blksize;

is just plain silly. Doubly silly. Why is it a 'loff_t'? It should
never have been a loff_t in the first place, and honestly, it really
should have been a shift count that fits in a few bits.

All this pain could have been trivially avoided with just people
writing better code, knowing that multiplies and divides are
expensive, and that shift counts are small and cheap.

               Linus
Linus Torvalds Sept. 14, 2021, 6:30 p.m. UTC | #4
On Tue, Sep 14, 2021 at 11:14 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> All this pain could have been trivially avoided with just people

> writing better code, knowing that multiplies and divides are

> expensive, and that shift counts are small and cheap.


IOW, maybe the fix is just this attached trivial patch.

I didn't bother to change the order of the 'struct ndb_config'
structure. It would pack better if you put the (now 32-bit)
blksize_bits field next to the 'atomic_t' field, I think. But I wanted
to just see how a minimal patch looks.

I did make the debugfs interface reflect the change to blocksize_bits,
so this is visible in user space. But it's debugfs.

If people care, it could be made to use a DEFINE_SHOW_ATTRIBUTE()
function the way it already does for 'flags', so that's not a
fundamental issue, I just didn't bother.

Hmm?

Btw, I really think more of the block layer should perhaps think about
use bit shifts more, not expanded values.  Can things like the queue
'discard_alignment' really be non-powers-of-two?

          Linus
drivers/block/nbd.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..9e12edf892d7 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -97,13 +97,18 @@ struct nbd_config {
 
 	atomic_t recv_threads;
 	wait_queue_head_t recv_wq;
-	loff_t blksize;
+	unsigned blksize_bits;
 	loff_t bytesize;
 #if IS_ENABLED(CONFIG_DEBUG_FS)
 	struct dentry *dbg_dir;
 #endif
 };
 
+static inline unsigned int nbd_blksize(struct nbd_config *config)
+{
+	return 1u << config->blksize_bits;
+}
+
 struct nbd_device {
 	struct blk_mq_tag_set tag_set;
 
@@ -146,7 +151,7 @@ static struct dentry *nbd_dbg_dir;
 
 #define NBD_MAGIC 0x68797548
 
-#define NBD_DEF_BLKSIZE 1024
+#define NBD_DEF_BLKSIZE_BITS 10
 
 static unsigned int nbds_max = 16;
 static int max_part = 16;
@@ -317,12 +322,12 @@ static int nbd_set_size(struct nbd_device *nbd, loff_t bytesize,
 		loff_t blksize)
 {
 	if (!blksize)
-		blksize = NBD_DEF_BLKSIZE;
+		blksize = 1u << NBD_DEF_BLKSIZE_BITS;
 	if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize))
 		return -EINVAL;
 
 	nbd->config->bytesize = bytesize;
-	nbd->config->blksize = blksize;
+	nbd->config->blksize_bits = __ffs(blksize);
 
 	if (!nbd->task_recv)
 		return 0;
@@ -1337,7 +1342,7 @@ static int nbd_start_device(struct nbd_device *nbd)
 		args->index = i;
 		queue_work(nbd->recv_workq, &args->work);
 	}
-	return nbd_set_size(nbd, config->bytesize, config->blksize);
+	return nbd_set_size(nbd, config->bytesize, nbd_blksize(config));
 }
 
 static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *bdev)
@@ -1406,11 +1411,12 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 	case NBD_SET_BLKSIZE:
 		return nbd_set_size(nbd, config->bytesize, arg);
 	case NBD_SET_SIZE:
-		return nbd_set_size(nbd, arg, config->blksize);
+		return nbd_set_size(nbd, arg, nbd_blksize(config));
 	case NBD_SET_SIZE_BLOCKS:
-		if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize))
+		bytesize = arg << config->blksize_bits;
+		if (bytesize >> config->blksize_bits != arg)
 			return -EINVAL;
-		return nbd_set_size(nbd, bytesize, config->blksize);
+		return nbd_set_size(nbd, bytesize, nbd_blksize(config));
 	case NBD_SET_TIMEOUT:
 		nbd_set_cmd_timeout(nbd, arg);
 		return 0;
@@ -1476,7 +1482,7 @@ static struct nbd_config *nbd_alloc_config(void)
 	atomic_set(&config->recv_threads, 0);
 	init_waitqueue_head(&config->recv_wq);
 	init_waitqueue_head(&config->conn_wait);
-	config->blksize = NBD_DEF_BLKSIZE;
+	config->blksize_bits = NBD_DEF_BLKSIZE_BITS;
 	atomic_set(&config->live_connections, 0);
 	try_module_get(THIS_MODULE);
 	return config;
@@ -1604,7 +1610,7 @@ static int nbd_dev_dbg_init(struct nbd_device *nbd)
 	debugfs_create_file("tasks", 0444, dir, nbd, &nbd_dbg_tasks_fops);
 	debugfs_create_u64("size_bytes", 0444, dir, &config->bytesize);
 	debugfs_create_u32("timeout", 0444, dir, &nbd->tag_set.timeout);
-	debugfs_create_u64("blocksize", 0444, dir, &config->blksize);
+	debugfs_create_u32("blocksize_bits", 0444, dir, &config->blksize_bits);
 	debugfs_create_file("flags", 0444, dir, nbd, &nbd_dbg_flags_fops);
 
 	return 0;
@@ -1826,7 +1832,7 @@ nbd_device_policy[NBD_DEVICE_ATTR_MAX + 1] = {
 static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd)
 {
 	struct nbd_config *config = nbd->config;
-	u64 bsize = config->blksize;
+	u64 bsize = nbd_blksize(config);
 	u64 bytes = config->bytesize;
 
 	if (info->attrs[NBD_ATTR_SIZE_BYTES])
@@ -1835,7 +1841,7 @@ static int nbd_genl_size_set(struct genl_info *info, struct nbd_device *nbd)
 	if (info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES])
 		bsize = nla_get_u64(info->attrs[NBD_ATTR_BLOCK_SIZE_BYTES]);
 
-	if (bytes != config->bytesize || bsize != config->blksize)
+	if (bytes != config->bytesize || bsize != nbd_blksize(config))
 		return nbd_set_size(nbd, bytes, bsize);
 	return 0;
 }
Nick Desaulniers Sept. 14, 2021, 6:44 p.m. UTC | #5
On Tue, Sep 14, 2021 at 11:30 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> On Tue, Sep 14, 2021 at 11:14 AM Linus Torvalds

> <torvalds@linux-foundation.org> wrote:

> >

> > All this pain could have been trivially avoided with just people

> > writing better code, knowing that multiplies and divides are

> > expensive, and that shift counts are small and cheap.

>

> IOW, maybe the fix is just this attached trivial patch.

>

> I didn't bother to change the order of the 'struct ndb_config'

> structure. It would pack better if you put the (now 32-bit)

> blksize_bits field next to the 'atomic_t' field, I think. But I wanted

> to just see how a minimal patch looks.

>

> I did make the debugfs interface reflect the change to blocksize_bits,

> so this is visible in user space. But it's debugfs.

>

> If people care, it could be made to use a DEFINE_SHOW_ATTRIBUTE()

> function the way it already does for 'flags', so that's not a

> fundamental issue, I just didn't bother.

>

> Hmm?

>

> Btw, I really think more of the block layer should perhaps think about

> use bit shifts more, not expanded values.  Can things like the queue

> 'discard_alignment' really be non-powers-of-two?

>

>           Linus


Any issues passing an loff_t (aka long long) to __ffs which expects an
unsigned long for ilp32 targets? (I hate the whole family of
ffs()...why did ffs() ever accept just an int?!)

Any issues modifying the sysfs interface? Perhaps something in
userspace relies on parsing those strings?

Other than that LGTM, and I like your new overflow check. :)
-- 
Thanks,
~Nick Desaulniers
Kees Cook Sept. 14, 2021, 6:47 p.m. UTC | #6
On Tue, Sep 14, 2021 at 11:30:03AM -0700, Linus Torvalds wrote:
>  	case NBD_SET_SIZE_BLOCKS:

> -		if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize))

> +		bytesize = arg << config->blksize_bits;

> +		if (bytesize >> config->blksize_bits != arg)

>  			return -EINVAL;


FWIW, it's probably better to avoid open-coding the check. There are
helpers for shift-left too. :)

+		if (check_shl_overflow(arg, config->blksize_bits, &bytesize))

-- 
Kees Cook
Linus Torvalds Sept. 14, 2021, 6:48 p.m. UTC | #7
On Tue, Sep 14, 2021 at 11:45 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>

> Any issues passing an loff_t (aka long long) to __ffs which expects an

> unsigned long for ilp32 targets?


No.

We literally _just_ checked that the value is a power-of-two, and that
it's in the range [1024, PAGE_SIZE].

There was never anything "loff_t" about bitmask at any point.

> Any issues modifying the sysfs interface? Perhaps something in

> userspace relies on parsing those strings?


See my comment about how it could use DEFINE_SHOW_ATTRIBUTE() to
always show the bits as a value.

But it's not even sysfs. It's debugfs. So nobody _should_ have relied
on any of this anyway.

Of course, "should have" is just a dream world - but the point is that
if somebody complains, it's very fixable.

           Linus
Linus Torvalds Sept. 14, 2021, 6:54 p.m. UTC | #8
On Tue, Sep 14, 2021 at 11:48 AM Kees Cook <keescook@chromium.org> wrote:
>

> FWIW, it's probably better to avoid open-coding the check. There are

> helpers for shift-left too. :)


I actually looked for them.

But I only did so with a grep for "check_shift_overflow".

Which didn't find anything.

I didn't think anybody would call a shift overflow function "shl",
since a right-shift by definition cannot overflow.

But no complaints about using the oddly named overflow function,
though - it makes it more obvious that the patch is purely about
changing 'blksize' to use a bit count.

Btw, these kinds of issues is exactly why I've been hardnosed about
64-bit divides for decades. 64-bit divides on 32-bit machines are
*expensive*. It's why I don't like saying "just use '/' and we'll pick
up the routines from libgcc".

In almost all real-life cases - at least in a kernel - the full divide
is unnecessary. It's almost always people being silly and lazy, and
the very expensive operation can be avoided entirely (or at least
minimized to something like 64/32).

           Linus
Linus Torvalds Sept. 14, 2021, 6:56 p.m. UTC | #9
On Tue, Sep 14, 2021 at 11:48 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> There was never anything "loff_t" about bitmask at any point.


Not 'bitmask'. It's 'blksize'. Hopefully that was obvious in context.

               Linus
Nick Desaulniers Sept. 14, 2021, 7:10 p.m. UTC | #10
On Tue, Sep 14, 2021 at 11:56 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> On Tue, Sep 14, 2021 at 11:48 AM Linus Torvalds

> <torvalds@linux-foundation.org> wrote:

> >

> > There was never anything "loff_t" about bitmask at any point.

>

> Not 'bitmask'. It's 'blksize'. Hopefully that was obvious in context.


Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`?
-- 
Thanks,
~Nick Desaulniers
Nick Desaulniers Sept. 14, 2021, 7:12 p.m. UTC | #11
On Tue, Sep 14, 2021 at 11:55 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> Btw, these kinds of issues is exactly why I've been hardnosed about

> 64-bit divides for decades. 64-bit divides on 32-bit machines are

> *expensive*. It's why I don't like saying "just use '/' and we'll pick

> up the routines from libgcc".


I was going to ask about the history there; not to derail the thread
further, but this is a question whose answer is important to me.

Are the helpers from libgcc insufficient?  Working through
https://github.com/ClangBuiltLinux/linux/issues/1438 which all came
about because LLVM's equivalent of libgcc, "compiler-rt," had a nice
helper for builtin multiply with overflow check that libgcc does not.
As such, llvm cannot assume compiler-rt is being linked against, so
llvm must expand these inline every time.  And the code in line is
HUGE: https://godbolt.org/z/MM4hPGxTE.  IMO we could do a much much
better job on code size (and thus probably I$ performance
improvements) had we just linked against the compiler runtime.

Perhaps the concern is of the quality of implementations of the
compiler runtime routines; that we may have arch specific
implementations that are better? 64b division on 32b targets is
expensive either way; I'd rather have the compiler generate a libcall
than try to expand these inline.  I'm not sure if it's the case, but I
can't help but wonder if there are other optimization decisions being
based on whether the compiler runtime is being linked against or not;
it's hard for the compiler to know what will happen at link time.
Vaguely reminiscent of the issues we face against using
-ffreestanding.

Switching that now (so that we did link in the compiler runtimes)
would be a massive yak shave, for sure.

> In almost all real-life cases - at least in a kernel - the full divide

> is unnecessary. It's almost always people being silly and lazy, and

> the very expensive operation can be avoided entirely (or at least

> minimized to something like 64/32).


At least when dealing in powers of two, sure.
-- 
Thanks,
~Nick Desaulniers
Linus Torvalds Sept. 14, 2021, 7:45 p.m. UTC | #12
On Tue, Sep 14, 2021 at 12:10 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>

> Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`?


So?

I'm not seeing your point.

We've checked the range of it - in loff_t.

So the *value* is already checked, and most definitely fits in 'unsigned long'.

So __ffs() is perfectly fine. It will truncate that loff_t to a sane type.

What is the problem you're trying to solve?

         Linus
Nick Desaulniers Sept. 14, 2021, 7:50 p.m. UTC | #13
On Tue, Sep 14, 2021 at 12:46 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

> On Tue, Sep 14, 2021 at 12:10 PM Nick Desaulniers

> <ndesaulniers@google.com> wrote:

> >

> > Isn't the parameter `blksize` of `nbd_set_size` declared as `loff_t`?

>

> So?

>

> I'm not seeing your point.

>

> We've checked the range of it - in loff_t.

>

> So the *value* is already checked, and most definitely fits in 'unsigned long'.

>

> So __ffs() is perfectly fine. It will truncate that loff_t to a sane type.

>

> What is the problem you're trying to solve?


Just making sure __ffs() works as expected should blksize > LONG_MAX
on 32b targets.  I don't see the range check you're referring to.
loff_t is a long long, yeah?
-- 
Thanks,
~Nick Desaulniers
Linus Torvalds Sept. 14, 2021, 7:52 p.m. UTC | #14
On Tue, Sep 14, 2021 at 12:12 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>

> On Tue, Sep 14, 2021 at 11:55 AM Linus Torvalds

> <torvalds@linux-foundation.org> wrote:

> >

> > Btw, these kinds of issues is exactly why I've been hardnosed about

> > 64-bit divides for decades. 64-bit divides on 32-bit machines are

> > *expensive*. It's why I don't like saying "just use '/' and we'll pick

> > up the routines from libgcc".

>

> I was going to ask about the history there; not to derail the thread

> further, but this is a question whose answer is important to me.

>

> Are the helpers from libgcc insufficient?  W


No. The helpers from libgcc are *overly* sufficient.

The reason we strive to avoid libgcc on any relevant architectures is
not because it's not sufficient, it's because it hides problems.

libgcc has all these helpers for things that the kernel simply shouldn't do.

So _not_ linking against it is the thing that traditionally helps us
find problems, because you get things like

  undefined symbol '__udivdi3'

or whatever.

In other words, avoiding libgcc is what protects us from people doing
(some) stupid things.

          Linus
Linus Torvalds Sept. 14, 2021, 7:57 p.m. UTC | #15
On Tue, Sep 14, 2021 at 12:50 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>

> Just making sure __ffs() works as expected should blksize > LONG_MAX

> on 32b targets.  I don't see the range check you're referring to.

> loff_t is a long long, yeah?


Christ Nick.

Stop wasting my time.

Read the patch:

        if (!blksize)
-               blksize = NBD_DEF_BLKSIZE;
+               blksize = 1u << NBD_DEF_BLKSIZE_BITS;
        if (blksize < 512 || blksize > PAGE_SIZE || !is_power_of_2(blksize))
                return -EINVAL;

        nbd->config->bytesize = bytesize;
-       nbd->config->blksize = blksize;
+       nbd->config->blksize_bits = __ffs(blksize);

See that range check?

Seriously, I've now replied several times to you just because you were
too damn lazy to just look three lines up from the __ffs() that you
reacted to, when I explicitly mentioned the range check several times,
including in the original submission, and when it was RIGHT THERE IN
THE PATCH IN THE ONLY PLACE THAT DID THAT __FFS.

(Ok, so the lower check of range was 512, not 1024, sue me).

It was all there in the diff all the time.

Don't email me again about this. At least not without spending the
FIVE SECONDS to look at what the hell you are emailing me about.

           Linus
diff mbox series

Patch

diff --git a/include/linux/math64.h b/include/linux/math64.h
index 66deb1fdc2ef..bc9c12c168d0 100644
--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -10,6 +10,9 @@ 
 
 #define div64_long(x, y) div64_s64((x), (y))
 #define div64_ul(x, y)   div64_u64((x), (y))
+#ifndef is_signed_type
+#define is_signed_type(type)       (((type)(-1)) < (type)1)
+#endif
 
 /**
  * div_u64_rem - unsigned 64bit divide with 32bit divisor with remainder
@@ -111,6 +114,15 @@  extern s64 div64_s64(s64 dividend, s64 divisor);
 
 #endif /* BITS_PER_LONG */
 
+#define div64_x64(dividend, divisor) ({			\
+	BUILD_BUG_ON_MSG(sizeof(dividend) < sizeof(u64),\
+	                 "prefer div_x64");		\
+	__builtin_choose_expr(				\
+		is_signed_type(typeof(dividend)),	\
+		div64_s64(dividend, divisor),		\
+		div64_u64(dividend, divisor));		\
+})
+
 /**
  * div_u64 - unsigned 64bit divide with 32bit divisor
  * @dividend: unsigned 64bit dividend
@@ -141,6 +153,29 @@  static inline s64 div_s64(s64 dividend, s32 divisor)
 }
 #endif
 
+#define div_x64(dividend, divisor) ({			\
+	BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u32),\
+	                 "prefer div64_x64");		\
+	__builtin_choose_expr(				\
+		is_signed_type(typeof(dividend)),	\
+		div_s64(dividend, divisor),		\
+		div_u64(dividend, divisor));		\
+})
+
+#define div_64(dividend, divisor) ({						\
+	BUILD_BUG_ON_MSG(sizeof(dividend) > sizeof(u64),			\
+	                 "128b div unsupported");				\
+	__builtin_choose_expr(							\
+		__builtin_types_compatible_p(typeof(dividend), s64) ||		\
+		__builtin_types_compatible_p(typeof(dividend), u64),		\
+		__builtin_choose_expr(						\
+			__builtin_types_compatible_p(typeof(divisor), s64) ||	\
+			__builtin_types_compatible_p(typeof(divisor), u64),	\
+			div64_x64(dividend, divisor),				\
+			div_x64(dividend, divisor)),				\
+		dividend / divisor);						\
+})
+
 u32 iter_div_u64_rem(u64 dividend, u32 divisor, u64 *remainder);
 
 #ifndef mul_u32_u32
diff --git a/include/linux/overflow.h b/include/linux/overflow.h
index ef74051d5cfe..2ebdf220c184 100644
--- a/include/linux/overflow.h
+++ b/include/linux/overflow.h
@@ -123,8 +123,8 @@  static inline bool __must_check __must_check_overflow(bool overflow)
 	(void) (&__a == __d);				\
 	*__d = __a * __b;				\
 	__builtin_constant_p(__b) ?			\
-	  __b > 0 && __a > type_max(typeof(__a)) / __b : \
-	  __a > 0 && __b > type_max(typeof(__b)) / __a;	 \
+	  __b > 0 && __a > div_64(type_max(typeof(__a)), __b) :	\
+	  __a > 0 && __b > div_64(type_max(typeof(__b)), __a);	\
 })
 
 /*
@@ -195,8 +195,8 @@  static inline bool __must_check __must_check_overflow(bool overflow)
 	(void) (&__a == &__b);						\
 	(void) (&__a == __d);						\
 	*__d = (u64)__a * (u64)__b;					\
-	(__b > 0   && (__a > __tmax/__b || __a < __tmin/__b)) ||	\
-	(__b < (typeof(__b))-1  && (__a > __tmin/__b || __a < __tmax/__b)) || \
+	(__b > 0 && (__a > div_64(__tmax, __b) || __a < div_64(__tmin, __b))) ||		\
+	(__b < (typeof(__b))-1 && (__a > div_64(__tmin, __b) || __a < div_64(__tmax, __b))) ||	\
 	(__b == (typeof(__b))-1 && __a == __tmin);			\
 })