diff mbox series

[5/5] cpumask: fix comment of cpumask_xxx

Message ID 20230306160651.2016767-6-vernon2gm@gmail.com
State New
Headers show
Series fix call cpumask_next() if no further cpus set | expand

Commit Message

Vernon Yang March 6, 2023, 4:06 p.m. UTC
After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
optimizations"), the cpumask size is divided into three different case,
so fix comment of cpumask_xxx correctly.

Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
---
 include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

Comments

Yury Norov March 6, 2023, 4:39 p.m. UTC | #1
On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations"), the cpumask size is divided into three different case,
> so fix comment of cpumask_xxx correctly.
> 
> Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> ---
>  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
>  1 file changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 8fbe76607965..248bdb1c50dc 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
>   * cpumask_first - get the first cpu in a cpumask
>   * @srcp: the cpumask pointer
>   *
> - * Returns >= nr_cpu_ids if no cpus set.
> + * Returns >= small_cpumask_bits if no cpus set.

There's no such thing like small_cpumask_bits. Here and everywhere,
nr_cpu_ids must be used.

Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
must be like that for all users even now.

nr_cpumask_bits must be considered as internal cpumask parameter and
never referenced outside of cpumask code.

Thansk,
Yury
Yury Norov March 6, 2023, 4:54 p.m. UTC | #2
On Mon, Mar 06, 2023 at 05:44:41PM +0100, Jason A. Donenfeld wrote:
> On Mon, Mar 6, 2023 at 5:39 PM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > > optimizations"), the cpumask size is divided into three different case,
> > > so fix comment of cpumask_xxx correctly.
> > >
> > > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > > ---
> > >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> > >  1 file changed, 23 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > index 8fbe76607965..248bdb1c50dc 100644
> > > --- a/include/linux/cpumask.h
> > > +++ b/include/linux/cpumask.h
> > > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> > >   * cpumask_first - get the first cpu in a cpumask
> > >   * @srcp: the cpumask pointer
> > >   *
> > > - * Returns >= nr_cpu_ids if no cpus set.
> > > + * Returns >= small_cpumask_bits if no cpus set.
> >
> > There's no such thing like small_cpumask_bits. Here and everywhere,
> > nr_cpu_ids must be used.
> >
> > Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> > must be like that for all users even now.
> >
> > nr_cpumask_bits must be considered as internal cpumask parameter and
> > never referenced outside of cpumask code.
> 
> What's the right thing I should do, then, for wireguard's usage and
> for random.c's usage? It sounds like you object to this patchset, but
> if the problem is real, it sounds like I should at least fix the two
> cases I maintain. What's the right check?

Everywhere outside of cpumasks internals use (cpu < nr_cpu_ids) to
check if the cpu is in a valid range, like:

cpu = cpumask_first(cpus);
if (cpu >= nr_cpu_ids)
        pr_err("There's no cpus");
Jason A. Donenfeld March 6, 2023, 5:04 p.m. UTC | #3
On Mon, Mar 6, 2023 at 5:54 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Mon, Mar 06, 2023 at 05:44:41PM +0100, Jason A. Donenfeld wrote:
> > On Mon, Mar 6, 2023 at 5:39 PM Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > On Tue, Mar 07, 2023 at 12:06:51AM +0800, Vernon Yang wrote:
> > > > After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> > > > optimizations"), the cpumask size is divided into three different case,
> > > > so fix comment of cpumask_xxx correctly.
> > > >
> > > > Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
> > > > ---
> > > >  include/linux/cpumask.h | 46 ++++++++++++++++++++---------------------
> > > >  1 file changed, 23 insertions(+), 23 deletions(-)
> > > >
> > > > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > > > index 8fbe76607965..248bdb1c50dc 100644
> > > > --- a/include/linux/cpumask.h
> > > > +++ b/include/linux/cpumask.h
> > > > @@ -155,7 +155,7 @@ static __always_inline unsigned int cpumask_check(unsigned int cpu)
> > > >   * cpumask_first - get the first cpu in a cpumask
> > > >   * @srcp: the cpumask pointer
> > > >   *
> > > > - * Returns >= nr_cpu_ids if no cpus set.
> > > > + * Returns >= small_cpumask_bits if no cpus set.
> > >
> > > There's no such thing like small_cpumask_bits. Here and everywhere,
> > > nr_cpu_ids must be used.
> > >
> > > Actually, before 596ff4a09b89 nr_cpumask_bits was deprecated, and it
> > > must be like that for all users even now.
> > >
> > > nr_cpumask_bits must be considered as internal cpumask parameter and
> > > never referenced outside of cpumask code.
> >
> > What's the right thing I should do, then, for wireguard's usage and
> > for random.c's usage? It sounds like you object to this patchset, but
> > if the problem is real, it sounds like I should at least fix the two
> > cases I maintain. What's the right check?
>
> Everywhere outside of cpumasks internals use (cpu < nr_cpu_ids) to
> check if the cpu is in a valid range, like:
>
> cpu = cpumask_first(cpus);
> if (cpu >= nr_cpu_ids)
>         pr_err("There's no cpus");

Oh, okay, so the ones for wireguard and random.c in this series are
correct then? If so, could you give a Reviewed-by:, and then I'll
queue those up in my respective trees.

Jason
Linus Torvalds March 6, 2023, 5:29 p.m. UTC | #4
On Mon, Mar 6, 2023 at 8:07 AM Vernon Yang <vernon2gm@gmail.com> wrote:
>
> After commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask
> optimizations"), the cpumask size is divided into three different case,
> so fix comment of cpumask_xxx correctly.

No no.

Those three cases are meant to be entirely internal optimizations.
They are literally just "preferred sizes".

The correct thing to do is always that

   * Returns >= nr_cpu_ids if no cpus set.

because nr_cpu_ids is always the *smallest* of the access sizes.

That's exactly why it's a ">=". The CPU mask stuff has always
historically potentially used a different size than the actual
nr_cpu_ids, in that it could do word-sized scans even when the machine
might only have a smaller set of CPUs.

So the whole "small" vs "large" should be seen entirely internal to
cpumask.h. We should not expose it outside (sadly, that already
happened with "nr_cpumask_size", which also was that kind of thing.

So no, this patch is wrong. If anything, the comments should be strengthened.

Of course, right now Guenter seems to be reporting a problem with that
optimization, so unless I figure out what is going on I'll just need
to revert it anyway.

                Linus

                Linus
Linus Torvalds March 6, 2023, 5:47 p.m. UTC | #5
On Mon, Mar 6, 2023 at 9:29 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The correct thing to do is always that
>
>    * Returns >= nr_cpu_ids if no cpus set.
>
> because nr_cpu_ids is always the *smallest* of the access sizes.
>
> Of course, right now Guenter seems to be reporting a problem with that
> optimization, so unless I figure out what is going on I'll just need
> to revert it anyway.

Ahh. And the reason is exactly that people do *not* follow that
"Returns >= nr_cpu_ids" rule.

The drivers/char/random.c code is very wrong, and does

             if (cpu == nr_cpumask_bits)
                             cpu = cpumask_first(&timer_cpus);

which fails miserably exactly because it doesn't use ">=".

Oh well.

I'll have to look for more of this pattern, but basically all those
"xyz_cpumask_bits" things were supposed to always be just internal to
that header file implementation, which is *exactly* why you have to
check the result for ">= nr_cpu_ids".

       Linus
Linus Torvalds March 6, 2023, 6:02 p.m. UTC | #6
On Mon, Mar 6, 2023 at 9:47 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The drivers/char/random.c code is very wrong, and does
>
>              if (cpu == nr_cpumask_bits)
>                              cpu = cpumask_first(&timer_cpus);
>
> which fails miserably exactly because it doesn't use ">=".

Turns out this "cpu == nr_cpumask_bits" pattern exists in a couple of
other places too.

It was always wrong, but it always just happened to work. The lpfc
SCSI driver in particular seems to *love* this pattern:

        start_cpu = cpumask_next(new_cpu, cpu_present_mask);
        if (start_cpu == nr_cpumask_bits)
                start_cpu = first_cpu;

and has repeated it multiple times, all incorrect.

We do have "cpumask_next_wrap()", and that *seems* to be what the lpcf
driver actually wants to do.

.. and then we have kernel/sched/fair.c, which is actually not buggy,
just odd. It uses nr_cpumask_bits too, but it uses it purely for its
own internal nefarious reasons - it's not actually related to the
cpumask functions at all, its just used as a "not valid CPU number".

I think that scheduler use is still very *wrong*, but it doesn't look
actively buggy.

The other cases all look very buggy indeed, but yes, they happened to
work, and now they don't. So commit 596ff4a09b89 ("cpumask:
re-introduce constant-sized cpumask optimizations") did break them.

I'd rather fix these bad users than revert, but there does seem to be
an alarming number of these things, which worries me:

     git grep '== nr_cpumask_bits'

and that's just checking for this *exact* thing.

                Linus
diff mbox series

Patch

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 8fbe76607965..248bdb1c50dc 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -155,7 +155,7 @@  static __always_inline unsigned int cpumask_check(unsigned int cpu)
  * cpumask_first - get the first cpu in a cpumask
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 static inline unsigned int cpumask_first(const struct cpumask *srcp)
 {
@@ -166,7 +166,7 @@  static inline unsigned int cpumask_first(const struct cpumask *srcp)
  * cpumask_first_zero - get the first unset cpu in a cpumask
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if all cpus are set.
+ * Returns >= small_cpumask_bits if all cpus are set.
  */
 static inline unsigned int cpumask_first_zero(const struct cpumask *srcp)
 {
@@ -178,7 +178,7 @@  static inline unsigned int cpumask_first_zero(const struct cpumask *srcp)
  * @src1p: the first input
  * @src2p: the second input
  *
- * Returns >= nr_cpu_ids if no cpus set in both.  See also cpumask_next_and().
+ * Returns >= small_cpumask_bits if no cpus set in both.  See also cpumask_next_and().
  */
 static inline
 unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask *srcp2)
@@ -190,7 +190,7 @@  unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
  * cpumask_last - get the last CPU in a cpumask
  * @srcp:	- the cpumask pointer
  *
- * Returns	>= nr_cpumask_bits if no CPUs set.
+ * Returns	>= small_cpumask_bits if no CPUs set.
  */
 static inline unsigned int cpumask_last(const struct cpumask *srcp)
 {
@@ -202,7 +202,7 @@  static inline unsigned int cpumask_last(const struct cpumask *srcp)
  * @n: the cpu prior to the place to search (ie. return will be > @n)
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus set.
+ * Returns >= small_cpumask_bits if no further cpus set.
  */
 static inline
 unsigned int cpumask_next(int n, const struct cpumask *srcp)
@@ -218,7 +218,7 @@  unsigned int cpumask_next(int n, const struct cpumask *srcp)
  * @n: the cpu prior to the place to search (ie. return will be > @n)
  * @srcp: the cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus unset.
+ * Returns >= small_cpumask_bits if no further cpus unset.
  */
 static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
 {
@@ -258,7 +258,7 @@  unsigned int cpumask_any_distribute(const struct cpumask *srcp);
  * @src1p: the first cpumask pointer
  * @src2p: the second cpumask pointer
  *
- * Returns >= nr_cpu_ids if no further cpus set in both.
+ * Returns >= small_cpumask_bits if no further cpus set in both.
  */
 static inline
 unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
@@ -276,7 +276,7 @@  unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
  * @cpu: the (optionally unsigned) integer iterator
  * @mask: the cpumask pointer
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu(cpu, mask)				\
 	for_each_set_bit(cpu, cpumask_bits(mask), small_cpumask_bits)
@@ -310,7 +310,7 @@  unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *
  * The implementation does not assume any bit in @mask is set (including @start).
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_wrap(cpu, mask, start)				\
 	for_each_set_bit_wrap(cpu, cpumask_bits(mask), small_cpumask_bits, start)
@@ -327,7 +327,7 @@  unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *	for_each_cpu(cpu, &tmp)
  *		...
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_and(cpu, mask1, mask2)				\
 	for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
@@ -345,7 +345,7 @@  unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *	for_each_cpu(cpu, &tmp)
  *		...
  *
- * After the loop, cpu is >= nr_cpu_ids.
+ * After the loop, cpu is >= small_cpumask_bits.
  */
 #define for_each_cpu_andnot(cpu, mask1, mask2)				\
 	for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpumask_bits)
@@ -375,7 +375,7 @@  unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
  * @srcp: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *srcp)
 {
@@ -388,7 +388,7 @@  static inline unsigned int cpumask_nth(unsigned int cpu, const struct cpumask *s
  * @srcp2: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline
 unsigned int cpumask_nth_and(unsigned int cpu, const struct cpumask *srcp1,
@@ -404,7 +404,7 @@  unsigned int cpumask_nth_and(unsigned int cpu, const struct cpumask *srcp1,
  * @srcp2: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static inline
 unsigned int cpumask_nth_andnot(unsigned int cpu, const struct cpumask *srcp1,
@@ -421,7 +421,7 @@  unsigned int cpumask_nth_andnot(unsigned int cpu, const struct cpumask *srcp1,
  * @srcp3: the cpumask pointer
  * @cpu: the N'th cpu to find, starting from 0
  *
- * Returns >= nr_cpu_ids if such cpu doesn't exist.
+ * Returns >= small_cpumask_bits if such cpu doesn't exist.
  */
 static __always_inline
 unsigned int cpumask_nth_and_andnot(unsigned int cpu, const struct cpumask *srcp1,
@@ -529,7 +529,7 @@  static inline void cpumask_setall(struct cpumask *dstp)
 }
 
 /**
- * cpumask_clear - clear all cpus (< nr_cpu_ids) in a cpumask
+ * cpumask_clear - clear all cpus (< large_cpumask_bits) in a cpumask
  * @dstp: the cpumask pointer
  */
 static inline void cpumask_clear(struct cpumask *dstp)
@@ -650,7 +650,7 @@  static inline bool cpumask_subset(const struct cpumask *src1p,
 
 /**
  * cpumask_empty - *srcp == 0
- * @srcp: the cpumask to that all cpus < nr_cpu_ids are clear.
+ * @srcp: the cpumask to that all cpus < small_cpumask_bits are clear.
  */
 static inline bool cpumask_empty(const struct cpumask *srcp)
 {
@@ -659,7 +659,7 @@  static inline bool cpumask_empty(const struct cpumask *srcp)
 
 /**
  * cpumask_full - *srcp == 0xFFFFFFFF...
- * @srcp: the cpumask to that all cpus < nr_cpu_ids are set.
+ * @srcp: the cpumask to that all cpus < nr_cpumask_bits are set.
  */
 static inline bool cpumask_full(const struct cpumask *srcp)
 {
@@ -668,7 +668,7 @@  static inline bool cpumask_full(const struct cpumask *srcp)
 
 /**
  * cpumask_weight - Count of bits in *srcp
- * @srcp: the cpumask to count bits (< nr_cpu_ids) in.
+ * @srcp: the cpumask to count bits (< small_cpumask_bits) in.
  */
 static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 {
@@ -677,8 +677,8 @@  static inline unsigned int cpumask_weight(const struct cpumask *srcp)
 
 /**
  * cpumask_weight_and - Count of bits in (*srcp1 & *srcp2)
- * @srcp1: the cpumask to count bits (< nr_cpu_ids) in.
- * @srcp2: the cpumask to count bits (< nr_cpu_ids) in.
+ * @srcp1: the cpumask to count bits (< small_cpumask_bits) in.
+ * @srcp2: the cpumask to count bits (< small_cpumask_bits) in.
  */
 static inline unsigned int cpumask_weight_and(const struct cpumask *srcp1,
 						const struct cpumask *srcp2)
@@ -727,7 +727,7 @@  static inline void cpumask_copy(struct cpumask *dstp,
  * cpumask_any - pick a "random" cpu from *srcp
  * @srcp: the input cpumask
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 #define cpumask_any(srcp) cpumask_first(srcp)
 
@@ -736,7 +736,7 @@  static inline void cpumask_copy(struct cpumask *dstp,
  * @mask1: the first input cpumask
  * @mask2: the second input cpumask
  *
- * Returns >= nr_cpu_ids if no cpus set.
+ * Returns >= small_cpumask_bits if no cpus set.
  */
 #define cpumask_any_and(mask1, mask2) cpumask_first_and((mask1), (mask2))