diff mbox series

rcu: Make rcu_normal_after_boot writable on RT

Message ID 20210805080123.16320-1-juri.lelli@redhat.com
State New
Headers show
Series rcu: Make rcu_normal_after_boot writable on RT | expand

Commit Message

Juri Lelli Aug. 5, 2021, 8:01 a.m. UTC
Certain configurations (e.g., systems that make heavy use of netns)
need to use synchronize_rcu_expedited() to service RCU grace periods
even after boot.

Even though synchronize_rcu_expedited() has been traditionally
considered harmful for RT for the heavy use of IPIs, it is perfectly
usable under certain conditions (e.g. nohz_full).

Make rcupdate.rcu_normal_after_boot= again writeable on RT, but keep
its default value to 1 (enabled) to avoid regressions. Users who need
synchronize_rcu_expedited() will boot with rcupdate.rcu_normal_after_
boot=0 in the kernel cmdline.

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
---
 kernel/rcu/update.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Paul E. McKenney Aug. 5, 2021, 4:03 p.m. UTC | #1
On Thu, Aug 05, 2021 at 10:01:23AM +0200, Juri Lelli wrote:
> Certain configurations (e.g., systems that make heavy use of netns)
> need to use synchronize_rcu_expedited() to service RCU grace periods
> even after boot.
> 
> Even though synchronize_rcu_expedited() has been traditionally
> considered harmful for RT for the heavy use of IPIs, it is perfectly
> usable under certain conditions (e.g. nohz_full).
> 
> Make rcupdate.rcu_normal_after_boot= again writeable on RT, but keep
> its default value to 1 (enabled) to avoid regressions. Users who need
> synchronize_rcu_expedited() will boot with rcupdate.rcu_normal_after_
> boot=0 in the kernel cmdline.
> 
> Signed-off-by: Juri Lelli <juri.lelli@redhat.com>

Makes sense to me!

But would another of the -rt people be willing to give an Acked-by?
For example, maybe they would prefer this kernel boot parameter to be
exposed only if (!PREEMPT_RT || NO_HZ_FULL).  Or are there !NO_HZ_FULL
situations where rcu_normal_after_boot makes sense?

							Thanx, Paul

> ---
>  kernel/rcu/update.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index c21b38cc25e9..0fdbf937edac 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -57,9 +57,7 @@
>  module_param(rcu_expedited, int, 0);
>  module_param(rcu_normal, int, 0);
>  static int rcu_normal_after_boot = IS_ENABLED(CONFIG_PREEMPT_RT);
> -#ifndef CONFIG_PREEMPT_RT
>  module_param(rcu_normal_after_boot, int, 0);
> -#endif
>  #endif /* #ifndef CONFIG_TINY_RCU */
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> -- 
> 2.31.1
>
Juri Lelli Aug. 6, 2021, 7:41 a.m. UTC | #2
Hi,

Thanks for reviewing Paul.

On 05/08/21 14:08, Paul E. McKenney wrote:
> On Thu, Aug 05, 2021 at 09:03:37AM -0700, Paul E. McKenney wrote:

> > On Thu, Aug 05, 2021 at 10:01:23AM +0200, Juri Lelli wrote:

> > > Certain configurations (e.g., systems that make heavy use of netns)

> > > need to use synchronize_rcu_expedited() to service RCU grace periods

> > > even after boot.

> > > 

> > > Even though synchronize_rcu_expedited() has been traditionally

> > > considered harmful for RT for the heavy use of IPIs, it is perfectly

> > > usable under certain conditions (e.g. nohz_full).

> > > 

> > > Make rcupdate.rcu_normal_after_boot= again writeable on RT, but keep

> > > its default value to 1 (enabled) to avoid regressions. Users who need

> > > synchronize_rcu_expedited() will boot with rcupdate.rcu_normal_after_

> > > boot=0 in the kernel cmdline.

> > > 

> > > Signed-off-by: Juri Lelli <juri.lelli@redhat.com>

> > 

> > Makes sense to me!

> > 

> > But would another of the -rt people be willing to give an Acked-by?

> > For example, maybe they would prefer this kernel boot parameter to be

> > exposed only if (!PREEMPT_RT || NO_HZ_FULL).  Or are there !NO_HZ_FULL

> > situations where rcu_normal_after_boot makes sense?

> 

> Ah, and this will also need to be reflected in the WARN_ON_ONCE()

> in synchronize_rcu_expedited_wait() in kernel/rcu/tree_exp.h.


Indeed. Will add the change as soon as I receive indication about your
first point.

Best,
Juri
Sebastian Andrzej Siewior Aug. 6, 2021, 8:04 a.m. UTC | #3
On 2021-08-05 09:03:37 [-0700], Paul E. McKenney wrote:
> Makes sense to me!

> 

> But would another of the -rt people be willing to give an Acked-by?

> For example, maybe they would prefer this kernel boot parameter to be

> exposed only if (!PREEMPT_RT || NO_HZ_FULL).  Or are there !NO_HZ_FULL

> situations where rcu_normal_after_boot makes sense?


Julia crafted that "rcu_normal_after_boot = 1" for RT after we had more
and more synchronize_rcu_expedited() users popping up. I would like to
keep that part (default value) since it good to have for most users.

I don't mind removing CONFIG_PREEMPT_RT part here if there are legitimate
use cases for using "rcu_normal_after_boot = 0".
Paul suggested initially to restrict that option for PREEMPT_RT and I
would follow here Paul's guidance to either remove it or restrict it to
NO_HZ_FULL in RT's case (as suggested).

> 							Thanx, Paul


Sebastian
Paul E. McKenney Aug. 6, 2021, 5:44 p.m. UTC | #4
On Fri, Aug 06, 2021 at 10:04:55AM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-08-05 09:03:37 [-0700], Paul E. McKenney wrote:

> > Makes sense to me!

> > 

> > But would another of the -rt people be willing to give an Acked-by?

> > For example, maybe they would prefer this kernel boot parameter to be

> > exposed only if (!PREEMPT_RT || NO_HZ_FULL).  Or are there !NO_HZ_FULL

> > situations where rcu_normal_after_boot makes sense?

> 

> Julia crafted that "rcu_normal_after_boot = 1" for RT after we had more

> and more synchronize_rcu_expedited() users popping up. I would like to

> keep that part (default value) since it good to have for most users.

> 

> I don't mind removing CONFIG_PREEMPT_RT part here if there are legitimate

> use cases for using "rcu_normal_after_boot = 0".

> Paul suggested initially to restrict that option for PREEMPT_RT and I

> would follow here Paul's guidance to either remove it or restrict it to

> NO_HZ_FULL in RT's case (as suggested).


Given what I know now, I suggest the following:

o	Restrict the option to !PREEMPT_RT unless NO_HZ_FULL.
	Maybe "!defined(CONFIG_PREEMPT_RT) || defined(CONFIG_NO_HZ_FULL)".

	If there is some non-NO_HZ_FULL PREEMPT_RT configuration that
	tolerates expedited grace periods, this would need to change.

o	Change the permissions from "0" to "0444", if desired.	If you
	would rather not, I can do this in a follow-up patch.  (No idea
	why I let such an ugly serviceability issue through, but the
	previous pair of module_param() instances have the same problem.)

Anything I am missing?

							Thanx, Paul
Juri Lelli Aug. 9, 2021, 8:37 a.m. UTC | #5
On 06/08/21 10:44, Paul E. McKenney wrote:
> On Fri, Aug 06, 2021 at 10:04:55AM +0200, Sebastian Andrzej Siewior wrote:

> > On 2021-08-05 09:03:37 [-0700], Paul E. McKenney wrote:

> > > Makes sense to me!

> > > 

> > > But would another of the -rt people be willing to give an Acked-by?

> > > For example, maybe they would prefer this kernel boot parameter to be

> > > exposed only if (!PREEMPT_RT || NO_HZ_FULL).  Or are there !NO_HZ_FULL

> > > situations where rcu_normal_after_boot makes sense?

> > 

> > Julia crafted that "rcu_normal_after_boot = 1" for RT after we had more

> > and more synchronize_rcu_expedited() users popping up. I would like to

> > keep that part (default value) since it good to have for most users.

> > 

> > I don't mind removing CONFIG_PREEMPT_RT part here if there are legitimate

> > use cases for using "rcu_normal_after_boot = 0".

> > Paul suggested initially to restrict that option for PREEMPT_RT and I

> > would follow here Paul's guidance to either remove it or restrict it to

> > NO_HZ_FULL in RT's case (as suggested).

> 

> Given what I know now, I suggest the following:

> 

> o	Restrict the option to !PREEMPT_RT unless NO_HZ_FULL.

> 	Maybe "!defined(CONFIG_PREEMPT_RT) || defined(CONFIG_NO_HZ_FULL)".

> 

> 	If there is some non-NO_HZ_FULL PREEMPT_RT configuration that

> 	tolerates expedited grace periods, this would need to change.

> 

> o	Change the permissions from "0" to "0444", if desired.	If you

> 	would rather not, I can do this in a follow-up patch.  (No idea

> 	why I let such an ugly serviceability issue through, but the

> 	previous pair of module_param() instances have the same problem.)

> 

> Anything I am missing?


Not that I can think of right now. :)

Will implement your suggestions and submit v2 soon. Thank again to you
and Sebastian for the review!

Best,
Juri
diff mbox series

Patch

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index c21b38cc25e9..0fdbf937edac 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -57,9 +57,7 @@ 
 module_param(rcu_expedited, int, 0);
 module_param(rcu_normal, int, 0);
 static int rcu_normal_after_boot = IS_ENABLED(CONFIG_PREEMPT_RT);
-#ifndef CONFIG_PREEMPT_RT
 module_param(rcu_normal_after_boot, int, 0);
-#endif
 #endif /* #ifndef CONFIG_TINY_RCU */
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC