Message ID | 20220713154729.80789-5-ldufour@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Extending NMI watchdog during LPM | expand |
Hi Laurent, On 7/13/22 08:47, Laurent Dufour wrote: > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index ddccd1077462..d73faa619c15 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst > @@ -592,6 +592,18 @@ to the guest kernel command line (see > Documentation/admin-guide/kernel-parameters.rst). > > > +nmi_wd_lpm_factor (PPC only) > +============================ > + > +Factor apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is Factor to apply to > +set to 1). This factor represents the percentage added to > +``watchdog_thresh`` when calculating the NMI watchdog timeout during an > +LPM. The soft lockup timeout is not impacted. > + > +A value of 0 means no change. The default value is 200 meaning the NMI > +watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10).
Le 13/07/2022 à 22:17, Randy Dunlap a écrit : > Hi Laurent, > > On 7/13/22 08:47, Laurent Dufour wrote: >> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst >> index ddccd1077462..d73faa619c15 100644 >> --- a/Documentation/admin-guide/sysctl/kernel.rst >> +++ b/Documentation/admin-guide/sysctl/kernel.rst >> @@ -592,6 +592,18 @@ to the guest kernel command line (see >> Documentation/admin-guide/kernel-parameters.rst). >> >> >> +nmi_wd_lpm_factor (PPC only) >> +============================ >> + >> +Factor apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is > > Factor to apply to Thanks, Randy. Michael, could you fix that when applying the series? Cheers, Laurent > >> +set to 1). This factor represents the percentage added to >> +``watchdog_thresh`` when calculating the NMI watchdog timeout during an >> +LPM. The soft lockup timeout is not impacted. >> + >> +A value of 0 means no change. The default value is 200 meaning the NMI >> +watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10). >
Laurent Dufour <ldufour@linux.ibm.com> writes: > Le 13/07/2022 à 22:17, Randy Dunlap a écrit : >> Hi Laurent, >> >> On 7/13/22 08:47, Laurent Dufour wrote: >>> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst >>> index ddccd1077462..d73faa619c15 100644 >>> --- a/Documentation/admin-guide/sysctl/kernel.rst >>> +++ b/Documentation/admin-guide/sysctl/kernel.rst >>> @@ -592,6 +592,18 @@ to the guest kernel command line (see >>> Documentation/admin-guide/kernel-parameters.rst). >>> >>> >>> +nmi_wd_lpm_factor (PPC only) >>> +============================ >>> + >>> +Factor apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is >> >> Factor to apply to > > Thanks, Randy. > > Michael, could you fix that when applying the series? Yes, I did. cheers
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index ddccd1077462..d73faa619c15 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -592,6 +592,18 @@ to the guest kernel command line (see Documentation/admin-guide/kernel-parameters.rst). +nmi_wd_lpm_factor (PPC only) +============================ + +Factor apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is +set to 1). This factor represents the percentage added to +``watchdog_thresh`` when calculating the NMI watchdog timeout during an +LPM. The soft lockup timeout is not impacted. + +A value of 0 means no change. The default value is 200 meaning the NMI +watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10). + + numa_balancing ============== diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index 6297467072e6..3d36a8955eaf 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -48,6 +48,39 @@ struct update_props_workarea { #define MIGRATION_SCOPE (1) #define PRRN_SCOPE -2 +#ifdef CONFIG_PPC_WATCHDOG +static unsigned int nmi_wd_lpm_factor = 200; + +#ifdef CONFIG_SYSCTL +static struct ctl_table nmi_wd_lpm_factor_ctl_table[] = { + { + .procname = "nmi_wd_lpm_factor", + .data = &nmi_wd_lpm_factor, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_douintvec_minmax, + }, + {} +}; +static struct ctl_table nmi_wd_lpm_factor_sysctl_root[] = { + { + .procname = "kernel", + .mode = 0555, + .child = nmi_wd_lpm_factor_ctl_table, + }, + {} +}; + +static int __init register_nmi_wd_lpm_factor_sysctl(void) +{ + register_sysctl_table(nmi_wd_lpm_factor_sysctl_root); + + return 0; +} +device_initcall(register_nmi_wd_lpm_factor_sysctl); +#endif /* CONFIG_SYSCTL */ +#endif /* CONFIG_PPC_WATCHDOG */ + static int mobility_rtas_call(int token, char *buf, s32 scope) { int rc; @@ -702,13 +735,20 @@ static int pseries_suspend(u64 handle) static int pseries_migrate_partition(u64 handle) { int ret; + unsigned int factor = 0; +#ifdef CONFIG_PPC_WATCHDOG + factor = nmi_wd_lpm_factor; +#endif ret = wait_for_vasi_session_suspending(handle); if (ret) return ret; vas_migration_handler(VAS_SUSPEND); + if (factor) + watchdog_nmi_set_timeout_pct(factor); + ret = pseries_suspend(handle); if (ret == 0) { post_mobility_fixup(); @@ -722,6 +762,9 @@ static int pseries_migrate_partition(u64 handle) } else pseries_cancel_migration(handle, ret); + if (factor) + watchdog_nmi_set_timeout_pct(0); + vas_migration_handler(VAS_RESUME); return ret;