diff mbox series

[1/2] watchdog: introduce watchdog_dev_suspend/resume

Message ID 20210615123904.2568052-2-grzegorz.jaszczyk@linaro.org
State New
Headers show
Series introduce watchdog_dev_suspend/resume | expand

Commit Message

Grzegorz Jaszczyk June 15, 2021, 12:39 p.m. UTC
The watchdog drivers often disable wdog clock during suspend and then
enable it again during resume. Nevertheless the ping worker is still
running and can issue low-level ping while the wdog clock is disabled
causing the system hang. To prevent such condition introduce
watchdog_dev_suspend/resume which can be used by any wdog driver and
actually cancel ping worker during suspend and restore it back, if
needed, during resume.

Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org>

---
 drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++
 include/linux/watchdog.h        |  2 ++
 2 files changed, 51 insertions(+)

-- 
2.29.0

Comments

Guenter Roeck June 15, 2021, 2:18 p.m. UTC | #1
On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote:
> The watchdog drivers often disable wdog clock during suspend and then

> enable it again during resume. Nevertheless the ping worker is still

> running and can issue low-level ping while the wdog clock is disabled

> causing the system hang. To prevent such condition introduce

> watchdog_dev_suspend/resume which can be used by any wdog driver and

> actually cancel ping worker during suspend and restore it back, if

> needed, during resume.

> 


I'll have to look into this further, but I don't think this is the correct
solution. Most likely the watchdog core needs to have its own independent
suspend/resule functions and suspend the high resolution timer on
suspend and restore it on resume. This may require an additional flag
to be set by drivers to indicate that the timer should be stopped on
suspend.

> Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org>

> ---

>  drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++

>  include/linux/watchdog.h        |  2 ++

>  2 files changed, 51 insertions(+)

> 

> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c

> index 2946f3a63110..3feca1567281 100644

> --- a/drivers/watchdog/watchdog_dev.c

> +++ b/drivers/watchdog/watchdog_dev.c

> @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void)

>  	kthread_destroy_worker(watchdog_kworker);

>  }

>  

> +int watchdog_dev_suspend(struct watchdog_device *wdd)

> +{

> +	struct watchdog_core_data *wd_data = wdd->wd_data;

> +	int ret;

> +

> +	if (!wdd->wd_data)

> +		return -ENODEV;

> +

> +	/* ping for the last time before suspend */

> +	mutex_lock(&wd_data->lock);

> +	if (watchdog_worker_should_ping(wd_data))

> +		ret = __watchdog_ping(wd_data->wdd);

> +	mutex_unlock(&wd_data->lock);

> +

> +	if (ret)

> +		return ret;

> +

> +	/*

> +	 * make sure that watchdog worker will not kick in when the wdog is

> +	 * suspended

> +	 */

> +	hrtimer_cancel(&wd_data->timer);

> +	kthread_cancel_work_sync(&wd_data->work);

> +

> +	return 0;

> +}

> +EXPORT_SYMBOL_GPL(watchdog_dev_suspend);

> +

> +int watchdog_dev_resume(struct watchdog_device *wdd)

> +{

> +	struct watchdog_core_data *wd_data = wdd->wd_data;

> +	int ret;

> +

> +	if (!wdd->wd_data)

> +		return -ENODEV;

> +

> +	/*

> +	 * __watchdog_ping will also retrigger hrtimer and therefore restore the

> +	 * ping worker if needed.

> +	 */

> +	mutex_lock(&wd_data->lock);

> +	if (watchdog_worker_should_ping(wd_data))

> +		ret = __watchdog_ping(wd_data->wdd);

> +	mutex_unlock(&wd_data->lock);

> +

> +	return ret;

> +}

> +EXPORT_SYMBOL_GPL(watchdog_dev_resume);

> +

>  module_param(handle_boot_enabled, bool, 0444);

>  MODULE_PARM_DESC(handle_boot_enabled,

>  	"Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="

> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h

> index 9b19e6bb68b5..febfde3b1ff6 100644

> --- a/include/linux/watchdog.h

> +++ b/include/linux/watchdog.h

> @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,

>  				  unsigned int timeout_parm, struct device *dev);

>  extern int watchdog_register_device(struct watchdog_device *);

>  extern void watchdog_unregister_device(struct watchdog_device *);

> +int watchdog_dev_suspend(struct watchdog_device *wdd);

> +int watchdog_dev_resume(struct watchdog_device *wdd);

>  

>  int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);

>  

> -- 

> 2.29.0

>
Grzegorz Jaszczyk June 16, 2021, 1:59 p.m. UTC | #2
On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote:
>

> On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote:

> > The watchdog drivers often disable wdog clock during suspend and then

> > enable it again during resume. Nevertheless the ping worker is still

> > running and can issue low-level ping while the wdog clock is disabled

> > causing the system hang. To prevent such condition introduce

> > watchdog_dev_suspend/resume which can be used by any wdog driver and

> > actually cancel ping worker during suspend and restore it back, if

> > needed, during resume.

> >

>

> I'll have to look into this further, but I don't think this is the correct

> solution. Most likely the watchdog core needs to have its own independent

> suspend/resule functions and suspend the high resolution timer on

> suspend and restore it on resume. This may require an additional flag

> to be set by drivers to indicate that the timer should be stopped on

> suspend.


That makes sense - thank you for your suggestion. I think I could
register a pm notifier in the watchdog core when the new e.g.
WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and
actually call watchdog_dev_suspend/resume from the notifier callback.
Please let me know if you see any other issue with this solution, if
not I will post v2.

Thank you in advance,
Grzegorz




>

> > Signed-off-by: Grzegorz Jaszczyk <grzegorz.jaszczyk@linaro.org>

> > ---

> >  drivers/watchdog/watchdog_dev.c | 49 +++++++++++++++++++++++++++++++++

> >  include/linux/watchdog.h        |  2 ++

> >  2 files changed, 51 insertions(+)

> >

> > diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c

> > index 2946f3a63110..3feca1567281 100644

> > --- a/drivers/watchdog/watchdog_dev.c

> > +++ b/drivers/watchdog/watchdog_dev.c

> > @@ -1219,6 +1219,55 @@ void __exit watchdog_dev_exit(void)

> >       kthread_destroy_worker(watchdog_kworker);

> >  }

> >

> > +int watchdog_dev_suspend(struct watchdog_device *wdd)

> > +{

> > +     struct watchdog_core_data *wd_data = wdd->wd_data;

> > +     int ret;

> > +

> > +     if (!wdd->wd_data)

> > +             return -ENODEV;

> > +

> > +     /* ping for the last time before suspend */

> > +     mutex_lock(&wd_data->lock);

> > +     if (watchdog_worker_should_ping(wd_data))

> > +             ret = __watchdog_ping(wd_data->wdd);

> > +     mutex_unlock(&wd_data->lock);

> > +

> > +     if (ret)

> > +             return ret;

> > +

> > +     /*

> > +      * make sure that watchdog worker will not kick in when the wdog is

> > +      * suspended

> > +      */

> > +     hrtimer_cancel(&wd_data->timer);

> > +     kthread_cancel_work_sync(&wd_data->work);

> > +

> > +     return 0;

> > +}

> > +EXPORT_SYMBOL_GPL(watchdog_dev_suspend);

> > +

> > +int watchdog_dev_resume(struct watchdog_device *wdd)

> > +{

> > +     struct watchdog_core_data *wd_data = wdd->wd_data;

> > +     int ret;

> > +

> > +     if (!wdd->wd_data)

> > +             return -ENODEV;

> > +

> > +     /*

> > +      * __watchdog_ping will also retrigger hrtimer and therefore restore the

> > +      * ping worker if needed.

> > +      */

> > +     mutex_lock(&wd_data->lock);

> > +     if (watchdog_worker_should_ping(wd_data))

> > +             ret = __watchdog_ping(wd_data->wdd);

> > +     mutex_unlock(&wd_data->lock);

> > +

> > +     return ret;

> > +}

> > +EXPORT_SYMBOL_GPL(watchdog_dev_resume);

> > +

> >  module_param(handle_boot_enabled, bool, 0444);

> >  MODULE_PARM_DESC(handle_boot_enabled,

> >       "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="

> > diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h

> > index 9b19e6bb68b5..febfde3b1ff6 100644

> > --- a/include/linux/watchdog.h

> > +++ b/include/linux/watchdog.h

> > @@ -209,6 +209,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,

> >                                 unsigned int timeout_parm, struct device *dev);

> >  extern int watchdog_register_device(struct watchdog_device *);

> >  extern void watchdog_unregister_device(struct watchdog_device *);

> > +int watchdog_dev_suspend(struct watchdog_device *wdd);

> > +int watchdog_dev_resume(struct watchdog_device *wdd);

> >

> >  int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);

> >

> > --

> > 2.29.0

> >
Guenter Roeck June 16, 2021, 5:57 p.m. UTC | #3
On Wed, Jun 16, 2021 at 03:59:23PM +0200, Grzegorz Jaszczyk wrote:
> On Tue, 15 Jun 2021 at 16:18, Guenter Roeck <linux@roeck-us.net> wrote:

> >

> > On Tue, Jun 15, 2021 at 02:39:03PM +0200, Grzegorz Jaszczyk wrote:

> > > The watchdog drivers often disable wdog clock during suspend and then

> > > enable it again during resume. Nevertheless the ping worker is still

> > > running and can issue low-level ping while the wdog clock is disabled

> > > causing the system hang. To prevent such condition introduce

> > > watchdog_dev_suspend/resume which can be used by any wdog driver and

> > > actually cancel ping worker during suspend and restore it back, if

> > > needed, during resume.

> > >

> >

> > I'll have to look into this further, but I don't think this is the correct

> > solution. Most likely the watchdog core needs to have its own independent

> > suspend/resule functions and suspend the high resolution timer on

> > suspend and restore it on resume. This may require an additional flag

> > to be set by drivers to indicate that the timer should be stopped on

> > suspend.

> 

> That makes sense - thank you for your suggestion. I think I could

> register a pm notifier in the watchdog core when the new e.g.

> WDOG_STOP_PING_ON_SUSPEND status flag will be set by the driver and

> actually call watchdog_dev_suspend/resume from the notifier callback.

> Please let me know if you see any other issue with this solution, if

> not I will post v2.

> 

Go for it.

Thanks,
Guenter
kernel test robot June 16, 2021, 11:21 p.m. UTC | #4
Hi Grzegorz,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on hwmon/hwmon-next]
[also build test WARNING on soc/for-next linus/master v5.13-rc6 next-20210616]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447
base:   https://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git hwmon-next
config: arm64-randconfig-r022-20210615 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 64720f57bea6a6bf033feef4a5751ab9c0c3b401)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/0day-ci/linux/commit/71dadcad8da1862c1205b19ddf4279d494e57545
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Grzegorz-Jaszczyk/introduce-watchdog_dev_suspend-resume/20210616-201447
        git checkout 71dadcad8da1862c1205b19ddf4279d494e57545
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/watchdog/watchdog_dev.c:1232:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]

           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/watchdog/watchdog_dev.c:1236:6: note: uninitialized use occurs here
           if (ret)
               ^~~
   include/linux/compiler.h:56:47: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                                                 ^~~~
   include/linux/compiler.h:58:52: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                                      ^~~~
   drivers/watchdog/watchdog_dev.c:1232:2: note: remove the 'if' if its condition is always true
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/watchdog/watchdog_dev.c:1225:9: note: initialize the variable 'ret' to silence this warning
           int ret;
                  ^
                   = 0
   drivers/watchdog/watchdog_dev.c:1263:2: warning: variable 'ret' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:28: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:58:30: note: expanded from macro '__trace_if_var'
   #define __trace_if_var(cond) (__builtin_constant_p(cond) ? (cond) : __trace_if_value(cond))
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/watchdog/watchdog_dev.c:1267:9: note: uninitialized use occurs here
           return ret;
                  ^~~
   drivers/watchdog/watchdog_dev.c:1263:2: note: remove the 'if' if its condition is always true
           if (watchdog_worker_should_ping(wd_data))
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/compiler.h:56:23: note: expanded from macro 'if'
   #define if(cond, ...) if ( __trace_if_var( !!(cond , ## __VA_ARGS__) ) )
                         ^
   drivers/watchdog/watchdog_dev.c:1253:9: note: initialize the variable 'ret' to silence this warning
           int ret;
                  ^
                   = 0
   2 warnings generated.


vim +1232 drivers/watchdog/watchdog_dev.c

  1221	
  1222	int watchdog_dev_suspend(struct watchdog_device *wdd)
  1223	{
  1224		struct watchdog_core_data *wd_data = wdd->wd_data;
  1225		int ret;
  1226	
  1227		if (!wdd->wd_data)
  1228			return -ENODEV;
  1229	
  1230		/* ping for the last time before suspend */
  1231		mutex_lock(&wd_data->lock);
> 1232		if (watchdog_worker_should_ping(wd_data))

  1233			ret = __watchdog_ping(wd_data->wdd);
  1234		mutex_unlock(&wd_data->lock);
  1235	
  1236		if (ret)
  1237			return ret;
  1238	
  1239		/*
  1240		 * make sure that watchdog worker will not kick in when the wdog is
  1241		 * suspended
  1242		 */
  1243		hrtimer_cancel(&wd_data->timer);
  1244		kthread_cancel_work_sync(&wd_data->work);
  1245	
  1246		return 0;
  1247	}
  1248	EXPORT_SYMBOL_GPL(watchdog_dev_suspend);
  1249	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff mbox series

Patch

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 2946f3a63110..3feca1567281 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -1219,6 +1219,55 @@  void __exit watchdog_dev_exit(void)
 	kthread_destroy_worker(watchdog_kworker);
 }
 
+int watchdog_dev_suspend(struct watchdog_device *wdd)
+{
+	struct watchdog_core_data *wd_data = wdd->wd_data;
+	int ret;
+
+	if (!wdd->wd_data)
+		return -ENODEV;
+
+	/* ping for the last time before suspend */
+	mutex_lock(&wd_data->lock);
+	if (watchdog_worker_should_ping(wd_data))
+		ret = __watchdog_ping(wd_data->wdd);
+	mutex_unlock(&wd_data->lock);
+
+	if (ret)
+		return ret;
+
+	/*
+	 * make sure that watchdog worker will not kick in when the wdog is
+	 * suspended
+	 */
+	hrtimer_cancel(&wd_data->timer);
+	kthread_cancel_work_sync(&wd_data->work);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(watchdog_dev_suspend);
+
+int watchdog_dev_resume(struct watchdog_device *wdd)
+{
+	struct watchdog_core_data *wd_data = wdd->wd_data;
+	int ret;
+
+	if (!wdd->wd_data)
+		return -ENODEV;
+
+	/*
+	 * __watchdog_ping will also retrigger hrtimer and therefore restore the
+	 * ping worker if needed.
+	 */
+	mutex_lock(&wd_data->lock);
+	if (watchdog_worker_should_ping(wd_data))
+		ret = __watchdog_ping(wd_data->wdd);
+	mutex_unlock(&wd_data->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(watchdog_dev_resume);
+
 module_param(handle_boot_enabled, bool, 0444);
 MODULE_PARM_DESC(handle_boot_enabled,
 	"Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 9b19e6bb68b5..febfde3b1ff6 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -209,6 +209,8 @@  extern int watchdog_init_timeout(struct watchdog_device *wdd,
 				  unsigned int timeout_parm, struct device *dev);
 extern int watchdog_register_device(struct watchdog_device *);
 extern void watchdog_unregister_device(struct watchdog_device *);
+int watchdog_dev_suspend(struct watchdog_device *wdd);
+int watchdog_dev_resume(struct watchdog_device *wdd);
 
 int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);