diff mbox series

[v2] watchdog: wdat_wdt: Set the min and max timeout values properly

Message ID 20220806081524.5636461a@endymion.delvare
State New
Headers show
Series [v2] watchdog: wdat_wdt: Set the min and max timeout values properly | expand

Commit Message

Jean Delvare Aug. 6, 2022, 6:15 a.m. UTC
The wdat_wdt driver is misusing the min_hw_heartbeat_ms field. This
field should only be used when the hardware watchdog device should not
be pinged more frequently than a specific period. The ACPI WDAT
"Minimum Count" field, on the other hand, specifies the minimum
timeout value that can be set. This corresponds to the min_timeout
field in Linux's watchdog infrastructure.

Setting min_hw_heartbeat_ms instead can cause pings to the hardware
to be delayed when there is no reason for that, eventually leading to
unexpected firing of the watchdog timer (and thus unexpected reboot).

I'm also changing max_hw_heartbeat_ms to max_timeout for symmetry,
although the use of this one isn't fundamentally wrong, but there is
also no reason to enable the software-driven ping mechanism for the
wdat_wdt driver.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog")
Reviewed-by! Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
Changes since v1:
 * Fix a stupid typo which broke the build. Apparently I shouldn't be
   sending out patches after midnight, sorry.

 drivers/watchdog/wdat_wdt.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Guenter Roeck Aug. 8, 2022, 11:42 a.m. UTC | #1
On 8/5/22 23:15, Jean Delvare wrote:
> The wdat_wdt driver is misusing the min_hw_heartbeat_ms field. This
> field should only be used when the hardware watchdog device should not
> be pinged more frequently than a specific period. The ACPI WDAT
> "Minimum Count" field, on the other hand, specifies the minimum
> timeout value that can be set. This corresponds to the min_timeout
> field in Linux's watchdog infrastructure.
> 
> Setting min_hw_heartbeat_ms instead can cause pings to the hardware
> to be delayed when there is no reason for that, eventually leading to
> unexpected firing of the watchdog timer (and thus unexpected reboot).
> 
> I'm also changing max_hw_heartbeat_ms to max_timeout for symmetry,
> although the use of this one isn't fundamentally wrong, but there is
> also no reason to enable the software-driven ping mechanism for the
> wdat_wdt driver.
> 

I dislike this part because it changes behavior and is unrelated to
the problem at hand, but I assume Mike knows the actual hardware limits
and understands the implications (ie that there is indeed no need to
enable the software-driven ping mechanism).

> Signed-off-by: Jean Delvare <jdelvare@suse.de>
> Fixes: 058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog")
> Reviewed-by! Mika Westerberg <mika.westerberg@linux.intel.com>
> Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Guenter Roeck <linux@roeck-us.net>

> ---
> Changes since v1:
>   * Fix a stupid typo which broke the build. Apparently I shouldn't be
>     sending out patches after midnight, sorry.
> 
>   drivers/watchdog/wdat_wdt.c |    8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> --- linux-5.18.orig/drivers/watchdog/wdat_wdt.c	2022-07-27 07:32:33.336928967 +0200
> +++ linux-5.18/drivers/watchdog/wdat_wdt.c	2022-08-06 08:09:49.235935543 +0200
> @@ -342,8 +342,8 @@ static int wdat_wdt_probe(struct platfor
>   		return -EINVAL;
>   
>   	wdat->period = tbl->timer_period;
> -	wdat->wdd.min_hw_heartbeat_ms = wdat->period * tbl->min_count;
> -	wdat->wdd.max_hw_heartbeat_ms = wdat->period * tbl->max_count;
> +	wdat->wdd.min_timeout = DIV_ROUND_UP(wdat->period * tbl->min_count, 1000);
> +	wdat->wdd.max_timeout = wdat->period * tbl->max_count / 1000;
>   	wdat->stopped_in_sleep = tbl->flags & ACPI_WDAT_STOPPED;
>   	wdat->wdd.info = &wdat_wdt_info;
>   	wdat->wdd.ops = &wdat_wdt_ops;
> @@ -450,8 +450,8 @@ static int wdat_wdt_probe(struct platfor
>   	 * watchdog properly after it has opened the device. In some cases
>   	 * the BIOS default is too short and causes immediate reboot.
>   	 */
> -	if (timeout * 1000 < wdat->wdd.min_hw_heartbeat_ms ||
> -	    timeout * 1000 > wdat->wdd.max_hw_heartbeat_ms) {
> +	if (timeout < wdat->wdd.min_timeout ||
> +	    timeout > wdat->wdd.max_timeout) {
>   		dev_warn(dev, "Invalid timeout %d given, using %d\n",
>   			 timeout, WDAT_DEFAULT_TIMEOUT);
>   		timeout = WDAT_DEFAULT_TIMEOUT;
> 
>
Jean Delvare Aug. 23, 2022, 1 p.m. UTC | #2
Hi all,

On Sat, 6 Aug 2022 08:15:24 +0200, Jean Delvare wrote:
> The wdat_wdt driver is misusing the min_hw_heartbeat_ms field. This
> field should only be used when the hardware watchdog device should not
> be pinged more frequently than a specific period. The ACPI WDAT
> "Minimum Count" field, on the other hand, specifies the minimum
> timeout value that can be set. This corresponds to the min_timeout
> field in Linux's watchdog infrastructure.
> 
> Setting min_hw_heartbeat_ms instead can cause pings to the hardware
> to be delayed when there is no reason for that, eventually leading to
> unexpected firing of the watchdog timer (and thus unexpected reboot).
> (...)

This patch no longer applies as it conflicts with:

commit 6d72c7ac9fbe26a77800676507da980436b40b2f
Author: Liu Xinpeng
Date:   Tue Apr 26 22:53:28 2022 +0800

which made it into kernel v5.19.

Having reviewed the commit mentioned above, I must say I'm skeptical. I
can't see how setting min_timeout to 1 arbitrarily has been considered
a good thing. This allows setting timeout values lower than the ACPI
WDAT "Minimum Count" field, while presumably the hardware does not
support such short timeouts.

Furthermore, calling watchdog_timeout_invalid() to validate the timeout
value is a good idea in principle, however, given that min_timeout is
now 1 and max_hw_heartbeat_ms is defined, the function is no longer
checking much.

My understanding is that the original code was checking the right
limits (from the WDAT table's perspective) using the wrong fields (from
the watchdog core's perspective). This fix from Liu is not really fixing
the problem (min_hw_heartbeat_ms and max_hw_heartbeat_ms are still set,
which enables watchdog core facilities that the driver doesn't need
IMHO) and is adding a new problem (the timeout limits defined in the
ACPI WDAT table are no longer checked).

I will rebase my patch on top and address both problems.
diff mbox series

Patch

--- linux-5.18.orig/drivers/watchdog/wdat_wdt.c	2022-07-27 07:32:33.336928967 +0200
+++ linux-5.18/drivers/watchdog/wdat_wdt.c	2022-08-06 08:09:49.235935543 +0200
@@ -342,8 +342,8 @@  static int wdat_wdt_probe(struct platfor
 		return -EINVAL;
 
 	wdat->period = tbl->timer_period;
-	wdat->wdd.min_hw_heartbeat_ms = wdat->period * tbl->min_count;
-	wdat->wdd.max_hw_heartbeat_ms = wdat->period * tbl->max_count;
+	wdat->wdd.min_timeout = DIV_ROUND_UP(wdat->period * tbl->min_count, 1000);
+	wdat->wdd.max_timeout = wdat->period * tbl->max_count / 1000;
 	wdat->stopped_in_sleep = tbl->flags & ACPI_WDAT_STOPPED;
 	wdat->wdd.info = &wdat_wdt_info;
 	wdat->wdd.ops = &wdat_wdt_ops;
@@ -450,8 +450,8 @@  static int wdat_wdt_probe(struct platfor
 	 * watchdog properly after it has opened the device. In some cases
 	 * the BIOS default is too short and causes immediate reboot.
 	 */
-	if (timeout * 1000 < wdat->wdd.min_hw_heartbeat_ms ||
-	    timeout * 1000 > wdat->wdd.max_hw_heartbeat_ms) {
+	if (timeout < wdat->wdd.min_timeout ||
+	    timeout > wdat->wdd.max_timeout) {
 		dev_warn(dev, "Invalid timeout %d given, using %d\n",
 			 timeout, WDAT_DEFAULT_TIMEOUT);
 		timeout = WDAT_DEFAULT_TIMEOUT;