diff mbox series

Bluetooth: hci_h5: Disable the hci_suspend_notifier for btrtl devices

Message ID 20210127215834.9367-1-hdegoede@redhat.com
State New
Headers show
Series Bluetooth: hci_h5: Disable the hci_suspend_notifier for btrtl devices | expand

Commit Message

Hans de Goede Jan. 27, 2021, 9:58 p.m. UTC
The hci_suspend_notifier which was introduced last year, is causing
problems for uart attached btrtl devices. These devices may loose their
firmware and their baudrate setting over a suspend/resume.

Since we don't even know the baudrate after a suspend/resume recovering
from this is tricky. The driver solves this by treating these devices
the same as USB BT HCIs which drop of the bus during suspend.

Specifically the driver:
1. Simply unconditionally turns the device fully off during
   system-suspend to save maximum power.
2. Calls device_reprobe() from a workqueue to fully re-init the device
   from scratch on system-resume (unregistering the old HCI and
   registering a new HCI).

This means that these devices do not benefit from the suspend / resume
handling work done by the hci_suspend_notifier. At best this unnecessarily
adds some time to the suspend/resume time.

But in practice this is actually causing problems:

1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(
SCAN_DISABLED) request being send to them when entering the
BT_SUSPEND_CONFIGURE_WAKE state. The same request send on
BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)
from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:

[  573.497754] PM: suspend entry (s2idle)
[  573.554615] Filesystems sync: 0.056 seconds
[  575.837753] Bluetooth: hci0: Timed out waiting for suspend events
[  575.837801] Bluetooth: hci0: Suspend timeout bit: 4
[  575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110

2. The PM_POST_SUSPEND / BT_RUNNING transition races with the
driver-unbinding done by the device_reprobe() work.
If the hci_suspend_notifier wins the race it is talking to a dead
device leading to the following errors being logged:

[  598.686060] Bluetooth: hci0: Timed out waiting for suspend events
[  598.686124] Bluetooth: hci0: Suspend timeout bit: 5
[  598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110

In both cases things still work, but the suspend-notifier is causing
these ugly errors getting logged and ut increase both the suspend- and
the resume-time by 2 seconds.

This commit works around these problems by disabling (unregistering)
the hci_suspend_notifier.

Note that any eventual hci_unregister_dev() will call
unregister_pm_notifier() a second time, this is fine it will
simply fail with -ENOENT and hci_unregister_dev() ignores the
return value.

Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/bluetooth/hci_h5.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Abhishek Pandit-Subedi Jan. 27, 2021, 11:58 p.m. UTC | #1
Hi Hans,

You can also implement hdev->prevent_wake to prevent configure wake
from running. hci_unregister_dev will also clear any tasks during
resume, unregister the pm notifier and cancel the suspend work so if
reprobe takes <2s (default suspend work timeout), you shouldn't see a
delay or any errors.

I'm not 100% confident that the reprobe worker and the pm notifier
won't race so I'm ok with this change as well. I would prefer if the
driver only set a quirk and the actual register/unregister was handled
entirely inside hci_core though.

Thanks,
Abhishek

On Wed, Jan 27, 2021 at 1:58 PM Hans de Goede <hdegoede@redhat.com> wrote:
>

> The hci_suspend_notifier which was introduced last year, is causing

> problems for uart attached btrtl devices. These devices may loose their

> firmware and their baudrate setting over a suspend/resume.

>

> Since we don't even know the baudrate after a suspend/resume recovering

> from this is tricky. The driver solves this by treating these devices

> the same as USB BT HCIs which drop of the bus during suspend.

>

> Specifically the driver:

> 1. Simply unconditionally turns the device fully off during

>    system-suspend to save maximum power.

> 2. Calls device_reprobe() from a workqueue to fully re-init the device

>    from scratch on system-resume (unregistering the old HCI and

>    registering a new HCI).

>

> This means that these devices do not benefit from the suspend / resume

> handling work done by the hci_suspend_notifier. At best this unnecessarily

> adds some time to the suspend/resume time.

>

> But in practice this is actually causing problems:

>

> 1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(

> SCAN_DISABLED) request being send to them when entering the

> BT_SUSPEND_CONFIGURE_WAKE state. The same request send on

> BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)

> from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:

>

> [  573.497754] PM: suspend entry (s2idle)

> [  573.554615] Filesystems sync: 0.056 seconds

> [  575.837753] Bluetooth: hci0: Timed out waiting for suspend events

> [  575.837801] Bluetooth: hci0: Suspend timeout bit: 4

> [  575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110

>

> 2. The PM_POST_SUSPEND / BT_RUNNING transition races with the

> driver-unbinding done by the device_reprobe() work.

> If the hci_suspend_notifier wins the race it is talking to a dead

> device leading to the following errors being logged:

>

> [  598.686060] Bluetooth: hci0: Timed out waiting for suspend events

> [  598.686124] Bluetooth: hci0: Suspend timeout bit: 5

> [  598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110

>

> In both cases things still work, but the suspend-notifier is causing

> these ugly errors getting logged and ut increase both the suspend- and

> the resume-time by 2 seconds.

>

> This commit works around these problems by disabling (unregistering)

> the hci_suspend_notifier.

>

> Note that any eventual hci_unregister_dev() will call

> unregister_pm_notifier() a second time, this is fine it will

> simply fail with -ENOENT and hci_unregister_dev() ignores the

> return value.

>

> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

> Cc: Vasily Khoruzhick <anarsoul@gmail.com>

> Cc: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>

> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

> ---

>  drivers/bluetooth/hci_h5.c | 8 ++++++++

>  1 file changed, 8 insertions(+)

>

> diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c

> index 7be16a7f653b..acbcc676d6c2 100644

> --- a/drivers/bluetooth/hci_h5.c

> +++ b/drivers/bluetooth/hci_h5.c

> @@ -14,6 +14,7 @@

>  #include <linux/of_device.h>

>  #include <linux/serdev.h>

>  #include <linux/skbuff.h>

> +#include <linux/suspend.h>

>

>  #include <net/bluetooth/bluetooth.h>

>  #include <net/bluetooth/hci_core.h>

> @@ -876,6 +877,13 @@ static int h5_btrtl_setup(struct h5 *h5)

>         bool flow_control;

>         int err;

>

> +       /*

> +        * Since h5_btrtl_resume() does a device_reprobe() the suspend handling

> +        * done by the hci_suspend_notifier is not necessary; it actually causes

> +        * delays and a bunch of errors to get logged, so disable it.

> +        */

> +       unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);

> +

>         btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);

>         if (IS_ERR(btrtl_dev))

>                 return PTR_ERR(btrtl_dev);

> --

> 2.29.2

>
Hans de Goede Jan. 28, 2021, 1:22 p.m. UTC | #2
Hi,

On 1/28/21 12:58 AM, Abhishek Pandit-Subedi wrote:
> Hi Hans,

> 

> You can also implement hdev->prevent_wake to prevent configure wake

> from running.


Yes I did consider that, but there still is an issue sometimes on
resume too and the BT_SUSPEND_DISCONNECT handling takes 200 ms which
is nice to get rid of too. So I opted to just disable the
hci_suspend_notifier().

> hci_unregister_dev will also clear any tasks during

> resume, unregister the pm notifier and cancel the suspend work so if

> reprobe takes <2s (default suspend work timeout), you shouldn't see a

> delay or any errors.


The RTL8723bs is the cheapest of cheap wifi/bt combo chip and is
often found on cheap/slow devices. I did see this:

[  598.686060] Bluetooth: hci0: Timed out waiting for suspend events
[  598.686124] Bluetooth: hci0: Suspend timeout bit: 5
[  598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110

At least once, it might just be that the 2 seconds elapsed before
the reprobe ran (or there might still be a race somewhere).

> I'm not 100% confident that the reprobe worker and the pm notifier

> won't race so I'm ok with this change as well. I would prefer if the> driver only set a quirk and the actual register/unregister was handled

> entirely inside hci_core though.


Ok, I did consider using a quirk for this myself, but I was not sure
if this warranted a core change since this is a somewhat unique
situation.

I'll respin the patch into a patch-set adding a quirk, thank you
for your feedback.

Regards,

Hans





> On Wed, Jan 27, 2021 at 1:58 PM Hans de Goede <hdegoede@redhat.com> wrote:

>>

>> The hci_suspend_notifier which was introduced last year, is causing

>> problems for uart attached btrtl devices. These devices may loose their

>> firmware and their baudrate setting over a suspend/resume.

>>

>> Since we don't even know the baudrate after a suspend/resume recovering

>> from this is tricky. The driver solves this by treating these devices

>> the same as USB BT HCIs which drop of the bus during suspend.

>>

>> Specifically the driver:

>> 1. Simply unconditionally turns the device fully off during

>>    system-suspend to save maximum power.

>> 2. Calls device_reprobe() from a workqueue to fully re-init the device

>>    from scratch on system-resume (unregistering the old HCI and

>>    registering a new HCI).

>>

>> This means that these devices do not benefit from the suspend / resume

>> handling work done by the hci_suspend_notifier. At best this unnecessarily

>> adds some time to the suspend/resume time.

>>

>> But in practice this is actually causing problems:

>>

>> 1. These btrtl devices seem to not like the HCI_OP_WRITE_SCAN_ENABLE(

>> SCAN_DISABLED) request being send to them when entering the

>> BT_SUSPEND_CONFIGURE_WAKE state. The same request send on

>> BT_SUSPEND_DISCONNECT works fine, but the second one send (unnecessarily?)

>> from the BT_SUSPEND_CONFIGURE_WAKE transition causes the device to hang:

>>

>> [  573.497754] PM: suspend entry (s2idle)

>> [  573.554615] Filesystems sync: 0.056 seconds

>> [  575.837753] Bluetooth: hci0: Timed out waiting for suspend events

>> [  575.837801] Bluetooth: hci0: Suspend timeout bit: 4

>> [  575.837925] Bluetooth: hci0: Suspend notifier action (3) failed: -110

>>

>> 2. The PM_POST_SUSPEND / BT_RUNNING transition races with the

>> driver-unbinding done by the device_reprobe() work.

>> If the hci_suspend_notifier wins the race it is talking to a dead

>> device leading to the following errors being logged:

>>

>> [  598.686060] Bluetooth: hci0: Timed out waiting for suspend events

>> [  598.686124] Bluetooth: hci0: Suspend timeout bit: 5

>> [  598.686237] Bluetooth: hci0: Suspend notifier action (4) failed: -110

>>

>> In both cases things still work, but the suspend-notifier is causing

>> these ugly errors getting logged and ut increase both the suspend- and

>> the resume-time by 2 seconds.

>>

>> This commit works around these problems by disabling (unregistering)

>> the hci_suspend_notifier.

>>

>> Note that any eventual hci_unregister_dev() will call

>> unregister_pm_notifier() a second time, this is fine it will

>> simply fail with -ENOENT and hci_unregister_dev() ignores the

>> return value.

>>

>> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

>> Cc: Vasily Khoruzhick <anarsoul@gmail.com>

>> Cc: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>

>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

>> ---

>>  drivers/bluetooth/hci_h5.c | 8 ++++++++

>>  1 file changed, 8 insertions(+)

>>

>> diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c

>> index 7be16a7f653b..acbcc676d6c2 100644

>> --- a/drivers/bluetooth/hci_h5.c

>> +++ b/drivers/bluetooth/hci_h5.c

>> @@ -14,6 +14,7 @@

>>  #include <linux/of_device.h>

>>  #include <linux/serdev.h>

>>  #include <linux/skbuff.h>

>> +#include <linux/suspend.h>

>>

>>  #include <net/bluetooth/bluetooth.h>

>>  #include <net/bluetooth/hci_core.h>

>> @@ -876,6 +877,13 @@ static int h5_btrtl_setup(struct h5 *h5)

>>         bool flow_control;

>>         int err;

>>

>> +       /*

>> +        * Since h5_btrtl_resume() does a device_reprobe() the suspend handling

>> +        * done by the hci_suspend_notifier is not necessary; it actually causes

>> +        * delays and a bunch of errors to get logged, so disable it.

>> +        */

>> +       unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);

>> +

>>         btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);

>>         if (IS_ERR(btrtl_dev))

>>                 return PTR_ERR(btrtl_dev);

>> --

>> 2.29.2

>>

>
diff mbox series

Patch

diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c
index 7be16a7f653b..acbcc676d6c2 100644
--- a/drivers/bluetooth/hci_h5.c
+++ b/drivers/bluetooth/hci_h5.c
@@ -14,6 +14,7 @@ 
 #include <linux/of_device.h>
 #include <linux/serdev.h>
 #include <linux/skbuff.h>
+#include <linux/suspend.h>
 
 #include <net/bluetooth/bluetooth.h>
 #include <net/bluetooth/hci_core.h>
@@ -876,6 +877,13 @@  static int h5_btrtl_setup(struct h5 *h5)
 	bool flow_control;
 	int err;
 
+	/*
+	 * Since h5_btrtl_resume() does a device_reprobe() the suspend handling
+	 * done by the hci_suspend_notifier is not necessary; it actually causes
+	 * delays and a bunch of errors to get logged, so disable it.
+	 */
+	unregister_pm_notifier(&h5->hu->hdev->suspend_notifier);
+
 	btrtl_dev = btrtl_initialize(h5->hu->hdev, h5->id);
 	if (IS_ERR(btrtl_dev))
 		return PTR_ERR(btrtl_dev);