mbox series

[BlueZ,0/1] Fixes busy loop when disabling

Message ID 20240403205252.71978-1-dimitris.on.linux@gmail.com
Headers show
Series Fixes busy loop when disabling | expand

Message

Dimitris April 3, 2024, 8:52 p.m. UTC
Reverts a commit which causes bluetoothd to enter a busy loop if
BT is disabled (eg. rfkill block bluetooth) when one or more
devices are connected.

Steps to reproduce are described in these reports:

https://github.com/bluez/bluez/issues/785
https://bugzilla.redhat.com/show_bug.cgi?id=2269516

Dimitris (1):
  Revert "device: Consider service state on device_is_connected"

 src/device.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Paul Menzel April 4, 2024, 6:24 a.m. UTC | #1
Dear Dimitris,


Am 03.04.24 um 22:52 schrieb Dimitris:
> This reverts commit 44d3f67277f83983e1e9697eda7b9aeb40ca231d.

Please document the reason for the revert.

> ---
>   src/device.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)

[…]


Kind regards,

Paul
Dimitris April 4, 2024, 6:35 a.m. UTC | #2
Hi Paul,

> Please document the reason for the revert.

I've since done a narrower change; instead of the revert I'm only 
factoring out the state check that avoids the busy loop at "rfkill block 
bluetooth" time.

Updated cover letter and patch under the "V2" part of the thread:

https://lore.kernel.org/linux-bluetooth/20240404024521.120349-1-dimitris.on.linux@gmail.com/T/#u

Issue steps to reproduce from e.g. 
https://github.com/bluez/bluez/issues/785 :

- Connect at least one device. Tried this with either one of: Logitech 
MX Master 3, Google Pixel Buds Pro.
- Run rfkill block bluetooth
- bluetoothd takes a whole core, GNOME quick settings and status still 
shows bluetooth as "active". I have to kill -9 the process to get the 
block to really complete.
- Turning off the connected device does not break the loop.

Busy loop stack looks like:

> #0  adapter_remove_connection (adapter=0x55a17e6889d0, device=0x55a17e698d30, bdaddr_type=2 '\002') at src/adapter.c:7476
> #1  0x000055a17e55979f in adapter_stop (adapter=0x55a17e6889d0) at src/adapter.c:7527
> #2  settings_changed (settings=<optimized out>, adapter=0x55a17e6889d0) at src/adapter.c:622
> #3  new_settings_callback (index=<optimized out>, length=<optimized out>, param=<optimized out>, user_data=0x55a17e6889d0) at src/adapter.c:705
> #4  0x000055a17e59981e in queue_foreach (user_data=0x7ffe49a7ef20, function=0x55a17e591e60 <notify_handler>, queue=0x55a17e683280) at src/shared/queue.c:207
> #5  queue_foreach (user_data=0x7ffe49a7ef20, function=0x55a17e591e60 <notify_handler>, queue=0x55a17e683280) at src/shared/queue.c:190
> #6  process_notify (param=<optimized out>, length=<optimized out>, index=<optimized out>, event=<optimized out>, mgmt=0x55a17e682f30) at src/shared/mgmt.c:349
> #7  can_read_data (io=<optimized out>, user_data=0x55a17e682f30) at src/shared/mgmt.c:409
> #8  0x000055a17e5bb679 in watch_callback (channel=<optimized out>, cond=<optimized out>, user_data=<optimized out>) at src/shared/io-glib.c:157
> #9  0x00007f876edd4e5c in g_main_context_dispatch_unlocked.lto_priv () from target:/lib64/libglib-2.0.so.0
> #10 0x00007f876ee2ff18 in g_main_context_iterate_unlocked.isra () from target:/lib64/libglib-2.0.so.0
> #11 0x00007f876edd6447 in g_main_loop_run () from target:/lib64/libglib-2.0.so.0
> #12 0x000055a17e4f1d64 in mainloop_run () at src/shared/mainloop-glib.c:66
> #13 mainloop_run_with_signal (func=0x55a17e544740 <signal_callback>, user_data=0x0) at src/shared/mainloop-notify.c:188
> #14 main (argc=<optimized out>, argv=<optimized out>) at src/main.c:1455

First time submitter here, should I resubmit/start a new thread for this 
"V2" approach?

Thanks

D.
Dimitris April 4, 2024, 3:52 p.m. UTC | #3
On 4/4/24 07:59, Luiz Augusto von Dentz wrote:

>> diff --git a/src/adapter.c b/src/adapter.c
>> index 4bcc464de..0b7aab4b5 100644
>> --- a/src/adapter.c
>> +++ b/src/adapter.c
>> @@ -7486,7 +7486,7 @@ static void adapter_remove_connection(struct btd_adapter *adapter,
>>                  device_cancel_authentication(device, TRUE);
>>
>>          /* If another bearer is still connected */
>> -       if (btd_device_is_connected(device))
>> +       if (btd_device_state_is_connected(device))
> 
> Perhaps btd_device_bearer_is_connected would be a better name.

Thanks, I'll rename.

> I guess the problem is that some service is indicating it is still
> connected though?

Newbie question for sure, but: As this is happening in the code path for 
"disabling bluetooth", shouldn't services already be disconnected here?

D.
Luiz Augusto von Dentz April 4, 2024, 4:16 p.m. UTC | #4
Hi Dimitris,

On Thu, Apr 4, 2024 at 11:52 AM Dimitris <dimitris.on.linux@gmail.com> wrote:
>
> On 4/4/24 07:59, Luiz Augusto von Dentz wrote:
>
> >> diff --git a/src/adapter.c b/src/adapter.c
> >> index 4bcc464de..0b7aab4b5 100644
> >> --- a/src/adapter.c
> >> +++ b/src/adapter.c
> >> @@ -7486,7 +7486,7 @@ static void adapter_remove_connection(struct btd_adapter *adapter,
> >>                  device_cancel_authentication(device, TRUE);
> >>
> >>          /* If another bearer is still connected */
> >> -       if (btd_device_is_connected(device))
> >> +       if (btd_device_state_is_connected(device))
> >
> > Perhaps btd_device_bearer_is_connected would be a better name.
>
> Thanks, I'll rename.
>
> > I guess the problem is that some service is indicating it is still
> > connected though?
>
> Newbie question for sure, but: As this is happening in the code path for
> "disabling bluetooth", shouldn't services already be disconnected here?

That is exactly what I would like to know, why is there a service
still indicating it is connected if the controller is rfkilled, so
while this should break it back to the old behavior we still need to
fix the service that is causing the problem so perhaps we need to
print its profile/drive name and figure out what is the driver that is
causing it.

> D.
Dimitris April 4, 2024, 6:25 p.m. UTC | #5
Hi Luiz,

On 4/4/24 09:16, Luiz Augusto von Dentz wrote:

>>> I guess the problem is that some service is indicating it is still
>>> connected though?
>>
>> Newbie question for sure, but: As this is happening in the code path for
>> "disabling bluetooth", shouldn't services already be disconnected here?
> 
> That is exactly what I would like to know, why is there a service
> still indicating it is connected if the controller is rfkilled, so
> while this should break it back to the old behavior we still need to
> fix the service that is causing the problem so perhaps we need to
> print its profile/drive name and figure out what is the driver that is
> causing it.
> 

I added a debug kludge:


> diff --git a/src/device.c b/src/device.c
> index 74dd67a09..c461a6a3a 100644
> --- a/src/device.c
> +++ b/src/device.c
> @@ -344,7 +344,15 @@ static GSList *find_service_with_state(GSList *list,
>                 struct btd_service *service = l->data;
>  
>                 if (btd_service_get_state(service) == state)
> +               {
> +                       char name[256];
> +                       device_get_name(btd_service_get_device(service), name, 256);
> +                       info("Found service: err: %d, device: %s",
> +                               btd_service_get_error(service),
> +                               name
> +                       );
>                         return l;
> +               }
>         }
>  
>         return NULL;
> @@ -3282,7 +3290,12 @@ bool btd_device_is_connected(struct btd_device *dev)
>  
>  bool btd_device_bearer_is_connected(struct btd_device *dev)
>  {
> -       return dev->bredr_state.connected || dev->le_state.connected;
> +       if(dev->bredr_state.connected || dev->le_state.connected)
> +               return true;
> +       else {
> +               find_service_with_state(dev->services, BTD_SERVICE_STATE_CONNECTED);
> +               return false;
> +       };
>  }
>  
>  static void clear_temporary_timer(struct btd_device *dev)

And it seems that every device I have available triggers this:  MX 
Master 3 mouse, Google Pixel Buds Pro, Google Pixel 7.  Ran experiments 
with each one of the devices being connected when running rfkill block:

> Apr 04 11:06:31 bluetoothd[331222]: Found service: err: 0, device: MX Master 3
> Apr 04 11:06:46 bluetoothd[331222]: Found service: err: 0, device: MX Master 3
> Apr 04 11:06:58 bluetoothd[331222]: Found service: err: 0, device: MX Master 3
> Apr 04 11:07:28 bluetoothd[331222]: Found service: err: 0, device: coral buds
> Apr 04 11:07:29 bluetoothd[331222]: Found service: err: 0, device: coral buds
> Apr 04 11:07:29 bluetoothd[331222]: Found service: err: 0, device: coral buds
> Apr 04 11:07:29 bluetoothd[331222]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)
> Apr 04 11:08:01 bluetoothd[331222]: Found service: err: 0, device: coral buds
> Apr 04 11:08:01 bluetoothd[331222]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)
> Apr 04 11:08:40 bluetoothd[331222]: Found service: err: 0, device: coral buds
> Apr 04 11:08:40 bluetoothd[331222]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free Voice gateway: getpeername: Transport endpoint is not connected (107)
> Apr 04 11:09:47 bluetoothd[331222]: Found service: err: 0, device: p7
> Apr 04 11:09:47 bluetoothd[331222]: src/profile.c:ext_io_disconnected() Unable to get io data for Hands-Free unit: getpeername: Transport endpoint is not connected (107)

The BT adapter is a Mediatek MT7922 WiFi/BT M2 adapter, seems to be 
using the btmtk driver.

In parallel, I've sent a V3 of the "bring back previous behavior" patch 
with the new function named btd_device_bearer_is_connected.

D.