diff mbox series

[2/4] wifi: mt76: mt7921: fix potential command timeout issues when suspend

Message ID 20231220092138.12830-3-mingyen.hsieh@mediatek.com
State New
Headers show
Series wifi: mt76: mt7921: fix potential issues for mt7921 | expand

Commit Message

Mingyen Hsieh Dec. 20, 2023, 9:21 a.m. UTC
From: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>

An ongoing command may be interrupted if suspended, leading to a
command timeout. Add a lock to the suspend function in order to
protect the ongoing command from being interrupted.

Signed-off-by: Leon Yen <leon.yen@mediatek.com>
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
---
 drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Sean Wang Dec. 20, 2023, 10:11 p.m. UTC | #1
Hi Mingyen,

On Wed, Dec 20, 2023 at 3:22 AM Mingyen Hsieh
<mingyen.hsieh@mediatek.com> wrote:
>
> From: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
>
> An ongoing command may be interrupted if suspended, leading to a
> command timeout. Add a lock to the suspend function in order to
> protect the ongoing command from being interrupted.
>
> Signed-off-by: Leon Yen <leon.yen@mediatek.com>
> Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
> ---
>  drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> index 57903c6e4f11..e1c53f18abdc 100644
> --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> @@ -409,7 +409,9 @@ static int mt7921_pci_suspend(struct device *device)
>         if (err < 0)
>                 goto restore_suspend;
>
> +       mt792x_mutex_acquire(dev);
>         err = mt76_connac_mcu_set_hif_suspend(mdev, true);
> +       mt792x_mutex_release(dev);

I had assumed that all commands would finish up before the suspend
handler is called, given that processes are frozen before the
suspension can progress. Could you provide more details on the
specific conflict between these two commands and how it results in the
command timeout?

Moreover, while I understand you've identified some potential issues,
I suggest we work towards a comprehensive solution to address timeouts
caused by potential command conflicts, not just for the specific
command. For instance, we should take into account other MCU commands
during suspension and extend this consideration to similar scenarios
in SDIO suspension.

>         if (err)
>                 goto restore_suspend;
>
> --
> 2.18.0
>
>
Mingyen Hsieh Dec. 21, 2023, 2:26 a.m. UTC | #2
On Wed, 2023-12-20 at 16:11 -0600, Sean Wang wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  Hi Mingyen,
> 
> On Wed, Dec 20, 2023 at 3:22 AM Mingyen Hsieh
> <mingyen.hsieh@mediatek.com> wrote:
> >
> > From: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
> >
> > An ongoing command may be interrupted if suspended, leading to a
> > command timeout. Add a lock to the suspend function in order to
> > protect the ongoing command from being interrupted.
> >
> > Signed-off-by: Leon Yen <leon.yen@mediatek.com>
> > Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
> > ---
> >  drivers/net/wireless/mediatek/mt76/mt7921/pci.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > index 57903c6e4f11..e1c53f18abdc 100644
> > --- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > +++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
> > @@ -409,7 +409,9 @@ static int mt7921_pci_suspend(struct device
> *device)
> >         if (err < 0)
> >                 goto restore_suspend;
> >
> > +       mt792x_mutex_acquire(dev);
> >         err = mt76_connac_mcu_set_hif_suspend(mdev, true);
> > +       mt792x_mutex_release(dev);
> 
> I had assumed that all commands would finish up before the suspend
> handler is called, given that processes are frozen before the
> suspension can progress. Could you provide more details on the
> specific conflict between these two commands and how it results in
> the
> command timeout?
> 
> Moreover, while I understand you've identified some potential issues,
> I suggest we work towards a comprehensive solution to address
> timeouts
> caused by potential command conflicts, not just for the specific
> command. For instance, we should take into account other MCU commands
> during suspension and extend this consideration to similar scenarios
> in SDIO suspension.
> 
> >         if (err)
> >                 goto restore_suspend;
> >
> > --
> > 2.18.0
> >
> >

Hi Sean,

This case occurs when entering suspend mode during a connection. As the
system transitions from a connected state to a disconnected state upon
entering suspend, it calls the .regd_notifier() function. Consequently,
the mt7921 executes the CLC cmd. Meanwhile, because the suspend cmd is
not protected by a lock, the system proceeds to suspend, resulting in
the CLC cmd not completing and causing a command timeout.
diff mbox series

Patch

diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
index 57903c6e4f11..e1c53f18abdc 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/pci.c
@@ -409,7 +409,9 @@  static int mt7921_pci_suspend(struct device *device)
 	if (err < 0)
 		goto restore_suspend;
 
+	mt792x_mutex_acquire(dev);
 	err = mt76_connac_mcu_set_hif_suspend(mdev, true);
+	mt792x_mutex_release(dev);
 	if (err)
 		goto restore_suspend;