diff mbox series

[net-next,1/6] net: ipa: don't suspend/resume modem if not up

Message ID 20210804153626.1549001-2-elder@linaro.org
State New
Headers show
Series net: ipa: more work toward runtime PM | expand

Commit Message

Alex Elder Aug. 4, 2021, 3:36 p.m. UTC
The modem network device is set up by ipa_modem_start().  But its
TX queue is not actually started and endpoints enabled until it is
opened.

So avoid stopping the modem network device TX queue and disabling
endpoints on suspend or stop unless the netdev is marked UP.  And
skip attempting to resume unless it is UP.

Signed-off-by: Alex Elder <elder@linaro.org>

---
 drivers/net/ipa/ipa_modem.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.27.0

Comments

Jakub Kicinski Aug. 6, 2021, 1:26 a.m. UTC | #1
On Wed,  4 Aug 2021 10:36:21 -0500 Alex Elder wrote:
> The modem network device is set up by ipa_modem_start().  But its

> TX queue is not actually started and endpoints enabled until it is

> opened.

> 

> So avoid stopping the modem network device TX queue and disabling

> endpoints on suspend or stop unless the netdev is marked UP.  And

> skip attempting to resume unless it is UP.

> 

> Signed-off-by: Alex Elder <elder@linaro.org>


You said in the cover letter that in practice this fix doesn't matter.
It seems trivial to test so perhaps it doesn't and we should leave the
code be? Looking at dev->flags without holding rtnl_lock() seems
suspicious, drivers commonly put the relevant portion of suspend/resume
routines under rtnl_lock()/rtnl_unlock() (although to be completely
frank IDK if it's actually possible for concurrent suspend +
open/close to happen).

Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()? 

> diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c

> index 4ea8287e9d237..663a610979e70 100644

> --- a/drivers/net/ipa/ipa_modem.c

> +++ b/drivers/net/ipa/ipa_modem.c

> @@ -178,6 +178,9 @@ void ipa_modem_suspend(struct net_device *netdev)

>  	struct ipa_priv *priv = netdev_priv(netdev);

>  	struct ipa *ipa = priv->ipa;

>  

> +	if (!(netdev->flags & IFF_UP))

> +		return;

> +

>  	netif_stop_queue(netdev);

>  

>  	ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);

> @@ -194,6 +197,9 @@ void ipa_modem_resume(struct net_device *netdev)

>  	struct ipa_priv *priv = netdev_priv(netdev);

>  	struct ipa *ipa = priv->ipa;

>  

> +	if (!(netdev->flags & IFF_UP))

> +		return;

> +

>  	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]);

>  	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);

>  

> @@ -265,9 +271,11 @@ int ipa_modem_stop(struct ipa *ipa)

>  	/* Prevent the modem from triggering a call to ipa_setup() */

>  	ipa_smp2p_disable(ipa);

>  

> -	/* Stop the queue and disable the endpoints if it's open */

> +	/* Clean up the netdev and endpoints if it was started */

>  	if (netdev) {

> -		(void)ipa_stop(netdev);

> +		/* If it was opened, stop it first */

> +		if (netdev->flags & IFF_UP)

> +			(void)ipa_stop(netdev);

>  		ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL;

>  		ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL;

>  		ipa->modem_netdev = NULL;
Alex Elder Aug. 6, 2021, 11:39 a.m. UTC | #2
On 8/5/21 8:26 PM, Jakub Kicinski wrote:
> On Wed,  4 Aug 2021 10:36:21 -0500 Alex Elder wrote:

>> The modem network device is set up by ipa_modem_start().  But its

>> TX queue is not actually started and endpoints enabled until it is

>> opened.

>>

>> So avoid stopping the modem network device TX queue and disabling

>> endpoints on suspend or stop unless the netdev is marked UP.  And

>> skip attempting to resume unless it is UP.

>>

>> Signed-off-by: Alex Elder <elder@linaro.org>

> 

> You said in the cover letter that in practice this fix doesn't matter.


I don't think we've seen this problem with system suspend, but
with runtime suspend we could get a forced suspend request at
any time (and frequently), so if there is a problem, it will be
much more likely to occur.

For suspend, I don't think it's actually a "problem".  Disabling
the TX queue if it wasn't open is harmless--it just sets the
DRV_XOFF bit in the TX queue state field.  And we have a
separate "enabled endpoints" mask that prevents stopping or
suspending the endpoint if it wasn't opened.

But for resume, waking the queue schedules it.  I'm not sure
what exactly ensues in that case, but it's not correct if the
network device hasn't been opened.  For endpoints, again, they
won't be resumed if they weren't enabled, so that part's OK.

> It seems trivial to test so perhaps it doesn't and we should leave the

> code be? Looking at dev->flags without holding rtnl_lock() seems

> suspicious, drivers commonly put the relevant portion of suspend/resume

> routines under rtnl_lock()/rtnl_unlock() (although to be completely


I don't use rtnl_lock()/rtnl_unlock() *anywhere* in the driver.
It has no netlink interface (yet), and therefore I didn't even
think about using rtnl_lock().  Do I need it?

> frank IDK if it's actually possible for concurrent suspend +

> open/close to happen).


I think it isn't possible, but I'm less than 100% sure.  I've
been thinking a lot about exactly this sort of question lately...

> Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()?


None of them take that lock.  It is called in the driver ->remove
callback, and is called during cleanup if the modem crashes.

I think this fix is good, but as I said in the cover letter I'm
not aware of ever having hit it to date.

Thank you very much for your review and comments.

					-Alex

>> diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c

>> index 4ea8287e9d237..663a610979e70 100644

>> --- a/drivers/net/ipa/ipa_modem.c

>> +++ b/drivers/net/ipa/ipa_modem.c

>> @@ -178,6 +178,9 @@ void ipa_modem_suspend(struct net_device *netdev)

>>   	struct ipa_priv *priv = netdev_priv(netdev);

>>   	struct ipa *ipa = priv->ipa;

>>   

>> +	if (!(netdev->flags & IFF_UP))

>> +		return;

>> +

>>   	netif_stop_queue(netdev);

>>   

>>   	ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);

>> @@ -194,6 +197,9 @@ void ipa_modem_resume(struct net_device *netdev)

>>   	struct ipa_priv *priv = netdev_priv(netdev);

>>   	struct ipa *ipa = priv->ipa;

>>   

>> +	if (!(netdev->flags & IFF_UP))

>> +		return;

>> +

>>   	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]);

>>   	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);

>>   

>> @@ -265,9 +271,11 @@ int ipa_modem_stop(struct ipa *ipa)

>>   	/* Prevent the modem from triggering a call to ipa_setup() */

>>   	ipa_smp2p_disable(ipa);

>>   

>> -	/* Stop the queue and disable the endpoints if it's open */

>> +	/* Clean up the netdev and endpoints if it was started */

>>   	if (netdev) {

>> -		(void)ipa_stop(netdev);

>> +		/* If it was opened, stop it first */

>> +		if (netdev->flags & IFF_UP)

>> +			(void)ipa_stop(netdev);

>>   		ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL;

>>   		ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL;

>>   		ipa->modem_netdev = NULL;

>
Jakub Kicinski Aug. 6, 2021, 12:59 p.m. UTC | #3
On Fri, 6 Aug 2021 06:39:46 -0500 Alex Elder wrote:
> On 8/5/21 8:26 PM, Jakub Kicinski wrote:

> > On Wed,  4 Aug 2021 10:36:21 -0500 Alex Elder wrote:  

> >> The modem network device is set up by ipa_modem_start().  But its

> >> TX queue is not actually started and endpoints enabled until it is

> >> opened.

> >>

> >> So avoid stopping the modem network device TX queue and disabling

> >> endpoints on suspend or stop unless the netdev is marked UP.  And

> >> skip attempting to resume unless it is UP.

> >>

> >> Signed-off-by: Alex Elder <elder@linaro.org>  

> > 

> > You said in the cover letter that in practice this fix doesn't matter.  

> 

> I don't think we've seen this problem with system suspend, but

> with runtime suspend we could get a forced suspend request at

> any time (and frequently), so if there is a problem, it will be

> much more likely to occur.

> 

> For suspend, I don't think it's actually a "problem".  Disabling

> the TX queue if it wasn't open is harmless--it just sets the

> DRV_XOFF bit in the TX queue state field.  And we have a

> separate "enabled endpoints" mask that prevents stopping or

> suspending the endpoint if it wasn't opened.

> 

> But for resume, waking the queue schedules it.  I'm not sure

> what exactly ensues in that case, but it's not correct if the

> network device hasn't been opened.  For endpoints, again, they

> won't be resumed if they weren't enabled, so that part's OK.

> 

> > It seems trivial to test so perhaps it doesn't and we should leave the

> > code be? Looking at dev->flags without holding rtnl_lock() seems

> > suspicious, drivers commonly put the relevant portion of suspend/resume

> > routines under rtnl_lock()/rtnl_unlock() (although to be completely  

> 

> I don't use rtnl_lock()/rtnl_unlock() *anywhere* in the driver.

> It has no netlink interface (yet), and therefore I didn't even

> think about using rtnl_lock().  Do I need it?


Runtime PM interactions with rtnl_lock get really tricky, if there are
callers which will wake the device up while holding rtnl then taking
rtnl in .resume will cause an obvious deadlock, right?

I'm starting to feel like driver's RPM-related code has to be under it's
own lock, and interrogating higher layer's (e.g. network stack's) state
from RPM code should be avoided...

Long story short I don't think we have a good handle on this, 
I certainly don't so maybe let's leave your code be, for now.

> > frank IDK if it's actually possible for concurrent suspend +

> > open/close to happen).  

> 

> I think it isn't possible, but I'm less than 100% sure.  I've

> been thinking a lot about exactly this sort of question lately...

> 

> > Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()?  

> 

> None of them take that lock.  It is called in the driver ->remove

> callback, and is called during cleanup if the modem crashes.

> 

> I think this fix is good, but as I said in the cover letter I'm

> not aware of ever having hit it to date.

> 

> Thank you very much for your review and comments.
diff mbox series

Patch

diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c
index 4ea8287e9d237..663a610979e70 100644
--- a/drivers/net/ipa/ipa_modem.c
+++ b/drivers/net/ipa/ipa_modem.c
@@ -178,6 +178,9 @@  void ipa_modem_suspend(struct net_device *netdev)
 	struct ipa_priv *priv = netdev_priv(netdev);
 	struct ipa *ipa = priv->ipa;
 
+	if (!(netdev->flags & IFF_UP))
+		return;
+
 	netif_stop_queue(netdev);
 
 	ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);
@@ -194,6 +197,9 @@  void ipa_modem_resume(struct net_device *netdev)
 	struct ipa_priv *priv = netdev_priv(netdev);
 	struct ipa *ipa = priv->ipa;
 
+	if (!(netdev->flags & IFF_UP))
+		return;
+
 	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]);
 	ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]);
 
@@ -265,9 +271,11 @@  int ipa_modem_stop(struct ipa *ipa)
 	/* Prevent the modem from triggering a call to ipa_setup() */
 	ipa_smp2p_disable(ipa);
 
-	/* Stop the queue and disable the endpoints if it's open */
+	/* Clean up the netdev and endpoints if it was started */
 	if (netdev) {
-		(void)ipa_stop(netdev);
+		/* If it was opened, stop it first */
+		if (netdev->flags & IFF_UP)
+			(void)ipa_stop(netdev);
 		ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL;
 		ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL;
 		ipa->modem_netdev = NULL;