diff mbox series

net: qrtr: mhi: synchronize qrtr and mhi preparation

Message ID 20241104-qrtr_mhi-v1-1-79adf7e3bba5@quicinc.com
State New
Headers show
Series net: qrtr: mhi: synchronize qrtr and mhi preparation | expand

Commit Message

Chris Lew Nov. 5, 2024, 1:29 a.m. UTC
From: Bhaumik Bhatt <bbhatt@codeaurora.org>

The call to qrtr_endpoint_register() was moved before
mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
callback can occur before the qrtr endpoint is registered.

Now the reverse can happen where qrtr will try to send a packet
before the channels are prepared. Add a wait in the sending path to
ensure the channels are prepared before trying to do a ul transfer.

Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
Reported-by: Johan Hovold <johan@kernel.org>
Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
Signed-off-by: Chris Lew <quic_clew@quicinc.com>
---
 net/qrtr/mhi.c | 7 +++++++
 1 file changed, 7 insertions(+)


---
base-commit: 1ffec08567f426a1c593e038cadc61bdc38cb467
change-id: 20241104-qrtr_mhi-dfec353030af

Best regards,

Comments

Manivannan Sadhasivam Nov. 7, 2024, 11:27 a.m. UTC | #1
On Mon, Nov 04, 2024 at 05:29:37PM -0800, Chris Lew wrote:
> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> 
> The call to qrtr_endpoint_register() was moved before
> mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
> callback can occur before the qrtr endpoint is registered.
> 
> Now the reverse can happen where qrtr will try to send a packet
> before the channels are prepared. Add a wait in the sending path to
> ensure the channels are prepared before trying to do a ul transfer.
> 
> Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
> Reported-by: Johan Hovold <johan@kernel.org>
> Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> Signed-off-by: Chris Lew <quic_clew@quicinc.com>

I think we need to have the check in 'mhi_queue()' instead of waiting for the
channels in client drivers. Would it be a problem if qrtr returns -EAGAIN from
qcom_mhi_qrtr_send() instead of waiting for the channel?

- Mani

> ---
>  net/qrtr/mhi.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/qrtr/mhi.c b/net/qrtr/mhi.c
> index 69f53625a049..5b7268868bbd 100644
> --- a/net/qrtr/mhi.c
> +++ b/net/qrtr/mhi.c
> @@ -15,6 +15,7 @@ struct qrtr_mhi_dev {
>  	struct qrtr_endpoint ep;
>  	struct mhi_device *mhi_dev;
>  	struct device *dev;
> +	struct completion prepared;
>  };
>  
>  /* From MHI to QRTR */
> @@ -53,6 +54,10 @@ static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
>  	if (skb->sk)
>  		sock_hold(skb->sk);
>  
> +	rc = wait_for_completion_interruptible(&qdev->prepared);
> +	if (rc)
> +		goto free_skb;
> +
>  	rc = skb_linearize(skb);
>  	if (rc)
>  		goto free_skb;
> @@ -85,6 +90,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>  	qdev->mhi_dev = mhi_dev;
>  	qdev->dev = &mhi_dev->dev;
>  	qdev->ep.xmit = qcom_mhi_qrtr_send;
> +	init_completion(&qdev->prepared);
>  
>  	dev_set_drvdata(&mhi_dev->dev, qdev);
>  	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
> @@ -97,6 +103,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>  		qrtr_endpoint_unregister(&qdev->ep);
>  		return rc;
>  	}
> +	complete_all(&qdev->prepared);
>  
>  	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");
>  
> 
> ---
> base-commit: 1ffec08567f426a1c593e038cadc61bdc38cb467
> change-id: 20241104-qrtr_mhi-dfec353030af
> 
> Best regards,
> -- 
> Chris Lew <quic_clew@quicinc.com>
>
Chris Lew Nov. 7, 2024, 7:58 p.m. UTC | #2
On 11/7/2024 3:27 AM, Manivannan Sadhasivam wrote:
> On Mon, Nov 04, 2024 at 05:29:37PM -0800, Chris Lew wrote:
>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>
>> The call to qrtr_endpoint_register() was moved before
>> mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
>> callback can occur before the qrtr endpoint is registered.
>>
>> Now the reverse can happen where qrtr will try to send a packet
>> before the channels are prepared. Add a wait in the sending path to
>> ensure the channels are prepared before trying to do a ul transfer.
>>
>> Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
>> Reported-by: Johan Hovold <johan@kernel.org>
>> Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>> Signed-off-by: Chris Lew <quic_clew@quicinc.com>
> 
> I think we need to have the check in 'mhi_queue()' instead of waiting for the
> channels in client drivers. Would it be a problem if qrtr returns -EAGAIN from
> qcom_mhi_qrtr_send() instead of waiting for the channel?
> 

The packet would get dropped which usually ends up causing some 
functional problem down the line.

I can add retry handling for EAGAIN in qcom_mhi_qrtr_send().

Downstream we had also seen some issue where we received EAGAIN because 
the ring buffer was full. I think we saw issues doing a dumb retry so we 
triggered the retry on getting a ul_callback().

We would need to differentiate between this kind of EAGAIN from a 
ringbuf full EAGAIN.

> - Mani
> 
>> ---
>>   net/qrtr/mhi.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/net/qrtr/mhi.c b/net/qrtr/mhi.c
>> index 69f53625a049..5b7268868bbd 100644
>> --- a/net/qrtr/mhi.c
>> +++ b/net/qrtr/mhi.c
>> @@ -15,6 +15,7 @@ struct qrtr_mhi_dev {
>>   	struct qrtr_endpoint ep;
>>   	struct mhi_device *mhi_dev;
>>   	struct device *dev;
>> +	struct completion prepared;
>>   };
>>   
>>   /* From MHI to QRTR */
>> @@ -53,6 +54,10 @@ static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
>>   	if (skb->sk)
>>   		sock_hold(skb->sk);
>>   
>> +	rc = wait_for_completion_interruptible(&qdev->prepared);
>> +	if (rc)
>> +		goto free_skb;
>> +
>>   	rc = skb_linearize(skb);
>>   	if (rc)
>>   		goto free_skb;
>> @@ -85,6 +90,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>>   	qdev->mhi_dev = mhi_dev;
>>   	qdev->dev = &mhi_dev->dev;
>>   	qdev->ep.xmit = qcom_mhi_qrtr_send;
>> +	init_completion(&qdev->prepared);
>>   
>>   	dev_set_drvdata(&mhi_dev->dev, qdev);
>>   	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
>> @@ -97,6 +103,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>>   		qrtr_endpoint_unregister(&qdev->ep);
>>   		return rc;
>>   	}
>> +	complete_all(&qdev->prepared);
>>   
>>   	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");
>>   
>>
>> ---
>> base-commit: 1ffec08567f426a1c593e038cadc61bdc38cb467
>> change-id: 20241104-qrtr_mhi-dfec353030af
>>
>> Best regards,
>> -- 
>> Chris Lew <quic_clew@quicinc.com>
>>
>
Johan Hovold Nov. 8, 2024, 10:32 a.m. UTC | #3
On Mon, Nov 04, 2024 at 05:29:37PM -0800, Chris Lew wrote:
> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> 
> The call to qrtr_endpoint_register() was moved before
> mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
> callback can occur before the qrtr endpoint is registered.
> 
> Now the reverse can happen where qrtr will try to send a packet
> before the channels are prepared. Add a wait in the sending path to
> ensure the channels are prepared before trying to do a ul transfer.
> 
> Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
> Reported-by: Johan Hovold <johan@kernel.org>
> Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> Signed-off-by: Chris Lew <quic_clew@quicinc.com>

> @@ -53,6 +54,10 @@ static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
>  	if (skb->sk)
>  		sock_hold(skb->sk);
>  
> +	rc = wait_for_completion_interruptible(&qdev->prepared);
> +	if (rc)
> +		goto free_skb;
> +
>  	rc = skb_linearize(skb);
>  	if (rc)
>  		goto free_skb;
> @@ -85,6 +90,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>  	qdev->mhi_dev = mhi_dev;
>  	qdev->dev = &mhi_dev->dev;
>  	qdev->ep.xmit = qcom_mhi_qrtr_send;
> +	init_completion(&qdev->prepared);
>  
>  	dev_set_drvdata(&mhi_dev->dev, qdev);
>  	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
> @@ -97,6 +103,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>  		qrtr_endpoint_unregister(&qdev->ep);
>  		return rc;
>  	}
> +	complete_all(&qdev->prepared);
>  
>  	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");

While this probably works, it still looks like a bit of a hack.

Why can't you restructure the code so that the channels are fully
initialised before you register or enable them instead?

Johan
Manivannan Sadhasivam Nov. 24, 2024, 3:04 p.m. UTC | #4
On Thu, Nov 21, 2024 at 04:28:41PM -0800, Chris Lew wrote:
> 
> 
> On 11/8/2024 2:32 AM, Johan Hovold wrote:
> > On Mon, Nov 04, 2024 at 05:29:37PM -0800, Chris Lew wrote:
> > > From: Bhaumik Bhatt <bbhatt@codeaurora.org>
> > > 
> > > The call to qrtr_endpoint_register() was moved before
> > > mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
> > > callback can occur before the qrtr endpoint is registered.
> > > 
> > > Now the reverse can happen where qrtr will try to send a packet
> > > before the channels are prepared. Add a wait in the sending path to
> > > ensure the channels are prepared before trying to do a ul transfer.
> > > 
> > > Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
> > > Reported-by: Johan Hovold <johan@kernel.org>
> > > Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
> > > Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
> > > Signed-off-by: Chris Lew <quic_clew@quicinc.com>
> > 
> > > @@ -53,6 +54,10 @@ static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
> > >   	if (skb->sk)
> > >   		sock_hold(skb->sk);
> > > +	rc = wait_for_completion_interruptible(&qdev->prepared);
> > > +	if (rc)
> > > +		goto free_skb;
> > > +
> > >   	rc = skb_linearize(skb);
> > >   	if (rc)
> > >   		goto free_skb;
> > > @@ -85,6 +90,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
> > >   	qdev->mhi_dev = mhi_dev;
> > >   	qdev->dev = &mhi_dev->dev;
> > >   	qdev->ep.xmit = qcom_mhi_qrtr_send;
> > > +	init_completion(&qdev->prepared);
> > >   	dev_set_drvdata(&mhi_dev->dev, qdev);
> > >   	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
> > > @@ -97,6 +103,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
> > >   		qrtr_endpoint_unregister(&qdev->ep);
> > >   		return rc;
> > >   	}
> > > +	complete_all(&qdev->prepared);
> > >   	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");
> > 
> > While this probably works, it still looks like a bit of a hack.
> > 
> > Why can't you restructure the code so that the channels are fully
> > initialised before you register or enable them instead?
> > 
> 
> Ok, I think we will have to stop using the autoqueue feature of MHI and
> change the flow to be mhi_prepare_for_transfer() -->
> qrtr_endpoint_register() --> mhi_queue_buf(DMA_FROM_DEVICE). This would make
> it so ul_transfers only happen after mhi_prepare_for_transfer() and
> dl_transfers happen after qrtr_endpoint_register().
> 
> I'll take a stab at implementing this.
> 

Hmm, I thought 'autoqueue' was used for a specific reason. So it is not valid
now?

- Mani
Chris Lew Nov. 25, 2024, 7:05 p.m. UTC | #5
On 11/24/2024 7:04 AM, Manivannan Sadhasivam wrote:
> On Thu, Nov 21, 2024 at 04:28:41PM -0800, Chris Lew wrote:
>>
>>
>> On 11/8/2024 2:32 AM, Johan Hovold wrote:
>>> On Mon, Nov 04, 2024 at 05:29:37PM -0800, Chris Lew wrote:
>>>> From: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>>
>>>> The call to qrtr_endpoint_register() was moved before
>>>> mhi_prepare_for_transfer_autoqueue() to prevent a case where a dl
>>>> callback can occur before the qrtr endpoint is registered.
>>>>
>>>> Now the reverse can happen where qrtr will try to send a packet
>>>> before the channels are prepared. Add a wait in the sending path to
>>>> ensure the channels are prepared before trying to do a ul transfer.
>>>>
>>>> Fixes: 68a838b84eff ("net: qrtr: start MHI channel after endpoit creation")
>>>> Reported-by: Johan Hovold <johan@kernel.org>
>>>> Closes: https://lore.kernel.org/linux-arm-msm/ZyTtVdkCCES0lkl4@hovoldconsulting.com/
>>>> Signed-off-by: Bhaumik Bhatt <bbhatt@codeaurora.org>
>>>> Signed-off-by: Chris Lew <quic_clew@quicinc.com>
>>>
>>>> @@ -53,6 +54,10 @@ static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
>>>>    	if (skb->sk)
>>>>    		sock_hold(skb->sk);
>>>> +	rc = wait_for_completion_interruptible(&qdev->prepared);
>>>> +	if (rc)
>>>> +		goto free_skb;
>>>> +
>>>>    	rc = skb_linearize(skb);
>>>>    	if (rc)
>>>>    		goto free_skb;
>>>> @@ -85,6 +90,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>>>>    	qdev->mhi_dev = mhi_dev;
>>>>    	qdev->dev = &mhi_dev->dev;
>>>>    	qdev->ep.xmit = qcom_mhi_qrtr_send;
>>>> +	init_completion(&qdev->prepared);
>>>>    	dev_set_drvdata(&mhi_dev->dev, qdev);
>>>>    	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
>>>> @@ -97,6 +103,7 @@ static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
>>>>    		qrtr_endpoint_unregister(&qdev->ep);
>>>>    		return rc;
>>>>    	}
>>>> +	complete_all(&qdev->prepared);
>>>>    	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");
>>>
>>> While this probably works, it still looks like a bit of a hack.
>>>
>>> Why can't you restructure the code so that the channels are fully
>>> initialised before you register or enable them instead?
>>>
>>
>> Ok, I think we will have to stop using the autoqueue feature of MHI and
>> change the flow to be mhi_prepare_for_transfer() -->
>> qrtr_endpoint_register() --> mhi_queue_buf(DMA_FROM_DEVICE). This would make
>> it so ul_transfers only happen after mhi_prepare_for_transfer() and
>> dl_transfers happen after qrtr_endpoint_register().
>>
>> I'll take a stab at implementing this.
>>
> 
> Hmm, I thought 'autoqueue' was used for a specific reason. So it is not valid
> now?
> 

I think when MHI was being developed, I asked for an interface similar 
to rpmsg. The team came up with the autoqueue feature which made the 
qrtr mhi transport simpler and closer to the smd transport. I can't 
think of a specific reason that QRTR needs autoqueue, but maybe ill find 
it when I start poking at it.

> - Mani
>
diff mbox series

Patch

diff --git a/net/qrtr/mhi.c b/net/qrtr/mhi.c
index 69f53625a049..5b7268868bbd 100644
--- a/net/qrtr/mhi.c
+++ b/net/qrtr/mhi.c
@@ -15,6 +15,7 @@  struct qrtr_mhi_dev {
 	struct qrtr_endpoint ep;
 	struct mhi_device *mhi_dev;
 	struct device *dev;
+	struct completion prepared;
 };
 
 /* From MHI to QRTR */
@@ -53,6 +54,10 @@  static int qcom_mhi_qrtr_send(struct qrtr_endpoint *ep, struct sk_buff *skb)
 	if (skb->sk)
 		sock_hold(skb->sk);
 
+	rc = wait_for_completion_interruptible(&qdev->prepared);
+	if (rc)
+		goto free_skb;
+
 	rc = skb_linearize(skb);
 	if (rc)
 		goto free_skb;
@@ -85,6 +90,7 @@  static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
 	qdev->mhi_dev = mhi_dev;
 	qdev->dev = &mhi_dev->dev;
 	qdev->ep.xmit = qcom_mhi_qrtr_send;
+	init_completion(&qdev->prepared);
 
 	dev_set_drvdata(&mhi_dev->dev, qdev);
 	rc = qrtr_endpoint_register(&qdev->ep, QRTR_EP_NID_AUTO);
@@ -97,6 +103,7 @@  static int qcom_mhi_qrtr_probe(struct mhi_device *mhi_dev,
 		qrtr_endpoint_unregister(&qdev->ep);
 		return rc;
 	}
+	complete_all(&qdev->prepared);
 
 	dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");