diff mbox series

xhci: Fix Link TRB DMA in command ring stopped completion event

Message ID 20241018195953.12315-1-quic_faisalh@quicinc.com
State New
Headers show
Series xhci: Fix Link TRB DMA in command ring stopped completion event | expand

Commit Message

Faisal Hassan Oct. 18, 2024, 7:59 p.m. UTC
During the aborting of a command, the software receives a command
completion event for the command ring stopped, with the TRB pointing
to the next TRB after the aborted command.

If the command we abort is located just before the Link TRB in the
command ring, then during the 'command ring stopped' completion event,
the xHC gives the Link TRB in the event's cmd DMA, which causes a
mismatch in handling command completion event.

To handle this situation, an additional check has been added to ignore
the mismatch error and continue the operation.

Cc: stable@vger.kernel.org
Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com>
---
 drivers/usb/host/xhci-ring.c | 38 +++++++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

Comments

Greg KH Oct. 19, 2024, 6:34 a.m. UTC | #1
On Sat, Oct 19, 2024 at 01:29:53AM +0530, Faisal Hassan wrote:
> During the aborting of a command, the software receives a command
> completion event for the command ring stopped, with the TRB pointing
> to the next TRB after the aborted command.
> 
> If the command we abort is located just before the Link TRB in the
> command ring, then during the 'command ring stopped' completion event,
> the xHC gives the Link TRB in the event's cmd DMA, which causes a
> mismatch in handling command completion event.
> 
> To handle this situation, an additional check has been added to ignore
> the mismatch error and continue the operation.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com>

What commit id does this fix?

thanks,

greg k-h
Michał Pecio Oct. 19, 2024, 7:20 a.m. UTC | #2
Hi,

> During the aborting of a command, the software receives a command
> completion event for the command ring stopped, with the TRB pointing
> to the next TRB after the aborted command.
>
> If the command we abort is located just before the Link TRB in the
> command ring, then during the 'command ring stopped' completion event,
> the xHC gives the Link TRB in the event's cmd DMA, which causes a
> mismatch in handling command completion event.
>
> To handle this situation, an additional check has been added to ignore
> the mismatch error and continue the operation.

Thanks, I remember having some issues with command aborts, but I blamed
them on my own bugs, although I never found what the problem was. I may
take a look at it later, but I'm currently busy with other things.

No comment about validity of this patch for now, but a few remarks:

>+static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
>+{
>+	struct xhci_segment *seg;
>+	union xhci_trb *trb;
>+	dma_addr_t trb_dma;
>+	int i;
>+
>+	seg = ring->first_seg;
>+	do {
>+		for (i = 0; i < TRBS_PER_SEGMENT; i++) {
>+			trb = &seg->trbs[i];
>+			trb_dma = seg->dma + (i * sizeof(union xhci_trb));
>+
>+			if (trb_is_link(trb) && trb_dma == dma)
>+				return true;
>+		}

You don't need to iterate the array. Something like this should work:
do {
	if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE)) {
		/* found the TRB, check if it's link */
		trb = &seg->trbs[(dma - seg->dma) / sizeof(*trb)];
		return trb_is_link(trb);
	}
	// try next seg, while (blah blah), return false

We should probably have a helper for (ring, dma)->trb lookups, but
for stable it may be sensible to do it without excess complication.

>+	if ((!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) &&
>+	    !(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
>+	      is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {

This if statement is quite complex now. I would be tempted to write
it this way instead:

/* original comment */
if (cmd_dma != dequeue_dma) {
	/* your new comment */
	if (! (RING_STOPPED && is_link)) {
		warn();
		return;
	}
}

Regards,
Michal
Faisal Hassan Oct. 19, 2024, 5:23 p.m. UTC | #3
On 10/19/2024 12:04 PM, Greg Kroah-Hartman wrote:
> On Sat, Oct 19, 2024 at 01:29:53AM +0530, Faisal Hassan wrote:
>> During the aborting of a command, the software receives a command
>> completion event for the command ring stopped, with the TRB pointing
>> to the next TRB after the aborted command.
>>
>> If the command we abort is located just before the Link TRB in the
>> command ring, then during the 'command ring stopped' completion event,
>> the xHC gives the Link TRB in the event's cmd DMA, which causes a
>> mismatch in handling command completion event.
>>
>> To handle this situation, an additional check has been added to ignore
>> the mismatch error and continue the operation.
>>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Faisal Hassan <quic_faisalh@quicinc.com>
> 
> What commit id does this fix?
> 
> thanks,
> 
> greg k-h

This does not fix any specific commit. It appears that the
implementation has been missing since the very beginning of xhci-ring.c.
Therefore, can I reference the first commit in the Fixes tag:
7f84eef0dafb ("USB: xhci: No-op command queueing and irq handler.")?

Thanks,
Faisal
Faisal Hassan Oct. 19, 2024, 5:52 p.m. UTC | #4
Hi Michal,

On 10/19/2024 12:50 PM, Michał Pecio wrote:
> Hi,
> 
>> During the aborting of a command, the software receives a command
>> completion event for the command ring stopped, with the TRB pointing
>> to the next TRB after the aborted command.
>>
>> If the command we abort is located just before the Link TRB in the
>> command ring, then during the 'command ring stopped' completion event,
>> the xHC gives the Link TRB in the event's cmd DMA, which causes a
>> mismatch in handling command completion event.
>>
>> To handle this situation, an additional check has been added to ignore
>> the mismatch error and continue the operation.
> 
> Thanks, I remember having some issues with command aborts, but I blamed
> them on my own bugs, although I never found what the problem was. I may
> take a look at it later, but I'm currently busy with other things.
> 
> No comment about validity of this patch for now, but a few remarks:
> 
>> +static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
>> +{
>> +	struct xhci_segment *seg;
>> +	union xhci_trb *trb;
>> +	dma_addr_t trb_dma;
>> +	int i;
>> +
>> +	seg = ring->first_seg;
>> +	do {
>> +		for (i = 0; i < TRBS_PER_SEGMENT; i++) {
>> +			trb = &seg->trbs[i];
>> +			trb_dma = seg->dma + (i * sizeof(union xhci_trb));
>> +
>> +			if (trb_is_link(trb) && trb_dma == dma)
>> +				return true;
>> +		}
> 
> You don't need to iterate the array. Something like this should work:
> do {
> 	if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE)) {
> 		/* found the TRB, check if it's link */
> 		trb = &seg->trbs[(dma - seg->dma) / sizeof(*trb)];
> 		return trb_is_link(trb);
> 	}
> 	// try next seg, while (blah blah), return false
> 

Sure, this looks good. Let me revise the patch.

> We should probably have a helper for (ring, dma)->trb lookups, but
> for stable it may be sensible to do it without excess complication.
> 
>> +	if ((!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) &&
>> +	    !(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
>> +	      is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
> 
> This if statement is quite complex now. I would be tempted to write
> it this way instead:
> 
> /* original comment */
> if (cmd_dma != dequeue_dma) {
> 	/* your new comment */
> 	if (! (RING_STOPPED && is_link)) {
> 		warn();
> 		return;
> 	}
> }
> 
> Regards,
> Michal

Thank you for the suggestions. I will submit a v2 version.

Regards,
Faisal
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b2950c35c740..43926c378df9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -126,6 +126,32 @@  static void inc_td_cnt(struct urb *urb)
 	urb_priv->num_tds_done++;
 }
 
+/*
+ * Return true if the DMA is pointing to a Link TRB in the ring;
+ * otherwise, return false.
+ */
+static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
+{
+	struct xhci_segment *seg;
+	union xhci_trb *trb;
+	dma_addr_t trb_dma;
+	int i;
+
+	seg = ring->first_seg;
+	do {
+		for (i = 0; i < TRBS_PER_SEGMENT; i++) {
+			trb = &seg->trbs[i];
+			trb_dma = seg->dma + (i * sizeof(union xhci_trb));
+
+			if (trb_is_link(trb) && trb_dma == dma)
+				return true;
+		}
+		seg = seg->next;
+	} while (seg != ring->first_seg);
+
+	return false;
+}
+
 static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
 {
 	if (trb_is_link(trb)) {
@@ -1718,13 +1744,21 @@  static void handle_cmd_completion(struct xhci_hcd *xhci,
 
 	trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic);
 
+	cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
 	cmd_dequeue_dma = xhci_trb_virt_to_dma(xhci->cmd_ring->deq_seg,
 			cmd_trb);
 	/*
 	 * Check whether the completion event is for our internal kept
 	 * command.
+	 * For the 'command ring stopped' completion event, there is a
+	 * risk of a mismatch in dequeue pointers if we abort the command
+	 * just before the link TRB in the command ring. In this scenario,
+	 * the cmd_dma in the event would point to a link TRB, while the
+	 * software dequeue pointer circles back to the start.
 	 */
-	if (!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) {
+	if ((!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) &&
+	    !(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
+	      is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
 		xhci_warn(xhci,
 			  "ERROR mismatched command completion event\n");
 		return;
@@ -1734,8 +1768,6 @@  static void handle_cmd_completion(struct xhci_hcd *xhci,
 
 	cancel_delayed_work(&xhci->cmd_timer);
 
-	cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
-
 	/* If CMD ring stopped we own the trbs between enqueue and dequeue */
 	if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
 		complete_all(&xhci->cmd_ring_stop_completion);