diff mbox series

[v2] usb: xhci: quirk for data loss in ISOC transfers

Message ID 20241112122104.120633-1-Raju.Rangoju@amd.com
State Superseded
Headers show
Series [v2] usb: xhci: quirk for data loss in ISOC transfers | expand

Commit Message

Raju Rangoju Nov. 12, 2024, 12:21 p.m. UTC
During the High-Speed Isochronous Audio transfers, xHCI
controller on certain AMD platforms experiences momentary data
loss. This results in Missed Service Errors (MSE) being
generated by the xHCI.

The root cause of the MSE is attributed to the ISOC OUT endpoint
being omitted from scheduling. This can happen either when an IN
endpoint with a 64ms service interval is pre-scheduled prior to
the ISOC OUT endpoint or when the interval of the ISOC OUT
endpoint is shorter than that of the IN endpoint. Consequently,
the OUT service is neglected when an IN endpoint with a service
interval exceeding 32ms is scheduled concurrently (every 64ms in
this scenario).

This issue is particularly seen on certain older AMD platforms.
To mitigate this problem, it is recommended to adjust the service
interval of the IN endpoint to not exceed 32ms (interval 8). This
adjustment ensures that the OUT endpoint will not be bypassed,
even if a smaller interval value is utilized.

Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
---
Changes since v1:
 - replaced hex values with pci device names
 - corrected the commit message

 drivers/usb/host/xhci-mem.c |  5 +++++
 drivers/usb/host/xhci-pci.c | 25 +++++++++++++++++++++++++
 drivers/usb/host/xhci.h     |  1 +
 3 files changed, 31 insertions(+)

Comments

Greg Kroah-Hartman Nov. 12, 2024, 12:24 p.m. UTC | #1
On Tue, Nov 12, 2024 at 05:51:04PM +0530, Raju Rangoju wrote:
> During the High-Speed Isochronous Audio transfers, xHCI
> controller on certain AMD platforms experiences momentary data
> loss. This results in Missed Service Errors (MSE) being
> generated by the xHCI.
> 
> The root cause of the MSE is attributed to the ISOC OUT endpoint
> being omitted from scheduling. This can happen either when an IN
> endpoint with a 64ms service interval is pre-scheduled prior to
> the ISOC OUT endpoint or when the interval of the ISOC OUT
> endpoint is shorter than that of the IN endpoint. Consequently,
> the OUT service is neglected when an IN endpoint with a service
> interval exceeding 32ms is scheduled concurrently (every 64ms in
> this scenario).
> 
> This issue is particularly seen on certain older AMD platforms.
> To mitigate this problem, it is recommended to adjust the service
> interval of the IN endpoint to not exceed 32ms (interval 8). This
> adjustment ensures that the OUT endpoint will not be bypassed,
> even if a smaller interval value is utilized.
> 
> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>

You don't want this backported to any older kernels?  Why not?

thanks,

greg k-h
Greg Kroah-Hartman Nov. 13, 2024, 6:05 a.m. UTC | #2
On Tue, Nov 12, 2024 at 07:02:40PM +0530, Rangoju, Raju wrote:
> 
> 
> On 11/12/2024 5:54 PM, Greg KH wrote:
> > On Tue, Nov 12, 2024 at 05:51:04PM +0530, Raju Rangoju wrote:
> > > During the High-Speed Isochronous Audio transfers, xHCI
> > > controller on certain AMD platforms experiences momentary data
> > > loss. This results in Missed Service Errors (MSE) being
> > > generated by the xHCI.
> > > 
> > > The root cause of the MSE is attributed to the ISOC OUT endpoint
> > > being omitted from scheduling. This can happen either when an IN
> > > endpoint with a 64ms service interval is pre-scheduled prior to
> > > the ISOC OUT endpoint or when the interval of the ISOC OUT
> > > endpoint is shorter than that of the IN endpoint. Consequently,
> > > the OUT service is neglected when an IN endpoint with a service
> > > interval exceeding 32ms is scheduled concurrently (every 64ms in
> > > this scenario).
> > > 
> > > This issue is particularly seen on certain older AMD platforms.
> > > To mitigate this problem, it is recommended to adjust the service
> > > interval of the IN endpoint to not exceed 32ms (interval 8). This
> > > adjustment ensures that the OUT endpoint will not be bypassed,
> > > even if a smaller interval value is utilized.
> > > 
> > > Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
> > 
> > You don't want this backported to any older kernels?  Why not?
> 
> Hi Greg, Yes, backporting is needed, but some of products were released back
> in 2018, not sure of the exact commit id to quote here for backporting as
> there were no precise commits that added this initial support in the first
> place.
> 
> Would you mind tagging it to all stable kernels. Let me know if the patch
> needs to be respinned.

Yes, please resend it with the proper tag, don't ask maintainers to
hand-edit changes for you, that does not scale at all.

thanks,

greg k-h
Raju Rangoju Nov. 13, 2024, 6:11 a.m. UTC | #3
On 11/13/2024 11:35 AM, Greg KH wrote:
> On Tue, Nov 12, 2024 at 07:02:40PM +0530, Rangoju, Raju wrote:
>>
>>
>> On 11/12/2024 5:54 PM, Greg KH wrote:
>>> On Tue, Nov 12, 2024 at 05:51:04PM +0530, Raju Rangoju wrote:
>>>> During the High-Speed Isochronous Audio transfers, xHCI
>>>> controller on certain AMD platforms experiences momentary data
>>>> loss. This results in Missed Service Errors (MSE) being
>>>> generated by the xHCI.
>>>>
>>>> The root cause of the MSE is attributed to the ISOC OUT endpoint
>>>> being omitted from scheduling. This can happen either when an IN
>>>> endpoint with a 64ms service interval is pre-scheduled prior to
>>>> the ISOC OUT endpoint or when the interval of the ISOC OUT
>>>> endpoint is shorter than that of the IN endpoint. Consequently,
>>>> the OUT service is neglected when an IN endpoint with a service
>>>> interval exceeding 32ms is scheduled concurrently (every 64ms in
>>>> this scenario).
>>>>
>>>> This issue is particularly seen on certain older AMD platforms.
>>>> To mitigate this problem, it is recommended to adjust the service
>>>> interval of the IN endpoint to not exceed 32ms (interval 8). This
>>>> adjustment ensures that the OUT endpoint will not be bypassed,
>>>> even if a smaller interval value is utilized.
>>>>
>>>> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
>>>
>>> You don't want this backported to any older kernels?  Why not?
>>
>> Hi Greg, Yes, backporting is needed, but some of products were released back
>> in 2018, not sure of the exact commit id to quote here for backporting as
>> there were no precise commits that added this initial support in the first
>> place.
>>
>> Would you mind tagging it to all stable kernels. Let me know if the patch
>> needs to be respinned.
> 
> Yes, please resend it with the proper tag, don't ask maintainers to
> hand-edit changes for you, that does not scale at all.
>

Sure, I'll re-spin with a stable tag.

> thanks,
> 
> greg k-h
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index d2900197a49e..4892bb9afa6e 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1426,6 +1426,11 @@  int xhci_endpoint_init(struct xhci_hcd *xhci,
 	/* Periodic endpoint bInterval limit quirk */
 	if (usb_endpoint_xfer_int(&ep->desc) ||
 	    usb_endpoint_xfer_isoc(&ep->desc)) {
+		if ((xhci->quirks & XHCI_LIMIT_ENDPOINT_INTERVAL_9) &&
+		    usb_endpoint_xfer_int(&ep->desc) &&
+		    interval >= 9) {
+			interval = 8;
+		}
 		if ((xhci->quirks & XHCI_LIMIT_ENDPOINT_INTERVAL_7) &&
 		    udev->speed >= USB_SPEED_HIGH &&
 		    interval >= 7) {
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index cb07cee9ed0c..82657ca30030 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -69,12 +69,22 @@ 
 #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_XHCI		0x15ec
 #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_DD_XHCI		0x15f0
 
+#define PCI_DEVICE_ID_AMD_ARIEL_TYPEC_XHCI		0x13ed
+#define PCI_DEVICE_ID_AMD_ARIEL_TYPEA_XHCI		0x13ee
+#define PCI_DEVICE_ID_AMD_STARSHIP_XHCI			0x148c
+#define PCI_DEVICE_ID_AMD_FIREFLIGHT_15D4_XHCI		0x15d4
+#define PCI_DEVICE_ID_AMD_FIREFLIGHT_15D5_XHCI		0x15d5
+#define PCI_DEVICE_ID_AMD_RAVEN_15E0_XHCI		0x15e0
+#define PCI_DEVICE_ID_AMD_RAVEN_15E1_XHCI		0x15e1
+#define PCI_DEVICE_ID_AMD_RAVEN2_XHCI			0x15e5
 #define PCI_DEVICE_ID_AMD_RENOIR_XHCI			0x1639
 #define PCI_DEVICE_ID_AMD_PROMONTORYA_4			0x43b9
 #define PCI_DEVICE_ID_AMD_PROMONTORYA_3			0x43ba
 #define PCI_DEVICE_ID_AMD_PROMONTORYA_2			0x43bb
 #define PCI_DEVICE_ID_AMD_PROMONTORYA_1			0x43bc
 
+#define PCI_DEVICE_ID_ATI_NAVI10_7316_XHCI		0x7316
+
 #define PCI_DEVICE_ID_ASMEDIA_1042_XHCI			0x1042
 #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI		0x1142
 #define PCI_DEVICE_ID_ASMEDIA_1142_XHCI			0x1242
@@ -284,6 +294,21 @@  static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 	if (pdev->vendor == PCI_VENDOR_ID_NEC)
 		xhci->quirks |= XHCI_NEC_HOST;
 
+	if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+	    (pdev->device == PCI_DEVICE_ID_AMD_ARIEL_TYPEC_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_ARIEL_TYPEA_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_STARSHIP_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_FIREFLIGHT_15D4_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_FIREFLIGHT_15D5_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_RAVEN_15E0_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_RAVEN_15E1_XHCI ||
+	     pdev->device == PCI_DEVICE_ID_AMD_RAVEN2_XHCI))
+		xhci->quirks |= XHCI_LIMIT_ENDPOINT_INTERVAL_9;
+
+	if (pdev->vendor == PCI_VENDOR_ID_ATI &&
+	    pdev->device == PCI_DEVICE_ID_ATI_NAVI10_7316_XHCI)
+		xhci->quirks |= XHCI_LIMIT_ENDPOINT_INTERVAL_9;
+
 	if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version == 0x96)
 		xhci->quirks |= XHCI_AMD_0x96_HOST;
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index f0fb696d5619..fa69f7ac09b5 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1624,6 +1624,7 @@  struct xhci_hcd {
 #define XHCI_ZHAOXIN_HOST	BIT_ULL(46)
 #define XHCI_WRITE_64_HI_LO	BIT_ULL(47)
 #define XHCI_CDNS_SCTX_QUIRK	BIT_ULL(48)
+#define XHCI_LIMIT_ENDPOINT_INTERVAL_9	BIT_ULL(49)
 
 	unsigned int		num_active_eps;
 	unsigned int		limit_active_eps;