mbox series

[v3,0/4] Add logic to consolidate TRBs for Synopsys xHC

Message ID cover.1590415123.git.joglekar@synopsys.com
Headers show
Series Add logic to consolidate TRBs for Synopsys xHC | expand

Message

Tejas Joglekar May 27, 2020, 10:40 a.m. UTC
The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the
transfer ring in system memory whenever the driver issues a start
transfer or update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to
1 MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to
send or receive a packet and it will hang causing a driver timeout and
error.

This patch set adds logic to the XHCI driver to detect and prevent this
from happening along with the quirk to enable this logic for Synopsys
HAPS platform.

Based on Mathias's feedback on previous implementation where consolidation
was done in TRB cache, with this patch series the implementation is done
during mapping of the URB by consolidating the SG list into a temporary
buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.

Changes since v2:
 - Modified the xhci_unmap_temp_buffer function to unmap dma and clear
   the dma flag
 - Removed RFC tag

Changes since v1:
 - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
 - Renamed the property and quirk as in other patches based on [PATCH 1/4]

Tejas Joglekar (4):
  dt-bindings: usb: Add documentation for SG trb cache size quirk
  usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  usb: dwc3: Add device property sgl-trb-cache-size-quirk
  usb: xhci: Use temporary buffer to consolidate SG

 Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +
 Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +
 drivers/usb/dwc3/core.c                            |   2 +
 drivers/usb/dwc3/core.h                            |   2 +
 drivers/usb/dwc3/dwc3-haps.c                       |   1 +
 drivers/usb/dwc3/host.c                            |   6 +-
 drivers/usb/host/xhci-pci.c                        |   3 +
 drivers/usb/host/xhci-plat.c                       |   4 +
 drivers/usb/host/xhci-ring.c                       |   2 +-
 drivers/usb/host/xhci.c                            | 135 +++++++++++++++++++++
 drivers/usb/host/xhci.h                            |   5 +
 11 files changed, 165 insertions(+), 2 deletions(-)

Comments

Tejas Joglekar July 6, 2020, 5:07 a.m. UTC | #1
Hi Felipe,
On 5/27/2020 4:11 PM, Tejas Joglekar wrote:
> This commit adds the sgl-trb-cache-size-quirk property to enable

> quirk for the XHCI driver with Synopsys xHC. This property is

> enabled as initial property for the dwc3-haps driver.

> 

> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>

> ---

>  drivers/usb/dwc3/core.c      | 2 ++

>  drivers/usb/dwc3/core.h      | 2 ++

>  drivers/usb/dwc3/dwc3-haps.c | 1 +

>  drivers/usb/dwc3/host.c      | 6 +++++-

>  4 files changed, 10 insertions(+), 1 deletion(-)

> 

> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

> index 25c686a752b0..bc295477e1bc 100644

> --- a/drivers/usb/dwc3/core.c

> +++ b/drivers/usb/dwc3/core.c

> @@ -1299,6 +1299,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)

>  				"snps,usb3_lpm_capable");

>  	dwc->usb2_lpm_disable = device_property_read_bool(dev,

>  				"snps,usb2-lpm-disable");

> +	dwc->sgl_trb_cache_size_quirk = device_property_read_bool(dev,

> +				"snps,sgl-trb-cache-size-quirk");

>  	device_property_read_u8(dev, "snps,rx-thr-num-pkt-prd",

>  				&rx_thr_num_pkt_prd);

>  	device_property_read_u8(dev, "snps,rx-max-burst-prd",

> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h

> index 013f42a2b5dc..0dca0dbf4309 100644

> --- a/drivers/usb/dwc3/core.h

> +++ b/drivers/usb/dwc3/core.h

> @@ -1021,6 +1021,7 @@ struct dwc3_scratchpad_array {

>   *			not needed for DWC_usb31 version 1.70a-ea06 and below

>   * @usb3_lpm_capable: set if hadrware supports Link Power Management

>   * @usb2_lpm_disable: set to disable usb2 lpm

> + * @sgl_trb_cache_size_quirk: set to enable the SG list consolidation

>   * @disable_scramble_quirk: set if we enable the disable scramble quirk

>   * @u2exit_lfps_quirk: set if we enable u2exit lfps quirk

>   * @u2ss_inp3_quirk: set if we enable P3 OK for U2/SS Inactive quirk

> @@ -1220,6 +1221,7 @@ struct dwc3 {

>  	unsigned		dis_start_transfer_quirk:1;

>  	unsigned		usb3_lpm_capable:1;

>  	unsigned		usb2_lpm_disable:1;

> +	unsigned		sgl_trb_cache_size_quirk:1;

>  

>  	unsigned		disable_scramble_quirk:1;

>  	unsigned		u2exit_lfps_quirk:1;

> diff --git a/drivers/usb/dwc3/dwc3-haps.c b/drivers/usb/dwc3/dwc3-haps.c

> index 3cecbf169452..9311cbe5f264 100644

> --- a/drivers/usb/dwc3/dwc3-haps.c

> +++ b/drivers/usb/dwc3/dwc3-haps.c

> @@ -29,6 +29,7 @@ static const struct property_entry initial_properties[] = {

>  	PROPERTY_ENTRY_BOOL("snps,usb3_lpm_capable"),

>  	PROPERTY_ENTRY_BOOL("snps,has-lpm-erratum"),

>  	PROPERTY_ENTRY_BOOL("snps,dis_enblslpm_quirk"),

> +	PROPERTY_ENTRY_BOOL("snps,sgl-trb-cache-size-quirk"),

>  	PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),

>  	{ },

>  };

> diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c

> index bef1c1ac2067..e0089c82728e 100644

> --- a/drivers/usb/dwc3/host.c

> +++ b/drivers/usb/dwc3/host.c

> @@ -44,7 +44,7 @@ static int dwc3_host_get_irq(struct dwc3 *dwc)

>  

>  int dwc3_host_init(struct dwc3 *dwc)

>  {

> -	struct property_entry	props[4];

> +	struct property_entry	props[5];

>  	struct platform_device	*xhci;

>  	int			ret, irq;

>  	struct resource		*res;

> @@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)

>  	if (dwc->usb2_lpm_disable)

>  		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");

>  

> +	if (dwc->sgl_trb_cache_size_quirk)

> +		props[prop_idx++] =

> +			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");

> +

>  	/**

>  	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation

>  	 * where Port Disable command doesn't work.

> 

Does this implementation looks good to you? Rob has some concerned over the DT entries,
you suggested using compatible string with this quirk addition.
Can you please brief about how you would like to have this quirk implemented?
I can send the updated patch. My patch series is pending for merge just because of the
DT and quirk issue. Can you please help?

Thanks & Regards,
 Tejas Joglekar
Felipe Balbi July 6, 2020, 6:43 a.m. UTC | #2
Hi,

Tejas Joglekar <Tejas.Joglekar@synopsys.com> writes:
>> @@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)

>>  	if (dwc->usb2_lpm_disable)

>>  		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");

>>  

>> +	if (dwc->sgl_trb_cache_size_quirk)

>> +		props[prop_idx++] =

>> +			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");

>> +

>>  	/**

>>  	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation

>>  	 * where Port Disable command doesn't work.

>> 

> Does this implementation looks good to you? Rob has some concerned over the DT entries,

> you suggested using compatible string with this quirk addition.

> Can you please brief about how you would like to have this quirk implemented?

> I can send the updated patch. My patch series is pending for merge just because of the

> DT and quirk issue. Can you please help?


Yeah, you need to get into an agreement with Rob :-) I don't mind having
extra DT flags for things which can't be detected in runtime, Rob
disagrees.

-- 
balbi
Tejas Joglekar July 7, 2020, 5:21 p.m. UTC | #3
Hello Rob,
On 7/6/2020 12:13 PM, Felipe Balbi wrote:
> 

> Hi,

> 

> Tejas Joglekar <Tejas.Joglekar@synopsys.com> writes:

>>> @@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)

>>>  	if (dwc->usb2_lpm_disable)

>>>  		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");

>>>  

>>> +	if (dwc->sgl_trb_cache_size_quirk)

>>> +		props[prop_idx++] =

>>> +			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");

>>> +

>>>  	/**

>>>  	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation

>>>  	 * where Port Disable command doesn't work.

>>>

>> Does this implementation looks good to you? Rob has some concerned over the DT entries,

>> you suggested using compatible string with this quirk addition.

>> Can you please brief about how you would like to have this quirk implemented?

>> I can send the updated patch. My patch series is pending for merge just because of the

>> DT and quirk issue. Can you please help?

> 

> Yeah, you need to get into an agreement with Rob :-) I don't mind having

> extra DT flags for things which can't be detected in runtime, Rob

> disagrees.

> 

The compatible string is not suitable option as it does not work with platform drivers
with PCI based system. Also Synopsys controllers IP version register is not visible to xhci
driver and hence we don't have separate compatible string for each Synopsys version on the
xhci driver side. 
Due to which I depend on DT flag addition for the quirk. Can we add these DT flags and quirk?


Thanks & Regards,
 Tejas Joglekar
Tejas Joglekar July 12, 2020, 1:45 p.m. UTC | #4
Hello Rob,
On 7/6/2020 12:13 PM, Felipe Balbi wrote:
> 

> Hi,

> 

> Tejas Joglekar <Tejas.Joglekar@synopsys.com> writes:

>>> @@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)

>>>  	if (dwc->usb2_lpm_disable)

>>>  		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");

>>>  

>>> +	if (dwc->sgl_trb_cache_size_quirk)

>>> +		props[prop_idx++] =

>>> +			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");

>>> +

>>>  	/**

>>>  	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation

>>>  	 * where Port Disable command doesn't work.

>>>

>> Does this implementation looks good to you? Rob has some concerned over the DT entries,

>> you suggested using compatible string with this quirk addition.

>> Can you please brief about how you would like to have this quirk implemented?

>> I can send the updated patch. My patch series is pending for merge just because of the

>> DT and quirk issue. Can you please help?

> 

> Yeah, you need to get into an agreement with Rob :-) I don't mind having

> extra DT flags for things which can't be detected in runtime, Rob

> disagrees.

> 

The compatible string is not suitable option as it does not work with platform drivers
with PCI based system. Also Synopsys controllers IP version register is not visible to xhci
driver and hence we don't have separate compatible string for each Synopsys version on the
xhci driver side. 
Due to which I depend on DT flag addition for the quirk. Can we add these DT flags and quirk?
 
 
Thanks & Regards,
 Tejas Joglekar
Tejas Joglekar July 21, 2020, 4:57 p.m. UTC | #5
Hello,

On 7/21/2020 3:17 PM, Felipe Balbi wrote:
> Tejas Joglekar <Tejas.Joglekar@synopsys.com> writes:

> 

>> Hi Rob,

>>

>> On 7/6/2020 12:13 PM, Felipe Balbi wrote:

>>>

>>> Hi,

>>>

>>> Tejas Joglekar <Tejas.Joglekar@synopsys.com> writes:

>>>>> @@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)

>>>>>  	if (dwc->usb2_lpm_disable)

>>>>>  		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");

>>>>>  

>>>>> +	if (dwc->sgl_trb_cache_size_quirk)

>>>>> +		props[prop_idx++] =

>>>>> +			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");

>>>>> +

>>>>>  	/**

>>>>>  	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation

>>>>>  	 * where Port Disable command doesn't work.

>>>>>

>>>> Does this implementation looks good to you? Rob has some concerned over the DT entries,

>>>> you suggested using compatible string with this quirk addition.

>>>> Can you please brief about how you would like to have this quirk implemented?

>>>> I can send the updated patch. My patch series is pending for merge just because of the

>>>> DT and quirk issue. Can you please help?

>>>

>>> Yeah, you need to get into an agreement with Rob :-) I don't mind having

>>> extra DT flags for things which can't be detected in runtime, Rob

>>> disagrees.

>>>

>> The compatible string is not suitable option as it does not work with platform drivers

>> with PCI based system. Also Synopsys controllers IP version register is not visible to xhci

>> driver and hence we don't have separate compatible string for each Synopsys version on the

>> xhci driver side. 

>> Due to which I depend on DT flag addition for the quirk. Can we add these DT flags and quirk?

> 

> As I said, I'm well aware of the situation regarding usage of compatible

> strings and the fact that dwc3 must work on PCI and non-PCI systems (I

> wrote the thing as it is after all). The person blocking new quirk flags

> is Rob, not me. You need to convince Rob that this is the way to go.

> 

@Felipe: Sorry for confusion if any, previous mail was intended for Rob asking about his approval.

> Rob, ball's in your court. Sorry.> 

@Rob: As I and Felipe have mentioned before, it is very much necessary to have quirk flags
for the current changes as compatible string would not be a solution for PCI and non-PCI
systems. Can you please approve this change ? If you have any concern about naming or any
other thing, please let us know.

Thanks & Regards,
 Tejas Joglekar
Jun Li July 23, 2020, 10:35 a.m. UTC | #6
Tejas Joglekar <Tejas.Joglekar@synopsys.com> 于2020年5月27日周三 下午7:54写道:
>

> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for

> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8

> for HS. The controller loads and updates the TRB cache from the

> transfer ring in system memory whenever the driver issues a start

> transfer or update transfer command.

>

> For chained TRBs, the Synopsys xHC requires that the total amount of

> bytes for all TRBs loaded in the TRB cache be greater than or equal to

> 1 MPS. Or the chain ends within the TRB cache (with a last TRB).

>

> If this requirement is not met, the controller will not be able to

> send or receive a packet and it will hang causing a driver timeout and

> error.


Hi Tejas Joglekar

I am debugging  a similar issue on Synipsys XHC, it's not the same case
but I am wondering if it also linked to this HW limitation.

My Synopsys XHC based host enable UAS, when enumerates a UAS
HDD, one BULK-IN EP with stream enabled will not generate event for
trb(with stream ID 1) after a 16/4096 bytes(with stream ID 2) finished in
previous trb.

If I change the last OK urb/trb's buffer length from 4096 to 512, the issue
will gone.

following is the sequence of the question EP-IN:

<idle>-0     [000] d.h1   154.961710: xhci_urb_giveback: ep3in-bulk:
urb ffff0001775f6f00 pipe 3221324672 slot 1 length 36/36 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.962023: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 96/96 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970395: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 11/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970562: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 20/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   154.970786: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00400 pipe 3221324672 slot 1 length 60/255 sgs 1/1
stream 1 flags 00040200
<idle>-0     [000] d.h1   155.851600: xhci_urb_giveback: ep3in-bulk:
urb ffff000177d00200 pipe 3221324672 slot 1 length 16/4096 sgs 1/1
stream 2 flags 00040200

/* then the next ep3-in trb will not generate event and stopped, so
driver timeout in the end */
kworker/u8:2-349   [003] d..3   155.851987: xhci_urb_enqueue:
ep3in-bulk: urb ffff000170492400 pipe 3221324672 slot 1 length 0/32
sgs 1/1 stream 1 flags 00040200
kworker/u8:2-349   [003] d..4   155.851989: xhci_queue_trb: STREAM:
Buffer 00000000c19cf000 length 32 TD size 0 intr 0 type 'Normal' flags
b:i:I:c:s:I:e:c
kworker/u8:2-349   [003] d..4   155.851991: xhci_inc_enq: STREAM
ffff000177f86f80: enq 0x00000000be0eb060(0x00000000be0eb000) deq
0x00000000be0eb050(0x00000000be0eb000) segs 2 stream 1 free_trbs 508
bounce 1024 cycle 1

Do you have any ideas?

thanks
Li Jun
>

> This patch set adds logic to the XHCI driver to detect and prevent this

> from happening along with the quirk to enable this logic for Synopsys

> HAPS platform.

>

> Based on Mathias's feedback on previous implementation where consolidation

> was done in TRB cache, with this patch series the implementation is done

> during mapping of the URB by consolidating the SG list into a temporary

> buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.

>

> Changes since v2:

>  - Modified the xhci_unmap_temp_buffer function to unmap dma and clear

>    the dma flag

>  - Removed RFC tag

>

> Changes since v1:

>  - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]

>  - Renamed the property and quirk as in other patches based on [PATCH 1/4]

>

> Tejas Joglekar (4):

>   dt-bindings: usb: Add documentation for SG trb cache size quirk

>   usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK

>   usb: dwc3: Add device property sgl-trb-cache-size-quirk

>   usb: xhci: Use temporary buffer to consolidate SG

>

>  Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +

>  Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +

>  drivers/usb/dwc3/core.c                            |   2 +

>  drivers/usb/dwc3/core.h                            |   2 +

>  drivers/usb/dwc3/dwc3-haps.c                       |   1 +

>  drivers/usb/dwc3/host.c                            |   6 +-

>  drivers/usb/host/xhci-pci.c                        |   3 +

>  drivers/usb/host/xhci-plat.c                       |   4 +

>  drivers/usb/host/xhci-ring.c                       |   2 +-

>  drivers/usb/host/xhci.c                            | 135 +++++++++++++++++++++

>  drivers/usb/host/xhci.h                            |   5 +

>  11 files changed, 165 insertions(+), 2 deletions(-)

>

> --

> 2.11.0

>