musb + full speed device

Message ID 20140623144156.GA16963@saruman.home
State Accepted
Commit 2035772010db634ec8566b658fb1cd87ec47ac77
Headers show

Commit Message

Felipe Balbi June 23, 2014, 2:42 p.m.
Hi,

On Sun, Jun 22, 2014 at 10:39:07AM -0400, Alan Stern wrote:
> > What happens if you attach the full-speed device to a high-speed hub
> > and plug that hub into the MUSB?
> > 
> > 
> > That's a very interesting test i will definitely do. Unfortunately, even 
> > if that solves the problem, we could not use that on our production 
> > boards because we cannot change the design, we are very advanced and 
> > nearly approaching the production phase. We have no chance of changing 
> > any hardware, it should be a software solution.
> > 
> > I'm also very interested in any opinions about my assumptions because I 
> > really don't know if I'm misunderstanding something and the problem 
> > could be on any other place.
> > 
> > Basically that assumptions were that  we are sending too small packets 
> > (64B) which combines with the interrupt latencies and the degraded 
> > performance of MUSB as it has more work done in software than an EHCI 
> > interface. So for reaching the required 0.5MB/s by reducing the time 
> > spent on the OMAP micro (which i find difficult) or by increasing the 
> > packet size, having to use the Isochronous interface with a risk of 
> > packet losing.
> 
> I don't know much about the details of the MUSB driver.  Your 
> assumption seems reasonable, though.

I must say, it's been years since I last tested full speed with musb,
but one thing to keep in mind is that MUSB's internal DMA engine (the
inventra dma engine) is pretty stupid and you end up having to program
each, in your case, 64B packet instead of starting a block transfer.

That's because mode1 DMA (Mentor's parlance for a 'block transfer') is
very quirky.

I would suggest using the kernel's function profiler to figure out where
most of the time is spent and trying to optimize that part of the code.

One thing I must ask, are you processing your packets in during
giveback? You might want to try setting HCD_BH flag if that's the case.

I just noticed that we missed sending that patch upstream. Here it is
and I'll send it formally soon if it helps you out, with your Tested-by
tag.

From 8027f584b4405902b7ec890841aa6c7b8ebd6161 Mon Sep 17 00:00:00 2001
From: George Cherian <george.cherian@ti.com>
Date: Wed, 5 Mar 2014 14:01:43 +0530
Subject: [PATCH] usb: musb: musb_host: Enable HCD_BH flag to handle urb return
 in bottom half

Enable HCD_BH flag for musb host controller driver.
This improves the MSC/UVC through put. With this enabled
even 640x480@30fps webcam streaming is also supported.

Signed-off-by: George Cherian <george.cherian@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
---
 drivers/usb/musb/musb_host.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Pedro Erencia June 25, 2014, 8:59 a.m. | #1
Hi,

Thanks for the suggestions Felipe.

We are not processing our packets in during giveback, but anyway I've
tried your patch and it results in a very little (~5%) increase in
transfer rate.

I think it is a latency issue, since running the application reports a
97% idle time on top.

The profiling of the kernel functions gives the following


 30238 total                                      0.0046
 22903 omap3_enter_idle                          77.3750
  1468 notifier_call_chain                       10.4857
  1447 schedule                                   1.3251
   943 vfp_notifier                               4.5337
   737 thumbee_notifier                          11.5156
   625 finish_task_switch                         3.1888
   445 tick_nohz_stop_sched_tick                  0.4214
   360 __switch_to                                4.5000
   250 atomic_notifier_call_chain                 7.8125
   228 nwfpe_notify                               8.1429
   207 proc_do_submiturb                          0.0860
   182 cpu_idle                                   1.1974
    39 process_one_work                           0.0385
    37 __copy_to_user_std                         0.0395
    32 rcu_sched_qs                               0.1739
    21 __do_softirq                               0.0665
    18 notify_die                                 0.2647
    18 __delay                                    1.5000
    15 __hrtimer_start_range_ns                   0.0260
    14 __atomic_notifier_call_chain               3.5000
    13 tick_nohz_restart_sched_tick               0.0254
    10 smsc911x_rx_readfifo                       0.0309
    10 cpuidle_idle_call                          0.0385
...


We are using inventra dma with the option "Use System DMA for Mentor
DMA workaround".
Felipe, i'm not sure if we are using mode 1 or 0 on the DMA. This is
part of the log of one transaction

[   20.805786] musb-hdrc musb-hdrc: RXCSR10 := 0620
[   20.805908] musb-hdrc musb-hdrc: ** IRQ host usb0008 tx0000 rx0400
[   20.805938] musb-hdrc musb-hdrc: <== Power=e0, DevCtl=5d, int_usb=0x8
[   20.805969] musb-hdrc musb-hdrc: <== hw 10 rxcsr 0003, urb actual 0 (+dma 11)
[   20.806030] musb-hdrc musb-hdrc: RX10 count 64, buffer 0x8b8ee000 len 0/3851
[   20.806060] musb-hdrc musb-hdrc: <== hw 10 rxcsr a000, urb actual 0 (+dma 64)
[   20.806091] musb-hdrc musb-hdrc: ep 10 dma reset, rxcsr 0020, rxcount 0
[   20.806152] musb-hdrc musb-hdrc: ** IRQ host usb0000 tx0000 rx0400
[   20.806213] musb-hdrc musb-hdrc: <== hw 10 rxcsr 0203, urb actual
64 (+dma 64)
[   20.806243] musb-hdrc musb-hdrc: RX10 count 64, buffer 0x8b8ee040 len 64/3851
[   20.806274] musb-hdrc musb-hdrc: <== hw 10 rxcsr a200, urb actual
64 (+dma 64)
[   20.806335] musb-hdrc musb-hdrc: ep 10 dma reset, rxcsr 0220, rxcount 0
[   20.806640] musb-hdrc musb-hdrc: ** IRQ host usb0008 tx0000 rx0400
[   20.806671] musb-hdrc musb-hdrc: <== Power=e0, DevCtl=5d, int_usb=0x8
[   20.806732] musb-hdrc musb-hdrc: <== hw 10 rxcsr 0003, urb actual
128 (+dma 64)
[   20.806762] musb-hdrc musb-hdrc: RX10 count 64, buffer 0x8b8ee080
len 128/3851
[   20.806793] musb-hdrc musb-hdrc: <== hw 10 rxcsr a000, urb actual
128 (+dma 64)
[   20.806854] musb-hdrc musb-hdrc: ep 10 dma reset, rxcsr 0020, rxcount 0



Are those "dma reset" indicating we are using mode 0 ? If that's the
case, how could we use mode 1 ?
Also, those rxcsr 0020 and rxcsr 0220, I understand are not actual
errors, but set by the driver to initiate the new IN packet, right ?

Thanks Alan and Felipe :)

2014-06-23 16:42 GMT+02:00 Felipe Balbi <balbi@ti.com>:
> Hi,
>
> On Sun, Jun 22, 2014 at 10:39:07AM -0400, Alan Stern wrote:
>> > What happens if you attach the full-speed device to a high-speed hub
>> > and plug that hub into the MUSB?
>> >
>> >
>> > That's a very interesting test i will definitely do. Unfortunately, even
>> > if that solves the problem, we could not use that on our production
>> > boards because we cannot change the design, we are very advanced and
>> > nearly approaching the production phase. We have no chance of changing
>> > any hardware, it should be a software solution.
>> >
>> > I'm also very interested in any opinions about my assumptions because I
>> > really don't know if I'm misunderstanding something and the problem
>> > could be on any other place.
>> >
>> > Basically that assumptions were that  we are sending too small packets
>> > (64B) which combines with the interrupt latencies and the degraded
>> > performance of MUSB as it has more work done in software than an EHCI
>> > interface. So for reaching the required 0.5MB/s by reducing the time
>> > spent on the OMAP micro (which i find difficult) or by increasing the
>> > packet size, having to use the Isochronous interface with a risk of
>> > packet losing.
>>
>> I don't know much about the details of the MUSB driver.  Your
>> assumption seems reasonable, though.
>
> I must say, it's been years since I last tested full speed with musb,
> but one thing to keep in mind is that MUSB's internal DMA engine (the
> inventra dma engine) is pretty stupid and you end up having to program
> each, in your case, 64B packet instead of starting a block transfer.
>
> That's because mode1 DMA (Mentor's parlance for a 'block transfer') is
> very quirky.
>
> I would suggest using the kernel's function profiler to figure out where
> most of the time is spent and trying to optimize that part of the code.
>
> One thing I must ask, are you processing your packets in during
> giveback? You might want to try setting HCD_BH flag if that's the case.
>
> I just noticed that we missed sending that patch upstream. Here it is
> and I'll send it formally soon if it helps you out, with your Tested-by
> tag.
>
> From 8027f584b4405902b7ec890841aa6c7b8ebd6161 Mon Sep 17 00:00:00 2001
> From: George Cherian <george.cherian@ti.com>
> Date: Wed, 5 Mar 2014 14:01:43 +0530
> Subject: [PATCH] usb: musb: musb_host: Enable HCD_BH flag to handle urb return
>  in bottom half
>
> Enable HCD_BH flag for musb host controller driver.
> This improves the MSC/UVC through put. With this enabled
> even 640x480@30fps webcam streaming is also supported.
>
> Signed-off-by: George Cherian <george.cherian@ti.com>
> Signed-off-by: Felipe Balbi <balbi@ti.com>
> ---
>  drivers/usb/musb/musb_host.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> index eb06291..bc3b28c 100644
> --- a/drivers/usb/musb/musb_host.c
> +++ b/drivers/usb/musb/musb_host.c
> @@ -2612,7 +2612,7 @@ static const struct hc_driver musb_hc_driver = {
>         .description            = "musb-hcd",
>         .product_desc           = "MUSB HDRC host driver",
>         .hcd_priv_size          = sizeof(struct musb *),
> -       .flags                  = HCD_USB2 | HCD_MEMORY,
> +       .flags                  = HCD_USB2 | HCD_MEMORY | HCD_BH,
>
>         /* not using irq handler or reset hooks from usbcore, since
>          * those must be shared with peripheral code for OTG configs
> --
> 2.0.0.390.gcb682f8
>
> ps: which kernel are you using ?
>
> --
> balbi
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index eb06291..bc3b28c 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -2612,7 +2612,7 @@  static const struct hc_driver musb_hc_driver = {
 	.description		= "musb-hcd",
 	.product_desc		= "MUSB HDRC host driver",
 	.hcd_priv_size		= sizeof(struct musb *),
-	.flags			= HCD_USB2 | HCD_MEMORY,
+	.flags			= HCD_USB2 | HCD_MEMORY | HCD_BH,
 
 	/* not using irq handler or reset hooks from usbcore, since
 	 * those must be shared with peripheral code for OTG configs