MUSB/OMAP: gadget does not start when cable is plugged after the driver is started

Message ID 20140407160410.GC20228@saruman.home
State Accepted
Commit 8b2bc2c9351b4c09bc3d9096e2a7af3988565dbf
Headers show

Commit Message

Felipe Balbi April 7, 2014, 4:04 p.m.
On Mon, Apr 07, 2014 at 01:21:10PM +0200, Stefan Roese wrote:
> Hi Felipe,
> 
> On 04.04.2014 17:35, Stefan Roese wrote:
> >>> I'm currently seeing a problem on an OMAP3530 based board (Technexion
> >>> TAO3530). Where the MUSB is configured as device/peripheral and the 
> >>> ethernet
> >>> gadget is compiled into the kernel. This works without any problems 
> >>> and usb0
> >>> is available when the USB cable is connected to a PC upon startup. 
> >>> But when
> >>> the cable is disconnected when the driver is started (doesn't matter 
> >>> if the
> >>> gadget is included in the kernel or if the gadget driver module is 
> >>> loaded)
> >>> and the cable is plugged in later, the ethernet gadget does not start 
> >>> up.
> >>> This is 100% reproducible.
> >>>
> >>> Before digging deeper into this, I wanted to check here if this is a 
> >>> known
> >>> issue. And if anybody already has a solution/hint for it.
> >>>
> >>> FYI: I'm using v3.14 right now.
> >>
> >> Can you see if this helps ?
> >>
> >> commit e8fbe7b90021960907e885e0b7a9b52d378b0202
> >> Author: Felipe Balbi <balbi@ti.com>
> >> Date:   Fri Mar 28 14:31:47 2014 -0500
> >>
> >>      usb: musb: fix PHY power on/off
> >>
> >>      commi 30a70b0 (usb: musb: fix obex in g_nokia.ko
> >>      causing kernel panic) removed phy_power_on()
> >>      and phy_power_off() calls from runtime PM callbacks
> >>      but it failed to note that the driver depended
> >>      on pm_runtime_get_sync() calls to power up the PHY,
> >>      thus leaving some platforms without any means to
> >>      have a working PHY.
> >>
> >>      Fix that by enabling the phy during omap2430_musb_init()
> >>      and killing it in omap2430_musb_exit().
> >>
> >>      Fixes: 30a70b0 (usb: musb: fix obex in g_nokia.ko causing kernel 
> >> panic)
> >>      Cc: <stable@vger.kernel.org> # v3.14
> >>      Cc: Pali Rohár <pali.rohar@gmail.com>
> >>      Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
> >>      Reported-by: Michael Scott <hashcode0f@gmail.com>
> >>      Tested-by: Michael Scott <hashcode0f@gmail.com>
> >>      Reported-by: Rabin Vincent <rabin@rab.in>
> >>      Signed-off-by: Felipe Balbi <balbi@ti.com>
> > 
> > Yes, this patch fixes this problem. Thanks a lot!
> > 
> > Tested-by: Stefan Roese <sr@denx.de>
> 
> I just noticed that I still seem to have a problem. With your patch
> above applied, I do get an Kernel Oops when the USB cable is not
> connected upon kernel start (gadget compiled into the kernel). This
> oops happens upon the first MUSB register access in
> omap2430_musb_set_vbus():
> 
> 	devctl = musb_readb(musb->mregs, MUSB_DEVCTL);
> 
> ...
> [    2.457061] musb-hdrc: ConfigData=0xde (UTMI-8, dyn FIFOs, bulk combine, bulk split, HB-ISO Rx, HB-ISO Tx, SoftConn)
> [    2.468170] musb-hdrc: MHDRC RTL version 1.400 
> [    2.473052] musb-hdrc: setup fifo_mode 4
> [    2.477172] musb-hdrc: 28/31 max ep, 16384/16384 memory
> [    2.482849] musb-hdrc musb-hdrc.0.auto: MUSB HDRC host driver
> [    2.489440] musb-hdrc musb-hdrc.0.auto: new USB bus registered, assigned bus number 1
> [    2.497985] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
> [    2.505157] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
> [    2.512786] usb usb1: Product: MUSB HDRC host driver
> [    2.517974] usb usb1: Manufacturer: Linux 3.14.0-00067-g17ec48c-dirty musb-hcd
> [    2.525573] usb usb1: SerialNumber: musb-hdrc.0.auto
> [    2.532196] hub 1-0:1.0: USB hub found
> [    2.536193] hub 1-0:1.0: 1 port detected
> [    2.571777] udc musb-hdrc.0.auto: registering UDC driver [g_ether]
> [    2.578369] using random self ethernet address
> [    2.583190] using random host ethernet address
> [    2.587860] g_ether gadget: adding config #1 'CDC Ethernet (ECM)'/c088f36c
> [    2.595123] g_ether gadget: adding 'cdc_ethernet'/c67a5a80 to config 'CDC Ethernet (ECM)'/c088f36c
> [    2.605773] usb0: HOST MAC be:25:6c:0b:9c:f6
> [    2.610412] usb0: MAC 8e:1b:c7:a5:7c:69
> [    2.614532] g_ether gadget: CDC Ethernet: dual speed IN/ep1in OUT/ep1out NOTIFY/ep2in
> [    2.622802] g_ether gadget: cfg 1/c088f36c speeds: high full
> [    2.628692] g_ether gadget:   interface 0 = cdc_ethernet/c67a5a80
> [    2.635437] g_ether gadget:   interface 1 = cdc_ethernet/c67a5a80
> [    2.641906] g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
> [    2.648834] g_ether gadget: g_ether ready
> [    2.654602] mousedev: PS/2 mouse device common for all mice
> [    2.662597] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
> [    2.670623] Internal error: : 1028 [#1] PREEMPT ARM
> [    2.675720] Modules linked in:
> [    2.678924] CPU: 0 PID: 52 Comm: kworker/0:1 Tainted: G        W    3.14.0-00067-g17ec48c-dirty #36
> [    2.688385] Workqueue: events omap_musb_mailbox_work
> [    2.693603] task: c655e000 ti: c6576000 task.ti: c6576000
> [    2.699249] PC is at omap2430_musb_set_vbus+0x30/0x158
> [    2.704620] LR is at omap2430_musb_set_vbus+0x20/0x158
> [    2.709960] pc : [<c0402544>]    lr : [<c0402534>]    psr: 20000113
> [    2.709960] sp : c6577ec8  ip : 00000000  fp : c08a2200
> [    2.721984] r10: c67959d0  r9 : ffff8bda  r8 : 00000064
> [    2.727447] r7 : c085d030  r6 : c6589810  r5 : c6764010  r4 : 00000000
> [    2.734252] r3 : fa0ab000  r2 : 333333fd  r1 : 00000000  r0 : 00000064
> [    2.741088] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> [    2.748718] Control: 10c5387d  Table: 80004019  DAC: 00000017
> [    2.754730] Process kworker/0:1 (pid: 52, stack limit = 0xc6576240)
> [    2.761260] Stack: (0xc6577ec8 to 0xc6578000)
> [    2.765838] 7ec0:                   c66cac00 c00efa58 c00503e8 c6764010 c65f1490 c6589810
> [    2.774383] 7ee0: c65f1410 c6576010 00000000 00000000 c08a2200 c04027fc c04028c0 c65f149c
> [    2.782928] 7f00: c656ea80 c085d624 c7dcc300 c004f3cc c005f764 00000001 c64c5eb8 00000000
> [    2.791473] 7f20: c64c5eac c656ea80 c085d624 c085d634 c6576008 c656ea98 c08a1ea4 00000009
> [    2.800018] 7f40: c085d624 c0050324 c655e000 c656ff00 00000000 c656ea80 c0050210 00000000
> [    2.808593] 7f60: 00000000 00000000 00000000 c0055c24 00000000 00000000 c0863c10 c656ea80
> [    2.817138] 7f80: 00000000 c6577f84 c6577f84 00000000 c6577f90 c6577f90 c6577fac c656ff00
> [    2.825683] 7fa0: c0055b60 00000000 00000000 c000e4b8 00000000 00000000 00000000 00000000
> [    2.834228] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [    2.842773] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [    2.851348] [<c0402544>] (omap2430_musb_set_vbus) from [<c04027fc>] (omap_musb_set_mailbox+0xcc/0x190)
> [    2.861083] [<c04027fc>] (omap_musb_set_mailbox) from [<c004f3cc>] (process_one_work+0x13c/0x458)
> [    2.870391] [<c004f3cc>] (process_one_work) from [<c0050324>] (worker_thread+0x114/0x400)
> [    2.878936] [<c0050324>] (worker_thread) from [<c0055c24>] (kthread+0xc4/0xe0)
> [    2.886505] [<c0055c24>] (kthread) from [<c000e4b8>] (ret_from_fork+0x14/0x3c)
> [    2.894042] Code: e59f7120 e5953274 e5979000 e1a08000 (e5d3b060) 
> [    2.900421] ---[ end trace e1fcc80bd44c7d71 ]---
> [    2.905273] In-band Error seen by MPU  at address 0
> [    2.910369] ------------[ cut here ]------------
> ...
> 
> 
> I didn't notice it at first since I had some additional debug
> printk's built into the kernel. And it seems that with the addition
> of this printk the Oops does not occur.
> 
> diff --git a/drivers/usb/musb/musb_gadget.c b/drivers/usb/musb/musb_gadget.c
> index d4aa779..d58bc61 100644
> --- a/drivers/usb/musb/musb_gadget.c
> +++ b/drivers/usb/musb/musb_gadget.c
> @@ -1844,6 +1844,7 @@ static int musb_gadget_start(struct usb_gadget *g,
>         unsigned long           flags;
>         int                     retval = 0;
>  
> +       printk("xxxxxxxxxxx %s (%d)\n", __func__, __LINE__);
>         if (driver->max_speed < USB_SPEED_HIGH) {
>                 retval = -EINVAL;
>                 goto err;
> 
> 
> I wanted to report this before this patch hits mainline so it might
> get updated/fixed. Any ideas?

that's not caused by my patch, it's a previously existing bug. This
should sort it out:

commit e7f69404a878b5345ad07bf06d78559ecd31db79
Author: Felipe Balbi <balbi@ti.com>
Date:   Mon Apr 7 10:58:01 2014 -0500

    usb: musb: omap2430: make sure clocks are enabled when running mailbox
    
    on early initialization we could fall into
    a situation where the mailbox is called before
    MUSB's clocks are running, in order to avoid
    that, make sure mailbox is always wrapped with
    pm_runtime calls.
    
    Reported-by: Stefan Roese <sr@denx.de>
    Signed-off-by: Felipe Balbi <balbi@ti.com>


let me know

Comments

Stefan Roese April 8, 2014, 7:24 a.m. | #1
On 07.04.2014 18:04, Felipe Balbi wrote:

<snip>

> that's not caused by my patch, it's a previously existing bug. This
> should sort it out:
> 
> commit e7f69404a878b5345ad07bf06d78559ecd31db79
> Author: Felipe Balbi <balbi@ti.com>
> Date:   Mon Apr 7 10:58:01 2014 -0500
> 
>      usb: musb: omap2430: make sure clocks are enabled when running mailbox
>      
>      on early initialization we could fall into
>      a situation where the mailbox is called before
>      MUSB's clocks are running, in order to avoid
>      that, make sure mailbox is always wrapped with
>      pm_runtime calls.
>      
>      Reported-by: Stefan Roese <sr@denx.de>
>      Signed-off-by: Felipe Balbi <balbi@ti.com>
> 
> diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
> index 819a7cd..d369bf1 100644
> --- a/drivers/usb/musb/omap2430.c
> +++ b/drivers/usb/musb/omap2430.c
> @@ -316,7 +316,13 @@ static void omap_musb_mailbox_work(struct work_struct *mailbox_work)
>   {
>   	struct omap2430_glue *glue = container_of(mailbox_work,
>   				struct omap2430_glue, omap_musb_mailbox_work);
> +	struct musb *musb = glue_to_musb(glue);
> +	struct device *dev = musb->controller;
> +
> +	pm_runtime_get_sync(dev);
>   	omap_musb_set_mailbox(glue);
> +	pm_runtime_mark_last_busy(dev);
> +	pm_runtime_put_autosuspend(dev);
>   }
>   
>   static irqreturn_t omap2430_musb_interrupt(int irq, void *__hci)
> 
> let me know

The crash doesn't appear anymore with this patch. But the ethernet
gadget is not started. This happens also with a plugged cable upon
driver startup. Now I see an error in the log:

[    2.793121] musb-hdrc musb-hdrc.0.auto: VBUS_ERROR in b_idle (80, <SessEnd), retry #0, port1 00000100


Thanks,
Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi April 8, 2014, 2:31 p.m. | #2
Hi,

On Tue, Apr 08, 2014 at 09:24:09AM +0200, Stefan Roese wrote:
> On 07.04.2014 18:04, Felipe Balbi wrote:
> 
> <snip>
> 
> > that's not caused by my patch, it's a previously existing bug. This
> > should sort it out:
> > 
> > commit e7f69404a878b5345ad07bf06d78559ecd31db79
> > Author: Felipe Balbi <balbi@ti.com>
> > Date:   Mon Apr 7 10:58:01 2014 -0500
> > 
> >      usb: musb: omap2430: make sure clocks are enabled when running mailbox
> >      
> >      on early initialization we could fall into
> >      a situation where the mailbox is called before
> >      MUSB's clocks are running, in order to avoid
> >      that, make sure mailbox is always wrapped with
> >      pm_runtime calls.
> >      
> >      Reported-by: Stefan Roese <sr@denx.de>
> >      Signed-off-by: Felipe Balbi <balbi@ti.com>
> > 
> > diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
> > index 819a7cd..d369bf1 100644
> > --- a/drivers/usb/musb/omap2430.c
> > +++ b/drivers/usb/musb/omap2430.c
> > @@ -316,7 +316,13 @@ static void omap_musb_mailbox_work(struct work_struct *mailbox_work)
> >   {
> >   	struct omap2430_glue *glue = container_of(mailbox_work,
> >   				struct omap2430_glue, omap_musb_mailbox_work);
> > +	struct musb *musb = glue_to_musb(glue);
> > +	struct device *dev = musb->controller;
> > +
> > +	pm_runtime_get_sync(dev);
> >   	omap_musb_set_mailbox(glue);
> > +	pm_runtime_mark_last_busy(dev);
> > +	pm_runtime_put_autosuspend(dev);
> >   }
> >   
> >   static irqreturn_t omap2430_musb_interrupt(int irq, void *__hci)
> > 
> > let me know
> 
> The crash doesn't appear anymore with this patch. But the ethernet
> gadget is not started. This happens also with a plugged cable upon
> driver startup. Now I see an error in the log:
> 
> [    2.793121] musb-hdrc musb-hdrc.0.auto: VBUS_ERROR in b_idle (80, <SessEnd), retry #0, port1 00000100

that shouldn't happen... Can you add some extra debugging prints and try
to figure out what's going on ?
Stefan Roese April 10, 2014, 1:41 p.m. | #3
Hi Felipe,

On 08.04.2014 16:31, Felipe Balbi wrote:
>>> diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
>>> index 819a7cd..d369bf1 100644
>>> --- a/drivers/usb/musb/omap2430.c
>>> +++ b/drivers/usb/musb/omap2430.c
>>> @@ -316,7 +316,13 @@ static void omap_musb_mailbox_work(struct work_struct *mailbox_work)
>>>    {
>>>    	struct omap2430_glue *glue = container_of(mailbox_work,
>>>    				struct omap2430_glue, omap_musb_mailbox_work);
>>> +	struct musb *musb = glue_to_musb(glue);
>>> +	struct device *dev = musb->controller;
>>> +
>>> +	pm_runtime_get_sync(dev);
>>>    	omap_musb_set_mailbox(glue);
>>> +	pm_runtime_mark_last_busy(dev);
>>> +	pm_runtime_put_autosuspend(dev);
>>>    }
>>>    
>>>    static irqreturn_t omap2430_musb_interrupt(int irq, void *__hci)
>>>
>>> let me know
>>
>> The crash doesn't appear anymore with this patch. But the ethernet
>> gadget is not started. This happens also with a plugged cable upon
>> driver startup. Now I see an error in the log:
>>
>> [    2.793121] musb-hdrc musb-hdrc.0.auto: VBUS_ERROR in b_idle (80, <SessEnd), retry #0, port1 00000100
> 
> that shouldn't happen... Can you add some extra debugging prints and try
> to figure out what's going on ?

Unfortunately I can't reproduce this error. And the crash is also not
happening right now. Seems to be very timing sensitive.

Still with both patches applied, the ethernet gadget is only started
correctly (with cable connected after driver loading) when I add some
printk's. This one helps for example:

static void musb_do_idle(unsigned long _musb)
{
	struct musb	*musb = (void *)_musb;
	unsigned long	flags;
	u8	power;
	u8	devctl;

#if 1 // test-only: this helps as well with gadget to connect
	printk("*** %s (%d): state=%d\n", __func__, __LINE__, musb->xceiv->state); // test-only
#endif

Without this printk the gadget is not started upon cable insertion.
No MUSB interrupts occur when the USB cable is connected.

There seems to be some timing dependency which I fail to solve
quickly. The original locations (removed with [1]) for
phy_power_on() and phy_power_off() don't suffer these timing /
printk dependencies.

Any ideas?

Thanks,
Stefan

[1] 30a70b02 (usb: musb: fix obex in g_nokia.ko causing kernel panic)

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Felipe Balbi April 30, 2014, 4:48 p.m. | #4
On Thu, Apr 10, 2014 at 03:41:16PM +0200, Stefan Roese wrote:
> Hi Felipe,
> 
> On 08.04.2014 16:31, Felipe Balbi wrote:
> >>> diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
> >>> index 819a7cd..d369bf1 100644
> >>> --- a/drivers/usb/musb/omap2430.c
> >>> +++ b/drivers/usb/musb/omap2430.c
> >>> @@ -316,7 +316,13 @@ static void omap_musb_mailbox_work(struct work_struct *mailbox_work)
> >>>    {
> >>>    	struct omap2430_glue *glue = container_of(mailbox_work,
> >>>    				struct omap2430_glue, omap_musb_mailbox_work);
> >>> +	struct musb *musb = glue_to_musb(glue);
> >>> +	struct device *dev = musb->controller;
> >>> +
> >>> +	pm_runtime_get_sync(dev);
> >>>    	omap_musb_set_mailbox(glue);
> >>> +	pm_runtime_mark_last_busy(dev);
> >>> +	pm_runtime_put_autosuspend(dev);
> >>>    }
> >>>    
> >>>    static irqreturn_t omap2430_musb_interrupt(int irq, void *__hci)
> >>>
> >>> let me know
> >>
> >> The crash doesn't appear anymore with this patch. But the ethernet
> >> gadget is not started. This happens also with a plugged cable upon
> >> driver startup. Now I see an error in the log:
> >>
> >> [    2.793121] musb-hdrc musb-hdrc.0.auto: VBUS_ERROR in b_idle (80, <SessEnd), retry #0, port1 00000100
> > 
> > that shouldn't happen... Can you add some extra debugging prints and try
> > to figure out what's going on ?
> 
> Unfortunately I can't reproduce this error. And the crash is also not
> happening right now. Seems to be very timing sensitive.
> 
> Still with both patches applied, the ethernet gadget is only started
> correctly (with cable connected after driver loading) when I add some
> printk's. This one helps for example:
> 
> static void musb_do_idle(unsigned long _musb)
> {
> 	struct musb	*musb = (void *)_musb;
> 	unsigned long	flags;
> 	u8	power;
> 	u8	devctl;
> 
> #if 1 // test-only: this helps as well with gadget to connect
> 	printk("*** %s (%d): state=%d\n", __func__, __LINE__, musb->xceiv->state); // test-only
> #endif
> 
> Without this printk the gadget is not started upon cable insertion.
> No MUSB interrupts occur when the USB cable is connected.
> 
> There seems to be some timing dependency which I fail to solve
> quickly. The original locations (removed with [1]) for
> phy_power_on() and phy_power_off() don't suffer these timing /
> printk dependencies.
> 
> Any ideas?

no ideas, I'm guessing there's a race somewhere, but no time to deal
with that right now. Patches are welcome though.

Patch

diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
index 819a7cd..d369bf1 100644
--- a/drivers/usb/musb/omap2430.c
+++ b/drivers/usb/musb/omap2430.c
@@ -316,7 +316,13 @@  static void omap_musb_mailbox_work(struct work_struct *mailbox_work)
 {
 	struct omap2430_glue *glue = container_of(mailbox_work,
 				struct omap2430_glue, omap_musb_mailbox_work);
+	struct musb *musb = glue_to_musb(glue);
+	struct device *dev = musb->controller;
+
+	pm_runtime_get_sync(dev);
 	omap_musb_set_mailbox(glue);
+	pm_runtime_mark_last_busy(dev);
+	pm_runtime_put_autosuspend(dev);
 }
 
 static irqreturn_t omap2430_musb_interrupt(int irq, void *__hci)