diff mbox series

[v2] usb: chipidea: local_irq_save/restore added for hw_ep_prime

Message ID 20210817095313.GA671484@ubuntu
State New
Headers show
Series [v2] usb: chipidea: local_irq_save/restore added for hw_ep_prime | expand

Commit Message

Jeaho Hwang Aug. 17, 2021, 9:53 a.m. UTC
hw_ep_prime sometimes fails if irq occurs while it rus on RT kernel.
local_irq_save/restore is added inside the function to gurantee atomicity.
only effective for preempt_rt since hw_ep_prime is called inside top half
or spin_lock_irqsave. No effect is expected for standard linux.

Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

Comments

Sebastian Andrzej Siewior Aug. 18, 2021, 4:17 p.m. UTC | #1
On 2021-08-17 18:53:13 [+0900], Jeaho Hwang wrote:
> hw_ep_prime sometimes fails if irq occurs while it rus on RT kernel.

How/ why does it fail? Which IRQ occurs? Does it also occur without RT
and with threadirqs enabled?

> local_irq_save/restore is added inside the function to gurantee atomicity.
> only effective for preempt_rt since hw_ep_prime is called inside top half
> or spin_lock_irqsave. No effect is expected for standard linux.

How is that helping?
#1 
  udc_irq() -> isr_tr_complete_handler() -> isr_tr_complete_low ->
   _hardware_dequeue() -> reprime_dtd() -> hw_ep_prime()

udc_irq() acquires ci->lock.

#2 
  ep_queue -> _ep_queue() ->_hardware_enqueue() -> hw_ep_prime()

ep_queue acquires hwep->lock. Which is actually ci->lock.

So if I read this right then hw_ep_prime() may not be interrupted in the
middle of its operation (but preempted) because each path is protected
by the lock.

isr_tr_complete_low() drops hwep->lock and acquires it again so it that
phase another thread may acquire it.

> Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>
> 
> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
> index 8834ca613721..a624eddb3e22 100644
> --- a/drivers/usb/chipidea/udc.c
> +++ b/drivers/usb/chipidea/udc.c
> @@ -191,22 +191,31 @@ static int hw_ep_get_halt(struct ci_hdrc *ci, int num, int dir)
>  static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
>  {
>  	int n = hw_ep_bit(num, dir);
> +	unsigned long flags;
> +	int ret = 0;
>  
>  	/* Synchronize before ep prime */
>  	wmb();
>  
> -	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
> +	/* irq affects this routine so irq should be disabled on RT.
> +	 * on standard kernel, irq is already disabled by callers.

The important part is _how_ it is affected. If locking works then
nothing should read/ write the HW register. If the lock is briefly
dropped then another thread _may_ read/ write the registers but not
within this function.

If this function here is sensitive to timing (say the cpu_relax() loop
gets interrupt for 1ms) then it has to be documented as such.

> +	 */
> +	local_irq_save(flags);
> +	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num))) {
> +		local_irq_restore(flags);
>  		return -EAGAIN;
> +	}
>  
>  	hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
>  
>  	while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
>  		cpu_relax();
>  	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
> -		return -EAGAIN;
> +		ret = -EAGAIN;
>  
> +	local_irq_restore(flags);
>  	/* status shoult be tested according with manual but it doesn't work */
> -	return 0;
> +	return ret;
>  }
>  
>  /**

Sebastian
Jeaho Hwang Aug. 18, 2021, 11:50 p.m. UTC | #2
2021년 8월 19일 (목) 오전 1:17, Sebastian Andrzej Siewior
<bigeasy@linutronix.de>님이 작성:
>

> On 2021-08-17 18:53:13 [+0900], Jeaho Hwang wrote:

> > hw_ep_prime sometimes fails if irq occurs while it rus on RT kernel.

>

> How/ why does it fail? Which IRQ occurs? Does it also occur without RT

> and with threadirqs enabled?

>


We experienced priming failure while trying cable connection to a
Windows RNDIS host.
And we found that a twd interrupt occurs during execution of
hw_ep_prime for failure cases.
According to the manual, checking ENDPTSETUPSTAT before/after priming
should be done immediately since the host could send a setup packet
which makes the priming invalidated. So we think that the twd (or any)
interrupt could make a timing issue between udc_irq and the RNDIS
host.
Without RT, udc_irq runs as a forced threaded irq handler, so it runs
without any interruption or preemption. NO similar case is found on
non-RT.

> > local_irq_save/restore is added inside the function to gurantee atomicity.

> > only effective for preempt_rt since hw_ep_prime is called inside top half

> > or spin_lock_irqsave. No effect is expected for standard linux.

>

> How is that helping?

> #1

>   udc_irq() -> isr_tr_complete_handler() -> isr_tr_complete_low ->

>    _hardware_dequeue() -> reprime_dtd() -> hw_ep_prime()

>

> udc_irq() acquires ci->lock.

>

> #2

>   ep_queue -> _ep_queue() ->_hardware_enqueue() -> hw_ep_prime()

>

> ep_queue acquires hwep->lock. Which is actually ci->lock.

>

> So if I read this right then hw_ep_prime() may not be interrupted in the

> middle of its operation (but preempted) because each path is protected

> by the lock.

>

> isr_tr_complete_low() drops hwep->lock and acquires it again so it that

> phase another thread may acquire it.

>

> > Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

> >

> > diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c

> > index 8834ca613721..a624eddb3e22 100644

> > --- a/drivers/usb/chipidea/udc.c

> > +++ b/drivers/usb/chipidea/udc.c

> > @@ -191,22 +191,31 @@ static int hw_ep_get_halt(struct ci_hdrc *ci, int num, int dir)

> >  static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

> >  {

> >       int n = hw_ep_bit(num, dir);

> > +     unsigned long flags;

> > +     int ret = 0;

> >

> >       /* Synchronize before ep prime */

> >       wmb();

> >

> > -     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))

> > +     /* irq affects this routine so irq should be disabled on RT.

> > +      * on standard kernel, irq is already disabled by callers.

>

> The important part is _how_ it is affected. If locking works then

> nothing should read/ write the HW register. If the lock is briefly

> dropped then another thread _may_ read/ write the registers but not

> within this function.

>

> If this function here is sensitive to timing (say the cpu_relax() loop

> gets interrupt for 1ms) then it has to be documented as such.


The controller sets ENDPTSETUPSTAT register if the host sent a setup packet.
yes it is a timing problem. I will document that and resubmit again if
you agree that local_irq_save could help from the timing problem.

Thanks for the advice.

>

> > +      */

> > +     local_irq_save(flags);

> > +     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num))) {

> > +             local_irq_restore(flags);

> >               return -EAGAIN;

> > +     }

> >

> >       hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));

> >

> >       while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))

> >               cpu_relax();

> >       if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))

> > -             return -EAGAIN;

> > +             ret = -EAGAIN;

> >

> > +     local_irq_restore(flags);

> >       /* status shoult be tested according with manual but it doesn't work */

> > -     return 0;

> > +     return ret;

> >  }

> >

> >  /**

>

> Sebastian




-- 
황재호, Jay Hwang, linux team manager of RTst
010-7242-1593
Jeaho Hwang Aug. 20, 2021, 5:15 a.m. UTC | #3
2021년 8월 19일 (목) 오후 5:48, Sebastian Andrzej Siewior
<bigeasy@linutronix.de>님이 작성:
>

> On 2021-08-19 08:50:27 [+0900], Jeaho Hwang wrote:

> > Without RT, udc_irq runs as a forced threaded irq handler, so it runs

> > without any interruption or preemption. NO similar case is found on

> > non-RT.

>

> I see only a devm_request_irq() so no force-threading here. Booting with

> threadirqs would not lead to the problem since commit

>    81e2073c175b8 ("genirq: Disable interrupts for force threaded handlers")

>


I was wrong. udc threaded irq handler allows twd interrupt even on
non-RT and with threaded irq.
I believed Chen's comment "The function hw_ep_prime is only called at
udc_irq which is registered as top-half irq handlers. Why the timer
interrupt is occurred when hw_ep_prime is executing?".
We have additional experiments and got the results like below. RNDIS
host was Windows.

RT, 1ms delay between first ENDPTSETUPSTAT read and priming : error
case occurred
RT, 1ms delay + irq_save : no error case occurred.
non-RT, threaded irq, 1ms delay : no error case occurred even twd
fires inside the function execution.

It doesn't seem to be a timing issue. But irq definitely affects
priming on the RT kernel. Do you RT experts have any idea about the
causes?
If isr_tr_complete_handler fails ep priming it calls _ep_set_halt and
goes an infinite loop in hw_ep_set_halt. It was an actual problem we
experienced.
So we protect irqs inside hw_ep_priming not to make error cases and
also add a timeout inside the hw_ep_set_halt loop for a walkaround.
The timeout patch is submitted to linux-usb.
( https://marc.info/?l=linux-usb&m=162918269024007&w=2 )

We withdrew this patch since we don't know if disabling irq is the
best solution to solve the problem and udc would work fine with
hw_ep_set_halt walkaround even though hw_ep_prime fails.
But we are still trying to find out the cause of this symptom so We'd
so appreciate it if RT or USB experts share some ideas or ways to
report somewhere. Xilinx doesn't provide any support without their
official kernel :(

Thanks for the discussion Sebastian.

Jeaho Hwang.

> …

> > > If this function here is sensitive to timing (say the cpu_relax() loop

> > > gets interrupt for 1ms) then it has to be documented as such.

> >

> > The controller sets ENDPTSETUPSTAT register if the host sent a setup packet.

> > yes it is a timing problem. I will document that and resubmit again if

> > you agree that local_irq_save could help from the timing problem.

> >

> > Thanks for the advice.

>

> If it is really a timing issue in the function as you describe below

> then disabling interrupts would help and it is indeed an RT only issue.

>

> So you read OP_ENDPTSETUPSTAT, it is 0, all good.

> You write OP_ENDPTPRIME, wait for it to be cleared.

> Then you read OP_ENDPTSETUPSTAT again and if it is 0, all good.

>

> And the TWD interrupt could delay say the second read would read 1 and

> it is invalidated. Which looks odd.

> However, it is "okay" if the TWD interrupt happens after the second

> read? Even if the host sends a setup packet, nothing breaks?

> Do you have numbers on how long irq-off section is here? It seems to

> depend on how long the HW needs to clear the OP_ENDPTPRIME bits.

>

> Sebastian




-- 
황재호, Jay Hwang, linux team manager of RTst
010-7242-1593
Sebastian Andrzej Siewior Aug. 23, 2021, 4:14 p.m. UTC | #4
On 2021-08-21 16:04:01 [+0900], Jeaho Hwang wrote:
> 2021년 8월 21일 (토) 오후 2:05, Peter Chen <peter.chen@kernel.org>님이 작성:
> >
> HI Peter.
> 
> We configured the kernel as "low latency desktop" and added
> "threadirqs" inside the cmdline parameter.
> Then udc irq handler runs as a thread and shows no suspicious working.
> I Hope It will help.

May I again point out that since commit
   81e2073c175b8 ("genirq: Disable interrupts for force threaded handlers")                                                                                         

the threaded handler runs with disabled interrupts. This time more
verbose. It is available since v5.12-rc4 but it has been also applied
for stable.

> Thanks.

Sebastian
Sebastian Andrzej Siewior Aug. 23, 2021, 4:51 p.m. UTC | #5
On 2021-08-20 14:15:55 [+0900], Jeaho Hwang wrote:
> So we protect irqs inside hw_ep_priming not to make error cases and

> also add a timeout inside the hw_ep_set_halt loop for a walkaround.

> The timeout patch is submitted to linux-usb.

> ( https://marc.info/?l=linux-usb&m=162918269024007&w=2 )

>

> We withdrew this patch since we don't know if disabling irq is the

> best solution to solve the problem and udc would work fine with

> hw_ep_set_halt walkaround even though hw_ep_prime fails.

> But we are still trying to find out the cause of this symptom so We'd

> so appreciate it if RT or USB experts share some ideas or ways to

> report somewhere. Xilinx doesn't provide any support without their

> official kernel :(


The UDC driver sometimes drops the lock and acquires it again. On UP it
would not matter but on SMP/RT the other may acquire lock and do
something with the HW. One thing that you could check if there is any HW
access in between which would affect the behaviour.

> Thanks for the discussion Sebastian.

> 

> Jeaho Hwang.


Sebastian
diff mbox series

Patch

diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index 8834ca613721..a624eddb3e22 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -191,22 +191,31 @@  static int hw_ep_get_halt(struct ci_hdrc *ci, int num, int dir)
 static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
 {
 	int n = hw_ep_bit(num, dir);
+	unsigned long flags;
+	int ret = 0;
 
 	/* Synchronize before ep prime */
 	wmb();
 
-	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
+	/* irq affects this routine so irq should be disabled on RT.
+	 * on standard kernel, irq is already disabled by callers.
+	 */
+	local_irq_save(flags);
+	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num))) {
+		local_irq_restore(flags);
 		return -EAGAIN;
+	}
 
 	hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
 
 	while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
 		cpu_relax();
 	if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
-		return -EAGAIN;
+		ret = -EAGAIN;
 
+	local_irq_restore(flags);
 	/* status shoult be tested according with manual but it doesn't work */
-	return 0;
+	return ret;
 }
 
 /**