diff mbox series

[v2] usb: chipidea: add loop timeout for hw_ep_set_halt()

Message ID 20210817064353.GA669425@ubuntu
State New
Headers show
Series [v2] usb: chipidea: add loop timeout for hw_ep_set_halt() | expand

Commit Message

Jeaho Hwang Aug. 17, 2021, 6:43 a.m. UTC
If ctrl EP priming is failed (very rare case in standard linux),
hw_ep_set_halt goes infinite loop. waiting 100 times was enough
for zynq7000.

Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

Comments

Jeaho Hwang Aug. 24, 2021, 8:31 a.m. UTC | #1
2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <jhhwang@rtst.co.kr>님이 작성:
>

> If ctrl EP priming is failed (very rare case in standard linux),

> hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> for zynq7000.

>


Hi Peter.
I found from zynq7000 TRM that the hardware clears Stall bit if a
setup packet is received on a control endpoint.
I think hw_ep_set_halt goes infinite loop since:

1. hw_ep_prime for control EP which is called from
isr_tr_complete_handler -> isr_setup_status_phase is failed due to a
setup packet received.
2. in isr_tr_complete_handler it tries to call _ep_set_halt if either
isr_tr_complete_low or isr_setup_status_phase returns error.
3. Since the control EP got a setup packet, HW resets TXS bit just as
the driver sets inside hw_ep_set_halt so it goes infinite loop.

Does it make sense? If it is right, we shouldn't call _ep_set_halt if
the err is -EAGAIN, which could be returned ONLY due to the setup
packet issue described above.
And the loop timeout is not required anymore.

Can I ask your opinion on this, Peter and USB experts?

Thanks.

> Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

>

> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c

> index 8834ca613721..d73fadb18f32 100644

> --- a/drivers/usb/chipidea/udc.c

> +++ b/drivers/usb/chipidea/udc.c

> @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

>         return 0;

>  }

>

> +/* enough for zynq7000 evaluation board */

> +#define HW_EP_SET_HALT_COUNT_MAX 100

> +

>  /**

>   * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute

>   *                 without interruption)

> @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

>   */

>  static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

>  {

> +       int count = HW_EP_SET_HALT_COUNT_MAX;

>         if (value != 0 && value != 1)

>                 return -EINVAL;

>

> @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

>                 /* data toggle - reserved for EP0 but it's in ESS */

>                 hw_write(ci, reg, mask_xs|mask_xr,

>                           value ? mask_xs : mask_xr);

> -       } while (value != hw_ep_get_halt(ci, num, dir));

> +       } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);

>

> -       return 0;

> +       return count ? 0 : -EAGAIN;

>  }

>

>  /**

> --

> 2.25.1

>
Greg Kroah-Hartman Aug. 26, 2021, 11:17 a.m. UTC | #2
On Tue, Aug 17, 2021 at 03:43:53PM +0900, Jeaho Hwang wrote:
> If ctrl EP priming is failed (very rare case in standard linux),

> hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> for zynq7000.

> 

> Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

> 

> diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c

> index 8834ca613721..d73fadb18f32 100644

> --- a/drivers/usb/chipidea/udc.c

> +++ b/drivers/usb/chipidea/udc.c

> @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

>  	return 0;

>  }

>  

> +/* enough for zynq7000 evaluation board */

> +#define HW_EP_SET_HALT_COUNT_MAX 100

> +

>  /**

>   * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute

>   *                 without interruption)

> @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

>   */

>  static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

>  {

> +	int count = HW_EP_SET_HALT_COUNT_MAX;

>  	if (value != 0 && value != 1)


You need a blank line after "int count..."


>  		return -EINVAL;

>  

> @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

>  		/* data toggle - reserved for EP0 but it's in ESS */

>  		hw_write(ci, reg, mask_xs|mask_xr,

>  			  value ? mask_xs : mask_xr);

> -	} while (value != hw_ep_get_halt(ci, num, dir));

> +	} while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);

>  

> -	return 0;

> +	return count ? 0 : -EAGAIN;


Please spell this out:
	if (count)
		return 0;
	return -EAGAIN;

And will the caller properly handle this?

thanks,

greg k-h
Peter Chen Aug. 26, 2021, 11:07 p.m. UTC | #3
On 21-08-24 17:31:44, Jeaho Hwang wrote:
> 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <jhhwang@rtst.co.kr>님이 작성:

> >

> > If ctrl EP priming is failed (very rare case in standard linux),

> > hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> > for zynq7000.

> >

> 

> Hi Peter.

> I found from zynq7000 TRM that the hardware clears Stall bit if a

> setup packet is received on a control endpoint.

> I think hw_ep_set_halt goes infinite loop since:

> 

> 1. hw_ep_prime for control EP which is called from

> isr_tr_complete_handler -> isr_setup_status_phase is failed due to a

> setup packet received.


How do you know that? Do you observe the new setup packet on the bus
before the current status stage? Usually, the host doesn't begin new setup
transfer before current setup transfer has finished.

Peter

> 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either

> isr_tr_complete_low or isr_setup_status_phase returns error.

> 3. Since the control EP got a setup packet, HW resets TXS bit just as

> the driver sets inside hw_ep_set_halt so it goes infinite loop.

> 

> Does it make sense? If it is right, we shouldn't call _ep_set_halt if

> the err is -EAGAIN, which could be returned ONLY due to the setup

> packet issue described above.

> And the loop timeout is not required anymore.

> 

> Can I ask your opinion on this, Peter and USB experts?

> 

> Thanks.

> 

> > Signed-off-by: Jeaho Hwang <jhhwang@rtst.co.kr>

> >

> > diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c

> > index 8834ca613721..d73fadb18f32 100644

> > --- a/drivers/usb/chipidea/udc.c

> > +++ b/drivers/usb/chipidea/udc.c

> > @@ -209,6 +209,9 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

> >         return 0;

> >  }

> >

> > +/* enough for zynq7000 evaluation board */

> > +#define HW_EP_SET_HALT_COUNT_MAX 100

> > +

> >  /**

> >   * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute

> >   *                 without interruption)

> > @@ -221,6 +224,7 @@ static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

> >   */

> >  static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

> >  {

> > +       int count = HW_EP_SET_HALT_COUNT_MAX;

> >         if (value != 0 && value != 1)

> >                 return -EINVAL;

> >

> > @@ -232,9 +236,9 @@ static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)

> >                 /* data toggle - reserved for EP0 but it's in ESS */

> >                 hw_write(ci, reg, mask_xs|mask_xr,

> >                           value ? mask_xs : mask_xr);

> > -       } while (value != hw_ep_get_halt(ci, num, dir));

> > +       } while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);

> >

> > -       return 0;

> > +       return count ? 0 : -EAGAIN;

> >  }

> >

> >  /**

> > --

> > 2.25.1

> >


-- 

Thanks,
Peter Chen
Jeaho Hwang Aug. 27, 2021, 1:35 a.m. UTC | #4
2021년 8월 27일 (금) 오전 8:08, Peter Chen <peter.chen@kernel.org>님이 작성:
>

> On 21-08-24 17:31:44, Jeaho Hwang wrote:

> > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <jhhwang@rtst.co.kr>님이 작성:

> > >

> > > If ctrl EP priming is failed (very rare case in standard linux),

> > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> > > for zynq7000.

> > >

> >

> > Hi Peter.

> > I found from zynq7000 TRM that the hardware clears Stall bit if a

> > setup packet is received on a control endpoint.

> > I think hw_ep_set_halt goes infinite loop since:

> >

> > 1. hw_ep_prime for control EP which is called from

> > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a

> > setup packet received.

>

> How do you know that? Do you observe the new setup packet on the bus

> before the current status stage? Usually, the host doesn't begin new setup

> transfer before current setup transfer has finished.

>

> Peter

>


I found an error return from the second ENDPTSETUPSTAT checking
routine, then setting the stall bit(TXS) kept failing. So I guessed it
is due to a setup packet received.
I didn't observe the setup packet by e.g. HW probes. Any other reason
to produce that symptom?

For reminder, only reproduced on preemp_rt kernel and with Windows(10)
RNDIS host.

thanks.

 191 static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
 192 {
 193     int n = hw_ep_bit(num, dir);
 194
 195     /* Synchronize before ep prime */
 196     wmb();
 197
 198     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
 199         return -EAGAIN;
 200
 201     hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));
 202
 203     while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))
 204         cpu_relax();
 205     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))
 206         return -EAGAIN;
             ~~~~~~~~~~~~~~~~
 207
 208     /* status shoult be tested according with manual but it doesn't work */
 209     return 0;
 210 }

> > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either

> > isr_tr_complete_low or isr_setup_status_phase returns error.

> > 3. Since the control EP got a setup packet, HW resets TXS bit just as

> > the driver sets inside hw_ep_set_halt so it goes infinite loop.

> >

> > Does it make sense? If it is right, we shouldn't call _ep_set_halt if

> > the err is -EAGAIN, which could be returned ONLY due to the setup

> > packet issue described above.

> > And the loop timeout is not required anymore.

> >

> > Can I ask your opinion on this, Peter and USB experts?

> >

> > Thanks.

> >
Peter Chen Aug. 28, 2021, 1:38 a.m. UTC | #5
On 21-08-27 10:35:05, Jeaho Hwang wrote:
> 2021년 8월 27일 (금) 오전 8:08, Peter Chen <peter.chen@kernel.org>님이 작성:

> >

> > On 21-08-24 17:31:44, Jeaho Hwang wrote:

> > > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <jhhwang@rtst.co.kr>님이 작성:

> > > >

> > > > If ctrl EP priming is failed (very rare case in standard linux),

> > > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> > > > for zynq7000.

> > > >

> > >

> > > Hi Peter.

> > > I found from zynq7000 TRM that the hardware clears Stall bit if a

> > > setup packet is received on a control endpoint.

> > > I think hw_ep_set_halt goes infinite loop since:

> > >

> > > 1. hw_ep_prime for control EP which is called from

> > > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a

> > > setup packet received.

> >

> > How do you know that? Do you observe the new setup packet on the bus

> > before the current status stage? Usually, the host doesn't begin new setup

> > transfer before current setup transfer has finished.

> >

> > Peter

> >

> 

> I found an error return from the second ENDPTSETUPSTAT checking

> routine, then setting the stall bit(TXS) kept failing. So I guessed it

> is due to a setup packet received.

> I didn't observe the setup packet by e.g. HW probes. Any other reason

> to produce that symptom?


I guess two possible reasons for that:
- The new setup is coming after priming
- The interrupt occurs after prime, and when the back from interrupt,
other thread for USB transfer is scheduled, eg, usb_ep_queue from RNDIS 

From your experiments and observation, it seems the first reason is not possible.
Did your get failure with UP system?

Peter

> 

> For reminder, only reproduced on preemp_rt kernel and with Windows(10)

> RNDIS host.

> 

> thanks.

> 

>  191 static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)

>  192 {

>  193     int n = hw_ep_bit(num, dir);

>  194

>  195     /* Synchronize before ep prime */

>  196     wmb();

>  197

>  198     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))

>  199         return -EAGAIN;

>  200

>  201     hw_write(ci, OP_ENDPTPRIME, ~0, BIT(n));

>  202

>  203     while (hw_read(ci, OP_ENDPTPRIME, BIT(n)))

>  204         cpu_relax();

>  205     if (is_ctrl && dir == RX && hw_read(ci, OP_ENDPTSETUPSTAT, BIT(num)))

>  206         return -EAGAIN;

>              ~~~~~~~~~~~~~~~~

>  207

>  208     /* status shoult be tested according with manual but it doesn't work */

>  209     return 0;

>  210 }

> 

> > > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either

> > > isr_tr_complete_low or isr_setup_status_phase returns error.

> > > 3. Since the control EP got a setup packet, HW resets TXS bit just as

> > > the driver sets inside hw_ep_set_halt so it goes infinite loop.

> > >

> > > Does it make sense? If it is right, we shouldn't call _ep_set_halt if

> > > the err is -EAGAIN, which could be returned ONLY due to the setup

> > > packet issue described above.

> > > And the loop timeout is not required anymore.

> > >

> > > Can I ask your opinion on this, Peter and USB experts?

> > >

> > > Thanks.

> > >


-- 

Thanks,
Peter Chen
Jeaho Hwang Aug. 28, 2021, 3:10 a.m. UTC | #6
2021년 8월 28일 (토) 오전 10:38, Peter Chen <peter.chen@kernel.org>님이 작성:
>

> On 21-08-27 10:35:05, Jeaho Hwang wrote:

> > 2021년 8월 27일 (금) 오전 8:08, Peter Chen <peter.chen@kernel.org>님이 작성:

> > >

> > > On 21-08-24 17:31:44, Jeaho Hwang wrote:

> > > > 2021년 8월 17일 (화) 오후 3:44, Jeaho Hwang <jhhwang@rtst.co.kr>님이 작성:

> > > > >

> > > > > If ctrl EP priming is failed (very rare case in standard linux),

> > > > > hw_ep_set_halt goes infinite loop. waiting 100 times was enough

> > > > > for zynq7000.

> > > > >

> > > >

> > > > Hi Peter.

> > > > I found from zynq7000 TRM that the hardware clears Stall bit if a

> > > > setup packet is received on a control endpoint.

> > > > I think hw_ep_set_halt goes infinite loop since:

> > > >

> > > > 1. hw_ep_prime for control EP which is called from

> > > > isr_tr_complete_handler -> isr_setup_status_phase is failed due to a

> > > > setup packet received.

> > >

> > > How do you know that? Do you observe the new setup packet on the bus

> > > before the current status stage? Usually, the host doesn't begin new setup

> > > transfer before current setup transfer has finished.

> > >

> > > Peter

> > >

> >

> > I found an error return from the second ENDPTSETUPSTAT checking

> > routine, then setting the stall bit(TXS) kept failing. So I guessed it

> > is due to a setup packet received.

> > I didn't observe the setup packet by e.g. HW probes. Any other reason

> > to produce that symptom?

>

> I guess two possible reasons for that:

> - The new setup is coming after priming

> - The interrupt occurs after prime, and when the back from interrupt,

> other thread for USB transfer is scheduled, eg, usb_ep_queue from RNDIS

>

> From your experiments and observation, it seems the first reason is not possible.


I will check if any other thread calls udc. but the only workload
using RNDIS was the host side ping sender.
Thanks for the advice.

> Did your get failure with UP system?


I'm sorry I don't understand what UP system means.

>

> Peter

>

> >

> > For reminder, only reproduced on preemp_rt kernel and with Windows(10)

> > RNDIS host.

> >

> > thanks.

> >

> >

> > > > 2. in isr_tr_complete_handler it tries to call _ep_set_halt if either

> > > > isr_tr_complete_low or isr_setup_status_phase returns error.

> > > > 3. Since the control EP got a setup packet, HW resets TXS bit just as

> > > > the driver sets inside hw_ep_set_halt so it goes infinite loop.

> > > >

> > > > Does it make sense? If it is right, we shouldn't call _ep_set_halt if

> > > > the err is -EAGAIN, which could be returned ONLY due to the setup

> > > > packet issue described above.

> > > > And the loop timeout is not required anymore.

> > > >

> > > > Can I ask your opinion on this, Peter and USB experts?

> > > >

> > > > Thanks.

> > > >

>

> --

>

> Thanks,

> Peter Chen

>



-- 
황재호, Jay Hwang, linux team manager of RTst
010-7242-1593
diff mbox series

Patch

diff --git a/drivers/usb/chipidea/udc.c b/drivers/usb/chipidea/udc.c
index 8834ca613721..d73fadb18f32 100644
--- a/drivers/usb/chipidea/udc.c
+++ b/drivers/usb/chipidea/udc.c
@@ -209,6 +209,9 @@  static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
 	return 0;
 }
 
+/* enough for zynq7000 evaluation board */
+#define HW_EP_SET_HALT_COUNT_MAX 100
+
 /**
  * hw_ep_set_halt: configures ep halt & resets data toggle after clear (execute
  *                 without interruption)
@@ -221,6 +224,7 @@  static int hw_ep_prime(struct ci_hdrc *ci, int num, int dir, int is_ctrl)
  */
 static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
 {
+	int count = HW_EP_SET_HALT_COUNT_MAX;
 	if (value != 0 && value != 1)
 		return -EINVAL;
 
@@ -232,9 +236,9 @@  static int hw_ep_set_halt(struct ci_hdrc *ci, int num, int dir, int value)
 		/* data toggle - reserved for EP0 but it's in ESS */
 		hw_write(ci, reg, mask_xs|mask_xr,
 			  value ? mask_xs : mask_xr);
-	} while (value != hw_ep_get_halt(ci, num, dir));
+	} while (value != hw_ep_get_halt(ci, num, dir) && --count > 0);
 
-	return 0;
+	return count ? 0 : -EAGAIN;
 }
 
 /**