[v3,2/6] usb: xhci: return ERR_PTR(-ETIMEDOUT) at the end of xhci_wait_for_event()

Message ID 1515548724-31869-3-git-send-email-yamada.masahiro@socionext.com
State New
Headers show
Series
  • assert() is almost used in the same way as BUG_ON(), except:
Related show

Commit Message

Masahiro Yamada Jan. 10, 2018, 1:45 a.m.
xhci_wait_for_event() is supposed to return a pointer to union xhci_trb,
but it does not return anything at the end of the function.

This relies on that the end of the function is unreachable due to BUG().

We are planning to make BUG() no-op for platforms with strong image size
constraint.  Doing so would cause compiler warning:

drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^

So, this function must return something.  From the error message just
above, ERR_PTR(-ETIMEDOUT) seems a good choice.

The use of BUG() looks suspicious here in the first place; no response
from hardware is not a bug.  It should be treated as a normal error.
So, this function must return an error pointer instead of BUG(), then
the caller must handle it properly.

I am not fixing the code because this is not the only place that stops
the system.  Just one failure of xHCI halts the system, here and there.

I left a comment block, hoping somebody will take a look.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
---

Changes in v3:
  - newly added

Changes in v2: None

 drivers/usb/host/xhci-ring.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Masahiro Yamada Jan. 28, 2018, 10:18 a.m. | #1
Bin, Marek,

Would you take a look at xHCI code?

Lots of xHCI code invokes BUG_ON()/BUG()
when the hardware access fails,
then halts the system completely.
This is not a software bug.


2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>:
> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb,
> but it does not return anything at the end of the function.
>
> This relies on that the end of the function is unreachable due to BUG().
>
> We are planning to make BUG() no-op for platforms with strong image size
> constraint.  Doing so would cause compiler warning:
>
> drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type]
>  }
>  ^
>
> So, this function must return something.  From the error message just
> above, ERR_PTR(-ETIMEDOUT) seems a good choice.
>
> The use of BUG() looks suspicious here in the first place; no response
> from hardware is not a bug.  It should be treated as a normal error.
> So, this function must return an error pointer instead of BUG(), then
> the caller must handle it properly.
>
> I am not fixing the code because this is not the only place that stops
> the system.  Just one failure of xHCI halts the system, here and there.
>
> I left a comment block, hoping somebody will take a look.
>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
>
> Changes in v3:
>   - newly added
>
> Changes in v2: None
>
>  drivers/usb/host/xhci-ring.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 579e670..d780367 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -471,7 +471,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
>                 return NULL;
>
>         printf("XHCI timeout on event type %d... cannot recover.\n", expected);
> +
> +       /*
> +        * CHECK:
> +        * Is this software bug?  Is this a good reason to halt the system?
> +        */
>         BUG();
> +
> +       return ERR_PTR(-ETIMEDOUT);
>  }
>
>  /*
> --
> 2.7.4
>
> _______________________________________________
> U-Boot mailing list
> U-Boot@lists.denx.de
> https://lists.denx.de/listinfo/u-boot
Marek Vasut Jan. 28, 2018, 12:43 p.m. | #2
On 01/28/2018 11:18 AM, Masahiro Yamada wrote:
> Bin, Marek,

Hi,

> Would you take a look at xHCI code?

I was kinda hoping Bin would, oh well ...

> Lots of xHCI code invokes BUG_ON()/BUG()
> when the hardware access fails,
> then halts the system completely.
> This is not a software bug.
> 
> 
> 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>:
>> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb,
>> but it does not return anything at the end of the function.
>>
>> This relies on that the end of the function is unreachable due to BUG().
>>
>> We are planning to make BUG() no-op for platforms with strong image size
>> constraint.

But BUG() should hang the system because it means an unrecoverable issue
happened. Changing it to nop would cause a lot of weird misbehavior, so
that is IMO a bad idea. Just change it to hang(), that should be present
even on size-constrained systems.

>> Doing so would cause compiler warning:
>>
>> drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type]
>>  }
>>  ^
>>
>> So, this function must return something.  From the error message just
>> above, ERR_PTR(-ETIMEDOUT) seems a good choice.

This is an imported code from Linux, so you might want to check what
they do/did there.

>> The use of BUG() looks suspicious here in the first place; no response
>> from hardware is not a bug.  It should be treated as a normal error.
>> So, this function must return an error pointer instead of BUG(), then
>> the caller must handle it properly.
>>
>> I am not fixing the code because this is not the only place that stops
>> the system.  Just one failure of xHCI halts the system, here and there.
>>
>> I left a comment block, hoping somebody will take a look.
>>
>> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
>> ---
>>
>> Changes in v3:
>>   - newly added
>>
>> Changes in v2: None
>>
>>  drivers/usb/host/xhci-ring.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index 579e670..d780367 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -471,7 +471,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
>>                 return NULL;
>>
>>         printf("XHCI timeout on event type %d... cannot recover.\n", expected);
>> +
>> +       /*
>> +        * CHECK:
>> +        * Is this software bug?  Is this a good reason to halt the system?
>> +        */
>>         BUG();
>> +
>> +       return ERR_PTR(-ETIMEDOUT);
>>  }
>>
>>  /*
>> --
>> 2.7.4
>>
>> _______________________________________________
>> U-Boot mailing list
>> U-Boot@lists.denx.de
>> https://lists.denx.de/listinfo/u-boot
> 
> 
>
Masahiro Yamada Jan. 28, 2018, 2:30 p.m. | #3
Hi Marek,

2018-01-28 21:43 GMT+09:00 Marek Vasut <marex@denx.de>:
> On 01/28/2018 11:18 AM, Masahiro Yamada wrote:
>> Bin, Marek,
>
> Hi,
>
>> Would you take a look at xHCI code?
>
> I was kinda hoping Bin would, oh well ...
>
>> Lots of xHCI code invokes BUG_ON()/BUG()
>> when the hardware access fails,
>> then halts the system completely.
>> This is not a software bug.
>>
>>
>> 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>:
>>> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb,
>>> but it does not return anything at the end of the function.
>>>
>>> This relies on that the end of the function is unreachable due to BUG().
>>>
>>> We are planning to make BUG() no-op for platforms with strong image size
>>> constraint.
>
> But BUG() should hang the system because it means an unrecoverable issue
> happened. Changing it to nop would cause a lot of weird misbehavior, so
> that is IMO a bad idea. Just change it to hang(), that should be present
> even on size-constrained systems.


In my opinion, BUG_ON(!x) and assert(x) are equivalent.

Both check software bugs that should never happen
(but useful for debugging).


When releasing software, we can turn them into no-op.

Actually, assert() works like that in user-space programs.
(assert() is no-op if NDEBUG is defined)



>>> Doing so would cause compiler warning:
>>>
>>> drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type]
>>>  }
>>>  ^
>>>
>>> So, this function must return something.  From the error message just
>>> above, ERR_PTR(-ETIMEDOUT) seems a good choice.
>
> This is an imported code from Linux, so you might want to check what
> they do/did there.
>


I grepped xhci_wait_for_event in the kernel source,
but did not get any hit.
Marek Vasut Jan. 28, 2018, 2:39 p.m. | #4
On 01/28/2018 03:30 PM, Masahiro Yamada wrote:
> Hi Marek,
> 
> 2018-01-28 21:43 GMT+09:00 Marek Vasut <marex@denx.de>:
>> On 01/28/2018 11:18 AM, Masahiro Yamada wrote:
>>> Bin, Marek,
>>
>> Hi,
>>
>>> Would you take a look at xHCI code?
>>
>> I was kinda hoping Bin would, oh well ...
>>
>>> Lots of xHCI code invokes BUG_ON()/BUG()
>>> when the hardware access fails,
>>> then halts the system completely.
>>> This is not a software bug.
>>>
>>>
>>> 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>:
>>>> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb,
>>>> but it does not return anything at the end of the function.
>>>>
>>>> This relies on that the end of the function is unreachable due to BUG().
>>>>
>>>> We are planning to make BUG() no-op for platforms with strong image size
>>>> constraint.
>>
>> But BUG() should hang the system because it means an unrecoverable issue
>> happened. Changing it to nop would cause a lot of weird misbehavior, so
>> that is IMO a bad idea. Just change it to hang(), that should be present
>> even on size-constrained systems.
> 
> 
> In my opinion, BUG_ON(!x) and assert(x) are equivalent.
> 
> Both check software bugs that should never happen
> (but useful for debugging).
> 
> 
> When releasing software, we can turn them into no-op.
> 
> Actually, assert() works like that in user-space programs.
> (assert() is no-op if NDEBUG is defined)

Except in kernel, BUG() means something really bad happened and is not
optimized away, ever.
Bin Meng Jan. 30, 2018, 6:13 a.m. | #5
Hi Masahiro,

On Sun, Jan 28, 2018 at 6:18 PM, Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
> Bin, Marek,
>
> Would you take a look at xHCI code?
>

Sorry I missed this before.

> Lots of xHCI code invokes BUG_ON()/BUG()
> when the hardware access fails,
> then halts the system completely.
> This is not a software bug.

Given xHCI codes were originally from Linux where BUG_ON()/BUG() was
once there but now disappears, I agree let's do it to clear the
compiler warnings for now.
Something needs to be done in the U-Boot xHCI codes to handle the
error path properly.

Regards,
Bin

Patch

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 579e670..d780367 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -471,7 +471,14 @@  union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
 		return NULL;
 
 	printf("XHCI timeout on event type %d... cannot recover.\n", expected);
+
+	/*
+	 * CHECK:
+	 * Is this software bug?  Is this a good reason to halt the system?
+	 */
 	BUG();
+
+	return ERR_PTR(-ETIMEDOUT);
 }
 
 /*