diff mbox

IRQ #0 broken on ARM

Message ID 87zjbk966i.fsf@approximate.cambridge.arm.com
State New
Headers show

Commit Message

Marc Zyngier Nov. 21, 2014, 10:52 a.m. UTC
On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
> Hello,
>
> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).

Well, this is a valid IRQ number if you're not using irq domains. I may
be a bit pedantic here, but I thing this is an important distinction.

> The worst thing is that the CPU will be stuck busy-looping around this
> IRQ w/o printing anything to the console or masking the irq. How
> should we cope with that? I'd like to propose to either revert the
> offending commit or to add the following patch.

Well, said commit fixes a rather important bug, so I suggest we keep
around. Now, as for your suggestion:

> -- 
> With best wishes
> Dmitry
>
> From e87f86497b796ed55fff644bbc75bf1890941829 Mon Sep 17 00:00:00 2001
> From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> Date: Fri, 21 Nov 2014 13:27:11 +0300
> Subject: [PATCH] genirq: handle IRQ 0 in __handle_domain_irq
>
> __handle_domain_irq() function will ignore (well, report as bad) the IRQ
> number 0. On some platforms IRQ0 is bad IRQ. On others it is not. And
> while platforms are still in the process of converging to not using
> IRQ number 0 as a valid IRQ, I'd like to propose to use IRQ0 as a valid
> one in __handle_domain_irq().
>
> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> ---
>  kernel/irq/irqdesc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index a1782f8..bfbeeb6 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>  	 * Some hardware gives randomly wrong interrupts.  Rather
>  	 * than crashing, do something sensible.
>  	 */
> -	if (unlikely(!irq || irq >= nr_irqs)) {
> +	if (unlikely(irq >= nr_irqs)) {
>  		ack_bad_irq(irq);
>  		ret = -EINVAL;
>  	} else {

As I mentioned above, IRQ0 is not valid when using irq domains. As an
alternative, how about this:


I don't have a platform to test this on, but maybe you could give it a
go and let me know if that helps?

Thanks,

	M.

Comments

Dmitry Baryshkov Nov. 21, 2014, 11:01 a.m. UTC | #1
2014-11-21 13:52 GMT+03:00 Marc Zyngier <marc.zyngier@arm.com>:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>> Hello,
>>
>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.
>
>> The worst thing is that the CPU will be stuck busy-looping around this
>> IRQ w/o printing anything to the console or masking the irq. How
>> should we cope with that? I'd like to propose to either revert the
>> offending commit or to add the following patch.
>
> Well, said commit fixes a rather important bug, so I suggest we keep
> around. Now, as for your suggestion:

From the commit message it was not clear, that there was a bug fixed
(it talks only about code duplication).

[skipped]

> As I mentioned above, IRQ0 is not valid when using irq domains. As an
> alternative, how about this:
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index a1782f8..9f5bc92 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>          * Some hardware gives randomly wrong interrupts.  Rather
>          * than crashing, do something sensible.
>          */
> -       if (unlikely(!irq || irq >= nr_irqs)) {
> +       if (unlikely((lookup && !irq) || irq >= nr_irqs)) {
>                 ack_bad_irq(irq);
>                 ret = -EINVAL;
>         } else {
>
> I don't have a platform to test this on, but maybe you could give it a
> go and let me know if that helps?

It helps in my case. Thank you. Please add me to Cc if you submit
this patch.
Russell King - ARM Linux Nov. 21, 2014, 11:01 a.m. UTC | #2
On Fri, Nov 21, 2014 at 10:52:37AM +0000, Marc Zyngier wrote:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
> > Hello,
> >
> > After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
> > (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
> > on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
> 
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.

Linus has decreed it to not be a valid IRQ number, and that's basically
the end of the discussion.  Generic code, and drivers, will increasingly
decide that IRQ0 is not valid, and objecting to it has, and will continue
to elicit a response of "fix ARM".
Marc Zyngier Nov. 21, 2014, 11:17 a.m. UTC | #3
Hi Russell,

On 21/11/14 11:01, Russell King - ARM Linux wrote:
> On Fri, Nov 21, 2014 at 10:52:37AM +0000, Marc Zyngier wrote:
>> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>>> Hello,
>>>
>>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>>
>> Well, this is a valid IRQ number if you're not using irq domains. I may
>> be a bit pedantic here, but I thing this is an important distinction.
> 
> Linus has decreed it to not be a valid IRQ number, and that's basically
> the end of the discussion.  Generic code, and drivers, will increasingly
> decide that IRQ0 is not valid, and objecting to it has, and will continue
> to elicit a response of "fix ARM".

I'm fine with that.

Thanks,

	M.
Dmitry Baryshkov Nov. 21, 2014, 10:20 p.m. UTC | #4
2014-11-22 0:32 GMT+03:00 Robert Jarzmik <robert.jarzmik@free.fr>:
> Marc Zyngier <marc.zyngier@arm.com> writes:
>>> Linus has decreed it to not be a valid IRQ number, and that's basically
>>> the end of the discussion.  Generic code, and drivers, will increasingly
>>> decide that IRQ0 is not valid, and objecting to it has, and will continue
>>> to elicit a response of "fix ARM".
>>
>> I'm fine with that.
> For pxa, why not do something like that [1] ?
>
> Cheers.
>
> --
> Robert
>
> [1]
> ---8>---
>
> From 551eaf75934bd84939a40781470ed3c04d17507a Mon Sep 17 00:00:00 2001
> From: Robert Jarzmik <robert.jarzmik@free.fr>
> Date: Fri, 21 Nov 2014 22:11:42 +0100
> Subject: [PATCH] ARM: pxa: arbitrarily set first interrupt number
>
> As IRQ0, the legacy timer interrupt should not be used as an interrupt
> number, shift the interrupts by a fixed number.
>
> As we had in a special case a shift of 16 when ISA bus was used on a
> PXA, use that value as the first interrupt number, regardless of ISA or
> not.

This will shift the issue from PXA_SSP3 interrupt to ISA IRQ0.
On the other hand ISA IRQs are used only on viper and zeus boards.
And those boards explicitly mark IRQ0 as unused (because it is not
routed through CPLD).

I have another question. As for me, those "ISA" interrupts being placed
in front of PXA interrupts look like some kind of legacy stuff. Do we
still require for "ISA" (well, PC/104) interrupts to be the first ones?
Rob Herring Nov. 21, 2014, 10:27 p.m. UTC | #5
On Fri, Nov 21, 2014 at 3:32 PM, Robert Jarzmik <robert.jarzmik@free.fr> wrote:
> Marc Zyngier <marc.zyngier@arm.com> writes:
>>> Linus has decreed it to not be a valid IRQ number, and that's basically
>>> the end of the discussion.  Generic code, and drivers, will increasingly
>>> decide that IRQ0 is not valid, and objecting to it has, and will continue
>>> to elicit a response of "fix ARM".
>>
>> I'm fine with that.
> For pxa, why not do something like that [1] ?
>
> Cheers.
>
> --
> Robert
>
> [1]
> ---8>---
>
> From 551eaf75934bd84939a40781470ed3c04d17507a Mon Sep 17 00:00:00 2001
> From: Robert Jarzmik <robert.jarzmik@free.fr>
> Date: Fri, 21 Nov 2014 22:11:42 +0100
> Subject: [PATCH] ARM: pxa: arbitrarily set first interrupt number
>
> As IRQ0, the legacy timer interrupt should not be used as an interrupt
> number, shift the interrupts by a fixed number.
>
> As we had in a special case a shift of 16 when ISA bus was used on a
> PXA, use that value as the first interrupt number, regardless of ISA or
> not.
>
> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
> ---
>  arch/arm/mach-pxa/include/mach/irqs.h | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/arch/arm/mach-pxa/include/mach/irqs.h b/arch/arm/mach-pxa/include/mach/irqs.h
> index 48c2fd8..9d8983f 100644
> --- a/arch/arm/mach-pxa/include/mach/irqs.h
> +++ b/arch/arm/mach-pxa/include/mach/irqs.h
> @@ -14,12 +14,9 @@
>
>  #ifdef CONFIG_PXA_HAVE_ISA_IRQS

You can get rid of this ifdef and kconfig symbol. It is only used here.

>  #define PXA_ISA_IRQ(x) (x)
> -#define PXA_ISA_IRQ_NUM        (16)
> -#else
> -#define PXA_ISA_IRQ_NUM        (0)
>  #endif
>
> -#define PXA_IRQ(x)     (PXA_ISA_IRQ_NUM + (x))
> +#define PXA_IRQ(x)     (16 + (x))

Perhaps use NR_IRQS_LEGACY here.

>  #define IRQ_SSP3       PXA_IRQ(0)      /* SSP3 service request */
>  #define IRQ_MSL                PXA_IRQ(1)      /* MSL Interface interrupt */
> --
> 2.1.0
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Grant Likely Nov. 21, 2014, 10:31 p.m. UTC | #6
On Fri, Nov 21, 2014 at 10:52 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On Fri, Nov 21 2014 at 10:31:05 am GMT, Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> wrote:
>> Hello,
>>
>> After the commit a71b092a9c68685a270ebdde7b5986ba8787e575
>> (ARM: Convert handle_IRQ to use __handle_domain_irq) IRQ #0 is broken
>> on ARM. It is a valid IRQ and it is quite imporant (on sa1100 it's a GPIO0).
>
> Well, this is a valid IRQ number if you're not using irq domains. I may
> be a bit pedantic here, but I thing this is an important distinction.
>
>> The worst thing is that the CPU will be stuck busy-looping around this
>> IRQ w/o printing anything to the console or masking the irq. How
>> should we cope with that? I'd like to propose to either revert the
>> offending commit or to add the following patch.
>
> Well, said commit fixes a rather important bug, so I suggest we keep
> around. Now, as for your suggestion:
>
>> --
>> With best wishes
>> Dmitry
>>
>> From e87f86497b796ed55fff644bbc75bf1890941829 Mon Sep 17 00:00:00 2001
>> From: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> Date: Fri, 21 Nov 2014 13:27:11 +0300
>> Subject: [PATCH] genirq: handle IRQ 0 in __handle_domain_irq
>>
>> __handle_domain_irq() function will ignore (well, report as bad) the IRQ
>> number 0. On some platforms IRQ0 is bad IRQ. On others it is not. And
>> while platforms are still in the process of converging to not using
>> IRQ number 0 as a valid IRQ, I'd like to propose to use IRQ0 as a valid
>> one in __handle_domain_irq().
>>
>> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> ---
>>  kernel/irq/irqdesc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
>> index a1782f8..bfbeeb6 100644
>> --- a/kernel/irq/irqdesc.c
>> +++ b/kernel/irq/irqdesc.c
>> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>>        * Some hardware gives randomly wrong interrupts.  Rather
>>        * than crashing, do something sensible.
>>        */
>> -     if (unlikely(!irq || irq >= nr_irqs)) {
>> +     if (unlikely(irq >= nr_irqs)) {
>>               ack_bad_irq(irq);
>>               ret = -EINVAL;
>>       } else {
>
> As I mentioned above, IRQ0 is not valid when using irq domains. As an
> alternative, how about this:
>
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index a1782f8..9f5bc92 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -365,7 +365,7 @@ int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
>          * Some hardware gives randomly wrong interrupts.  Rather
>          * than crashing, do something sensible.
>          */
> -       if (unlikely(!irq || irq >= nr_irqs)) {
> +       if (unlikely((lookup && !irq) || irq >= nr_irqs)) {
>                 ack_bad_irq(irq);
>                 ret = -EINVAL;
>         } else {
>
> I don't have a platform to test this on, but maybe you could give it a
> go and let me know if that helps?

I have to nak this. It isn't just when using domains that virq 0 is
invalid. It is always invalid. As suggested elsewhere in this thread,
platform code should use a hard offset from the hwirq to the virq to
get off of irq0.

g.
diff mbox

Patch

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index a1782f8..9f5bc92 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -365,7 +365,7 @@  int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
 	 * Some hardware gives randomly wrong interrupts.  Rather
 	 * than crashing, do something sensible.
 	 */
-	if (unlikely(!irq || irq >= nr_irqs)) {
+	if (unlikely((lookup && !irq) || irq >= nr_irqs)) {
 		ack_bad_irq(irq);
 		ret = -EINVAL;
 	} else {