Message ID | 1417019010-9220-5-git-send-email-daniel.thompson@linaro.org |
---|---|
State | New |
Headers | show |
On 26/11/14 17:42, Jason Cooper wrote: > Daniel, > > I've been a bit swamped this cycle and haven't kept as close an eye on > this as I should have. :( fwiw, it's looking really good. I'll treat that good news. Thanks. > I have one > question below: > > On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote: >> Currently it is not possible to exploit FIQ for systems with a GIC, even if >> the systems are otherwise capable of it. This patch makes it possible >> for IPIs to be delivered using FIQ. >> >> To do so it modifies the register state so that normal interrupts are >> placed in group 1 and specific IPIs are placed into group 0. It also >> configures the controller to raise group 0 interrupts using the FIQ >> signal. It provides a means for architecture code to define which IPIs >> shall use FIQ and to acknowledge any IPIs that are raised. >> >> All GIC hardware except GICv1-without-TrustZone support provides a means >> to group exceptions into group 0 and group 1 but the hardware >> functionality is unavailable to the kernel when a secure monitor is >> present because access to the grouping registers are prohibited outside >> "secure world". However when grouping is not available (or in the case >> of early GICv1 implementations is very hard to configure) the code to >> change groups does not deploy and all IPIs will be raised via IRQ. >> >> It has been tested and shown working on two systems capable of >> supporting grouping (Freescale i.MX6 and STiH416). It has also been >> tested for boot regressions on two systems that do not support grouping >> (vexpress-a9 and Qualcomm Snapdragon 600). >> >> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Jason Cooper <jason@lakedaemon.net> >> Cc: Russell King <linux@arm.linux.org.uk> >> Cc: Marc Zyngier <marc.zyngier@arm.com> >> Tested-by: Jon Medhurst <tixy@linaro.org> >> --- >> arch/arm/kernel/traps.c | 5 +- >> drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- >> include/linux/irqchip/arm-gic.h | 8 +++ >> 3 files changed, 158 insertions(+), 10 deletions(-) > ... >> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c >> index 5d72823bc5e9..978e5e48d5c1 100644 >> --- a/drivers/irqchip/irq-gic.c >> +++ b/drivers/irqchip/irq-gic.c > ... >> +/* >> + * Test which group an interrupt belongs to. >> + * >> + * Returns 0 if the controller does not support grouping. >> + */ >> +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) >> +{ >> + unsigned int grp_reg = hwirq / 32 * 4; >> + u32 grp_val; >> + >> + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); >> + >> + return (grp_val >> (hwirq % 32)) & 1; >> +} > ... >> @@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) >> dmb(ishst); >> >> /* this always happens on GIC0 */ >> - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); >> + softint = map << 16 | irq; >> + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) >> + softint |= 0x8000; >> + writel_relaxed(softint, >> + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); >> >> bl_migration_unlock(); >> } > > Is it worth the code complication to optimize this if the controller > doesn't support grouping? Maybe set group_enabled at init so the above > would become: > > softint = map << 16 | irq; > if (group_enabled && > gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) > softint |= 0x8000; > writel_relaxed(...); No objections. However given this code always calls gic_get_group_irq() with irq < 16 we might be able to do better even than this. The lower 16-bits of IGROUP[0] are constant after boot so if we keep a shadow copy around instead of just a boolean then we can avoid the register read on all code paths. Daniel.
On 27/11/14 18:06, Jason Cooper wrote: > Daniel, > > On Thu, Nov 27, 2014 at 01:39:01PM +0000, Daniel Thompson wrote: >> On 26/11/14 17:42, Jason Cooper wrote: >>> On Wed, Nov 26, 2014 at 04:23:28PM +0000, Daniel Thompson wrote: >>>> Currently it is not possible to exploit FIQ for systems with a GIC, even if >>>> the systems are otherwise capable of it. This patch makes it possible >>>> for IPIs to be delivered using FIQ. >>>> >>>> To do so it modifies the register state so that normal interrupts are >>>> placed in group 1 and specific IPIs are placed into group 0. It also >>>> configures the controller to raise group 0 interrupts using the FIQ >>>> signal. It provides a means for architecture code to define which IPIs >>>> shall use FIQ and to acknowledge any IPIs that are raised. >>>> >>>> All GIC hardware except GICv1-without-TrustZone support provides a means >>>> to group exceptions into group 0 and group 1 but the hardware >>>> functionality is unavailable to the kernel when a secure monitor is >>>> present because access to the grouping registers are prohibited outside >>>> "secure world". However when grouping is not available (or in the case >>>> of early GICv1 implementations is very hard to configure) the code to >>>> change groups does not deploy and all IPIs will be raised via IRQ. >>>> >>>> It has been tested and shown working on two systems capable of >>>> supporting grouping (Freescale i.MX6 and STiH416). It has also been >>>> tested for boot regressions on two systems that do not support grouping >>>> (vexpress-a9 and Qualcomm Snapdragon 600). >>>> >>>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> >>>> Cc: Thomas Gleixner <tglx@linutronix.de> >>>> Cc: Jason Cooper <jason@lakedaemon.net> >>>> Cc: Russell King <linux@arm.linux.org.uk> >>>> Cc: Marc Zyngier <marc.zyngier@arm.com> >>>> Tested-by: Jon Medhurst <tixy@linaro.org> >>>> --- >>>> arch/arm/kernel/traps.c | 5 +- >>>> drivers/irqchip/irq-gic.c | 155 +++++++++++++++++++++++++++++++++++++--- >>>> include/linux/irqchip/arm-gic.h | 8 +++ >>>> 3 files changed, 158 insertions(+), 10 deletions(-) >>> ... >>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c >>>> index 5d72823bc5e9..978e5e48d5c1 100644 >>>> --- a/drivers/irqchip/irq-gic.c >>>> +++ b/drivers/irqchip/irq-gic.c >>> ... >>>> +/* >>>> + * Test which group an interrupt belongs to. >>>> + * >>>> + * Returns 0 if the controller does not support grouping. >>>> + */ >>>> +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) >>>> +{ >>>> + unsigned int grp_reg = hwirq / 32 * 4; >>>> + u32 grp_val; >>>> + >>>> + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); >>>> + >>>> + return (grp_val >> (hwirq % 32)) & 1; >>>> +} >>> ... >>>> @@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) >>>> dmb(ishst); >>>> >>>> /* this always happens on GIC0 */ >>>> - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); >>>> + softint = map << 16 | irq; >>>> + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) >>>> + softint |= 0x8000; >>>> + writel_relaxed(softint, >>>> + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); >>>> >>>> bl_migration_unlock(); >>>> } >>> >>> Is it worth the code complication to optimize this if the controller >>> doesn't support grouping? Maybe set group_enabled at init so the above >>> would become: >>> >>> softint = map << 16 | irq; >>> if (group_enabled && >>> gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) >>> softint |= 0x8000; >>> writel_relaxed(...); >> >> No objections. >> >> However given this code always calls gic_get_group_irq() with irq < 16 >> we might be able to do better even than this. The lower 16-bits of >> IGROUP[0] are constant after boot so if we keep a shadow copy around >> instead of just a boolean then we can avoid the register read on all >> code paths. > > Hmm, I'd look at that as a performance enhancement. I'm more concerned > about performance regressions for current users of the gic (non-group > enabled). "Current users of the gic" doesn't imply "non-group enabled". Whether or not grouping is enabled is a property of the hardware or (secure) bootloader. If we are seriously worried about a performance regression here we actually have to care about both cases. > Let's go ahead and do the change (well, a working facsimile) I suggested > above, and we can do a follow on patch to increase performance for the > group enabled use case. Hmnnn... I've have a new patch ready to go that shadows the IGROUP[0]. Its looks OK to me and I think it is actually fewer lines of code than v10 because we can remove gic_get_group_irq() completely. The code in question ends up looking like: softint = map << 16 | irq; if (gic->igroup0_shadow & BIT(irq)) softint |= 0x8000; writel_relaxed(...); This should end up with the same (data) cache profile as your proposal in the non-group case and should normally be a win for the grouped case. I even remembered an informative comment to make clear the use of shadowing is as an optimization and nothing to do with working around stupid hardware ;-). I hope you don't mind but I'm about to share a patchset based on the above so you can see it in full and decide if you like it. I don't object to adding an extra boolean (and will do that if you don't like the above) but I think this code is better. > If there's no objections, I'd like to try to get this in for v3.19, but > it's really late. So we'll see how it goes. I like that too. I also agree its pretty late and that's one of the reasons why I'm turning round new patchsets for each bit of feedback. Daniel.
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 0c8b10801d36..4dc45b38e56e 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -26,6 +26,7 @@ #include <linux/init.h> #include <linux/sched.h> #include <linux/irq.h> +#include <linux/irqchip/arm-gic.h> #include <linux/atomic.h> #include <asm/cacheflush.h> @@ -479,7 +480,9 @@ asmlinkage void __exception_irq_entry handle_fiq_as_nmi(struct pt_regs *regs) nmi_enter(); - /* nop. FIQ handlers for special arch/arm features can be added here. */ +#ifdef CONFIG_ARM_GIC + gic_handle_fiq_ipi(); +#endif nmi_exit(); diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index 5d72823bc5e9..978e5e48d5c1 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -39,6 +39,7 @@ #include <linux/slab.h> #include <linux/irqchip/chained_irq.h> #include <linux/irqchip/arm-gic.h> +#include <linux/ratelimit.h> #include <asm/cputype.h> #include <asm/irq.h> @@ -48,6 +49,10 @@ #include "irq-gic-common.h" #include "irqchip.h" +#ifndef SMP_IPI_FIQ_MASK +#define SMP_IPI_FIQ_MASK 0 +#endif + union gic_base { void __iomem *common_base; void __percpu * __iomem *percpu_base; @@ -348,6 +353,93 @@ static struct irq_chip gic_chip = { .irq_set_wake = gic_set_wake, }; +/* + * Shift an interrupt between Group 0 and Group 1. + * + * In addition to changing the group we also modify the priority to + * match what "ARM strongly recommends" for a system where no Group 1 + * interrupt must ever preempt a Group 0 interrupt. + * + * If is safe to call this function on systems which do not support + * grouping (it will have no effect). + */ +static void gic_set_group_irq(void __iomem *base, unsigned int hwirq, + int group) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_mask = BIT(hwirq % 32); + u32 grp_val; + + unsigned int pri_reg = (hwirq / 4) * 4; + u32 pri_mask = BIT(7 + ((hwirq % 4) * 8)); + u32 pri_val; + + /* + * Systems which do not support grouping will have not have + * the EnableGrp1 bit set. + */ + if (!(GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL))) + return; + + raw_spin_lock(&irq_controller_lock); + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + pri_val = readl_relaxed(base + GIC_DIST_PRI + pri_reg); + + if (group) { + grp_val |= grp_mask; + pri_val |= pri_mask; + } else { + grp_val &= ~grp_mask; + pri_val &= ~pri_mask; + } + + writel_relaxed(grp_val, base + GIC_DIST_IGROUP + grp_reg); + writel_relaxed(pri_val, base + GIC_DIST_PRI + pri_reg); + + raw_spin_unlock(&irq_controller_lock); +} + +/* + * Test which group an interrupt belongs to. + * + * Returns 0 if the controller does not support grouping. + */ +static int gic_get_group_irq(void __iomem *base, unsigned int hwirq) +{ + unsigned int grp_reg = hwirq / 32 * 4; + u32 grp_val; + + grp_val = readl_relaxed(base + GIC_DIST_IGROUP + grp_reg); + + return (grp_val >> (hwirq % 32)) & 1; +} + +/* + * Fully acknowledge (both ack and eoi) any outstanding FIQ-based IPI, + * otherwise do nothing. + */ +void gic_handle_fiq_ipi(void) +{ + struct gic_chip_data *gic = &gic_data[0]; + void __iomem *cpu_base = gic_data_cpu_base(gic); + unsigned long irqstat, irqnr; + + if (WARN_ON(!in_nmi())) + return; + + while ((1u << readl_relaxed(cpu_base + GIC_CPU_HIGHPRI)) & + SMP_IPI_FIQ_MASK) { + irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); + writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); + + irqnr = irqstat & GICC_IAR_INT_ID_MASK; + WARN_RATELIMIT(irqnr > 16, + "Unexpected irqnr %lu (bad prioritization?)\n", + irqnr); + } +} + void __init gic_cascade_irq(unsigned int gic_nr, unsigned int irq) { if (gic_nr >= MAX_GIC_NR) @@ -379,15 +471,24 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic) static void gic_cpu_if_up(void) { void __iomem *cpu_base = gic_data_cpu_base(&gic_data[0]); - u32 bypass = 0; + void __iomem *dist_base = gic_data_dist_base(&gic_data[0]); + u32 ctrl = 0; /* - * Preserve bypass disable bits to be written back later - */ - bypass = readl(cpu_base + GIC_CPU_CTRL); - bypass &= GICC_DIS_BYPASS_MASK; + * Preserve bypass disable bits to be written back later + */ + ctrl = readl(cpu_base + GIC_CPU_CTRL); + ctrl &= GICC_DIS_BYPASS_MASK; - writel_relaxed(bypass | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); + /* + * If EnableGrp1 is set in the distributor then enable group 1 + * support for this CPU (and route group 0 interrupts to FIQ). + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) + ctrl |= GICC_COMMON_BPR | GICC_FIQ_EN | GICC_ACK_CTL | + GICC_ENABLE_GRP1; + + writel_relaxed(ctrl | GICC_ENABLE, cpu_base + GIC_CPU_CTRL); } @@ -411,7 +512,23 @@ static void __init gic_dist_init(struct gic_chip_data *gic) gic_dist_config(base, gic_irqs, NULL); - writel_relaxed(GICD_ENABLE, base + GIC_DIST_CTRL); + /* + * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only, + * bit 1 ignored) depending on current mode. + */ + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, base + GIC_DIST_CTRL); + + /* + * Set all global interrupts to be group 1 if (and only if) it + * is possible to enable group 1 interrupts. This register is RAZ/WI + * if not accessible or not implemented, however some GICv1 devices + * do not implement the EnableGrp1 bit making it unsafe to set + * this register unconditionally. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(base + GIC_DIST_CTRL)) + for (i = 32; i < gic_irqs; i += 32) + writel_relaxed(0xffffffff, + base + GIC_DIST_IGROUP + i * 4 / 32); } static void gic_cpu_init(struct gic_chip_data *gic) @@ -420,6 +537,7 @@ static void gic_cpu_init(struct gic_chip_data *gic) void __iomem *base = gic_data_cpu_base(gic); unsigned int cpu_mask, cpu = smp_processor_id(); int i; + unsigned long secure_irqs, secure_irq; /* * Get what the GIC says our CPU mask is. @@ -438,6 +556,19 @@ static void gic_cpu_init(struct gic_chip_data *gic) gic_cpu_config(dist_base, NULL); + /* + * If the distributor is configured to support interrupt grouping + * then set any PPI and SGI interrupts not set in SMP_IPI_FIQ_MASK + * to be group1 and ensure any remaining group 0 interrupts have + * the right priority. + */ + if (GICD_ENABLE_GRP1 & readl_relaxed(dist_base + GIC_DIST_CTRL)) { + secure_irqs = SMP_IPI_FIQ_MASK; + writel_relaxed(~secure_irqs, dist_base + GIC_DIST_IGROUP + 0); + for_each_set_bit(secure_irq, &secure_irqs, 16) + gic_set_group_irq(dist_base, secure_irq, 0); + } + writel_relaxed(GICC_INT_PRI_THRESHOLD, base + GIC_CPU_PRIMASK); gic_cpu_if_up(); } @@ -527,7 +658,8 @@ static void gic_dist_restore(unsigned int gic_nr) writel_relaxed(gic_data[gic_nr].saved_spi_enable[i], dist_base + GIC_DIST_ENABLE_SET + i * 4); - writel_relaxed(GICD_ENABLE, dist_base + GIC_DIST_CTRL); + writel_relaxed(GICD_ENABLE_GRP1 | GICD_ENABLE, + dist_base + GIC_DIST_CTRL); } static void gic_cpu_save(unsigned int gic_nr) @@ -655,6 +787,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) { int cpu; unsigned long map = 0; + unsigned long softint; bl_migration_lock(); @@ -669,7 +802,11 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) dmb(ishst); /* this always happens on GIC0 */ - writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); + softint = map << 16 | irq; + if (gic_get_group_irq(gic_data_dist_base(&gic_data[0]), irq)) + softint |= 0x8000; + writel_relaxed(softint, + gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT); bl_migration_unlock(); } diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h index 13eed92c7d24..e83d292d4dbc 100644 --- a/include/linux/irqchip/arm-gic.h +++ b/include/linux/irqchip/arm-gic.h @@ -22,6 +22,10 @@ #define GIC_CPU_IDENT 0xfc #define GICC_ENABLE 0x1 +#define GICC_ENABLE_GRP1 0x2 +#define GICC_ACK_CTL 0x4 +#define GICC_FIQ_EN 0x8 +#define GICC_COMMON_BPR 0x10 #define GICC_INT_PRI_THRESHOLD 0xf0 #define GICC_IAR_INT_ID_MASK 0x3ff #define GICC_INT_SPURIOUS 1023 @@ -44,6 +48,7 @@ #define GIC_DIST_SGI_PENDING_SET 0xf20 #define GICD_ENABLE 0x1 +#define GICD_ENABLE_GRP1 0x2 #define GICD_DISABLE 0x0 #define GICD_INT_ACTLOW_LVLTRIG 0x0 #define GICD_INT_EN_CLR_X32 0xffffffff @@ -117,5 +122,8 @@ static inline void __init register_routable_domain_ops { gic_routable_irq_domain_ops = ops; } + +void gic_handle_fiq_ipi(void); + #endif /* __ASSEMBLY */ #endif