From patchwork Wed Jan  7 12:28:18 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniel Thompson <daniel.thompson@linaro.org>
X-Patchwork-Id: 42815
Return-Path: <patchwork-forward+bncBCAPDLF44QLBB4OMWSSQKGQEQEPK6ZY@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-la0-f69.google.com (mail-la0-f69.google.com
 [209.85.215.69])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id CDE0F2055F
 for <linaro@patches.linaro.org>; Wed,  7 Jan 2015 12:28:34 +0000 (UTC)
Received: by mail-la0-f69.google.com with SMTP id gd6sf1876139lab.0
 for <linaro@patches.linaro.org>; Wed, 07 Jan 2015 04:28:33 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject
 :date:message-id:in-reply-to:references:x-original-sender
 :x-original-authentication-results:precedence:mailing-list:list-id
 :list-post:list-help:list-archive:list-unsubscribe;
 bh=GrI396xhTOPsU0Pnv42lRXZmlGfs1bIaaAGVyQ1TUm8=;
 b=iqvswpgIOuXybyYrMfQBxH/PxUBaeaizjA50QBMCA9yW0ISXJ0SC08Mm98mgwQ09He
 plxm/TU69f2cpvvQ6YJWiYOdU9PWklTft6P4et79XPvxLmfcW4U1C8Zjpp7bUroc9fET
 Lan71bqIdozUxWsGHALSmfoljFZEIjw8J9qXNoQ/CNEBAsyMz0QxiMVbpRF8cx1m0aAn
 Pv/tlqQBXcCzcI0PJbT0Qa9vfE/EEAgP7KQO0Y/d8uAAf0oAD61+5iiPn//kpOhbJ8zn
 DmQDYpRQCSyuPbjOh5IInLSZd5eJWGQpF1wRuxNIFbxkwV0RG62naVs0Ke+W9cMcFRkM
 F8dw==
X-Gm-Message-State: ALoCoQns2WgBduR8KyFg6D8qwP3upeChUpxxCoI93V5vRWWH7s62RMQyVEu/TEa4PFcTkguHvKwN
X-Received: by 10.181.13.147 with SMTP id ey19mr2924669wid.2.1420633713420; 
 Wed, 07 Jan 2015 04:28:33 -0800 (PST)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.152.207.33 with SMTP id lt1ls157589lac.64.gmail; Wed, 07 Jan
 2015 04:28:33 -0800 (PST)
X-Received: by 10.112.47.135 with SMTP id d7mr4003294lbn.54.1420633713256;
 Wed, 07 Jan 2015 04:28:33 -0800 (PST)
Received: from mail-la0-f47.google.com (mail-la0-f47.google.com.
 [209.85.215.47])
 by mx.google.com with ESMTPS id u8si2425850lag.83.2015.01.07.04.28.33
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Wed, 07 Jan 2015 04:28:33 -0800 (PST)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.215.47 as permitted sender) client-ip=209.85.215.47; 
Received: by mail-la0-f47.google.com with SMTP id hz20so3084432lab.6
 for <patchwork-forward@linaro.org>;
 Wed, 07 Jan 2015 04:28:33 -0800 (PST)
X-Received: by 10.152.27.8 with SMTP id p8mr3978046lag.69.1420633712673;
 Wed, 07 Jan 2015 04:28:32 -0800 (PST)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patches@linaro.org
Received: by 10.112.9.200 with SMTP id c8csp1464883lbb;
 Wed, 7 Jan 2015 04:28:30 -0800 (PST)
X-Received: by 10.180.182.199 with SMTP id eg7mr6817566wic.17.1420633710151; 
 Wed, 07 Jan 2015 04:28:30 -0800 (PST)
Received: from mail-wg0-f44.google.com (mail-wg0-f44.google.com.
 [74.125.82.44]) by mx.google.com with ESMTPS id
 gc5si3580345wjb.131.2015.01.07.04.28.29 for <patches@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Wed, 07 Jan 2015 04:28:29 -0800 (PST)
Received-SPF: pass (google.com: domain of daniel.thompson@linaro.org
 designates 74.125.82.44 as permitted sender)
 client-ip=74.125.82.44; 
Received: by mail-wg0-f44.google.com with SMTP id b13so1082626wgh.3
 for <patches@linaro.org>; Wed, 07 Jan 2015 04:28:29 -0800 (PST)
X-Received: by 10.194.57.43 with SMTP id f11mr5840059wjq.6.1420633709522;
 Wed, 07 Jan 2015 04:28:29 -0800 (PST)
Received: from sundance.lan (cpc4-aztw19-0-0-cust157.18-1.cable.virginm.net.
 [82.33.25.158]) by mx.google.com with ESMTPSA id
 gb10sm1963002wjb.21.2015.01.07.04.28.26
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 07 Jan 2015 04:28:28 -0800 (PST)
From: Daniel Thompson <daniel.thompson@linaro.org>
To: Russell King <linux@arm.linux.org.uk>, Will Deacon <will.deacon@arm.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
 linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org,
 Shawn Guo <shawn.guo@linaro.org>, Sascha Hauer <kernel@pengutronix.de>,
 Peter Zijlstra <a.p.zijlstra@chello.nl>,
 Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@redhat.com>,
 Arnaldo Carvalho de Melo <acme@kernel.org>,
 Thomas Gleixner <tglx@linutronix.de>, Lucas Stach <l.stach@pengutronix.de>,
 Linus Walleij <linus.walleij@linaro.org>, patches@linaro.org,
 linaro-kernel@lists.linaro.org, John Stultz <john.stultz@linaro.org>,
 Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH v3] arm: perf: Directly handle SMP platforms with one SPI
Date: Wed,  7 Jan 2015 12:28:18 +0000
Message-Id: <1420633698-11742-1-git-send-email-daniel.thompson@linaro.org>
X-Mailer: git-send-email 1.9.3
In-Reply-To: <1416581603-30557-1-git-send-email-daniel.thompson@linaro.org>
References: <1416581603-30557-1-git-send-email-daniel.thompson@linaro.org>
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: daniel.thompson@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.215.47 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Precedence: list
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
List-ID: <patchwork-forward.linaro.org>
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>

Some ARM platforms mux the PMU interrupt of every core into a single
SPI. On such platforms if the PMU of any core except 0 raises an interrupt
then it cannot be serviced and eventually, if you are lucky, the spurious
irq detection might forcefully disable the interrupt.

On these SoCs it is not possible to determine which core raised the
interrupt so workaround this issue by queuing irqwork on the other
cores whenever the primary interrupt handler is unable to service the
interrupt.

The u8500 platform has an alternative workaround that dynamically alters
the affinity of the PMU interrupt. This workaround logic is no longer
required so the original code is removed as is the hook it relied upon.

Tested on imx6q (which has fours cores/PMUs all muxed to a single SPI).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---

Notes:
    v2 was tested on u8500 (thanks to Linus Walleij). v3 doesn't change
    anything conceptual but the changes were just enough for me not to
    preserve the Tested-By:.
    
    v3:
     * Removed function pointer indirection when deploying workaround code
       and reorganise the code accordingly (Mark Rutland).
     * Move the workaround state tracking into the existing percpu data
       structure (Mark Rutland).
     * Renamed cret to percpu_ret and rewrote the comment describing the
       purpose of this variable (Mark Rutland).
     * Copy the cpu_online_mask and use that to act on a consistent set of
       cpus throughout the workaround (Mark Rutland).
     * Changed "single_irq" to "muxed_spi" to more explicitly describe
       the problem.
    
    v2:
     * Fixed build problems on systems without SMP.
    
    v1:
     * Thanks to Lucas Stach, Russell King and Thomas Gleixner for
       critiquing an older, completely different way to tackle the
       same problem.
    

 arch/arm/include/asm/pmu.h       |  17 ++++++
 arch/arm/kernel/perf_event.c     |   9 +--
 arch/arm/kernel/perf_event_cpu.c | 122 +++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/perf_event_v7.c  |   2 +-
 arch/arm/mach-ux500/cpu-db8500.c |  29 ----------
 5 files changed, 141 insertions(+), 38 deletions(-)

--
1.9.3

diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index b1596bd59129..295e762d5116 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -87,6 +87,19 @@ struct pmu_hw_events {
 	 * already have to allocate this struct per cpu.
 	 */
 	struct arm_pmu		*percpu_pmu;
+
+#ifdef CONFIG_SMP
+	/*
+	 * This is used to schedule workaround logic on platforms where all
+	 * the PMUs are attached to a single SPI.
+	 */
+	struct irq_work work;
+
+	/*
+	 * Used to track state when deploying above workaround.
+	 */
+	atomic_t work_ret;
+#endif
 };

 struct arm_pmu {
@@ -117,6 +130,10 @@ struct arm_pmu {
 	struct platform_device	*plat_device;
 	struct pmu_hw_events	__percpu *hw_events;
 	struct notifier_block	hotplug_nb;
+#ifdef CONFIG_SMP
+	int             muxed_spi_workaround_irq;
+	atomic_t        remaining_work;
+#endif
 };

 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index f7c65adaa428..e5c537b57f94 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -299,8 +299,6 @@ validate_group(struct perf_event *event)
 static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 {
 	struct arm_pmu *armpmu;
-	struct platform_device *plat_device;
-	struct arm_pmu_platdata *plat;
 	int ret;
 	u64 start_clock, finish_clock;

@@ -311,14 +309,9 @@ static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 	 * dereference.
 	 */
 	armpmu = *(void **)dev;
-	plat_device = armpmu->plat_device;
-	plat = dev_get_platdata(&plat_device->dev);

 	start_clock = sched_clock();
-	if (plat && plat->handle_irq)
-		ret = plat->handle_irq(irq, armpmu, armpmu->handle_irq);
-	else
-		ret = armpmu->handle_irq(irq, armpmu);
+	ret = armpmu->handle_irq(irq, armpmu);
 	finish_clock = sched_clock();

 	perf_sample_event_took(finish_clock - start_clock);
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c
index dd9acc95ebc0..3d51c5f442eb 100644
--- a/arch/arm/kernel/perf_event_cpu.c
+++ b/arch/arm/kernel/perf_event_cpu.c
@@ -59,6 +59,116 @@ int perf_num_counters(void)
 }
 EXPORT_SYMBOL_GPL(perf_num_counters);

+#ifdef CONFIG_SMP
+/*
+ * Workaround logic that is distributed to all cores if the PMU has only
+ * a single IRQ and the CPU receiving that IRQ cannot handle it. Its
+ * job is to try to service the interrupt on the current CPU. It will
+ * also enable the IRQ again if all the other CPUs have already tried to
+ * service it.
+ */
+static void cpu_pmu_do_percpu_work(struct irq_work *w)
+{
+	struct pmu_hw_events *hw_events =
+	    container_of(w, struct pmu_hw_events, work);
+	struct arm_pmu *cpu_pmu = hw_events->percpu_pmu;
+
+	atomic_set(&hw_events->work_ret,
+		   cpu_pmu->handle_irq(0, cpu_pmu));
+
+	if (atomic_dec_and_test(&cpu_pmu->remaining_work))
+		enable_irq(cpu_pmu->muxed_spi_workaround_irq);
+}
+
+/*
+ * Called when the main interrupt handler cannot determine the source
+ * of interrupt. It will deploy a workaround if we are running on an SMP
+ * platform with only a single muxed SPI.
+ *
+ * The workaround disables the interrupt and distributes irqwork to all
+ * other processors in the system. Hopefully one of them will clear the
+ * interrupt...
+ */
+static irqreturn_t cpu_pmu_handle_irq_none(int irq_num, struct arm_pmu *cpu_pmu)
+{
+	irqreturn_t ret = IRQ_NONE;
+	cpumask_t deploy_on_mask;
+	int cpu, work_ret;
+
+	if (irq_num != cpu_pmu->muxed_spi_workaround_irq)
+		return IRQ_NONE;
+
+	disable_irq_nosync(cpu_pmu->muxed_spi_workaround_irq);
+
+	cpumask_copy(&deploy_on_mask, cpu_online_mask);
+	cpumask_clear_cpu(smp_processor_id(), &deploy_on_mask);
+	atomic_add(cpumask_weight(&deploy_on_mask), &cpu_pmu->remaining_work);
+	smp_mb__after_atomic();
+
+	for_each_cpu(cpu, &deploy_on_mask) {
+		struct pmu_hw_events *hw_events =
+		    per_cpu_ptr(cpu_pmu->hw_events, cpu);
+
+		/*
+		 * The workaround code exits immediately without waiting to
+		 * see if the interrupt was handled by another CPU. This makes
+		 * it hard for us to decide between IRQ_HANDLED and IRQ_NONE.
+		 * However, the handler isn't shared so we don't have to worry
+		 * about being a good citizen, only about keeping the spurious
+		 * interrupt detector working. This allows us to return the
+		 * result of our *previous* attempt to deploy workaround.
+		 */
+		work_ret = atomic_read(&hw_events->work_ret);
+		if (work_ret != IRQ_NONE)
+			ret = work_ret;
+
+		if (!irq_work_queue_on(&hw_events->work, cpu))
+			if (atomic_dec_and_test(&cpu_pmu->remaining_work))
+				enable_irq(cpu_pmu->muxed_spi_workaround_irq);
+	}
+
+	return ret;
+}
+
+static int cpu_pmu_muxed_spi_workaround_init(struct arm_pmu *cpu_pmu)
+{
+	struct platform_device *pmu_device = cpu_pmu->plat_device;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		struct pmu_hw_events *hw_events =
+		    per_cpu_ptr(cpu_pmu->hw_events, cpu);
+
+		init_irq_work(&hw_events->work, cpu_pmu_do_percpu_work);
+		atomic_set(&hw_events->work_ret, IRQ_HANDLED);
+	}
+
+	aomic_set(cpu_pmu->remaining_work, 0);
+	cpu_pmu->muxed_spi_workaround_irq = platform_get_irq(pmu_device, 0);
+
+	return 0;
+}
+
+static void cpu_pmu_muxed_spi_workaround_term(struct arm_pmu *cpu_pmu)
+{
+	cpu_pmu->muxed_spi_workaround_irq = 0;
+}
+#else /* CONFIG_SMP */
+static int cpu_pmu_muxed_spi_workaround_init(struct arm_pmu *cpu_pmu)
+{
+	return 0;
+}
+
+static void cpu_pmu_muxed_spi_workaround_term(struct arm_pmu *cpu_pmu)
+{
+}
+
+static irqreturn_t cpu_pmu_handle_irq_none(int irq_num, struct arm_pmu *cpu_pmu)
+{
+	return IRQ_NONE;
+}
+#endif /* CONFIG_SMP */
+
 /* Include the PMU-specific implementations. */
 #include "perf_event_xscale.c"
 #include "perf_event_v6.c"
@@ -98,6 +208,8 @@ static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
 			if (irq >= 0)
 				free_irq(irq, per_cpu_ptr(&hw_events->percpu_pmu, i));
 		}
+
+		cpu_pmu_muxed_spi_workaround_term(cpu_pmu);
 	}
 }

@@ -155,6 +267,16 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)

 			cpumask_set_cpu(i, &cpu_pmu->active_irqs);
 		}
+
+		/*
+		 * If we are running SMP and have only one interrupt source
+		 * then get ready to share that single irq among the cores.
+		 */
+		if (nr_cpu_ids > 1 && irqs == 1) {
+			err = cpu_pmu_muxed_spi_workaround_init(cpu_pmu);
+			if (err)
+				return err;
+		}
 	}

 	return 0;
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 8993770c47de..0dd914c10803 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -792,7 +792,7 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
 	 * Did an overflow occur?
 	 */
 	if (!armv7_pmnc_has_overflowed(pmnc))
-		return IRQ_NONE;
+		return cpu_pmu_handle_irq_none(irq_num, cpu_pmu);

 	/*
 	 * Handle the counter(s) overflow(s)
diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c
index 6f63954c8bde..917774999c5c 100644
--- a/arch/arm/mach-ux500/cpu-db8500.c
+++ b/arch/arm/mach-ux500/cpu-db8500.c
@@ -12,8 +12,6 @@
 #include <linux/init.h>
 #include <linux/device.h>
 #include <linux/amba/bus.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
 #include <linux/platform_device.h>
 #include <linux/io.h>
 #include <linux/mfd/abx500/ab8500.h>
@@ -23,7 +21,6 @@
 #include <linux/regulator/machine.h>
 #include <linux/random.h>

-#include <asm/pmu.h>
 #include <asm/mach/map.h>

 #include "setup.h"
@@ -99,30 +96,6 @@ static void __init u8500_map_io(void)
 		iotable_init(u8500_io_desc, ARRAY_SIZE(u8500_io_desc));
 }

-/*
- * The PMU IRQ lines of two cores are wired together into a single interrupt.
- * Bounce the interrupt to the other core if it's not ours.
- */
-static irqreturn_t db8500_pmu_handler(int irq, void *dev, irq_handler_t handler)
-{
-	irqreturn_t ret = handler(irq, dev);
-	int other = !smp_processor_id();
-
-	if (ret == IRQ_NONE && cpu_online(other))
-		irq_set_affinity(irq, cpumask_of(other));
-
-	/*
-	 * We should be able to get away with the amount of IRQ_NONEs we give,
-	 * while still having the spurious IRQ detection code kick in if the
-	 * interrupt really starts hitting spuriously.
-	 */
-	return ret;
-}
-
-static struct arm_pmu_platdata db8500_pmu_platdata = {
-	.handle_irq		= db8500_pmu_handler,
-};
-
 static const char *db8500_read_soc_id(void)
 {
 	void __iomem *uid = __io_address(U8500_BB_UID_BASE);
@@ -143,8 +116,6 @@ static struct device * __init db8500_soc_device_init(void)
 }

 static struct of_dev_auxdata u8500_auxdata_lookup[] __initdata = {
-	/* Requires call-back bindings. */
-	OF_DEV_AUXDATA("arm,cortex-a9-pmu", 0, "arm-pmu", &db8500_pmu_platdata),
 	/* Requires DMA bindings. */
 	OF_DEV_AUXDATA("stericsson,ux500-msp-i2s", 0x80123000,
 		       "ux500-msp-i2s.0", &msp0_platform_data),