From patchwork Thu Apr 27 15:48:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 98291 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp163866qgf; Thu, 27 Apr 2017 08:48:50 -0700 (PDT) X-Received: by 10.98.42.2 with SMTP id q2mr6583336pfq.165.1493308130578; Thu, 27 Apr 2017 08:48:50 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q9si2920488pli.252.2017.04.27.08.48.50; Thu, 27 Apr 2017 08:48:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1032533AbdD0Psr (ORCPT + 25 others); Thu, 27 Apr 2017 11:48:47 -0400 Received: from foss.arm.com ([217.140.101.70]:38912 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1031520AbdD0Psj (ORCPT ); Thu, 27 Apr 2017 11:48:39 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C099315A1; Thu, 27 Apr 2017 08:48:38 -0700 (PDT) Received: from leverpostej (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 32CEF3F23B; Thu, 27 Apr 2017 08:48:37 -0700 (PDT) Date: Thu, 27 Apr 2017 16:48:06 +0100 From: Mark Rutland To: Thomas Gleixner , catalin.marinas@arm.com, will.deacon@arm.com Cc: Suzuki K Poulose , Peter Zijlstra , Sebastian Siewior , LKML , Steven Rostedt , Ingo Molnar , linux-arm-kernel@lists.infradead.org Subject: [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked() (was: Re: [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem) Message-ID: <20170427154806.GA6646@leverpostej> References: <20170418170442.665445272@linutronix.de> <20170425161037.GA27156@leverpostej> <20170425172838.mr3kyccsdteyjso5@linutronix.de> <20170426085958.GC27156@leverpostej> <20170426103236.GI27156@leverpostej> <20170427082719.3wyru4bk67kdmflb@linutronix.de> <20170427095744.GB31337@leverpostej> <20170427123056.GD31337@leverpostej> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170427123056.GD31337@leverpostej> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Catalin/Will, The below addresses a boot failure Catalin spotted in next-20170424, based on Sebastian's patch [1]. I've given it a spin on Juno R1, where I can reproduce the issue prior to applying this patch. I believe this would need to go via tip, as the issue is a result of change in the tip smp/hotplug branch, and the fix depends on infrastructure introduced there. Are you happy with the fix, and for it to go via the tip tree? Thanks, Mark. [1] https://lkml.kernel.org/r/20170425172838.mr3kyccsdteyjso5@linutronix.de [2] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=smp/hotplug ---->8---- >From 6cdb503b060f74743769c9f601c35f985d3c58eb Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Wed, 26 Apr 2017 09:46:47 +0100 Subject: [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked() Recently, the hotplug locking was conveted to use a percpu rwsem. Unlike the existing {get,put}_online_cpus() logic, this can't nest. Unfortunately, in arm64's secondary boot path we can end up nesting via static_branch_enable() in cpus_set_cap() when we detect an erratum. This leads to a stream of messages as below, where the secondary attempts to schedule before it has been fully onlined. As the CPU orchestrating the onlining holds the rswem, this hangs the system. [ 0.250334] BUG: scheduling while atomic: swapper/1/0/0x00000002 [ 0.250337] Modules linked in: [ 0.250346] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc7-next-20170424 #2 [ 0.250349] Hardware name: ARM Juno development board (r1) (DT) [ 0.250353] Call trace: [ 0.250365] [] dump_backtrace+0x0/0x238 [ 0.250371] [] show_stack+0x14/0x20 [ 0.250377] [] dump_stack+0x9c/0xc0 [ 0.250384] [] __schedule_bug+0x50/0x70 [ 0.250391] [] __schedule+0x52c/0x5a8 [ 0.250395] [] schedule+0x38/0xa0 [ 0.250400] [] rwsem_down_read_failed+0xc4/0x108 [ 0.250407] [] __percpu_down_read+0x100/0x118 [ 0.250414] [] get_online_cpus+0x70/0x78 [ 0.250420] [] static_key_enable+0x28/0x48 [ 0.250425] [] update_cpu_capabilities+0x78/0xf8 [ 0.250430] [] update_cpu_errata_workarounds+0x1c/0x28 [ 0.250435] [] check_local_cpu_capabilities+0xf4/0x128 [ 0.250440] [] secondary_start_kernel+0x8c/0x118 [ 0.250444] [<000000008093d1b4>] 0x8093d1b4 We call cpus_set_cap() from update_cpu_capabilities(), which is called from the secondary boot path (where the CPU orchestrating the onlining holds the hotplug rwsem), and in the primary boot path, where this is not held. This patch makes cpus_set_cap() use static_branch_enable_cpuslocked(), and updates the primary CPU boot path to hold the rwsem so as to keep the *_cpuslocked() code happy. Signed-off-by: Mark Rutland Reported-by: Catalin Marinas Suggested-by: Sebastian Andrzej Siewior Suggested-by: Thomas Gleixner Cc: Will Deacon Cc: Suzuki Poulose Signed-off-by: Mark Rutland --- arch/arm64/include/asm/cpufeature.h | 2 +- arch/arm64/kernel/smp.c | 8 +++++--- 2 files changed, 6 insertions(+), 4 deletions(-) -- 1.9.1 Signed-off-by: Mark Rutland Reported-by: Catalin Marinas Signed-off-by: Mark Rutland Signed-off-by: Suzuki K Poulose diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index f31c48d..349b5cd 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -145,7 +145,7 @@ static inline void cpus_set_cap(unsigned int num) num, ARM64_NCAPS); } else { __set_bit(num, cpu_hwcaps); - static_branch_enable(&cpu_hwcap_keys[num]); + static_branch_enable_cpuslocked(&cpu_hwcap_keys[num]); } } diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 9b10365..c2ce9aa 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -447,11 +447,13 @@ void __init smp_prepare_boot_cpu(void) cpuinfo_store_boot_cpu(); save_boot_cpu_run_el(); /* - * Run the errata work around checks on the boot CPU, once we have - * initialised the cpu feature infrastructure from - * cpuinfo_store_boot_cpu() above. + * Run the errata work around checks on the boot CPU, now that + * cpuinfo_store_boot_cpu() has set things up. We hold the percpu rwsem + * to keep the workaround setup code happy. */ + get_online_cpus(); update_cpu_errata_workarounds(); + put_online_cpus(); } static u64 __init of_get_cpu_mpidr(struct device_node *dn)