diff mbox series

[PATCHv2,2/2] arm64: cpufeature: use static_branch_enable_cpuslocked()

Message ID 1493315077-19496-3-git-send-email-mark.rutland@arm.com
State New
Headers show
Series arm64: fix hotplug rwsem boot fallout | expand

Commit Message

Mark Rutland April 27, 2017, 5:44 p.m. UTC
Recently, the hotplug locking was conveted to use a percpu rwsem. Unlike
the existing {get,put}_online_cpus() logic, this can't nest.
Unfortunately, in arm64's secondary boot path we can end up nesting via
static_branch_enable() in cpus_set_cap() when we detect an erratum.

This leads to a stream of messages as below, where the secondary
attempts to schedule before it has been fully onlined. As the CPU
orchestrating the onlining holds the rswem, this hangs the system.

[    0.250334] BUG: scheduling while atomic: swapper/1/0/0x00000002
[    0.250337] Modules linked in:
[    0.250346] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc7-next-20170424 #2
[    0.250349] Hardware name: ARM Juno development board (r1) (DT)
[    0.250353] Call trace:
[    0.250365] [<ffff000008088510>] dump_backtrace+0x0/0x238
[    0.250371] [<ffff00000808880c>] show_stack+0x14/0x20
[    0.250377] [<ffff00000839d854>] dump_stack+0x9c/0xc0
[    0.250384] [<ffff0000080e3540>] __schedule_bug+0x50/0x70
[    0.250391] [<ffff000008932ecc>] __schedule+0x52c/0x5a8
[    0.250395] [<ffff000008932f80>] schedule+0x38/0xa0
[    0.250400] [<ffff000008935e8c>] rwsem_down_read_failed+0xc4/0x108
[    0.250407] [<ffff0000080fe8e0>] __percpu_down_read+0x100/0x118
[    0.250414] [<ffff0000080c0b60>] get_online_cpus+0x70/0x78
[    0.250420] [<ffff0000081749e8>] static_key_enable+0x28/0x48
[    0.250425] [<ffff00000808de90>] update_cpu_capabilities+0x78/0xf8
[    0.250430] [<ffff00000808d14c>] update_cpu_errata_workarounds+0x1c/0x28
[    0.250435] [<ffff00000808e004>] check_local_cpu_capabilities+0xf4/0x128
[    0.250440] [<ffff00000808e894>] secondary_start_kernel+0x8c/0x118
[    0.250444] [<000000008093d1b4>] 0x8093d1b4

We call cpus_set_cap() from update_cpu_capabilities(), which is called
from the secondary boot path (where the CPU orchestrating the onlining
holds the hotplug rwsem), and in the primary boot path, where this is
not held.

This patch makes cpus_set_cap() use static_branch_enable_cpuslocked(),
and updates all the callers of update_cpu_capabilities() consistent with
the change.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

[Mark: minor fixups]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>

---
 arch/arm64/include/asm/cpufeature.h |  5 +++--
 arch/arm64/kernel/cpu_errata.c      | 13 ++++++++++++-
 arch/arm64/kernel/cpufeature.c      |  5 ++++-
 arch/arm64/kernel/smp.c             |  7 +++----
 4 files changed, 22 insertions(+), 8 deletions(-)

-- 
1.9.1

Comments

Will Deacon April 27, 2017, 6:01 p.m. UTC | #1
On Thu, Apr 27, 2017 at 06:44:37PM +0100, Mark Rutland wrote:
> Recently, the hotplug locking was conveted to use a percpu rwsem. Unlike

> the existing {get,put}_online_cpus() logic, this can't nest.

> Unfortunately, in arm64's secondary boot path we can end up nesting via

> static_branch_enable() in cpus_set_cap() when we detect an erratum.

> 

> This leads to a stream of messages as below, where the secondary

> attempts to schedule before it has been fully onlined. As the CPU

> orchestrating the onlining holds the rswem, this hangs the system.

> 

> [    0.250334] BUG: scheduling while atomic: swapper/1/0/0x00000002

> [    0.250337] Modules linked in:

> [    0.250346] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc7-next-20170424 #2

> [    0.250349] Hardware name: ARM Juno development board (r1) (DT)

> [    0.250353] Call trace:

> [    0.250365] [<ffff000008088510>] dump_backtrace+0x0/0x238

> [    0.250371] [<ffff00000808880c>] show_stack+0x14/0x20

> [    0.250377] [<ffff00000839d854>] dump_stack+0x9c/0xc0

> [    0.250384] [<ffff0000080e3540>] __schedule_bug+0x50/0x70

> [    0.250391] [<ffff000008932ecc>] __schedule+0x52c/0x5a8

> [    0.250395] [<ffff000008932f80>] schedule+0x38/0xa0

> [    0.250400] [<ffff000008935e8c>] rwsem_down_read_failed+0xc4/0x108

> [    0.250407] [<ffff0000080fe8e0>] __percpu_down_read+0x100/0x118

> [    0.250414] [<ffff0000080c0b60>] get_online_cpus+0x70/0x78

> [    0.250420] [<ffff0000081749e8>] static_key_enable+0x28/0x48

> [    0.250425] [<ffff00000808de90>] update_cpu_capabilities+0x78/0xf8

> [    0.250430] [<ffff00000808d14c>] update_cpu_errata_workarounds+0x1c/0x28

> [    0.250435] [<ffff00000808e004>] check_local_cpu_capabilities+0xf4/0x128

> [    0.250440] [<ffff00000808e894>] secondary_start_kernel+0x8c/0x118

> [    0.250444] [<000000008093d1b4>] 0x8093d1b4

> 

> We call cpus_set_cap() from update_cpu_capabilities(), which is called

> from the secondary boot path (where the CPU orchestrating the onlining

> holds the hotplug rwsem), and in the primary boot path, where this is

> not held.

> 

> This patch makes cpus_set_cap() use static_branch_enable_cpuslocked(),

> and updates all the callers of update_cpu_capabilities() consistent with

> the change.

> 

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>

> Reported-by: Catalin Marinas <catalin.marinas@arm.com>

> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> Suggested-by: Thomas Gleixner <tglx@linutronix.de>

> Cc: Will Deacon <will.deacon@arm.com>

> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> [Mark: minor fixups]

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>

> ---

>  arch/arm64/include/asm/cpufeature.h |  5 +++--

>  arch/arm64/kernel/cpu_errata.c      | 13 ++++++++++++-

>  arch/arm64/kernel/cpufeature.c      |  5 ++++-

>  arch/arm64/kernel/smp.c             |  7 +++----

>  4 files changed, 22 insertions(+), 8 deletions(-)

> 

> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h

> index f31c48d..c96353a 100644

> --- a/arch/arm64/include/asm/cpufeature.h

> +++ b/arch/arm64/include/asm/cpufeature.h

> @@ -145,7 +145,7 @@ static inline void cpus_set_cap(unsigned int num)

>  			num, ARM64_NCAPS);

>  	} else {

>  		__set_bit(num, cpu_hwcaps);

> -		static_branch_enable(&cpu_hwcap_keys[num]);

> +		static_branch_enable_cpuslocked(&cpu_hwcap_keys[num]);

>  	}

>  }

>  

> @@ -222,7 +222,8 @@ void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,

>  void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps);

>  void check_local_cpu_capabilities(void);

>  

> -void update_cpu_errata_workarounds(void);

> +void update_secondary_cpu_errata_workarounds(void);

> +void update_boot_cpu_errata_workarounds(void);

>  void __init enable_errata_workarounds(void);

>  void verify_local_cpu_errata_workarounds(void);

>  

> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c

> index f6cc67e..379ad8d 100644

> --- a/arch/arm64/kernel/cpu_errata.c

> +++ b/arch/arm64/kernel/cpu_errata.c

> @@ -175,9 +175,20 @@ void verify_local_cpu_errata_workarounds(void)

>  		}

>  }

>  

> -void update_cpu_errata_workarounds(void)

> +/*

> + * Secondary CPUs are booted with the waker holding the

> + * CPU hotplug lock, hence we don't need to lock it here again.

> + */

> +void update_secondary_cpu_errata_workarounds(void)

> +{

> +	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> +}

> +

> +void update_boot_cpu_errata_workarounds(void)

>  {

> +	get_online_cpus();

>  	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> +	put_online_cpus();

>  }


These functions seem to have unhelpful names, especially when compared to
the naming scheme used by the core code. I'd prefer to have:

update_cpu_errata_workarounds: just calls update_cpu_capabilities

update_cpu_errata_workarounds_cpuslocked: does get_online_cpus(), then calls
update_cpu_errata_workarounds, then does put_online_cpus();

With that change:

Acked-by: Will Deacon <will.deacon@arm.com>


for -tip.

Will
Mark Rutland April 28, 2017, 10:02 a.m. UTC | #2
On Thu, Apr 27, 2017 at 07:01:04PM +0100, Will Deacon wrote:
> On Thu, Apr 27, 2017 at 06:44:37PM +0100, Mark Rutland wrote:

> > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c

> > index f6cc67e..379ad8d 100644

> > --- a/arch/arm64/kernel/cpu_errata.c

> > +++ b/arch/arm64/kernel/cpu_errata.c

> > @@ -175,9 +175,20 @@ void verify_local_cpu_errata_workarounds(void)

> >  		}

> >  }

> >  

> > -void update_cpu_errata_workarounds(void)

> > +/*

> > + * Secondary CPUs are booted with the waker holding the

> > + * CPU hotplug lock, hence we don't need to lock it here again.

> > + */

> > +void update_secondary_cpu_errata_workarounds(void)

> > +{

> > +	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> > +}

> > +

> > +void update_boot_cpu_errata_workarounds(void)

> >  {

> > +	get_online_cpus();

> >  	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> > +	put_online_cpus();

> >  }

> 

> These functions seem to have unhelpful names, especially when compared to

> the naming scheme used by the core code. I'd prefer to have:

> 

> update_cpu_errata_workarounds: just calls update_cpu_capabilities

> 

> update_cpu_errata_workarounds_cpuslocked: does get_online_cpus(), then calls

> update_cpu_errata_workarounds, then does put_online_cpus();


That's the opposite polarity to the other _cpuslocked functions, where
_cpuslocked means that the lock is already held (and should not be taken
by the _cpuslocked function itself.

So I'll make those changes, but I'll swap that so:
update_cpu_errata_workarounds() does:

  get_online_cpus()
  update_cpu_errata_workarounds_cpuslocked()
  put_online_cpus()

> With that change:

> 

> Acked-by: Will Deacon <will.deacon@arm.com>


I assume that will stand with the above change. Please shout if not!

Thanks,
Mark.
Will Deacon April 28, 2017, 10:07 a.m. UTC | #3
On Fri, Apr 28, 2017 at 11:02:30AM +0100, Mark Rutland wrote:
> On Thu, Apr 27, 2017 at 07:01:04PM +0100, Will Deacon wrote:

> > On Thu, Apr 27, 2017 at 06:44:37PM +0100, Mark Rutland wrote:

> > > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c

> > > index f6cc67e..379ad8d 100644

> > > --- a/arch/arm64/kernel/cpu_errata.c

> > > +++ b/arch/arm64/kernel/cpu_errata.c

> > > @@ -175,9 +175,20 @@ void verify_local_cpu_errata_workarounds(void)

> > >  		}

> > >  }

> > >  

> > > -void update_cpu_errata_workarounds(void)

> > > +/*

> > > + * Secondary CPUs are booted with the waker holding the

> > > + * CPU hotplug lock, hence we don't need to lock it here again.

> > > + */

> > > +void update_secondary_cpu_errata_workarounds(void)

> > > +{

> > > +	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> > > +}

> > > +

> > > +void update_boot_cpu_errata_workarounds(void)

> > >  {

> > > +	get_online_cpus();

> > >  	update_cpu_capabilities(arm64_errata, "enabling workaround for");

> > > +	put_online_cpus();

> > >  }

> > 

> > These functions seem to have unhelpful names, especially when compared to

> > the naming scheme used by the core code. I'd prefer to have:

> > 

> > update_cpu_errata_workarounds: just calls update_cpu_capabilities

> > 

> > update_cpu_errata_workarounds_cpuslocked: does get_online_cpus(), then calls

> > update_cpu_errata_workarounds, then does put_online_cpus();

> 

> That's the opposite polarity to the other _cpuslocked functions, where

> _cpuslocked means that the lock is already held (and should not be taken

> by the _cpuslocked function itself.

> 

> So I'll make those changes, but I'll swap that so:

> update_cpu_errata_workarounds() does:

> 

>   get_online_cpus()

>   update_cpu_errata_workarounds_cpuslocked()

>   put_online_cpus()

> 

> > With that change:

> > 

> > Acked-by: Will Deacon <will.deacon@arm.com>

> 

> I assume that will stand with the above change. Please shout if not!


Haha, yes, I got it downside-up. Thanks for working out what I meant.

Will
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index f31c48d..c96353a 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -145,7 +145,7 @@  static inline void cpus_set_cap(unsigned int num)
 			num, ARM64_NCAPS);
 	} else {
 		__set_bit(num, cpu_hwcaps);
-		static_branch_enable(&cpu_hwcap_keys[num]);
+		static_branch_enable_cpuslocked(&cpu_hwcap_keys[num]);
 	}
 }
 
@@ -222,7 +222,8 @@  void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
 void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps);
 void check_local_cpu_capabilities(void);
 
-void update_cpu_errata_workarounds(void);
+void update_secondary_cpu_errata_workarounds(void);
+void update_boot_cpu_errata_workarounds(void);
 void __init enable_errata_workarounds(void);
 void verify_local_cpu_errata_workarounds(void);
 
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index f6cc67e..379ad8d 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -175,9 +175,20 @@  void verify_local_cpu_errata_workarounds(void)
 		}
 }
 
-void update_cpu_errata_workarounds(void)
+/*
+ * Secondary CPUs are booted with the waker holding the
+ * CPU hotplug lock, hence we don't need to lock it here again.
+ */
+void update_secondary_cpu_errata_workarounds(void)
+{
+	update_cpu_capabilities(arm64_errata, "enabling workaround for");
+}
+
+void update_boot_cpu_errata_workarounds(void)
 {
+	get_online_cpus();
 	update_cpu_capabilities(arm64_errata, "enabling workaround for");
+	put_online_cpus();
 }
 
 void __init enable_errata_workarounds(void)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index abda8e8..62d3a12 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -956,6 +956,7 @@  static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps)
 			cap_set_elf_hwcap(hwcaps);
 }
 
+/* Should be called with CPU hotplug lock held */
 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
 			    const char *info)
 {
@@ -1075,14 +1076,16 @@  void check_local_cpu_capabilities(void)
 	 * advertised capabilities.
 	 */
 	if (!sys_caps_initialised)
-		update_cpu_errata_workarounds();
+		update_secondary_cpu_errata_workarounds();
 	else
 		verify_local_cpu_capabilities();
 }
 
 static void __init setup_feature_capabilities(void)
 {
+	get_online_cpus();
 	update_cpu_capabilities(arm64_features, "detected feature:");
+	put_online_cpus();
 	enable_cpu_capabilities(arm64_features);
 }
 
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 9b10365..d9ddd5b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -447,11 +447,10 @@  void __init smp_prepare_boot_cpu(void)
 	cpuinfo_store_boot_cpu();
 	save_boot_cpu_run_el();
 	/*
-	 * Run the errata work around checks on the boot CPU, once we have
-	 * initialised the cpu feature infrastructure from
-	 * cpuinfo_store_boot_cpu() above.
+	 * Run the errata work around checks on the boot CPU, now that
+	 * cpuinfo_store_boot_cpu() has set things up.
 	 */
-	update_cpu_errata_workarounds();
+	update_boot_cpu_errata_workarounds();
 }
 
 static u64 __init of_get_cpu_mpidr(struct device_node *dn)