[3.10] pstore: Make spinlock per zone instead of global

Message ID 1501207574-24958-1-git-send-email-leo.yan@linaro.org
State New
Headers show

Commit Message

Leo Yan July 28, 2017, 2:06 a.m.
From: Joel Fernandes <joelaf@google.com>


Currently pstore has a global spinlock for all zones. Since the zones
are independent and modify different areas of memory, there's no need
to have a global lock, so we should use a per-zone lock as introduced
here. Also, when ramoops's ftrace use-case has a FTRACE_PER_CPU flag
introduced later, which splits the ftrace memory area into a single zone
per CPU, it will eliminate the need for locking. In preparation for this,
make the locking optional.

Supply for commit log (Leo):

This patch has another effect is to fix deadlock issue when enable
ftrace and console log together in ramoops; in the old code ftrace
buffer and console buffer in ramoops use the same raw spinlock
"buffer_lock". So in below case the kernel firstly has acquired the
lock for console buffer; when exit from console recording it calls
function _raw_spin_unlock_irqrestore(), this function has been enabled
function tracer when enter it, before release the spinlock it call
function tracing and acquire the spinlock again. At the end the spinlock
recursion happens and introduce the hang.

This patch uses separate locks for every buffer, this let console
buffer and ftrace buffer uses the dedicated locking in the same flow;
this effectively fixes the lock recursion issue.

[   65.103905] hrtimer: interrupt took 2759375 ns
[   65.108721] BUG: spinlock recursion on CPU#0, kschedfreq:0/1246
[   65.108760]  lock: buffer_lock+0x0/0x38, .magic: dead4ead, .owner: kschedfreq:0/1246, .owner_cpu: 0
[   65.108779] CPU: 0 PID: 1246 Comm: kschedfreq:0 Not tainted 4.4.74-07294-g5c996a9-dirty #130
[   65.108786] Hardware name: HiKey960 (DT)
[   65.108794] Call trace:
[   65.108820] [<ffffff800808ad64>] dump_backtrace+0x0/0x1e0
[   65.108835] [<ffffff800808af64>] show_stack+0x20/0x28
[   65.108857] [<ffffff80084ed4ec>] dump_stack+0xa8/0xe0
[   65.108872] [<ffffff800813c934>] spin_dump+0x88/0xac
[   65.108882] [<ffffff800813c988>] spin_bug+0x30/0x3c
[   65.108894] [<ffffff800813cb98>] do_raw_spin_lock+0xd0/0x1b8
[   65.108916] [<ffffff8008cba444>] _raw_spin_lock_irqsave+0x58/0x68
[   65.108935] [<ffffff8008453aec>] buffer_size_add.isra.4+0x30/0x78
[   65.108948] [<ffffff8008453f44>] persistent_ram_write+0x58/0x150
[   65.108961] [<ffffff8008452ca0>] ramoops_pstore_write_buf+0x14c/0x1d8
[   65.108974] [<ffffff8008452648>] pstore_ftrace_call+0x80/0xb4
[   65.108991] [<ffffff80081a9404>] ftrace_ops_no_ops+0xb8/0x154
[   65.109008] [<ffffff8008092e9c>] ftrace_graph_call+0x0/0x14
[   65.109023] [<ffffff8008cba594>] _raw_spin_unlock_irqrestore+0x20/0x90
[   65.109036] [<ffffff8008453b24>] buffer_size_add.isra.4+0x68/0x78
[   65.109048] [<ffffff8008453f44>] persistent_ram_write+0x58/0x150
[   65.109061] [<ffffff8008452ca0>] ramoops_pstore_write_buf+0x14c/0x1d8
[   65.109073] [<ffffff80084517c8>] pstore_write_compat+0x60/0x6c
[   65.109086] [<ffffff80084519d0>] pstore_console_write+0xa8/0xf4
[   65.109104] [<ffffff80081442e0>] call_console_drivers.constprop.21+0x1bc/0x1ec
[   65.109117] [<ffffff8008145488>] console_unlock+0x3a8/0x500
[   65.109129] [<ffffff8008145900>] vprintk_emit+0x320/0x62c
[   65.109142] [<ffffff8008145db0>] vprintk_default+0x48/0x54
[   65.109161] [<ffffff80081e3bec>] printk+0xa8/0xb4
[   65.109178] [<ffffff80081602a8>] hrtimer_interrupt+0x1f0/0x1f4
[   65.109197] [<ffffff80088eefd4>] arch_timer_handler_phys+0x3c/0x48
[   65.109211] [<ffffff800814bd00>] handle_percpu_devid_irq+0xd0/0x3c0
[   65.109225] [<ffffff800814718c>] generic_handle_irq+0x34/0x4c
[   65.109237] [<ffffff8008147234>] __handle_domain_irq+0x90/0xf8
[   65.109250] [<ffffff800808253c>] gic_handle_irq+0x5c/0xa8

Fixes: 0405a5cec340 ("pstore/ram: avoid atomic accesses for ioremapped regions")
Signed-off-by: Joel Fernandes <joelaf@google.com>

[kees: updated commit message]
Signed-off-by: Kees Cook <keescook@chromium.org>

Signed-off-by: Leo Yan <leo.yan@linaro.org>

---
 fs/pstore/ram_core.c       | 11 +++++------
 include/linux/pstore_ram.h |  1 +
 2 files changed, 6 insertions(+), 6 deletions(-)

-- 
2.7.4

Comments

Willy Tarreau July 28, 2017, 4:25 a.m. | #1
Hi Leo,

There was no upstream commit ID here but I found it in mainline here :

  commit 109704492ef637956265ec2eb72ae7b3b39eb6f4
  Author: Joel Fernandes <joelaf@google.com>
  Date:   Thu Oct 20 00:34:00 2016 -0700

    pstore: Make spinlock per zone instead of global
    
What worries me is that some later fixes were issued, apparently to fix
an oops and a warning after this patch :

  commit 76d5692a58031696e282384cbd893832bc92bd76
  Author: Kees Cook <keescook@chromium.org>
  Date:   Thu Feb 9 15:43:44 2017 -0800

    pstore: Correctly initialize spinlock and flags
    
    The ram backend wasn't always initializing its spinlock correctly. Since
    it was coming from kzalloc memory, though, it was harmless on
    architectures that initialize unlocked spinlocks to 0 (at least x86 and
    ARM). This also fixes a possibly ignored flag setting too.

and :

  commit e9a330c4289f2ba1ca4bf98c2b430ab165a8931b
  Author: Kees Cook <keescook@chromium.org>
  Date:   Sun Mar 5 22:08:58 2017 -0800

    pstore: Use dynamic spinlock initializer
    
    The per-prz spinlock should be using the dynamic initializer so that
    lockdep can correctly track it. Without this, under lockdep, we get a
    warning at boot that the lock is in non-static memory.

So I'm fine with merging this patch as long as Kees is OK with this and
we know what exact patch series needs to be merged.

Also, the information you added to the commit message references a trace
on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? I
just prefer to avoid blindly backporting sensitive patches if they're not
absolutely needed.

> [   65.103905] hrtimer: interrupt took 2759375 ns

> [   65.108721] BUG: spinlock recursion on CPU#0, kschedfreq:0/1246

> [   65.108760]  lock: buffer_lock+0x0/0x38, .magic: dead4ead, .owner: kschedfreq:0/1246, .owner_cpu: 0

> [   65.108779] CPU: 0 PID: 1246 Comm: kschedfreq:0 Not tainted 4.4.74-07294-g5c996a9-dirty #130


Thanks!
willy
Leo Yan July 28, 2017, 6:52 a.m. | #2
On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote:
> Hi Leo,

> 

> There was no upstream commit ID here but I found it in mainline here :

> 

>   commit 109704492ef637956265ec2eb72ae7b3b39eb6f4

>   Author: Joel Fernandes <joelaf@google.com>

>   Date:   Thu Oct 20 00:34:00 2016 -0700

> 

>     pstore: Make spinlock per zone instead of global

>     

> What worries me is that some later fixes were issued, apparently to fix

> an oops and a warning after this patch :


Yes, below two patches I also notices. But at least I have not
reproduce them on Android common kernel 4.4. I only faced the hang
issue and the first patch just fixes it.

BTW, I tried to port the second and third patch, but seems the
second patch is dependency on one extra patch; so avoid to introduce
complexity to resolve issue, I just port the first one for fixing
issues.

commit 663deb47880f2283809669563c5a52ac7c6aef1a
Author: Joel Fernandes <joelaf@google.com>
Date:   Thu Oct 20 00:34:01 2016 -0700

    pstore: Allow prz to control need for locking

    In preparation of not locking at all for certain buffers depending on if
    there's contention, make locking optional depending on the initialization
    of the prz.

>   commit 76d5692a58031696e282384cbd893832bc92bd76

>   Author: Kees Cook <keescook@chromium.org>

>   Date:   Thu Feb 9 15:43:44 2017 -0800

> 

>     pstore: Correctly initialize spinlock and flags

>     

>     The ram backend wasn't always initializing its spinlock correctly. Since

>     it was coming from kzalloc memory, though, it was harmless on

>     architectures that initialize unlocked spinlocks to 0 (at least x86 and

>     ARM). This also fixes a possibly ignored flag setting too.

> 

> and :

> 

>   commit e9a330c4289f2ba1ca4bf98c2b430ab165a8931b

>   Author: Kees Cook <keescook@chromium.org>

>   Date:   Sun Mar 5 22:08:58 2017 -0800

> 

>     pstore: Use dynamic spinlock initializer

>     

>     The per-prz spinlock should be using the dynamic initializer so that

>     lockdep can correctly track it. Without this, under lockdep, we get a

>     warning at boot that the lock is in non-static memory.

> 

> So I'm fine with merging this patch as long as Kees is OK with this and

> we know what exact patch series needs to be merged.

> 

> Also, the information you added to the commit message references a trace

> on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ?


No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are
avaliable on the board, and I verified mainline kernel can work well;
so this is why I can check difference between them and find the first
patch is critical.

> I just prefer to avoid blindly backporting sensitive patches if they're not

> absolutely needed.

> 

> > [   65.103905] hrtimer: interrupt took 2759375 ns

> > [   65.108721] BUG: spinlock recursion on CPU#0, kschedfreq:0/1246

> > [   65.108760]  lock: buffer_lock+0x0/0x38, .magic: dead4ead, .owner: kschedfreq:0/1246, .owner_cpu: 0

> > [   65.108779] CPU: 0 PID: 1246 Comm: kschedfreq:0 Not tainted 4.4.74-07294-g5c996a9-dirty #130

> 

> Thanks!

> willy
Willy Tarreau July 28, 2017, 9:47 p.m. | #3
On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote:
> On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote:

> > Hi Leo,

> > 

> > There was no upstream commit ID here but I found it in mainline here :

> > 

> >   commit 109704492ef637956265ec2eb72ae7b3b39eb6f4

> >   Author: Joel Fernandes <joelaf@google.com>

> >   Date:   Thu Oct 20 00:34:00 2016 -0700

> > 

> >     pstore: Make spinlock per zone instead of global

> >     

> > What worries me is that some later fixes were issued, apparently to fix

> > an oops and a warning after this patch :

> 

> Yes, below two patches I also notices. But at least I have not

> reproduce them on Android common kernel 4.4. I only faced the hang

> issue and the first patch just fixes it.


OK but maybe by breaking something else that the other ones have to
fix. That's my main concern in fact.

> BTW, I tried to port the second and third patch, but seems the

> second patch is dependency on one extra patch; so avoid to introduce

> complexity to resolve issue, I just port the first one for fixing

> issues.


OK.

> > Also, the information you added to the commit message references a trace

> > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ?

> 

> No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are

> avaliable on the board, and I verified mainline kernel can work well;

> so this is why I can check difference between them and find the first

> patch is critical.


Given that 3.10 only has a few months left, if 3.10 isn't available on
this hardware, do you really think we need to fix something in it that
apparently nobody will be in situation to experience, at the risk of
possibly adding some partial breakage ?

I'm not opposed, really just asking.

Thanks,
Willy
Leo Yan July 30, 2017, 3:48 p.m. | #4
Hi Willy,

On Fri, Jul 28, 2017 at 11:47:52PM +0200, Willy Tarreau wrote:
> On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote:

> > On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote:

> > > Hi Leo,

> > > 

> > > There was no upstream commit ID here but I found it in mainline here :

> > > 

> > >   commit 109704492ef637956265ec2eb72ae7b3b39eb6f4

> > >   Author: Joel Fernandes <joelaf@google.com>

> > >   Date:   Thu Oct 20 00:34:00 2016 -0700

> > > 

> > >     pstore: Make spinlock per zone instead of global

> > >     

> > > What worries me is that some later fixes were issued, apparently to fix

> > > an oops and a warning after this patch :

> > 

> > Yes, below two patches I also notices. But at least I have not

> > reproduce them on Android common kernel 4.4. I only faced the hang

> > issue and the first patch just fixes it.

> 

> OK but maybe by breaking something else that the other ones have to

> fix. That's my main concern in fact.


Yeah, I also want to check if we need back port another three extra
patches to long term support kernels.

> > > Also, the information you added to the commit message references a trace

> > > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ?

> > 

> > No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are

> > avaliable on the board, and I verified mainline kernel can work well;

> > so this is why I can check difference between them and find the first

> > patch is critical.

> 

> Given that 3.10 only has a few months left, if 3.10 isn't available on

> this hardware, do you really think we need to fix something in it that

> apparently nobody will be in situation to experience, at the risk of

> possibly adding some partial breakage ?

> 

> I'm not opposed, really just asking.


Indeedly I have no requirement for 3.10 kernel; Greg has ported
patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be
posted to mailing list for kernel 3.10.

So for 3.10, it's okay for me to ignore this patch backporting; or if
Greg and you think we should backport another 3 patches either is okay
for me. For later case, please let me know if me to follow this (I
can do this after one week later after holiday).

Thanks,
Leo Yan
Willy Tarreau July 30, 2017, 8:29 p.m. | #5
Hi Leo,

On Sun, Jul 30, 2017 at 11:48:39PM +0800, Leo Yan wrote:
> > Given that 3.10 only has a few months left, if 3.10 isn't available on

> > this hardware, do you really think we need to fix something in it that

> > apparently nobody will be in situation to experience, at the risk of

> > possibly adding some partial breakage ?

> > 

> > I'm not opposed, really just asking.

> 

> Indeedly I have no requirement for 3.10 kernel; Greg has ported

> patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be

> posted to mailing list for kernel 3.10.

> 

> So for 3.10, it's okay for me to ignore this patch backporting; or if

> Greg and you think we should backport another 3 patches either is okay

> for me. For later case, please let me know if me to follow this (I

> can do this after one week later after holiday).


Thanks, that's clearer now. I'm fine with taking the patches if you
are sure they're fine. There won't be many more 3.10 now (1 or 2 I
think) so we won't have many chances to fix a late regression if any.
If you can double-check and let me know the preferred option, I'll
happily follow your advice. If it requires some extra work on your
side or you're unsure, better not change anything.

Thanks!
Willy
Greg Kroah-Hartman Aug. 4, 2017, 7:50 p.m. | #6
On Sun, Jul 30, 2017 at 11:48:39PM +0800, Leo Yan wrote:
> Hi Willy,

> 

> On Fri, Jul 28, 2017 at 11:47:52PM +0200, Willy Tarreau wrote:

> > On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote:

> > > On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote:

> > > > Hi Leo,

> > > > 

> > > > There was no upstream commit ID here but I found it in mainline here :

> > > > 

> > > >   commit 109704492ef637956265ec2eb72ae7b3b39eb6f4

> > > >   Author: Joel Fernandes <joelaf@google.com>

> > > >   Date:   Thu Oct 20 00:34:00 2016 -0700

> > > > 

> > > >     pstore: Make spinlock per zone instead of global

> > > >     

> > > > What worries me is that some later fixes were issued, apparently to fix

> > > > an oops and a warning after this patch :

> > > 

> > > Yes, below two patches I also notices. But at least I have not

> > > reproduce them on Android common kernel 4.4. I only faced the hang

> > > issue and the first patch just fixes it.

> > 

> > OK but maybe by breaking something else that the other ones have to

> > fix. That's my main concern in fact.

> 

> Yeah, I also want to check if we need back port another three extra

> patches to long term support kernels.

> 

> > > > Also, the information you added to the commit message references a trace

> > > > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ?

> > > 

> > > No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are

> > > avaliable on the board, and I verified mainline kernel can work well;

> > > so this is why I can check difference between them and find the first

> > > patch is critical.

> > 

> > Given that 3.10 only has a few months left, if 3.10 isn't available on

> > this hardware, do you really think we need to fix something in it that

> > apparently nobody will be in situation to experience, at the risk of

> > possibly adding some partial breakage ?

> > 

> > I'm not opposed, really just asking.

> 

> Indeedly I have no requirement for 3.10 kernel; Greg has ported

> patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be

> posted to mailing list for kernel 3.10.

> 

> So for 3.10, it's okay for me to ignore this patch backporting; or if

> Greg and you think we should backport another 3 patches either is okay

> for me. For later case, please let me know if me to follow this (I

> can do this after one week later after holiday).


I'm going to take the other 3 as well for 3.18, 4.4, and 4.9.

thanks,

greg k-h

Patch

diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c
index 7df456d..ac55707 100644
--- a/fs/pstore/ram_core.c
+++ b/fs/pstore/ram_core.c
@@ -45,8 +45,6 @@  static inline size_t buffer_start(struct persistent_ram_zone *prz)
 	return atomic_read(&prz->buffer->start);
 }
 
-static DEFINE_RAW_SPINLOCK(buffer_lock);
-
 /* increase and wrap the start pointer, returning the old value */
 static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a)
 {
@@ -54,7 +52,7 @@  static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a)
 	int new;
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&buffer_lock, flags);
+	raw_spin_lock_irqsave(&prz->buffer_lock, flags);
 
 	old = atomic_read(&prz->buffer->start);
 	new = old + a;
@@ -62,7 +60,7 @@  static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a)
 		new -= prz->buffer_size;
 	atomic_set(&prz->buffer->start, new);
 
-	raw_spin_unlock_irqrestore(&buffer_lock, flags);
+	raw_spin_unlock_irqrestore(&prz->buffer_lock, flags);
 
 	return old;
 }
@@ -74,7 +72,7 @@  static void buffer_size_add(struct persistent_ram_zone *prz, size_t a)
 	size_t new;
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&buffer_lock, flags);
+	raw_spin_lock_irqsave(&prz->buffer_lock, flags);
 
 	old = atomic_read(&prz->buffer->size);
 	if (old == prz->buffer_size)
@@ -86,7 +84,7 @@  static void buffer_size_add(struct persistent_ram_zone *prz, size_t a)
 	atomic_set(&prz->buffer->size, new);
 
 exit:
-	raw_spin_unlock_irqrestore(&buffer_lock, flags);
+	raw_spin_unlock_irqrestore(&prz->buffer_lock, flags);
 }
 
 static void notrace persistent_ram_encode_rs8(struct persistent_ram_zone *prz,
@@ -452,6 +450,7 @@  static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig,
 
 	prz->buffer->sig = sig;
 	persistent_ram_zap(prz);
+	prz->buffer_lock = __RAW_SPIN_LOCK_UNLOCKED(buffer_lock);
 
 	return 0;
 }
diff --git a/include/linux/pstore_ram.h b/include/linux/pstore_ram.h
index 4af3fdc..4bfcd43 100644
--- a/include/linux/pstore_ram.h
+++ b/include/linux/pstore_ram.h
@@ -39,6 +39,7 @@  struct persistent_ram_zone {
 	void *vaddr;
 	struct persistent_ram_buffer *buffer;
 	size_t buffer_size;
+	raw_spinlock_t buffer_lock;
 
 	/* ECC correction */
 	char *par_buffer;