Message ID | 1501207574-24958-1-git-send-email-leo.yan@linaro.org |
---|---|
State | New |
Headers | show |
Hi Leo, There was no upstream commit ID here but I found it in mainline here : commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 Author: Joel Fernandes <joelaf@google.com> Date: Thu Oct 20 00:34:00 2016 -0700 pstore: Make spinlock per zone instead of global What worries me is that some later fixes were issued, apparently to fix an oops and a warning after this patch : commit 76d5692a58031696e282384cbd893832bc92bd76 Author: Kees Cook <keescook@chromium.org> Date: Thu Feb 9 15:43:44 2017 -0800 pstore: Correctly initialize spinlock and flags The ram backend wasn't always initializing its spinlock correctly. Since it was coming from kzalloc memory, though, it was harmless on architectures that initialize unlocked spinlocks to 0 (at least x86 and ARM). This also fixes a possibly ignored flag setting too. and : commit e9a330c4289f2ba1ca4bf98c2b430ab165a8931b Author: Kees Cook <keescook@chromium.org> Date: Sun Mar 5 22:08:58 2017 -0800 pstore: Use dynamic spinlock initializer The per-prz spinlock should be using the dynamic initializer so that lockdep can correctly track it. Without this, under lockdep, we get a warning at boot that the lock is in non-static memory. So I'm fine with merging this patch as long as Kees is OK with this and we know what exact patch series needs to be merged. Also, the information you added to the commit message references a trace on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? I just prefer to avoid blindly backporting sensitive patches if they're not absolutely needed. > [ 65.103905] hrtimer: interrupt took 2759375 ns > [ 65.108721] BUG: spinlock recursion on CPU#0, kschedfreq:0/1246 > [ 65.108760] lock: buffer_lock+0x0/0x38, .magic: dead4ead, .owner: kschedfreq:0/1246, .owner_cpu: 0 > [ 65.108779] CPU: 0 PID: 1246 Comm: kschedfreq:0 Not tainted 4.4.74-07294-g5c996a9-dirty #130 Thanks! willy
On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote: > Hi Leo, > > There was no upstream commit ID here but I found it in mainline here : > > commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 > Author: Joel Fernandes <joelaf@google.com> > Date: Thu Oct 20 00:34:00 2016 -0700 > > pstore: Make spinlock per zone instead of global > > What worries me is that some later fixes were issued, apparently to fix > an oops and a warning after this patch : Yes, below two patches I also notices. But at least I have not reproduce them on Android common kernel 4.4. I only faced the hang issue and the first patch just fixes it. BTW, I tried to port the second and third patch, but seems the second patch is dependency on one extra patch; so avoid to introduce complexity to resolve issue, I just port the first one for fixing issues. commit 663deb47880f2283809669563c5a52ac7c6aef1a Author: Joel Fernandes <joelaf@google.com> Date: Thu Oct 20 00:34:01 2016 -0700 pstore: Allow prz to control need for locking In preparation of not locking at all for certain buffers depending on if there's contention, make locking optional depending on the initialization of the prz. > commit 76d5692a58031696e282384cbd893832bc92bd76 > Author: Kees Cook <keescook@chromium.org> > Date: Thu Feb 9 15:43:44 2017 -0800 > > pstore: Correctly initialize spinlock and flags > > The ram backend wasn't always initializing its spinlock correctly. Since > it was coming from kzalloc memory, though, it was harmless on > architectures that initialize unlocked spinlocks to 0 (at least x86 and > ARM). This also fixes a possibly ignored flag setting too. > > and : > > commit e9a330c4289f2ba1ca4bf98c2b430ab165a8931b > Author: Kees Cook <keescook@chromium.org> > Date: Sun Mar 5 22:08:58 2017 -0800 > > pstore: Use dynamic spinlock initializer > > The per-prz spinlock should be using the dynamic initializer so that > lockdep can correctly track it. Without this, under lockdep, we get a > warning at boot that the lock is in non-static memory. > > So I'm fine with merging this patch as long as Kees is OK with this and > we know what exact patch series needs to be merged. > > Also, the information you added to the commit message references a trace > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are avaliable on the board, and I verified mainline kernel can work well; so this is why I can check difference between them and find the first patch is critical. > I just prefer to avoid blindly backporting sensitive patches if they're not > absolutely needed. > > > [ 65.103905] hrtimer: interrupt took 2759375 ns > > [ 65.108721] BUG: spinlock recursion on CPU#0, kschedfreq:0/1246 > > [ 65.108760] lock: buffer_lock+0x0/0x38, .magic: dead4ead, .owner: kschedfreq:0/1246, .owner_cpu: 0 > > [ 65.108779] CPU: 0 PID: 1246 Comm: kschedfreq:0 Not tainted 4.4.74-07294-g5c996a9-dirty #130 > > Thanks! > willy
On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote: > On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote: > > Hi Leo, > > > > There was no upstream commit ID here but I found it in mainline here : > > > > commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 > > Author: Joel Fernandes <joelaf@google.com> > > Date: Thu Oct 20 00:34:00 2016 -0700 > > > > pstore: Make spinlock per zone instead of global > > > > What worries me is that some later fixes were issued, apparently to fix > > an oops and a warning after this patch : > > Yes, below two patches I also notices. But at least I have not > reproduce them on Android common kernel 4.4. I only faced the hang > issue and the first patch just fixes it. OK but maybe by breaking something else that the other ones have to fix. That's my main concern in fact. > BTW, I tried to port the second and third patch, but seems the > second patch is dependency on one extra patch; so avoid to introduce > complexity to resolve issue, I just port the first one for fixing > issues. OK. > > Also, the information you added to the commit message references a trace > > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? > > No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are > avaliable on the board, and I verified mainline kernel can work well; > so this is why I can check difference between them and find the first > patch is critical. Given that 3.10 only has a few months left, if 3.10 isn't available on this hardware, do you really think we need to fix something in it that apparently nobody will be in situation to experience, at the risk of possibly adding some partial breakage ? I'm not opposed, really just asking. Thanks, Willy
Hi Willy, On Fri, Jul 28, 2017 at 11:47:52PM +0200, Willy Tarreau wrote: > On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote: > > On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote: > > > Hi Leo, > > > > > > There was no upstream commit ID here but I found it in mainline here : > > > > > > commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 > > > Author: Joel Fernandes <joelaf@google.com> > > > Date: Thu Oct 20 00:34:00 2016 -0700 > > > > > > pstore: Make spinlock per zone instead of global > > > > > > What worries me is that some later fixes were issued, apparently to fix > > > an oops and a warning after this patch : > > > > Yes, below two patches I also notices. But at least I have not > > reproduce them on Android common kernel 4.4. I only faced the hang > > issue and the first patch just fixes it. > > OK but maybe by breaking something else that the other ones have to > fix. That's my main concern in fact. Yeah, I also want to check if we need back port another three extra patches to long term support kernels. > > > Also, the information you added to the commit message references a trace > > > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? > > > > No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are > > avaliable on the board, and I verified mainline kernel can work well; > > so this is why I can check difference between them and find the first > > patch is critical. > > Given that 3.10 only has a few months left, if 3.10 isn't available on > this hardware, do you really think we need to fix something in it that > apparently nobody will be in situation to experience, at the risk of > possibly adding some partial breakage ? > > I'm not opposed, really just asking. Indeedly I have no requirement for 3.10 kernel; Greg has ported patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be posted to mailing list for kernel 3.10. So for 3.10, it's okay for me to ignore this patch backporting; or if Greg and you think we should backport another 3 patches either is okay for me. For later case, please let me know if me to follow this (I can do this after one week later after holiday). Thanks, Leo Yan
Hi Leo, On Sun, Jul 30, 2017 at 11:48:39PM +0800, Leo Yan wrote: > > Given that 3.10 only has a few months left, if 3.10 isn't available on > > this hardware, do you really think we need to fix something in it that > > apparently nobody will be in situation to experience, at the risk of > > possibly adding some partial breakage ? > > > > I'm not opposed, really just asking. > > Indeedly I have no requirement for 3.10 kernel; Greg has ported > patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be > posted to mailing list for kernel 3.10. > > So for 3.10, it's okay for me to ignore this patch backporting; or if > Greg and you think we should backport another 3 patches either is okay > for me. For later case, please let me know if me to follow this (I > can do this after one week later after holiday). Thanks, that's clearer now. I'm fine with taking the patches if you are sure they're fine. There won't be many more 3.10 now (1 or 2 I think) so we won't have many chances to fix a late regression if any. If you can double-check and let me know the preferred option, I'll happily follow your advice. If it requires some extra work on your side or you're unsure, better not change anything. Thanks! Willy
On Sun, Jul 30, 2017 at 11:48:39PM +0800, Leo Yan wrote: > Hi Willy, > > On Fri, Jul 28, 2017 at 11:47:52PM +0200, Willy Tarreau wrote: > > On Fri, Jul 28, 2017 at 02:52:15PM +0800, Leo Yan wrote: > > > On Fri, Jul 28, 2017 at 06:25:55AM +0200, Willy Tarreau wrote: > > > > Hi Leo, > > > > > > > > There was no upstream commit ID here but I found it in mainline here : > > > > > > > > commit 109704492ef637956265ec2eb72ae7b3b39eb6f4 > > > > Author: Joel Fernandes <joelaf@google.com> > > > > Date: Thu Oct 20 00:34:00 2016 -0700 > > > > > > > > pstore: Make spinlock per zone instead of global > > > > > > > > What worries me is that some later fixes were issued, apparently to fix > > > > an oops and a warning after this patch : > > > > > > Yes, below two patches I also notices. But at least I have not > > > reproduce them on Android common kernel 4.4. I only faced the hang > > > issue and the first patch just fixes it. > > > > OK but maybe by breaking something else that the other ones have to > > fix. That's my main concern in fact. > > Yeah, I also want to check if we need back port another three extra > patches to long term support kernels. > > > > > Also, the information you added to the commit message references a trace > > > > on a 4.4 kernel. Do you confirm that you got the same issue on 3.10 ? > > > > > > No, I only can confirm this on kernel 4.4. Now only kernel 4.4 are > > > avaliable on the board, and I verified mainline kernel can work well; > > > so this is why I can check difference between them and find the first > > > patch is critical. > > > > Given that 3.10 only has a few months left, if 3.10 isn't available on > > this hardware, do you really think we need to fix something in it that > > apparently nobody will be in situation to experience, at the risk of > > possibly adding some partial breakage ? > > > > I'm not opposed, really just asking. > > Indeedly I have no requirement for 3.10 kernel; Greg has ported > patch to 3.18/4.4/4.9 kernels, so Greg suggested the patch can be > posted to mailing list for kernel 3.10. > > So for 3.10, it's okay for me to ignore this patch backporting; or if > Greg and you think we should backport another 3 patches either is okay > for me. For later case, please let me know if me to follow this (I > can do this after one week later after holiday). I'm going to take the other 3 as well for 3.18, 4.4, and 4.9. thanks, greg k-h
diff --git a/fs/pstore/ram_core.c b/fs/pstore/ram_core.c index 7df456d..ac55707 100644 --- a/fs/pstore/ram_core.c +++ b/fs/pstore/ram_core.c @@ -45,8 +45,6 @@ static inline size_t buffer_start(struct persistent_ram_zone *prz) return atomic_read(&prz->buffer->start); } -static DEFINE_RAW_SPINLOCK(buffer_lock); - /* increase and wrap the start pointer, returning the old value */ static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a) { @@ -54,7 +52,7 @@ static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a) int new; unsigned long flags; - raw_spin_lock_irqsave(&buffer_lock, flags); + raw_spin_lock_irqsave(&prz->buffer_lock, flags); old = atomic_read(&prz->buffer->start); new = old + a; @@ -62,7 +60,7 @@ static size_t buffer_start_add(struct persistent_ram_zone *prz, size_t a) new -= prz->buffer_size; atomic_set(&prz->buffer->start, new); - raw_spin_unlock_irqrestore(&buffer_lock, flags); + raw_spin_unlock_irqrestore(&prz->buffer_lock, flags); return old; } @@ -74,7 +72,7 @@ static void buffer_size_add(struct persistent_ram_zone *prz, size_t a) size_t new; unsigned long flags; - raw_spin_lock_irqsave(&buffer_lock, flags); + raw_spin_lock_irqsave(&prz->buffer_lock, flags); old = atomic_read(&prz->buffer->size); if (old == prz->buffer_size) @@ -86,7 +84,7 @@ static void buffer_size_add(struct persistent_ram_zone *prz, size_t a) atomic_set(&prz->buffer->size, new); exit: - raw_spin_unlock_irqrestore(&buffer_lock, flags); + raw_spin_unlock_irqrestore(&prz->buffer_lock, flags); } static void notrace persistent_ram_encode_rs8(struct persistent_ram_zone *prz, @@ -452,6 +450,7 @@ static int persistent_ram_post_init(struct persistent_ram_zone *prz, u32 sig, prz->buffer->sig = sig; persistent_ram_zap(prz); + prz->buffer_lock = __RAW_SPIN_LOCK_UNLOCKED(buffer_lock); return 0; } diff --git a/include/linux/pstore_ram.h b/include/linux/pstore_ram.h index 4af3fdc..4bfcd43 100644 --- a/include/linux/pstore_ram.h +++ b/include/linux/pstore_ram.h @@ -39,6 +39,7 @@ struct persistent_ram_zone { void *vaddr; struct persistent_ram_buffer *buffer; size_t buffer_size; + raw_spinlock_t buffer_lock; /* ECC correction */ char *par_buffer;