[next-20150119] regression (mm)?

Message ID CANMBJr4YOcHj2G7w-gwfoZjQQd=h0Mj59QNBo3ei_=ejYRcdnw@mail.gmail.com
State New
Headers show

Commit Message

Tyler Baker Jan. 23, 2015, 10:42 p.m.
Hi Kirill,

On 23 January 2015 at 12:22, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> On Fri, Jan 23, 2015 at 12:37:06PM -0600, Nishanth Menon wrote:
>> On 09:39-20150123, Tyler Baker wrote:
>> > Hi,
>> >
>> > On 23 January 2015 at 09:27, Nishanth Menon <nm@ti.com> wrote:
>> > > On 16:05-20150120, Kirill A. Shutemov wrote:
>> > > [..]
>> > >> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> > >> Reported-by: Nishanth Menon <nm@ti.com>
>> > > Just to close on this thread:
>> > > https://github.com/nmenon/kernel-test-logs/tree/next-20150123 looks good
>> > > and back to old status. Thank you folks for all the help.
>> >
>> > I just reviewed the boot logs for next-20150123 and there still seems
>> > to be a related issue. I've been boot testing
>> > multi_v7_defconfig+CONFIG_ARM_LPAE=y kernel configurations which still
>> > seem broken.
>> >
>> > For example here are two boots with exynos5250-arndale, one with
>> > multi_v7_defconfig+CONFIG_ARM_LPAE=y [1] and the other with
>> > multi_v7_defconfig[2]. You can see the kernel configurations with
>> > CONFIG_ARM_LPAE=y show the splat:
>> >
>> > [   14.605950] ------------[ cut here ]------------
>> > [   14.609163] WARNING: CPU: 1 PID: 63 at ../mm/mmap.c:2858
>> > exit_mmap+0x1b8/0x224()
>> > [   14.616548] Modules linked in:
>> > [   14.619553] CPU: 1 PID: 63 Comm: init Not tainted 3.19.0-rc5-next-20150123 #1
>> > [   14.626713] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>> > [   14.632830] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
>> > [   14.640473] [] (show_stack) from [] (dump_stack+0x78/0x94)
>> > [   14.647678] [] (dump_stack) from [] (warn_slowpath_common+0x74/0xb0)
>> > [   14.655744] [] (warn_slowpath_common) from [] (warn_slowpath_null+0x1c/0x24)
>> > [   14.664510] [] (warn_slowpath_null) from [] (exit_mmap+0x1b8/0x224)
>> > [   14.672497] [] (exit_mmap) from [] (mmput+0x40/0xf8)
>> > [   14.679180] [] (mmput) from [] (flush_old_exec+0x328/0x604)
>> > [   14.686471] [] (flush_old_exec) from [] (load_elf_binary+0x26c/0x11f4)
>> > [   14.694715] [] (load_elf_binary) from [] (search_binary_handler+0x98/0x244)
>> > [   14.703395] [] (search_binary_handler) from []
>> > (do_execveat_common+0x4dc/0x5bc)
>> > [   14.712421] [] (do_execveat_common) from [] (do_execve+0x28/0x30)
>> > [   14.720235] [] (do_execve) from [] (ret_fast_syscall+0x0/0x34)
>> > [   14.727782] ---[ end trace 5e3ca48b454c7e0a ]---
>> > [   14.733758] ------------[ cut here ]------------
>> >
>> > Has anyone else tested with CONFIG_ARM_LPAE=y that can confirm my findings?
>> Uggh... I missed since i was looking at non LPAE omap2plus_defconfig.
>>
>> Dual A15 OMAP5432 with multi_v7_defconfig + CONFIG_ARM_LPAE=y
>> https://github.com/nmenon/kernel-test-logs/blob/next-20150123/multi_lpae_defconfig/omap5-evm.txt
>>
>> Dual A15 DRA7/AM572x with same configuration as above.
>> https://raw.githubusercontent.com/nmenon/kernel-test-logs/next-20150123/multi_lpae_defconfig/dra7xx-evm.txt
>> https://github.com/nmenon/kernel-test-logs/blob/next-20150123/multi_lpae_defconfig/am57xx-evm.txt
>>
>> Single A15 DRA72 with same configuration as above:
>> https://raw.githubusercontent.com/nmenon/kernel-test-logs/next-20150123/multi_lpae_defconfig/dra72x-evm.txt
>>
>> You are right. the issue re-appears with LPAE on :(
>> Apologies on missing that.
>
> Guys, could you instrument mm_{inc,dec}_nr_pmds() with dump_stack() +
> printk() of the counter and add printk() on mmap_exit() then run a simple
> program which triggers the issue?

For reference, here is the patch I've applied for testing, mostly
stolen from Felipe's debug patch above in this thread.


I applied this patch to the tip of linux-next, configured for
multi_v7_defconfig and set CONFIG_ARM_LPAE=y. The log for this arndale
boot can be found here [1]. For good measure, I then rebuilt the
kernel with CONFIG_ARM_LPAE=n and booted the same platform again. This
log can be found here [2].

Happy hunting!

>
> --
>  Kirill A. Shutemov

[1] http://storage.kernelci.org/debug/mm/arndale-lpae-debug-next-20150123.html
[2] http://storage.kernelci.org/debug/mm/arndale-no-lpae-debug-next-20150123.html

Cheers,

Tyler
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Tyler Baker Jan. 26, 2015, 10:38 p.m. | #1
On 26 January 2015 at 04:00, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> On Fri, Jan 23, 2015 at 10:37:46PM -0600, Nishanth Menon wrote:
>> On 03:13-20150124, Kirill A. Shutemov wrote:
>> > > >> On 09:39-20150123, Tyler Baker wrote:
>> [...]
>> > > >> > I just reviewed the boot logs for next-20150123 and there still seems
>> > > >> > to be a related issue. I've been boot testing
>> > > >> > multi_v7_defconfig+CONFIG_ARM_LPAE=y kernel configurations which still
>> > > >> > seem broken.
>> [...]
>> > Okay, proof of concept patch is below. It's going to break every other
>> > architecture with FIRST_USER_ADDRESS != 0, but I think it's cleaner way to
>> > go.
>>
>> Testing on my end:
>>
>> just ran through this set (+ logs similar to Tyler's from my side):
>>
>> next-20150123 (multi_v7_defconfig == !LPAE)
>>  1:    BeagleBoard-X15(am57xx-evm): BOOT: PASS: http://paste.ubuntu.org.cn/2219449
>>  2:                     dra72x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2219450
>>  3:                     dra7xx-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2219451
>>  4:                      omap5-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2219452
>> TOTAL = 4 boards, Booted Boards = 4, No Boot boards = 0
>>
>> next-20150123-LPAE-Logging enabled[1] (multi_v7_defconfig +LPAE)
>>  1:    BeagleBoard-X15(am57xx-evm): BOOT: FAIL: http://paste.ubuntu.org.cn/2220938
>>  2:                     dra72x-evm: BOOT: FAIL: http://paste.ubuntu.org.cn/2220943
>>  3:                     dra7xx-evm: BOOT: FAIL: http://paste.ubuntu.org.cn/2220947
>>  4:                      omap5-evm: BOOT: FAIL: http://paste.ubuntu.org.cn/2220955
>> TOTAL = 4 boards, Booted Boards = 0, No Boot boards = 4
>>
>> next-20150123-LPAE-new-patch [2] (multi_v7_defconfig + LPAE)
>>  1:    BeagleBoard-X15(am57xx-evm): BOOT: PASS: http://paste.ubuntu.org.cn/2221047
>>  2:                     dra72x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221065
>>  3:                     dra7xx-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221069
>>  4:                      omap5-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221070
>> TOTAL = 4 boards, Booted Boards = 4, No Boot boards = 0
>>
>> next-20150123-new-patch[2] (multi_v7_defconfig == !LPAE)
>>  1:                     am335x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221277
>>  2:                      am335x-sk: BOOT: PASS: http://paste.ubuntu.org.cn/2221278
>>  3:                      am437x-sk: BOOT: FAIL: http://paste.ubuntu.org.cn/2221279 (unrelated)
>>  4:                    am43xx-epos: BOOT: PASS: http://paste.ubuntu.org.cn/2221280
>>  5:                   am43xx-gpevm: BOOT: PASS: http://paste.ubuntu.org.cn/2221281
>>  6:    BeagleBoard-X15(am57xx-evm): BOOT: PASS: http://paste.ubuntu.org.cn/2221282
>>  7:                 BeagleBoard-XM: BOOT: FAIL: http://paste.ubuntu.org.cn/2221283 (unrelated)
>>  8:            beagleboard-vanilla: BOOT: PASS: http://paste.ubuntu.org.cn/2221284
>>  9:               beaglebone-black: BOOT: PASS: http://paste.ubuntu.org.cn/2221285
>> 10:                     beaglebone: BOOT: FAIL: http://paste.ubuntu.org.cn/2221286 (unrelated)
>> 11:                     dra72x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221287
>> 12:                     dra7xx-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221288
>> 13:                      omap5-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221289
>> 14:                  pandaboard-es: BOOT: PASS: http://paste.ubuntu.org.cn/2221290
>> 15:             pandaboard-vanilla: BOOT: PASS: http://paste.ubuntu.org.cn/2221291
>> 16:                        sdp4430: BOOT: PASS: http://paste.ubuntu.org.cn/2221292
>> TOTAL = 16 boards, Booted Boards = 13, No Boot boards = 3
>>
>> next-20150123-new-patch[2] (omap2plus_defconfig)
>>  1:                     am335x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221653
>>  2:                      am335x-sk: BOOT: PASS: http://paste.ubuntu.org.cn/2221654
>>  3:                      am437x-sk: BOOT: PASS: http://paste.ubuntu.org.cn/2221656
>>  4:                    am43xx-epos: BOOT: PASS: http://paste.ubuntu.org.cn/2221659
>>  5:                   am43xx-gpevm: BOOT: PASS: http://paste.ubuntu.org.cn/2221660
>>  6:    BeagleBoard-X15(am57xx-evm): BOOT: PASS: http://paste.ubuntu.org.cn/2221661
>>  7:                 BeagleBoard-XM: BOOT: PASS: http://paste.ubuntu.org.cn/2221670
>>  8:            beagleboard-vanilla: BOOT: PASS: http://paste.ubuntu.org.cn/2221676
>>  9:               beaglebone-black: BOOT: PASS: http://paste.ubuntu.org.cn/2221683
>> 10:                     beaglebone: BOOT: PASS: http://paste.ubuntu.org.cn/2221690
>> 11:                     dra72x-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221692
>> 12:                     dra7xx-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221695
>> 13:                      omap5-evm: BOOT: PASS: http://paste.ubuntu.org.cn/2221700
>> 14:                  pandaboard-es: BOOT: PASS: http://paste.ubuntu.org.cn/2221704
>> 15:             pandaboard-vanilla: BOOT: PASS: http://paste.ubuntu.org.cn/2221707
>> 16:                        sdp4430: BOOT: PASS: http://paste.ubuntu.org.cn/2221713
>> TOTAL = 16 boards, Booted Boards = 16, No Boot boards = 0
>
> Okay thanks. Here's proper patch.
>
> From 8f9845ab8d972164b700ff3e3ce53484cceb942b Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Mon, 26 Jan 2015 12:07:54 +0200
> Subject: [PATCH 1/2] mm: fix false-positive warning on exit due mm_nr_pmds(mm)
>
> The problem is that we check nr_ptes/nr_pmds in exit_mmap() which happens
> *before* pgd_free(). And if an arch does pte/pmd allocation in pgd_alloc()
> and frees them in pgd_free() we see offset in counters by the time of the
> checks.
>
> We tried to workaround this by offsetting expected counter value
> according to FIRST_USER_ADDRESS for both nr_pte and nr_pmd in
> exit_mmap(). But it doesn't work in some cases:
>
> 1. ARM with LPAE enabled also has non-zero USER_PGTABLES_CEILING, but
>    upper addresses occupied with huge pmd entries, so the trick with
>    offsetting expected counter value will get really ugly: we will have
>    to apply it nr_pmds, but not nr_ptes.
>
> 2. Metag has non-zero FIRST_USER_ADDRESS, but doesn't do allocation
>    pte/pmd page tables allocation in pgd_alloc(), just setup a pgd entry
>    which is allocated at boot and shared accross all processes.
>
> The proposal is to move the check to check_mm() which happens *after*
> pgd_free() and do proper accounting during pgd_alloc() and pgd_free()
> which would bring counters to zero if nothing leaked.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

I've tested this patch on top of linux-next [1] on a various array of
arm, arm64 and x86 hardware. I can confirm the issue with
CONFIG_ARM_LPAE=y has been resolved with no additional regressions
detected. The results can be found here [2].

Feel free to add:

Tested-by: Tyler Baker <tyler.baker@linaro.org>

> Reported-by: Tyler Baker <tyler.baker@linaro.org>
> Tested-by: Nishanth Menon <nm@ti.com>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: James Hogan <james.hogan@imgtec.com>
> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
> ---
>  arch/arm/mm/pgd.c       | 4 ++++
>  arch/unicore32/mm/pgd.c | 3 +++
>  kernel/fork.c           | 8 ++++++++
>  mm/mmap.c               | 5 -----
>  4 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
> index 249379535be2..a3681f11dd9f 100644
> --- a/arch/arm/mm/pgd.c
> +++ b/arch/arm/mm/pgd.c
> @@ -97,6 +97,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
>
>  no_pte:
>         pmd_free(mm, new_pmd);
> +       mm_dec_nr_pmds(mm);
>  no_pmd:
>         pud_free(mm, new_pud);
>  no_pud:
> @@ -130,9 +131,11 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
>         pte = pmd_pgtable(*pmd);
>         pmd_clear(pmd);
>         pte_free(mm, pte);
> +       atomic_long_dec(&mm->nr_ptes);
>  no_pmd:
>         pud_clear(pud);
>         pmd_free(mm, pmd);
> +       mm_dec_nr_pmds(mm);
>  no_pud:
>         pgd_clear(pgd);
>         pud_free(mm, pud);
> @@ -152,6 +155,7 @@ no_pgd:
>                 pmd = pmd_offset(pud, 0);
>                 pud_clear(pud);
>                 pmd_free(mm, pmd);
> +               mm_dec_nr_pmds(mm);
>                 pgd_clear(pgd);
>                 pud_free(mm, pud);
>         }
> diff --git a/arch/unicore32/mm/pgd.c b/arch/unicore32/mm/pgd.c
> index 08b8d4295e70..1bc00d0305d4 100644
> --- a/arch/unicore32/mm/pgd.c
> +++ b/arch/unicore32/mm/pgd.c
> @@ -69,6 +69,7 @@ pgd_t *get_pgd_slow(struct mm_struct *mm)
>
>  no_pte:
>         pmd_free(mm, new_pmd);
> +       mm_dec_nr_pmds(mm);
>  no_pmd:
>         free_pages((unsigned long)new_pgd, 0);
>  no_pgd:
> @@ -96,7 +97,9 @@ void free_pgd_slow(struct mm_struct *mm, pgd_t *pgd)
>         pte = pmd_pgtable(*pmd);
>         pmd_clear(pmd);
>         pte_free(mm, pte);
> +       atomic_long_dec(&mm->nr_ptes);
>         pmd_free(mm, pmd);
> +       mm_dec_nr_pmds(mm)
>  free:
>         free_pages((unsigned long) pgd, 0);
>  }
> diff --git a/kernel/fork.c b/kernel/fork.c
> index c99098c52641..76d6f292274c 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -606,6 +606,14 @@ static void check_mm(struct mm_struct *mm)
>                         printk(KERN_ALERT "BUG: Bad rss-counter state "
>                                           "mm:%p idx:%d val:%ld\n", mm, i, x);
>         }
> +
> +       if (atomic_long_read(&mm->nr_ptes))
> +               pr_alert("BUG: non-zero nr_ptes on freeing mm: %ld",
> +                               atomic_long_read(&mm->nr_ptes));
> +       if (mm_nr_pmds(mm))
> +               pr_alert("BUG: non-zero nr_pmds on freeing mm: %ld",
> +                               mm_nr_pmds(mm));
> +
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
>         VM_BUG_ON_MM(mm->pmd_huge_pte, mm);
>  #endif
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 6a7d36d133fb..c5f44682c0d1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2851,11 +2851,6 @@ void exit_mmap(struct mm_struct *mm)
>                 vma = remove_vma(vma);
>         }
>         vm_unacct_memory(nr_accounted);
> -
> -       WARN_ON(atomic_long_read(&mm->nr_ptes) >
> -                       round_up(FIRST_USER_ADDRESS, PMD_SIZE) >> PMD_SHIFT);
> -       WARN_ON(mm_nr_pmds(mm) >
> -                       round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);
>  }
>
>  /* Insert vm structure into process list sorted by address
> --
>  Kirill A. Shutemov

[1] https://git.linaro.org/people/tyler.baker/linux-next.git/shortlog/refs/heads/next-testing
[2] http://kernelci.org/boot/all/job/tbaker/kernel/v3.19-rc5-5174-g384ba8a33c70/

Thanks,

Tyler
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch hide | download patch | download mbox

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1fbd0e8..e5b0444 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1455,11 +1455,17 @@  static inline unsigned long mm_nr_pmds(struct
mm_struct *mm)
 static inline void mm_inc_nr_pmds(struct mm_struct *mm)
 {
        atomic_long_inc(&mm->nr_pmds);
+        dump_stack();
+        printk(KERN_INFO "===> %s nr_pmds %ld\n", __func__,
+                atomic_long_read(&mm->nr_pmds));
 }

 static inline void mm_dec_nr_pmds(struct mm_struct *mm)
 {
        atomic_long_dec(&mm->nr_pmds);
+        dump_stack();
+        printk(KERN_INFO "===> %s nr_pmds %ld\n", __func__,
+                atomic_long_read(&mm->nr_pmds));
 }
 #endif

diff --git a/mm/mmap.c b/mm/mmap.c
index 6a7d36d..a16471f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2809,6 +2809,7 @@  EXPORT_SYMBOL(vm_brk);
 /* Release all mmaps. */
 void exit_mmap(struct mm_struct *mm)
 {
+       printk(KERN_INFO "===> %s exit_mmap enter\n", __func__);
        struct mmu_gather tlb;
        struct vm_area_struct *vma;
        unsigned long nr_accounted = 0;