krzysztof/for-next bisection: baseline.dmesg.crit on bcm2837-rpi-3-b

Message ID 5ef21053.1c69fb81.b80ec.8649@mx.google.com
State New
Headers show
Series
  • krzysztof/for-next bisection: baseline.dmesg.crit on bcm2837-rpi-3-b
Related show

Commit Message

kernelci.org bot June 23, 2020, 2:23 p.m.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* This automated bisection report was sent to you on the basis  *
* that you may be involved with the breaking commit it has      *
* found.  No manual investigation has been done to verify it,   *
* and the root cause of the problem may be somewhere else.      *
*                                                               *
* If you do send a fix, please include this trailer:            *
*   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
*                                                               *
* Hope this helps!                                              *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

krzysztof/for-next bisection: baseline.dmesg.crit on bcm2837-rpi-3-b

Summary:
  Start:      d6fe116541b7 Merge branch 'next/soc' into for-next
  Plain log:  https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.txt
  HTML log:   https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html
  Result:     5b17a04addc2 ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

Checks:
  revert:     PASS
  verify:     PASS

Parameters:
  Tree:       krzysztof
  URL:        https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git
  Branch:     for-next
  Target:     bcm2837-rpi-3-b
  CPU arch:   arm64
  Lab:        lab-baylibre
  Compiler:   gcc-8
  Config:     defconfig+CONFIG_RANDOMIZE_BASE=y
  Test case:  baseline.dmesg.crit

Breaking commit found:

-------------------------------------------------------------------------------
commit 5b17a04addc29201dc142c8d2c077eb7745d2e35
Author: Guillaume Tucker <guillaume.tucker@collabora.com>
Date:   Fri Jun 12 14:58:37 2020 +0100

    ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val
    
    This "alert" error message can be seen on exynos4412-odroidx2:
    
        L2C: platform modifies aux control register: 0x02070000 -> 0x3e470001
        L2C: platform provided aux values permit register corruption.
    
    Followed by this plain error message:
    
        L2C-310: enabling full line of zeros but not enabled in Cortex-A9
    
    To fix it, don't set the L310_AUX_CTRL_FULL_LINE_ZERO flag (bit 0) in
    the default value of l2c_aux_val.  It may instead be enabled when
    applicable by the logic in l2c310_enable() if the attribute
    "arm,full-line-zero-disable" was set in the device tree.
    
    The initial commit that introduced this default value was in v2.6.38
    commit 1cf0eb799759 ("ARM: S5PV310: Add L2 cache init function in
    cpu.c").
    
    However, the code to set the L310_AUX_CTRL_FULL_LINE_ZERO flag and
    manage that feature was added much later and the default value was not
    updated then.  So this seems to have been a subtle oversight
    especially since enabling it only in the cache and not in the A9 core
    doesn't actually prevent the platform from running.  According to the
    TRM, the opposite would be a real issue, if the feature was enabled in
    the A9 core but not in the cache controller.
    
    Reported-by: "kernelci.org bot" <bot@kernelci.org>
    Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>

    Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>


-------------------------------------------------------------------------------


Git bisection log:

-------------------------------------------------------------------------------
git bisect start
# good: [b0953d8b7cdb39493e67cff4b20b0ebe3a2bba92] Merge branch 'next/drivers' into for-next
git bisect good b0953d8b7cdb39493e67cff4b20b0ebe3a2bba92
# bad: [d6fe116541b73a56110310c39a270c99766cd909] Merge branch 'next/soc' into for-next
git bisect bad d6fe116541b73a56110310c39a270c99766cd909
# bad: [5b17a04addc29201dc142c8d2c077eb7745d2e35] ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val
git bisect bad 5b17a04addc29201dc142c8d2c077eb7745d2e35
# first bad commit: [5b17a04addc29201dc142c8d2c077eb7745d2e35] ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val
-------------------------------------------------------------------------------

Comments

Krzysztof Kozlowski June 23, 2020, 2:36 p.m. | #1
On Tue, 23 Jun 2020 at 16:23, kernelci.org bot <bot@kernelci.org> wrote:
>

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

> * This automated bisection report was sent to you on the basis  *

> * that you may be involved with the breaking commit it has      *

> * found.  No manual investigation has been done to verify it,   *

> * and the root cause of the problem may be somewhere else.      *

> *                                                               *

> * If you do send a fix, please include this trailer:            *

> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *

> *                                                               *

> * Hope this helps!                                              *

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

>

> krzysztof/for-next bisection: baseline.dmesg.crit on bcm2837-rpi-3-b

>

> Summary:

>   Start:      d6fe116541b7 Merge branch 'next/soc' into for-next

>   Plain log:  https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.txt

>   HTML log:   https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html

>   Result:     5b17a04addc2 ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

>

> Checks:

>   revert:     PASS

>   verify:     PASS

>

> Parameters:

>   Tree:       krzysztof

>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git

>   Branch:     for-next

>   Target:     bcm2837-rpi-3-b


Looks like false positive. The commit does not touch anything in arm64/bcm.

Best regards,
Krzysztof
Guillaume Tucker June 23, 2020, 3:19 p.m. | #2
On 23/06/2020 15:23, kernelci.org bot wrote:
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

> * This automated bisection report was sent to you on the basis  *

> * that you may be involved with the breaking commit it has      *

> * found.  No manual investigation has been done to verify it,   *

> * and the root cause of the problem may be somewhere else.      *

> *                                                               *

> * If you do send a fix, please include this trailer:            *

> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *

> *                                                               *

> * Hope this helps!                                              *

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

> 

> krzysztof/for-next bisection: baseline.dmesg.crit on bcm2837-rpi-3-b

> 

> Summary:

>   Start:      d6fe116541b7 Merge branch 'next/soc' into for-next

>   Plain log:  https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.txt

>   HTML log:   https://storage.kernelci.org/krzysztof/for-next/v5.8-rc1-14-gd6fe116541b7/arm64/defconfig+CONFIG_RANDOMIZE_BASE=y/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html

>   Result:     5b17a04addc2 ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

> 

> Checks:

>   revert:     PASS

>   verify:     PASS

> 

> Parameters:

>   Tree:       krzysztof

>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git

>   Branch:     for-next

>   Target:     bcm2837-rpi-3-b

>   CPU arch:   arm64

>   Lab:        lab-baylibre

>   Compiler:   gcc-8

>   Config:     defconfig+CONFIG_RANDOMIZE_BASE=y

>   Test case:  baseline.dmesg.crit


The "crit" kernel message is:

[   17.566555] hwmon hwmon1: Undervoltage detected!

which does not seem to have anything to do with the patch found
by the bisection.  Also, the bcm2837-rpi-3-b uses Cortex-A53
cores and no L2C-310 cache.

This undervoltage issue is actually an intermittent issue that
was already present before.  See next-20200616:

  https://kernelci.org/test/case/id/5ee880c10e8d4cd38797bf52/
  https://storage.kernelci.org/next/master/next-20200616/arm64/defconfig/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html#L708

I'll still take a closer look to be sure this is actually noise.
The same revision built without CONFIG_RAMDOMIZE_BASE=y passed
fine, although I don't see how this could be related:

  https://kernelci.org/test/plan/id/5ef1ccb2d9df2557d597bf20/

Maybe the rpi-3-b could get an undervoltage depending on the
address where the kernel was loaded, and somehow my patch would
make this more likely?  It sounds so far-fetched...

This is so ironic - after 6 months with no false positives in
kernelci bisections, and this rpi-3-b issue too random to ever
cause a bisection to succeed, I get this report which landed a
commit that I made, one week after enabling public bisection
email reports again.  It must be trying to tell me something :)

Guillaume


> Breaking commit found:

> 

> -------------------------------------------------------------------------------

> commit 5b17a04addc29201dc142c8d2c077eb7745d2e35

> Author: Guillaume Tucker <guillaume.tucker@collabora.com>

> Date:   Fri Jun 12 14:58:37 2020 +0100

> 

>     ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

>     

>     This "alert" error message can be seen on exynos4412-odroidx2:

>     

>         L2C: platform modifies aux control register: 0x02070000 -> 0x3e470001

>         L2C: platform provided aux values permit register corruption.

>     

>     Followed by this plain error message:

>     

>         L2C-310: enabling full line of zeros but not enabled in Cortex-A9

>     

>     To fix it, don't set the L310_AUX_CTRL_FULL_LINE_ZERO flag (bit 0) in

>     the default value of l2c_aux_val.  It may instead be enabled when

>     applicable by the logic in l2c310_enable() if the attribute

>     "arm,full-line-zero-disable" was set in the device tree.

>     

>     The initial commit that introduced this default value was in v2.6.38

>     commit 1cf0eb799759 ("ARM: S5PV310: Add L2 cache init function in

>     cpu.c").

>     

>     However, the code to set the L310_AUX_CTRL_FULL_LINE_ZERO flag and

>     manage that feature was added much later and the default value was not

>     updated then.  So this seems to have been a subtle oversight

>     especially since enabling it only in the cache and not in the A9 core

>     doesn't actually prevent the platform from running.  According to the

>     TRM, the opposite would be a real issue, if the feature was enabled in

>     the A9 core but not in the cache controller.

>     

>     Reported-by: "kernelci.org bot" <bot@kernelci.org>

>     Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>

>     Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>

> 

> diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c

> index 7a8d1555db40..36c37444485a 100644

> --- a/arch/arm/mach-exynos/exynos.c

> +++ b/arch/arm/mach-exynos/exynos.c

> @@ -193,7 +193,7 @@ static void __init exynos_dt_fixup(void)

>  }

>  

>  DT_MACHINE_START(EXYNOS_DT, "Samsung Exynos (Flattened Device Tree)")

> -	.l2c_aux_val	= 0x3c400001,

> +	.l2c_aux_val	= 0x3c400000,

>  	.l2c_aux_mask	= 0xc20fffff,

>  	.smp		= smp_ops(exynos_smp_ops),

>  	.map_io		= exynos_init_io,

> -------------------------------------------------------------------------------

> 

> 

> Git bisection log:

> 

> -------------------------------------------------------------------------------

> git bisect start

> # good: [b0953d8b7cdb39493e67cff4b20b0ebe3a2bba92] Merge branch 'next/drivers' into for-next

> git bisect good b0953d8b7cdb39493e67cff4b20b0ebe3a2bba92

> # bad: [d6fe116541b73a56110310c39a270c99766cd909] Merge branch 'next/soc' into for-next

> git bisect bad d6fe116541b73a56110310c39a270c99766cd909

> # bad: [5b17a04addc29201dc142c8d2c077eb7745d2e35] ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

> git bisect bad 5b17a04addc29201dc142c8d2c077eb7745d2e35

> # first bad commit: [5b17a04addc29201dc142c8d2c077eb7745d2e35] ARM: exynos: clear L310_AUX_CTRL_FULL_LINE_ZERO in default l2c_aux_val

> -------------------------------------------------------------------------------

>

Patch

diff --git a/arch/arm/mach-exynos/exynos.c b/arch/arm/mach-exynos/exynos.c
index 7a8d1555db40..36c37444485a 100644
--- a/arch/arm/mach-exynos/exynos.c
+++ b/arch/arm/mach-exynos/exynos.c
@@ -193,7 +193,7 @@  static void __init exynos_dt_fixup(void)
 }
 
 DT_MACHINE_START(EXYNOS_DT, "Samsung Exynos (Flattened Device Tree)")
-	.l2c_aux_val	= 0x3c400001,
+	.l2c_aux_val	= 0x3c400000,
 	.l2c_aux_mask	= 0xc20fffff,
 	.smp		= smp_ops(exynos_smp_ops),
 	.map_io		= exynos_init_io,