diff mbox

drivers/of: crash on boot

Message ID CAL_JsqJsjh+Shk1nD5YQeQ=6B1MZOyEj5oX-q_xKH642toURdA@mail.gmail.com
State New
Headers show

Commit Message

Rob Herring May 18, 2016, 7:36 p.m. UTC
On Wed, May 18, 2016 at 10:34 AM, Sasha Levin <sasha.levin@oracle.com> wrote:
> Hi Rhyland,

>

> I'm seeing a crash on boot that seems to have been caused by

> "drivers/of: Fix depth when unflattening devicetree":

>

> [   61.145229] ==================================================================

>

> [   61.147588] BUG: KASAN: stack-out-of-bounds in unflatten_dt_nodes+0x11d2/0x1290 at addr ffff88005b30777c


The following appears to fix it for me. Rhyland, please confirm.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rob Herring May 19, 2016, 12:23 a.m. UTC | #1
On Wed, May 18, 2016 at 4:26 PM, Rhyland Klein <rklein@nvidia.com> wrote:
> On 5/18/2016 3:58 PM, Rhyland Klein wrote:

>> On 5/18/2016 3:36 PM, Rob Herring wrote:

>>> On Wed, May 18, 2016 at 10:34 AM, Sasha Levin <sasha.levin@oracle.com> wrote:

>>>> Hi Rhyland,

>>>>

>>>> I'm seeing a crash on boot that seems to have been caused by

>>>> "drivers/of: Fix depth when unflattening devicetree":

>>>>

>>>> [   61.145229] ==================================================================

>>>>

>>>> [   61.147588] BUG: KASAN: stack-out-of-bounds in unflatten_dt_nodes+0x11d2/0x1290 at addr ffff88005b30777c

>>>

>>> The following appears to fix it for me. Rhyland, please confirm.

>>>

>>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c

>>> index 7f38241..888ec2a 100644

>>> --- a/drivers/of/fdt.c

>>> +++ b/drivers/of/fdt.c

>>> @@ -409,7 +409,7 @@ static int unflatten_dt_nodes(const void *blob,

>>>         fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;

>>>         nps[depth+1] = dad;

>>>         for (offset = 0;

>>> -            offset >= 0;

>>> +            offset >= 0, depth >= 0;

>>>              offset = fdt_next_node(blob, offset, &depth)) {

>>>                 if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))

>>>                         continue;

>>>

>>

>> If I try that patch, i see this when compiling:

>>

>> In function ‘unflatten_dt_nodes’:

>> warning: left-hand operand of comma expression has no effect

>> [-Wunused-value]

>>       offset >= 0, depth >= 0;


Doh! However, that does make the unit test pass and I don't see a NULL ptr...

>>

>

> This patch seems to work for me. I found a bug in my original patch.

> Sasha/Rob, can you see if this works for you too:

>

> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c

> index 0b5850027bb5..e7a8caac5b27 100644

> --- a/drivers/of/fdt.c

> +++ b/drivers/of/fdt.c

> @@ -407,9 +407,9 @@ static int unflatten_dt_nodes(const void *blob,

>

>         root = dad;

>         fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;

> -       nps[depth+1] = dad;

> +       nps[depth] = dad;

>         for (offset = 0;

> -            offset >= 0;

> +            offset >= 0 && depth >= 0;

>              offset = fdt_next_node(blob, offset, &depth)) {

>                 if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))

>                         continue;


This is not work for me. I'm booting x86 with the DT unit test and
KASAN enabled. I suspect our differences are due to different data
after the end of the dtb. Also, I think there may be a bug in
fdt_next_node FDT_END handling. The "!depth" seems suspicious to me
and I think it should be "!(*depth)".

The DT overlay unit tests are also failing. Not sure if that's related.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring May 19, 2016, 1:51 a.m. UTC | #2
On Wed, May 18, 2016 at 7:23 PM, Rob Herring <robh@kernel.org> wrote:
> On Wed, May 18, 2016 at 4:26 PM, Rhyland Klein <rklein@nvidia.com> wrote:

>> On 5/18/2016 3:58 PM, Rhyland Klein wrote:

>>> On 5/18/2016 3:36 PM, Rob Herring wrote:

>>>> On Wed, May 18, 2016 at 10:34 AM, Sasha Levin <sasha.levin@oracle.com> wrote:

>>>>> Hi Rhyland,

>>>>>

>>>>> I'm seeing a crash on boot that seems to have been caused by

>>>>> "drivers/of: Fix depth when unflattening devicetree":

>>>>>

>>>>> [   61.145229] ==================================================================

>>>>>

>>>>> [   61.147588] BUG: KASAN: stack-out-of-bounds in unflatten_dt_nodes+0x11d2/0x1290 at addr ffff88005b30777c


[...]

>> This patch seems to work for me. I found a bug in my original patch.

>> Sasha/Rob, can you see if this works for you too:

>>

>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c

>> index 0b5850027bb5..e7a8caac5b27 100644

>> --- a/drivers/of/fdt.c

>> +++ b/drivers/of/fdt.c

>> @@ -407,9 +407,9 @@ static int unflatten_dt_nodes(const void *blob,

>>

>>         root = dad;

>>         fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;

>> -       nps[depth+1] = dad;

>> +       nps[depth] = dad;

>>         for (offset = 0;

>> -            offset >= 0;

>> +            offset >= 0 && depth >= 0;

>>              offset = fdt_next_node(blob, offset, &depth)) {

>>                 if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))

>>                         continue;

>

> This is not work for me. I'm booting x86 with the DT unit test and

> KASAN enabled. I suspect our differences are due to different data

> after the end of the dtb. Also, I think there may be a bug in

> fdt_next_node FDT_END handling. The "!depth" seems suspicious to me

> and I think it should be "!(*depth)".

>

> The DT overlay unit tests are also failing. Not sure if that's related.


Seems with the above patch and the fix to fdt_next_node, the problem
is fixed both for KASAN and the DT overlay tests. Trying it out now
with some other configurations.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring May 19, 2016, 12:48 p.m. UTC | #3
On Thu, May 19, 2016 at 6:19 AM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> On Wed, May 18, 2016 at 08:51:59PM -0500, Rob Herring wrote:

>>On Wed, May 18, 2016 at 7:23 PM, Rob Herring <robh@kernel.org> wrote:

>>> On Wed, May 18, 2016 at 4:26 PM, Rhyland Klein <rklein@nvidia.com> wrote:

>>>> On 5/18/2016 3:58 PM, Rhyland Klein wrote:

>>>>> On 5/18/2016 3:36 PM, Rob Herring wrote:

>>>>>> On Wed, May 18, 2016 at 10:34 AM, Sasha Levin <sasha.levin@oracle.com> wrote:

>>>>>>> Hi Rhyland,

>>>>>>>

>>>>>>> I'm seeing a crash on boot that seems to have been caused by

>>>>>>> "drivers/of: Fix depth when unflattening devicetree":

>>>>>>>

>>>>>>> [   61.145229] ==================================================================

>>>>>>>

>>>>>>> [   61.147588] BUG: KASAN: stack-out-of-bounds in unflatten_dt_nodes+0x11d2/0x1290 at addr ffff88005b30777c

>>

>>[...]

>>

>>>> This patch seems to work for me. I found a bug in my original patch.

>>>> Sasha/Rob, can you see if this works for you too:

>>>>

>>>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c

>>>> index 0b5850027bb5..e7a8caac5b27 100644

>>>> --- a/drivers/of/fdt.c

>>>> +++ b/drivers/of/fdt.c

>>>> @@ -407,9 +407,9 @@ static int unflatten_dt_nodes(const void *blob,

>>>>

>>>>         root = dad;

>>>>         fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;

>>>> -       nps[depth+1] = dad;

>>>> +       nps[depth] = dad;

>>>>         for (offset = 0;

>>>> -            offset >= 0;

>>>> +            offset >= 0 && depth >= 0;

>>>>              offset = fdt_next_node(blob, offset, &depth)) {

>>>>                 if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))

>>>>                         continue;

>>>

>>> This is not work for me. I'm booting x86 with the DT unit test and

>>> KASAN enabled. I suspect our differences are due to different data

>>> after the end of the dtb. Also, I think there may be a bug in

>>> fdt_next_node FDT_END handling. The "!depth" seems suspicious to me

>>> and I think it should be "!(*depth)".

>>>

>>> The DT overlay unit tests are also failing. Not sure if that's related.

>>

>>Seems with the above patch and the fix to fdt_next_node, the problem

>>is fixed both for KASAN and the DT overlay tests. Trying it out now

>>with some other configurations.

>>

>

> There're 5 patches I introduced to drivers/of/fdt.c (A). Rhyland had

> one patch based on them (B). The code change in this thread is (C).

> I tried several cases as below.

>

> There is one failing case caused by something we don't know yet. I

> will do some invetigation unless it's not a issue or a known issue

> of unittest itself.

>

> [1]. (A) excluded, (B) excluded, (C) excluded

> =============================================

> device-tree: Duplicate name in testcase-data, renamed to "duplicate-name#1"

> ### dt-test ### start of unittest - you will see error messages

> /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1

> /testcase-data/phandle-tests/consumer-a: could not get #phandle-cells-missing for /testcase-data/phandle-tests/provider1

> /testcase-data/phandle-tests/consumer-a: could not find phandle

> /testcase-data/phandle-tests/consumer-a: could not find phandle

> /testcase-data/phandle-tests/consumer-a: arguments longer than property

> /testcase-data/phandle-tests/consumer-a: arguments longer than property

> irq: XICS didn't like hwirq-0x1 to VIRQ32 mapping (rc=-22)

> irq: XICS didn't like hwirq-0x1 to VIRQ32 mapping (rc=-22)

> ### dt-test ### FAIL of_unittest_platform_populate():783 device deferred probe failed - 0


Humm, I'm not seeing this one.

> overlay_is_topmost: #5 clashes #6 @/testcase-data/overlay-node/test-bus/test-unittest8

> overlay_removal_is_ok: overlay #5 is not topmost

> of_overlay_destroy: removal check failed for overlay #5

> ### dt-test ### end of unittest - 147 passed, 1 failed

>

> [2]. (A) included, (B) exsluded, (C) excluded

> =============================================

> Same output as [1]

>

> [3]. (A) included, (B) included, (C) excluded

> =============================================

> System fails to boot

>

> [4]. (A) included, (B) included, (C) included

> =============================================

> Same output as [1] and [2].


For C, this includes the fix to depth in fdt_next_node?

While case 2 works for you, do you agree that there is an off by one
error and initially fdt_next_node should be called with depth=0?

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring May 19, 2016, 2:20 p.m. UTC | #4
On Wed, May 18, 2016 at 8:51 PM, Rob Herring <robh@kernel.org> wrote:
> On Wed, May 18, 2016 at 7:23 PM, Rob Herring <robh@kernel.org> wrote:

>> On Wed, May 18, 2016 at 4:26 PM, Rhyland Klein <rklein@nvidia.com> wrote:

>>> On 5/18/2016 3:58 PM, Rhyland Klein wrote:

>>>> On 5/18/2016 3:36 PM, Rob Herring wrote:

>>>>> On Wed, May 18, 2016 at 10:34 AM, Sasha Levin <sasha.levin@oracle.com> wrote:

>>>>>> Hi Rhyland,

>>>>>>

>>>>>> I'm seeing a crash on boot that seems to have been caused by

>>>>>> "drivers/of: Fix depth when unflattening devicetree":

>>>>>>

>>>>>> [   61.145229] ==================================================================

>>>>>>

>>>>>> [   61.147588] BUG: KASAN: stack-out-of-bounds in unflatten_dt_nodes+0x11d2/0x1290 at addr ffff88005b30777c

>

> [...]

>

>>> This patch seems to work for me. I found a bug in my original patch.

>>> Sasha/Rob, can you see if this works for you too:

>>>

>>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c

>>> index 0b5850027bb5..e7a8caac5b27 100644

>>> --- a/drivers/of/fdt.c

>>> +++ b/drivers/of/fdt.c

>>> @@ -407,9 +407,9 @@ static int unflatten_dt_nodes(const void *blob,

>>>

>>>         root = dad;

>>>         fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;

>>> -       nps[depth+1] = dad;

>>> +       nps[depth] = dad;

>>>         for (offset = 0;

>>> -            offset >= 0;

>>> +            offset >= 0 && depth >= 0;

>>>              offset = fdt_next_node(blob, offset, &depth)) {

>>>                 if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))

>>>                         continue;

>>

>> This is not work for me. I'm booting x86 with the DT unit test and

>> KASAN enabled. I suspect our differences are due to different data

>> after the end of the dtb. Also, I think there may be a bug in

>> fdt_next_node FDT_END handling. The "!depth" seems suspicious to me

>> and I think it should be "!(*depth)".


I take that back. Your change does work for me. Must have had something stale.

>> The DT overlay unit tests are also failing. Not sure if that's related.

>

> Seems with the above patch and the fix to fdt_next_node, the problem

> is fixed both for KASAN and the DT overlay tests. Trying it out now

> with some other configurations.


fdt_next_node is in fact correct. Changing it caused failures in the
dtc unit tests.

So I have squashed the above fix into your original fix and pushed
that out to -next. kernelci.org is also seeing some failures due to
this. I'll give this another day or so before sending to Linus.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 7f38241..888ec2a 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -409,7 +409,7 @@  static int unflatten_dt_nodes(const void *blob,
        fpsizes[depth] = dad ? strlen(of_node_full_name(dad)) : 0;
        nps[depth+1] = dad;
        for (offset = 0;
-            offset >= 0;
+            offset >= 0, depth >= 0;
             offset = fdt_next_node(blob, offset, &depth)) {
                if (WARN_ON_ONCE(depth >= FDT_MAX_DEPTH))
                        continue;