diff mbox

Problems with commit 'kallsyms: add support for relative offsets in kallsyms address table' (in mmotm)

Message ID CAKv+Gu9wJ8QKK98H5O_1Hn+xpwDxim1XrCmSGnzfQr0cqooUZw@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel Jan. 24, 2016, 8:21 a.m. UTC
On 24 January 2016 at 08:06, Guenter Roeck <linux@roeck-us.net> wrote:
> On 01/23/2016 10:10 PM, Ard Biesheuvel wrote:

>>

>>

>>

>>> On 24 jan. 2016, at 03:35, Guenter Roeck <linux@roeck-us.net> wrote:

>>>

>>>> On 01/23/2016 06:06 PM, Guenter Roeck wrote:

>>>> Hi,

>>>>

>>>> I see runtime problems with the current mmotm branch. All qemu mips

>>>> targets

>>>> (32 and 64 bit, big and little endian) are stuck in boot after this

>>>> commit.

>>>>

>>>> Bisect points to commit d13682e4d9d2 ("kallsyms: add support for

>>>> relative offsets

>>>> in kallsyms address table". Disabling CONFIG_KALLSYMS_BASE_RELATIVE

>>>> fixes the problem,

>>>> ie I can boot the image with qemu.

>>>>

>>>> Bisect log is attached.

>>>>

>>>> Playing with the problem, I found the following:

>>>>

>>>> 1) The problem is only seen with a toolchain using binutils 2.22, but

>>>> not

>>>>     with a toolchain using binutils 2.25. The compiler configuration may

>>>> be

>>>>     different for both toolchains.

>>>> 2) Message "kallsyms failure: absolute symbol value 0xffffffff807afd14

>>>> out of range

>>>>     in relative mode" (twice) when using the toolchain with binutils

>>>> 2.22.

>>>>     This does not cause the build to fail, though.

>>>> 3) kallsyms_sym_address() parameter variable type is "int". In the

>>>> calling code,

>>>>     the variable type used is "unsigned long". That has no impact on the

>>>> problem,

>>>>     though.

>>>

>>>

>>> An additional data point: When using the older toolchain, many symbols in

>>> System.map

>>> are marked "A".

>>>     ffffffff80100000 A _text

>>> With the more recent toolchain, the same symbols are marked "T".

>>>     ffffffff80100000 T _text

>>>

>>

>> Thanks for the analysis. It is surprising that the build does not fail

>> when this occurs, and the subsequent hangs themselves are probably caused by

>> missing kallsyms data.

>>

> Yes, I wondered why the build doesn't fail. Seems odd.

>

>> scripts/kallsyms.c ignores all A symbols except _text, which is actually a

>> relative symbol by nature so we can simply assume it is relative (i.e.,

>> override it as T)

>>

>> Re x86_64 !SMP, any build time errors there as well? Likewise for sparc32?

>>

>

> Yes, same kind of errors for both. For x86_64/nosmp I also get the error

> message

> when using the Ubuntu native toolchain, so it doesn't seem to be (directly)

> related to binutils 2.22 vs. 2.25 for that architecture.

>

> Runtime behavior is a bit different for the different architectures.

> x86_64 dies silently without any console output, mips just hangs,

> and sparc32 gets a panic with NULL pointer access.

> Of course, with missing kallsyms data all bets are off.

>

>>

>> Thanks again, and sorry for the trouble,

>

>

> No worries. Hope you'll get this sorted out.

>


OK, there's an additional issue in my latest version: the
kallsyms_relative_base value itself is not relocated.

If you have more time to burn on this, could you try the following on
top? (If not, that is also fine, I will look into it myself on Monday)


Thanks,
Ard.

Comments

Ard Biesheuvel Jan. 24, 2016, 5:20 p.m. UTC | #1
> On 24 jan. 2016, at 18:05, Guenter Roeck <linux@roeck-us.net> wrote:

> 

>> On 01/24/2016 12:21 AM, Ard Biesheuvel wrote:

>>> On 24 January 2016 at 08:06, Guenter Roeck <linux@roeck-us.net> wrote:

>>>> On 01/23/2016 10:10 PM, Ard Biesheuvel wrote:

>>>> 

>>>> 

>>>> 

>>>>>> On 24 jan. 2016, at 03:35, Guenter Roeck <linux@roeck-us.net> wrote:

>>>>>> 

>>>>>> On 01/23/2016 06:06 PM, Guenter Roeck wrote:

>>>>>> Hi,

>>>>>> 

>>>>>> I see runtime problems with the current mmotm branch. All qemu mips

>>>>>> targets

>>>>>> (32 and 64 bit, big and little endian) are stuck in boot after this

>>>>>> commit.

>>>>>> 

>>>>>> Bisect points to commit d13682e4d9d2 ("kallsyms: add support for

>>>>>> relative offsets

>>>>>> in kallsyms address table". Disabling CONFIG_KALLSYMS_BASE_RELATIVE

>>>>>> fixes the problem,

>>>>>> ie I can boot the image with qemu.

>>>>>> 

>>>>>> Bisect log is attached.

>>>>>> 

>>>>>> Playing with the problem, I found the following:

>>>>>> 

>>>>>> 1) The problem is only seen with a toolchain using binutils 2.22, but

>>>>>> not

>>>>>>     with a toolchain using binutils 2.25. The compiler configuration may

>>>>>> be

>>>>>>     different for both toolchains.

>>>>>> 2) Message "kallsyms failure: absolute symbol value 0xffffffff807afd14

>>>>>> out of range

>>>>>>     in relative mode" (twice) when using the toolchain with binutils

>>>>>> 2.22.

>>>>>>     This does not cause the build to fail, though.

>>>>>> 3) kallsyms_sym_address() parameter variable type is "int". In the

>>>>>> calling code,

>>>>>>     the variable type used is "unsigned long". That has no impact on the

>>>>>> problem,

>>>>>>     though.

>>>>> 

>>>>> 

>>>>> An additional data point: When using the older toolchain, many symbols in

>>>>> System.map

>>>>> are marked "A".

>>>>>     ffffffff80100000 A _text

>>>>> With the more recent toolchain, the same symbols are marked "T".

>>>>>     ffffffff80100000 T _text

>>>> 

>>>> Thanks for the analysis. It is surprising that the build does not fail

>>>> when this occurs, and the subsequent hangs themselves are probably caused by

>>>> missing kallsyms data.

>>> Yes, I wondered why the build doesn't fail. Seems odd.

>>> 

>>>> scripts/kallsyms.c ignores all A symbols except _text, which is actually a

>>>> relative symbol by nature so we can simply assume it is relative (i.e.,

>>>> override it as T)

>>>> 

>>>> Re x86_64 !SMP, any build time errors there as well? Likewise for sparc32?

>>> 

>>> Yes, same kind of errors for both. For x86_64/nosmp I also get the error

>>> message

>>> when using the Ubuntu native toolchain, so it doesn't seem to be (directly)

>>> related to binutils 2.22 vs. 2.25 for that architecture.

>>> 

>>> Runtime behavior is a bit different for the different architectures.

>>> x86_64 dies silently without any console output, mips just hangs,

>>> and sparc32 gets a panic with NULL pointer access.

>>> Of course, with missing kallsyms data all bets are off.

>>> 

>>>> 

>>>> Thanks again, and sorry for the trouble,

>>> 

>>> 

>>> No worries. Hope you'll get this sorted out.

>> 

>> OK, there's an additional issue in my latest version: the

>> kallsyms_relative_base value itself is not relocated.

>> 

>> If you have more time to burn on this, could you try the following on

>> top? (If not, that is also fine, I will look into it myself on Monday)

>> 

>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c

>> index 5ab13394dfd9..0f43f0751d47 100644

>> --- a/scripts/kallsyms.c

>> +++ b/scripts/kallsyms.c

>> @@ -137,8 +137,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)

>>                 sym++;

>> 

>>         /* Ignore most absolute/undefined (?) symbols. */

>> -       if (strcmp(sym, "_text") == 0)

>> +       if (strcmp(sym, "_text") == 0) {

>>                 _text = s->addr;

>> +               stype = 'T';

>> +       }

>>         else if (check_symbol_range(sym, s->addr, text_ranges,

>>                                     ARRAY_SIZE(text_ranges)) == 0)

>>                 /* nothing to do */;

>> @@ -406,7 +408,7 @@ static void write_src(void)

>> 

>>         if (base_relative) {

>>                 output_label("kallsyms_relative_base");

>> -               printf("\tPTR\t%#llx\n", relative_base);

>> +               printf("\tPTR\t_text - %#llx\n", _text - relative_base);

>>                 printf("\n");

>>         }

> 

> Does not help.

> 


For x86? Or none of them?

> Here is part of the problem. This is from a log message added to make_percpus_absolute().

> 

> Marking symbol 'B__bss_start' as absolute

> Marking symbol '?__init_end' as absolute

> Marking symbol 'D__nosave_begin' as absolute

> Marking symbol 'D__nosave_end' as absolute

> Marking symbol 'D__per_cpu_end' as absolute

> Marking symbol 'D__per_cpu_load' as absolute

> Marking symbol 'D__per_cpu_start' as absolute

> Marking symbol '?__smp_locks' as absolute

> Marking symbol '?__smp_locks_end' as absolute

> Marking symbol 'Bempty_zero_page' as absolute

> 

> This is with x86_64/nosmp. At least some of those symbols don't really reflect

> 'percpu' values. Maybe the distinction between percpu and non-percpu variables

> gets lost if SMP is not configured.

> 


Yes, sounds plausible, and that probably means some latent issue gets uncovered here rather than created. I suppose few people are testing x86_64+!SMP+CONFIG_RELOCATABLE thoroughly.

> On top of that, older versions of binutils mark additional symbols as absolute,

> even with x86_64.

> 

> ffffffff81a00000 A __end_rodata_hpage_align

> ffffffff81b19000 A __vvar_page

> ffffffff81d3d000 A _end

> 


Yes, but _text is the *only* symbol that is natively A that does not get filtered out (save for some ia64 specific ones) so these should not matter. Only _text and the percpu ones that get marked A explicitly should end up in the final table.

> Hope this helps,


A great deal, thanks a lot
Ard.
Ard Biesheuvel Jan. 24, 2016, 7:01 p.m. UTC | #2
> On 24 jan. 2016, at 19:01, Guenter Roeck <linux@roeck-us.net> wrote:

> 

> On 01/24/2016 09:20 AM, Ard Biesheuvel wrote:

> [ ... ]

>>>> OK, there's an additional issue in my latest version: the

>>>> kallsyms_relative_base value itself is not relocated.

>>>> 

>>>> If you have more time to burn on this, could you try the following on

>>>> top? (If not, that is also fine, I will look into it myself on Monday)

>>>> 

>>>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c

>>>> index 5ab13394dfd9..0f43f0751d47 100644

>>>> --- a/scripts/kallsyms.c

>>>> +++ b/scripts/kallsyms.c

>>>> @@ -137,8 +137,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)

>>>>                 sym++;

>>>> 

>>>>         /* Ignore most absolute/undefined (?) symbols. */

>>>> -       if (strcmp(sym, "_text") == 0)

>>>> +       if (strcmp(sym, "_text") == 0) {

>>>>                 _text = s->addr;

>>>> +               stype = 'T';

>>>> +       }

>>>>         else if (check_symbol_range(sym, s->addr, text_ranges,

>>>>                                     ARRAY_SIZE(text_ranges)) == 0)

>>>>                 /* nothing to do */;

>>>> @@ -406,7 +408,7 @@ static void write_src(void)

>>>> 

>>>>         if (base_relative) {

>>>>                 output_label("kallsyms_relative_base");

>>>> -               printf("\tPTR\t%#llx\n", relative_base);

>>>> +               printf("\tPTR\t_text - %#llx\n", _text - relative_base);

>>>>                 printf("\n");

>>>>         }

>>> 

>>> Does not help.

>>> 

>> 

>> For x86? Or none of them?

>> 

> 

> I tested sparc32 and x86_64/nosmp. Doesn't help for any of them.

> sparc32 has the following absolute symbols.

> 

> f035a420 A _etext

> f03d9000 A _sdata

> f03de8c4 A jiffies

> f03f8860 A _edata

> f03fc000 A __init_begin

> f041bdc8 A __init_text_end

> f0423000 A __bss_start

> f0423000 A __init_end

> f044457d A __bss_stop

> f044457d A _end

> 


Any clue why these don't get dropped? Am I missing something? Afaict A symbols get dropped unless they are whitelisted (i.e., the few ia64 ones)

> This results in:

> 

> kallsyms failure: absolute symbol value 0xf035a420 out of range in relative mode

> 

> This is with binutils 2.22. I didn't test with binutils 2.25 for sparc, or re-test mips.

> 

> 

> Looks like I'll need to add more test cases with binutils 2.22 vs. 2.25 for various

> architectures, as well as more SMP vs. !SMP builds.

> 


Thanks once again
Ard Biesheuvel Jan. 24, 2016, 7:16 p.m. UTC | #3
> On 24 jan. 2016, at 20:01, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> 

> 

> 

>> On 24 jan. 2016, at 19:01, Guenter Roeck <linux@roeck-us.net> wrote:

>> 

>> On 01/24/2016 09:20 AM, Ard Biesheuvel wrote:

>> [ ... ]

>>>>> OK, there's an additional issue in my latest version: the

>>>>> kallsyms_relative_base value itself is not relocated.

>>>>> 

>>>>> If you have more time to burn on this, could you try the following on

>>>>> top? (If not, that is also fine, I will look into it myself on Monday)

>>>>> 

>>>>> diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c

>>>>> index 5ab13394dfd9..0f43f0751d47 100644

>>>>> --- a/scripts/kallsyms.c

>>>>> +++ b/scripts/kallsyms.c

>>>>> @@ -137,8 +137,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)

>>>>>                sym++;

>>>>> 

>>>>>        /* Ignore most absolute/undefined (?) symbols. */

>>>>> -       if (strcmp(sym, "_text") == 0)

>>>>> +       if (strcmp(sym, "_text") == 0) {

>>>>>                _text = s->addr;

>>>>> +               stype = 'T';

>>>>> +       }

>>>>>        else if (check_symbol_range(sym, s->addr, text_ranges,

>>>>>                                    ARRAY_SIZE(text_ranges)) == 0)

>>>>>                /* nothing to do */;

>>>>> @@ -406,7 +408,7 @@ static void write_src(void)

>>>>> 

>>>>>        if (base_relative) {

>>>>>                output_label("kallsyms_relative_base");

>>>>> -               printf("\tPTR\t%#llx\n", relative_base);

>>>>> +               printf("\tPTR\t_text - %#llx\n", _text - relative_base);

>>>>>                printf("\n");

>>>>>        }

>>>> 

>>>> Does not help.

>>> 

>>> For x86? Or none of them?

>> 

>> I tested sparc32 and x86_64/nosmp. Doesn't help for any of them.

>> sparc32 has the following absolute symbols.

>> 

>> f035a420 A _etext

>> f03d9000 A _sdata

>> f03de8c4 A jiffies

>> f03f8860 A _edata

>> f03fc000 A __init_begin

>> f041bdc8 A __init_text_end

>> f0423000 A __bss_start

>> f0423000 A __init_end

>> f044457d A __bss_stop

>> f044457d A _end

> 

> Any clue why these don't get dropped? Am I missing something? Afaict A symbols get dropped unless they are whitelisted (i.e., the few ia64 ones)

> 


ok, never mind. it's the symbol range check.

anyway, i should have enough info now to get this sorted

thanks,
ard

>> This results in:

>> 

>> kallsyms failure: absolute symbol value 0xf035a420 out of range in relative mode

>> 

>> This is with binutils 2.22. I didn't test with binutils 2.25 for sparc, or re-test mips.

>> 

>> 

>> Looks like I'll need to add more test cases with binutils 2.22 vs. 2.25 for various

>> architectures, as well as more SMP vs. !SMP builds.

> 

> Thanks once again
diff mbox

Patch

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 5ab13394dfd9..0f43f0751d47 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -137,8 +137,10 @@  static int read_symbol(FILE *in, struct sym_entry *s)
                sym++;

        /* Ignore most absolute/undefined (?) symbols. */
-       if (strcmp(sym, "_text") == 0)
+       if (strcmp(sym, "_text") == 0) {
                _text = s->addr;
+               stype = 'T';
+       }
        else if (check_symbol_range(sym, s->addr, text_ranges,
                                    ARRAY_SIZE(text_ranges)) == 0)
                /* nothing to do */;
@@ -406,7 +408,7 @@  static void write_src(void)

        if (base_relative) {
                output_label("kallsyms_relative_base");
-               printf("\tPTR\t%#llx\n", relative_base);
+               printf("\tPTR\t_text - %#llx\n", _text - relative_base);
                printf("\n");
        }