diff mbox series

mips: vdso: conditionalize 32-bit time functions on COMPAT_32BIT_TIME

Message ID 20191224135404.389039-1-Jason@zx2c4.com
State New
Headers show
Series mips: vdso: conditionalize 32-bit time functions on COMPAT_32BIT_TIME | expand

Commit Message

Jason A. Donenfeld Dec. 24, 2019, 1:54 p.m. UTC
When the VDSO falls back to 32-bit time functions on kernels with
COMPAT_32BIT_TIME=n, userspace becomes corrupted and appears to crash
shortly after, with something like:

[    0.359617] do_page_fault(): sending SIGSEGV to init for invalid read access from 000000007ff790d0
[    0.359843] epc = 0000000077e45df4 in libc.so[77da6000+de000]
[    0.360319] ra  = 0000000010000c50 in init[10000000+2000]
[    0.364456] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

This can be reproduced with simply calling `clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts)`,
since `CLOCK_PROCESS_CPUTIME_ID` is not exported to the VDSO, invoking
the syscall callback branch. This crash was observed with musl 1.20's
clock_gettime implementation:

int __clock_gettime(clockid_t clk, struct timespec *ts)
{
  int r;

  int (*f)(clockid_t, struct timespec *) =
    (int (*)(clockid_t, struct timespec *))vdso_func;
  if (f) {
    r = f(clk, ts);
    if (!r) {
      return r;
    }
    if (r == -EINVAL)
      return __syscall_ret(r);
  }
  r = __syscall(SYS_clock_gettime, clk, ts); // <-- CRASH
  if (r == -ENOSYS) {
    if (clk == CLOCK_REALTIME) {
      __syscall(SYS_gettimeofday, ts, 0);
      ts->tv_nsec = (int)ts->tv_nsec * 1000;
      return 0;
    }
    r = -EINVAL;
  }
  return __syscall_ret(r);
}

The particular kernel and libc are built as part of the MIPS64 CI on
build.wireguard.com which generally uses as minimal configurations as
possible.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Cc: Paul Burton <paulburton@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Christian Brauner <christian.brauner@canonical.com>
Fixes: 942437c97fd9 ("y2038: allow disabling time32 system calls")
---
 arch/mips/include/asm/vdso/gettimeofday.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.24.1

Comments

Arnd Bergmann Dec. 30, 2019, 11:57 a.m. UTC | #1
On Tue, Dec 24, 2019 at 2:54 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> When the VDSO falls back to 32-bit time functions on kernels with

> COMPAT_32BIT_TIME=n, userspace becomes corrupted and appears to crash

> shortly after, with something like:

>

> [    0.359617] do_page_fault(): sending SIGSEGV to init for invalid read access from 000000007ff790d0

> [    0.359843] epc = 0000000077e45df4 in libc.so[77da6000+de000]

> [    0.360319] ra  = 0000000010000c50 in init[10000000+2000]

> [    0.364456] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

>

> This can be reproduced with simply calling `clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts)`,

> since `CLOCK_PROCESS_CPUTIME_ID` is not exported to the VDSO, invoking

> the syscall callback branch. This crash was observed with musl 1.20's

> clock_gettime implementation:


Thanks for the bug report! I'm not completely sure why this fails in
this particular
way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot
(the version 1.20 you list does not exist), so the combination you are testing
is supposed to just return -ENOSYS here to match the behavior of hte
system call.

> --- a/arch/mips/include/asm/vdso/gettimeofday.h

> +++ b/arch/mips/include/asm/vdso/gettimeofday.h

> @@ -107,7 +107,7 @@ static __always_inline int clock_getres_fallback(

>        return error ? -ret : ret;

> }

>

> -#if _MIPS_SIM != _MIPS_SIM_ABI64

> +#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_COMPAT_32BIT_TIME)

>

>  #define VDSO_HAS_32BIT_FALLBACK        1

>


I don't think this is the correct fix, it may actually make it worse
by changing the vdso implementation for clock_gettime32()
to fall back to clock_gettime64(), which would appear to work
correctly before y2038 but fail afterwards.  How about this one:

diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
index da4627430aba..0bdc6a026be8 100644
--- a/arch/mips/vdso/vdso.lds.S
+++ b/arch/mips/vdso/vdso.lds.S
@@ -93,9 +93,11 @@ VERSION
        LINUX_2.6 {
 #ifndef DISABLE_MIPS_VDSO
        global:
+#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME)
                __vdso_clock_gettime;
                __vdso_gettimeofday;
                __vdso_clock_getres;
+#endif
 #if _MIPS_SIM != _MIPS_SIM_ABI64
                __vdso_clock_gettime64;
 #endif

That should ensure that no user space can call the old vdso
functions on a kernel that intentionally breaks the actual
syscalls.

      Arnd
Jason A. Donenfeld Dec. 30, 2019, 12:26 p.m. UTC | #2
On Mon, Dec 30, 2019 at 12:58 PM Arnd Bergmann <arnd@arndb.de> wrote:
> Thanks for the bug report! I'm not completely sure why this fails in

> this particular

> way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot


Yes, that's the one, sorry.

> diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S

> index da4627430aba..0bdc6a026be8 100644

> --- a/arch/mips/vdso/vdso.lds.S

> +++ b/arch/mips/vdso/vdso.lds.S

> @@ -93,9 +93,11 @@ VERSION

>         LINUX_2.6 {

>  #ifndef DISABLE_MIPS_VDSO

>         global:

> +#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME)

>                 __vdso_clock_gettime;

>                 __vdso_gettimeofday;

>                 __vdso_clock_getres;

> +#endif

>  #if _MIPS_SIM != _MIPS_SIM_ABI64

>                 __vdso_clock_gettime64;

>  #endif

>

> That should ensure that no user space can call the old vdso

> functions on a kernel that intentionally breaks the actual

> syscalls.


I can confirm that patch fixes things. Thanks.

Jason
Arnd Bergmann Dec. 30, 2019, 12:34 p.m. UTC | #3
On Mon, Dec 30, 2019 at 1:27 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Mon, Dec 30, 2019 at 12:58 PM Arnd Bergmann <arnd@arndb.de> wrote:

> > Thanks for the bug report! I'm not completely sure why this fails in

> > this particular

> > way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot

>

> Yes, that's the one, sorry.

>

> > diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S

> > index da4627430aba..0bdc6a026be8 100644

> > --- a/arch/mips/vdso/vdso.lds.S

> > +++ b/arch/mips/vdso/vdso.lds.S

> > @@ -93,9 +93,11 @@ VERSION

> >         LINUX_2.6 {

> >  #ifndef DISABLE_MIPS_VDSO

> >         global:

> > +#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME)

> >                 __vdso_clock_gettime;

> >                 __vdso_gettimeofday;

> >                 __vdso_clock_getres;

> > +#endif

> >  #if _MIPS_SIM != _MIPS_SIM_ABI64

> >                 __vdso_clock_gettime64;

> >  #endif

> >

> > That should ensure that no user space can call the old vdso

> > functions on a kernel that intentionally breaks the actual

> > syscalls.

>

> I can confirm that patch fixes things. Thanks.


Ok, that's good news, but I think we still need to answer two questions:

- Why does it crash in the first place rather than returning -ENOSYS?

- How does it actually work if you run an application built against
  an old musl version on a kernel that tries to make this not work?
  Do you just get a random time (uninitialized user space stack) and
  work with that without checking the error code?

      Arnd
Jason A. Donenfeld Dec. 30, 2019, 2:37 p.m. UTC | #4
On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>

> - Why does it crash in the first place rather than returning -ENOSYS?


There's a bit of speculation about this in the original thread that
prompted this patch (you're CC'd).

>

> - How does it actually work if you run an application built against

>   an old musl version on a kernel that tries to make this not work?

>   Do you just get a random time (uninitialized user space stack) and

>   work with that without checking the error code?


Actually, your patch fails here. The ts struct remains as it was
before, filled with garbage. No good. My original patch in this
thread, though, does result in the correct value being written to ts.

Jason
Jason A. Donenfeld Dec. 30, 2019, 3:10 p.m. UTC | #5
On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote:

> >

> > - Why does it crash in the first place rather than returning -ENOSYS?

>

> There's a bit of speculation about this in the original thread that

> prompted this patch (you're CC'd).


The following will provoke the crash:

__attribute__((noinline)) void somefunc(void) { }

int __clock_gettime(clockid_t clk, struct timespec *ts)
{
       ((int (*)(clockid_t, struct timespec *))vdso_func)(clk, ts);
       somefunc();
       return 88;
}

It seems like the VDSO is doing something to the stack.
Arnd Bergmann Dec. 30, 2019, 3:37 p.m. UTC | #6
On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote:

> >

> > - Why does it crash in the first place rather than returning -ENOSYS?

>

> There's a bit of speculation about this in the original thread that

> prompted this patch (you're CC'd).

>

> >

> > - How does it actually work if you run an application built against

> >   an old musl version on a kernel that tries to make this not work?

> >   Do you just get a random time (uninitialized user space stack) and

> >   work with that without checking the error code?

>

> Actually, your patch fails here. The ts struct remains as it was

> before, filled with garbage. No good. My original patch in this

> thread, though, does result in the correct value being written to ts.


Ok, that is the intended behavior then, clock_gettime() needs
to fail with -EINVAL or -ENOSYS here (depending on the libc
implementation), and of course the data is not updated.

Returning success from clock_gettime() on a kernel with only
time64 support and a libc with only time32 support (or vice
versa) would be a bug.

    Arnd
Jason A. Donenfeld Dec. 30, 2019, 3:39 p.m. UTC | #7
On Mon, Dec 30, 2019 at 4:37 PM Arnd Bergmann <arnd@arndb.de> wrote:
>

> On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> >

> > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote:

> > >

> > > - Why does it crash in the first place rather than returning -ENOSYS?

> >

> > There's a bit of speculation about this in the original thread that

> > prompted this patch (you're CC'd).

> >

> > >

> > > - How does it actually work if you run an application built against

> > >   an old musl version on a kernel that tries to make this not work?

> > >   Do you just get a random time (uninitialized user space stack) and

> > >   work with that without checking the error code?

> >

> > Actually, your patch fails here. The ts struct remains as it was

> > before, filled with garbage. No good. My original patch in this

> > thread, though, does result in the correct value being written to ts.

>

> Ok, that is the intended behavior then, clock_gettime() needs

> to fail with -EINVAL or -ENOSYS here (depending on the libc

> implementation), and of course the data is not updated.

>

> Returning success from clock_gettime() on a kernel with only

> time64 support and a libc with only time32 support (or vice

> versa) would be a bug.


Ah, right, hence why the 32-bit compat code is behind a
still-on-by-default-but-not-for-long menu option.
Arnd Bergmann Dec. 30, 2019, 3:47 p.m. UTC | #8
On Mon, Dec 30, 2019 at 4:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Mon, Dec 30, 2019 at 4:37 PM Arnd Bergmann <arnd@arndb.de> wrote:

> >

> > On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> > >

> > > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote:


> > Returning success from clock_gettime() on a kernel with only

> > time64 support and a libc with only time32 support (or vice

> > versa) would be a bug.

>

> Ah, right, hence why the 32-bit compat code is behind a

> still-on-by-default-but-not-for-long menu option.


I expect this to remain on-by-default for a rather long time, as we still
to run old binaries well into 2030s. Making it configurable is done for
two reasons:

- on embedded systems that have all user space built with time64,
  turning this off can make the kernel slightly smaller

- turning it off lets you check that no code relies on calling the old
  interfaces internally, bypassing the C library, in a way that works
  at the moment but may break after 2038.

         Arnd
Jason A. Donenfeld Dec. 30, 2019, 3:58 p.m. UTC | #9
Makes sense w.r.t. time32 situation.

I still think that in spite of that there's still something weird
happening with the mips VDSO.

Here's a register dump before the call:

 $ 0   : 0000000000000000 0000000000000001 0000000010000000 fffffffffffffffc
 $ 4   : 0000000000000002 000000007fff2e40 0000000000000000 0000000000000001
 $ 8   : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 $12   : 0000000000000000 000000000000000a ffffffff80000000 000000007fffffda
 $16   : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000
 $20   : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2ae8
 $24   : 0000000000000005 0000000077fa1d18
 $28   : 0000000010019cf0 000000007fff2e40 0000000000000000 0000000010000c30
 Hi    : 0000000000000000
 Lo    : 0000000000000000
 epc   : 0000000077fa1d18 0x77fa1d18
 ra    : 0000000010000c30 0x10000c30

And here it is immediately after:

 $ 0   : 0000000000000000 0000000000000001 ffffffffffffffa7 000000007fff5000
 $ 4   : 0000000000000002 000000007fff2e40 0000000077ff2000 0000000000000001
 $ 8   : 0000000000000006 0000000000000020 0000000000000002 0000000000000000
 $12   : 0000000000000000 0000000000001852 ffffffff80156160 000000007fffffda
 $16   : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000
 $20   : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2b00
 $24   : 0000000000000005 0000000000000000
 $28   : 000000007fff5000 000000007fff2e30 0000000000000000 0000000077fa1e00
 Hi    : 0000000000000000
 Lo    : 0000000000000000
 epc   : 0000000077fa1e00 0x77fa1e00
 ra    : 0000000077fa1e00 0x77fa1e00

I wonder if a toolchain option or compiler bug or something is causing
the vdso to not restore certain registers (gp? ra?).
Arnd Bergmann Dec. 30, 2019, 5:33 p.m. UTC | #10
On Mon, Dec 30, 2019 at 4:58 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> Makes sense w.r.t. time32 situation.

>

> I still think that in spite of that there's still something weird

> happening with the mips VDSO.


Agreed.

> Here's a register dump before the call:

>

>  $ 0   : 0000000000000000 0000000000000001 0000000010000000 fffffffffffffffc

>  $ 4   : 0000000000000002 000000007fff2e40 0000000000000000 0000000000000001

>  $ 8   : 0000000000000000 0000000000000000 0000000000000000 0000000000000000

>  $12   : 0000000000000000 000000000000000a ffffffff80000000 000000007fffffda

>  $16   : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000

>  $20   : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2ae8

>  $24   : 0000000000000005 0000000077fa1d18

>  $28   : 0000000010019cf0 000000007fff2e40 0000000000000000 0000000010000c30

>  Hi    : 0000000000000000

>  Lo    : 0000000000000000

>  epc   : 0000000077fa1d18 0x77fa1d18

>  ra    : 0000000010000c30 0x10000c30

>

> And here it is immediately after:

>

>  $ 0   : 0000000000000000 0000000000000001 ffffffffffffffa7 000000007fff5000

>  $ 4   : 0000000000000002 000000007fff2e40 0000000077ff2000 0000000000000001

>  $ 8   : 0000000000000006 0000000000000020 0000000000000002 0000000000000000

>  $12   : 0000000000000000 0000000000001852 ffffffff80156160 000000007fffffda

>  $16   : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000

>  $20   : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2b00

>  $24   : 0000000000000005 0000000000000000

>  $28   : 000000007fff5000 000000007fff2e30 0000000000000000 0000000077fa1e00

>  Hi    : 0000000000000000

>  Lo    : 0000000000000000

>  epc   : 0000000077fa1e00 0x77fa1e00

>  ra    : 0000000077fa1e00 0x77fa1e00


Is this immediately before/after the syscall instruction or the
indirect function call?

> I wonder if a toolchain option or compiler bug or something is causing

> the vdso to not restore certain registers (gp? ra?).


Here is the assembler output I see for the o32 vdso, hopefully I got all
the relevant bits:

 # /git/arm-soc/lib/vdso/gettimeofday.c:130:    if (unlikely(ret))
        .set    noreorder
        .set    nomacro
        beqz    $2,$L86  #,,
        lw      $28,16($sp)      #,
        .set    macro
        .set    reorder

$L46:
 # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:118:
register struct old_timespec32 *ts asm("a1") = _ts;
        move    $5,$16   # ts, ts
 # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:119:
register clockid_t clkid asm("a0") = _clkid;
        move    $4,$17   # clkid, clock
 # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:121:
register long nr asm("v0") = __NR_clock_gettime;
        li      $2,4263                 # 0x10a7         # nr,
 # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:124:  asm volatile(
#APP
 # 124 "/git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h" 1
               syscall

 # 0 "" 2
 # /git/arm-soc/arch/mips/vdso/vgettimeofday.c:18: }
#NO_APP
        lw      $31,60($sp)      #,
        lw      $19,56($sp)      #,
 # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:131:  return
error ? -ret : ret;
        subu    $3,$0,$2         # <retval>, ret
        selnez  $3,$3,$7         # tmp406, <retval>, error
        seleqz  $2,$2,$7         # tmp407, ret, error
        or      $3,$3,$2         # <retval>, tmp406, tmp407
 # /git/arm-soc/arch/mips/vdso/vgettimeofday.c:18: }
        lw      $18,52($sp)      #,
        lw      $17,48($sp)      #,
        lw      $16,44($sp)      #,
        move    $2,$3    #, <retval>
        .set    noreorder
        .set    nomacro
        jr      $31      #
        addiu   $sp,$sp,64       #,,
        .set    macro
        .set    reorder

gp ($28) and ra ($31) sound like good guesses to me,

SP ($r29) changed from 000000007fff2e40
to 000000007fff2e30, if that is not the intention, it would clearly explain why
anything afterwards crashes, but that seems unlikely.

r3 contains the error code -ENOSYS on mips, so that's good.

$23 is supposed to be preserved across function calls and is
consequently not part of the clobber list but is modified.

$25 is part of the clobber list and is also modified, but there
is no code to save/restore it in the assembler output.

      Arnd
Jason A. Donenfeld Dec. 30, 2019, 9:09 p.m. UTC | #11
On Mon, Dec 30, 2019 at 6:33 PM Arnd Bergmann <arnd@arndb.de> wrote
> Is this immediately before/after the syscall instruction or the

> indirect function call?


It's immediately after/before the call to the VDSO function itself.
Next I'll try to instrument the VDSO to get closer to that syscall.

I produced those reg dumps by hooking the page fault handler in the
kernel to print them and then disabling aslr and sticking a
`*(volatile int *)0 = 0;` in the code. Pretty gnarly.
Jason A. Donenfeld Dec. 30, 2019, 9:42 p.m. UTC | #12
On Mon, Dec 30, 2019 at 10:09 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Mon, Dec 30, 2019 at 6:33 PM Arnd Bergmann <arnd@arndb.de> wrote

> > Is this immediately before/after the syscall instruction or the

> > indirect function call?

>

> It's immediately after/before the call to the VDSO function itself.

> Next I'll try to instrument the VDSO to get closer to that syscall.

>

> I produced those reg dumps by hooking the page fault handler in the

> kernel to print them and then disabling aslr and sticking a

> `*(volatile int *)0 = 0;` in the code. Pretty gnarly.


Here's immediately before and immediately after the syscall asm that
the vdso has in mips/include/asm/vdso/gettimeofday.h. sp and ra are
wrong?

Before:

[    0.546364] $ 0   : 0000000000000000 0000000000000001
0000000000000002 0000000000000000
[    0.546545] $ 4   : 000000007fff4000 0000000000000000
0000000077ff0000 0000000000000406
[    0.546762] $ 8   : 000000007fff5000 0000000000000020
0000000000000002 0000000000000000
[    0.546912] $12   : 0000000000000000 000000000000000a
ffffffff80000000 000000000000006d
[    0.547046] $16   : 000000007fff2e40 000000007fff2e40
0000000010000000 0000000010000000
[    0.547178] $20   : 0000000010000000 0000000010000000
0000000000000000 0000000077ff0000
[    0.547548] $24   : 0000000000000005 0000000000000000
[    0.547743] $28   : 000000007fff5000 000000007fff2df0
0000000000000000 000000007fff550c
[    0.547898] Hi    : 0000000000000000
[    0.548010] Lo    : 0000000000000000
[    0.548175] epc   : 000000007fff5580 0x7fff5580
[    0.548358] ra    : 000000007fff550c 0x7fff550c
[    0.549305] Stack : 0000000000000002 000000007fff2e40
0000000000000002 0000000077f9e80c
[    0.549500]         0000000000000000 0000000000000000
ffffffffffffffff 0000000010000000
[    0.549687]         0000000010019dd0 0000000010000c20
0000000077ff0000 0000000077fa4868
[    0.549951]         0000000377ff19b8 0000000000000000
000000007fff2f04 0000000000000001
[    0.550127]         0000000010000870 0000000077ff0000
0000000077fa4868 0000000077ff19b8
[    0.550277]         0000000077ff7180 0000000077f297ac
7fff2f0c77ff7180 0000000077f29800
[    0.550458]         0000000000000000 000000007fff2f00
0000000077ff19b8 0000000077ff1e30
[    0.550613]         0000000010019dd0 0000000010000dec
0000000010019dd0 0000000010000db0
[    0.550811]         0000000000000000 0000000000000000
000000017fff2fda 000000007fff2fe0
[    0.550957]         7fff2fe700000000 000000217fff5000
0000001000000020 0000000600001000

After:

[    0.577975] $ 0   : 0000000000000000 0000000000000001
0000000000000059 000000007fff5000
[    0.578191] $ 4   : 0000000000000002 000000007fff2e40
0000000077ff0000 0000000000000001
[    0.578392] $ 8   : 0000000000000006 0000000000000020
0000000000000002 0000000000000000
[    0.578611] $12   : 0000000000000000 0000000000001852
ffffffff801560e0 000000000000006d
[    0.578817] $16   : 0000000000000002 000000007fff2e40
0000000010000000 0000000010000000
[    0.579004] $20   : 0000000010000000 0000000010000000
0000000000000000 0000000077ff0000
[    0.579149] $24   : 0000000000000005 0000000000000000
[    0.579375] $28   : 000000007fff5000 000000007fff2de0
0000000000000000 000000007fff551c
[    0.579640] Hi    : 0000000000000000
[    0.579799] Lo    : 0000000000000000
[    0.579974] epc   : 000000007fff55a0 0x7fff55a0
[    0.580134] ra    : 000000007fff551c 0x7fff551c
[    0.581293] Stack : 0000000000000000 0000000077f9e760
0000000000000002 000000007fff2e40
[    0.581456]         0000000077ff0000 0000000077f9e80c
0000000000000000 0000000000000000
[    0.581619]         ffffffffffffffff 0000000010000000
0000000010019dd0 0000000010000c20
[    0.581834]         0000000077ff0000 0000000077fa4868
0000000377ff19b8 0000000000000000
[    0.581985]         000000007fff2f04 0000000000000001
0000000010000870 0000000077ff0000
[    0.582136]         0000000077fa4868 0000000077ff19b8
0000000077ff7180 0000000077f297ac
[    0.582288]         7fff2f0c77ff7180 0000000077f29800
0000000000000000 000000007fff2f00
[    0.582438]         0000000077ff19b8 0000000077ff1e30
0000000010019dd0 0000000010000dec
[    0.582585]         0000000010019dd0 0000000010000db0
0000000000000000 0000000000000000
[    0.582732]         000000017fff2fda 000000007fff2fe0
7fff2fe700000000 000000217fff5000
Jason A. Donenfeld Dec. 31, 2019, 4:14 p.m. UTC | #13
Here's a "one click" reproducer:
https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz

Untar that and hit `make -j$(nproc)`, and you'll get a freshly built
and crashing kernel+userland.
Paul Burton Jan. 1, 2020, 4:10 a.m. UTC | #14
Hi Jason,

On Tue, Dec 31, 2019 at 05:14:41PM +0100, Jason A. Donenfeld wrote:
> Here's a "one click" reproducer:

> https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz

> 

> Untar that and hit `make -j$(nproc)`, and you'll get a freshly built

> and crashing kernel+userland.


Thanks for the test case. It seems like the VDSO code isn't saving &
restoring $gp/$28, even though it's meant to be callee-saved in both the
n32 & n64 ABIs. With some digging I found that the below seems to
resolve the issue. Could you check whether it works for you?

I'm still not quite sure *why* this happens; perhaps GCC just decides it
doesn't need to save & restore $gp/$28 when it spots that it's being
"used" for __current_thread_info (even though that's never actually
referenced in the VDSO)?

Just moving the declaration of __current_thread_info inside the
current_thread_info() function seems to do the trick too, and is
probably a bit neater.

Thanks,
    Paul

---
diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h
index 4993db40482c..ac33959bbb1f 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -50,7 +50,11 @@ struct thread_info {
 }

 /* How to get the thread information struct from C.  */
+#ifdef __VDSO__
+register struct thread_info *__current_thread_info __asm__("$0");
+#else
 register struct thread_info *__current_thread_info __asm__("$28");
+#endif

 static inline struct thread_info *current_thread_info(void)
 {
Paul Burton Jan. 1, 2020, 4:25 a.m. UTC | #15
Hi Jason,

On Tue, Dec 31, 2019 at 08:10:56PM -0800, Paul Burton wrote:
> I'm still not quite sure *why* this happens; perhaps GCC just decides it

> doesn't need to save & restore $gp/$28 when it spots that it's being

> "used" for __current_thread_info (even though that's never actually

> referenced in the VDSO)?


Ah:

> After defining a global register variable, for the current compilation

> unit:

> 

> - If the register is a call-saved register, call ABI is affected: the

>   register will not be restored in function epilogue sequences after

>   the variable has been assigned. Therefore, functions cannot safely

>   return to callers that assume standard ABI.


https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html

That makes sense then. What doesn't make sense is how this ever
worked. A question for another day...

Thanks,
    Paul
Jason A. Donenfeld Jan. 1, 2020, 9:47 a.m. UTC | #16
On Wed, Jan 1, 2020 at 5:08 AM Paul Burton <paulburton@kernel.org> wrote:
>

> Hi Jason,

>

> On Tue, Dec 31, 2019 at 05:14:41PM +0100, Jason A. Donenfeld wrote:

> > Here's a "one click" reproducer:

> > https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz

> >

> > Untar that and hit `make -j$(nproc)`, and you'll get a freshly built

> > and crashing kernel+userland.

>

> Thanks for the test case. It seems like the VDSO code isn't saving &

> restoring $gp/$28, even though it's meant to be callee-saved in both the

> n32 & n64 ABIs. With some digging I found that the below seems to

> resolve the issue. Could you check whether it works for you?

>

> I'm still not quite sure *why* this happens; perhaps GCC just decides it

> doesn't need to save & restore $gp/$28 when it spots that it's being

> "used" for __current_thread_info (even though that's never actually

> referenced in the VDSO)?

>

> Just moving the declaration of __current_thread_info inside the

> current_thread_info() function seems to do the trick too, and is

> probably a bit neater.

>

> Thanks,

>     Paul

>

> ---

> diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h

> index 4993db40482c..ac33959bbb1f 100644

> --- a/arch/mips/include/asm/thread_info.h

> +++ b/arch/mips/include/asm/thread_info.h

> @@ -50,7 +50,11 @@ struct thread_info {

>  }

>

>  /* How to get the thread information struct from C.  */

> +#ifdef __VDSO__

> +register struct thread_info *__current_thread_info __asm__("$0");

> +#else

>  register struct thread_info *__current_thread_info __asm__("$28");

> +#endif

>

>  static inline struct thread_info *current_thread_info(void)

>  {


Holy guacamole, nice catch. That's interesting behavior indeed...

I'll leave it to you to submit for 5.5?
Jason A. Donenfeld Jan. 1, 2020, 9:47 a.m. UTC | #17
On Wed, Jan 1, 2020 at 5:22 AM Paul Burton <paulburton@kernel.org> wrote:
> That makes sense then. What doesn't make sense is how this ever

> worked. A question for another day...


In most other cases, calls to the vdso would be followed in the parent
function immediately by a return, restoring gp then.
diff mbox series

Patch

diff --git a/arch/mips/include/asm/vdso/gettimeofday.h b/arch/mips/include/asm/vdso/gettimeofday.h
index b08825531e9f..7f1aa610e68e 100644
--- a/arch/mips/include/asm/vdso/gettimeofday.h
+++ b/arch/mips/include/asm/vdso/gettimeofday.h
@@ -107,7 +107,7 @@  static __always_inline int clock_getres_fallback(
 	return error ? -ret : ret;
 }
 
-#if _MIPS_SIM != _MIPS_SIM_ABI64
+#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_COMPAT_32BIT_TIME)
 
 #define VDSO_HAS_32BIT_FALLBACK	1