Message ID | 9f513eef618b6e72a088cc8b2787496f190d1c2d.1600203307.git.luto@kernel.org |
---|---|
State | New |
Headers | show |
Series | x86/smap: Fix the smap_save() asm | expand |
On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski <luto@kernel.org> wrote: > > The old smap_save() code was: > > pushf > pop %0 > > with %0 defined by an "=rm" constraint. This is fine if the > compiler picked the register option, but it was incorrect with an > %rsp-relative memory operand. It is incorrect because ... (I think mentioning the point about the red zone would be good, unless there were additional concerns?) The patch should be fine, so Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> regardless of whether or not you choose to amend the commit message. I suspect that the use of "r" exclusively without "m" could lead to register exhaustion (though it's more likely the compiler will spill some other variable to stack), which is why it's common to use it in conjunction with "m." As Bill's patch notes, getting the "m" version is poor for performance of this pattern, or at least for native_{save|restore}_fl. Being able to use the compiler builtins is another possibility here. > With some intentional abuse, I can > get both gcc and clang to generate code along these lines: > > pushfq > popq 0x8(%rsp) > mov 0x8(%rsp),%rax > > which is incorrect and will not work as intended. > > Fix it by removing the memory option. This issue is exacerbated by > a clang optimization bug: > > https://bugs.llvm.org/show_bug.cgi?id=47530 This is something we should fix. Bill, James, and I are discussing this internally. Thank you for filing a bug; I owe you a beer just for that. > > Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()") > Cc: stable@vger.kernel.org > Reported-by: Bill Wendling <morbo@google.com> # I think LOL, yes, the comment can be dropped...though I guess someone else may have reported the problem to Bill? > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > arch/x86/include/asm/smap.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h > index 8b58d6975d5d..be6d675ae3ac 100644 > --- a/arch/x86/include/asm/smap.h > +++ b/arch/x86/include/asm/smap.h > @@ -61,7 +61,7 @@ static __always_inline unsigned long smap_save(void) > ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP) > "pushf; pop %0; " __ASM_CLAC "\n\t" > "1:" > - : "=rm" (flags) : : "memory", "cc"); > + : "=r" (flags) : : "memory", "cc"); > > return flags; > } > -- > 2.26.2 > -- Thanks, ~Nick Desaulniers
On Tue, Sep 15, 2020 at 2:24 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski <luto@kernel.org> wrote: > > Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()") > > Cc: stable@vger.kernel.org > > Reported-by: Bill Wendling <morbo@google.com> # I think > > LOL, yes, the comment can be dropped...though I guess someone else may > have reported the problem to Bill? > I found this instance, but not the general issue. :-) -bw
> On Sep 15, 2020, at 2:24 PM, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski <luto@kernel.org> wrote: >> >> The old smap_save() code was: >> >> pushf >> pop %0 >> >> with %0 defined by an "=rm" constraint. This is fine if the >> compiler picked the register option, but it was incorrect with an >> %rsp-relative memory operand. > > It is incorrect because ... (I think mentioning the point about the > red zone would be good, unless there were additional concerns?) This isn’t a red zone issue — it’s a just-plain-wrong issue. The popf is storing the result in the wrong place in memory — it’s RSP-relative, but RSP is whatever the compiler thinks it should be minus 8, because the compiler doesn’t know that pushfq changed RSP. > > This is something we should fix. Bill, James, and I are discussing > this internally. Thank you for filing a bug; I owe you a beer just > for that. I’m looking forward to the day that beers can be exchanged in person again :) > >> >> Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()") >> Cc: stable@vger.kernel.org >> Reported-by: Bill Wendling <morbo@google.com> # I think > > LOL, yes, the comment can be dropped...though I guess someone else may > have reported the problem to Bill? The “I think” is because I’m not sure whether Bill reported this particular issue. But I’m fine with dropping it.
On Tue, Sep 15, 2020 at 4:40 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 16/09/2020 00:11, Andy Lutomirski wrote: > >> On Sep 15, 2020, at 2:24 PM, Nick Desaulniers <ndesaulniers@google.com> wrote: > >> > >> On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski <luto@kernel.org> wrote: > >>> The old smap_save() code was: > >>> > >>> pushf > >>> pop %0 > >>> > >>> with %0 defined by an "=rm" constraint. This is fine if the > >>> compiler picked the register option, but it was incorrect with an > >>> %rsp-relative memory operand. > >> It is incorrect because ... (I think mentioning the point about the > >> red zone would be good, unless there were additional concerns?) > > This isn’t a red zone issue — it’s a just-plain-wrong issue. The popf is storing the result in the wrong place in memory — it’s RSP-relative, but RSP is whatever the compiler thinks it should be minus 8, because the compiler doesn’t know that pushfq changed RSP. > > It's worse than that. Even when stating that %rsp is modified in the > asm, the generated code sequence is still buggy, for recent Clang and GCC. > > https://godbolt.org/z/ccz9v7 > > It's clearly not safe to ever use memory operands with pushf/popf asm > fragments. > Would this apply to native_save_fl() and native_restore_fl in arch/x86/include/asm/irqflags.h? It was like that two revisions ago, but it was changed (back) to "=rm" with a comment about it being safe. > >> This is something we should fix. Bill, James, and I are discussing > >> this internally. Thank you for filing a bug; I owe you a beer just > >> for that. > > I’m looking forward to the day that beers can be exchanged in person again :) > > +1 to that. > +100 -bw
On Tue, Sep 15, 2020 at 01:55:58PM -0700, Andy Lutomirski wrote: > The old smap_save() code was: > > pushf > pop %0 > > with %0 defined by an "=rm" constraint. This is fine if the > compiler picked the register option, but it was incorrect with an > %rsp-relative memory operand. With some intentional abuse, I can > get both gcc and clang to generate code along these lines: > > pushfq > popq 0x8(%rsp) > mov 0x8(%rsp),%rax > > which is incorrect and will not work as intended. We need another constraint :-) > Fix it by removing the memory option. This issue is exacerbated by > a clang optimization bug: > > https://bugs.llvm.org/show_bug.cgi?id=47530 > > Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()") > Cc: stable@vger.kernel.org > Reported-by: Bill Wendling <morbo@google.com> # I think > Signed-off-by: Andy Lutomirski <luto@kernel.org> > --- > arch/x86/include/asm/smap.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h > index 8b58d6975d5d..be6d675ae3ac 100644 > --- a/arch/x86/include/asm/smap.h > +++ b/arch/x86/include/asm/smap.h > @@ -61,7 +61,7 @@ static __always_inline unsigned long smap_save(void) > ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP) > "pushf; pop %0; " __ASM_CLAC "\n\t" > "1:" > - : "=rm" (flags) : : "memory", "cc"); > + : "=r" (flags) : : "memory", "cc"); native_save_fl() has the exact same code; you'll need to fix both.
On Wed, Sep 16, 2020 at 12:40:30AM +0100, Andrew Cooper wrote: > It's worse than that. Even when stating that %rsp is modified in the > asm, the generated code sequence is still buggy, for recent Clang and GCC. > > https://godbolt.org/z/ccz9v7 > > It's clearly not safe to ever use memory operands with pushf/popf asm > fragments. So I went and singlestepped your snippet in gdb. And it all seems to work - it is simply a bit confusing: :-) eflags 0x246 [ PF ZF IF ] => 0x000055555555505d <main+13>: 9c pushfq 0x7fffffffe440: 0x00007fffffffe540 0x0000000000000000 0x7fffffffe450: 0x0000000000000000 0x00007ffff7e0ecca 0x7fffffffe460: 0x00007fffffffe548 0x00000001ffffe7c9 0x7fffffffe470: 0x0000555555555050 0x00007ffff7e0e8f8 0x7fffffffe480: 0x0000000000000000 0x0c710afd7e78681b those lines under the "=>" line are the stack contents printed with $ x/10gx $sp Then, we will pop into 0x8(%rsp): => 0x55555555505e <main+14>: popq 0x8(%rsp) 0x7fffffffe438: 0x0000000000000346 0x00007fffffffe540 0x7fffffffe448: 0x0000000000000000 0x0000000000000000 0x7fffffffe458: 0x00007ffff7e0ecca 0x00007fffffffe548 0x7fffffffe468: 0x00000001ffffe7c9 0x0000555555555050 0x7fffffffe478: 0x00007ffff7e0e8f8 0x0000000000000000 Now, POP copies the value pointed to by %rsp, *increments* the stack pointer and *then* computes the effective address of the operand. It says so in the SDM too (thanks peterz!): "If the ESP register is used as a base register for addressing a destination operand in memory, the POP instruction computes the effective address of the operand after it increments the ESP register." *That*s why, FLAGS is in 0x7fffffffe448! which is %rsp + 8. Basically flags is there *twice* on the stack: (gdb) x/10x 0x7fffffffe438 0x7fffffffe438: 0x0000000000000346 0x00007fffffffe540 ^^^^^^^^^^^^^^^^^^ 0x7fffffffe448: 0x0000000000000346 0x0000000000000000 ^^^^^^^^^^^^^^^^^^ 0x7fffffffe458: 0x00007ffff7e0ecca 0x00007fffffffe548 0x7fffffffe468: 0x00000001ffffe7c9 0x0000555555555050 0x7fffffffe478: 0x00007ffff7e0e8f8 0x0000000000000000 and now we read the second copy into %rsi. => 0x555555555062 <main+18>: mov 0x8(%rsp),%rsi 0x7fffffffe440: 0x00007fffffffe540 0x0000000000000346 0x7fffffffe450: 0x0000000000000000 0x00007ffff7e0ecca 0x7fffffffe460: 0x00007fffffffe548 0x00000001ffffe7c9 0x7fffffffe470: 0x0000555555555050 0x00007ffff7e0e8f8 0x7fffffffe480: 0x0000000000000000 0x0c710afd7e78681b Looks like it works as designed. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Sep 16, 2020 at 11:48:42PM +0100, Andrew Cooper wrote: > Every day is a school day. Tell me about it... > This is very definitely one to be filed in obscure x86 corner cases. > > The code snippet above is actually wrong for the kernel, as it uses one > slot of the red-zone. Recompiling with -mno-red-zone makes something > which looks safe stack-wise, give or take this behaviour. Right, we recently disabled red zone in the early decompression stage, for SEV-ES: https://git.kernel.org/tip/6ba0efa46047936afa81460489cfd24bc95dd863 I probably should go audit that for similar funnies: $ objdump -d arch/x86/boot/compressed/vmlinux | grep -E "pop.*\(%[er]?sp" $ Nope, nothing. Because building your snippet with: $ gcc -Wall -O2 -mno-red-zone -o flags{,.c} still does use that one slot: 0000000000001050 <main>: 1050: 48 83 ec 18 sub $0x18,%rsp 1054: 48 8d 3d a9 0f 00 00 lea 0xfa9(%rip),%rdi # 2004 <_IO_stdin_used+0x4> 105b: 31 c0 xor %eax,%eax 105d: 9c pushfq 105e: 8f 44 24 08 popq 0x8(%rsp) 1062: 48 8b 74 24 08 mov 0x8(%rsp),%rsi Wonder if that flag -mno-red-zone even does anything... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Sep 17, 2020 at 10:05:37AM +0000, David Laight wrote: > The 'red-zone' allows leaf function to use stack memory for locals > that is below (ie the wrong side of) the stack pointer. After looking at "Figure 3.3: Stack Frame with Base Pointer" in the x86-64 ABI doc, you're probably right: 0(%rbp) -8(%rbp) ... 0(%rsp) .. red zone -128(%rsp) i.e., rsp-relative addresses with negative offsets are in the red zone. So it looks like the compiler actually knows very well what's going on here and allocates room on the stack for that 0x8(%rsp) slot. Good. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Sep 17, 2020 at 02:25:50PM +0000, David Laight wrote: > I actually wonder if there is any code that really benefits from > the red-zone. The kernel has been without a red zone since 2002 at least: commit 47f16da277d10ef9494f3e9da2a9113bb22bcd75 Author: Andi Kleen <ak@muc.de> Date: Tue Feb 12 20:17:35 2002 -0800 [PATCH] x86_64 merge: arch + asm This adds the x86_64 arch and asm directories and a Documentation/x86_64. ... +CFLAGS += $(shell if $(CC) -mno-red-zone -S -o /dev/null -xc /dev/null >/dev/null 2>&1; then echo "-mno-red-zone"; fi ) Also, from the ABI doc: "A.2.2 Stack Layout The Linux kernel may align the end of the input argument area to a 8, instead of 16, byte boundary. It does not honor the red zone (see section 3.2.2) and therefore this area is not allowed to be used by kernel code. Kernel code should be compiled by GCC with the option -mno-red-zone." so forget the red zone. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Sep 17, 2020 at 7:39 AM Borislav Petkov <bp@alien8.de> wrote: > > On Thu, Sep 17, 2020 at 02:25:50PM +0000, David Laight wrote: > > I actually wonder if there is any code that really benefits from > > the red-zone. > > The kernel has been without a red zone since 2002 at least: > > commit 47f16da277d10ef9494f3e9da2a9113bb22bcd75 > Author: Andi Kleen <ak@muc.de> > Date: Tue Feb 12 20:17:35 2002 -0800 > > [PATCH] x86_64 merge: arch + asm > > This adds the x86_64 arch and asm directories and a Documentation/x86_64. > > ... > +CFLAGS += $(shell if $(CC) -mno-red-zone -S -o /dev/null -xc /dev/null >/dev/null 2>&1; then echo "-mno-red-zone"; fi ) > > > Also, from the ABI doc: > > "A.2.2 Stack Layout > > The Linux kernel may align the end of the input argument area to a > 8, instead of 16, byte boundary. It does not honor the red zone (see > section 3.2.2) and therefore this area is not allowed to be used by > kernel code. Kernel code should be compiled by GCC with the option > -mno-red-zone." > > so forget the red zone. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette Regardless of anything that any docs may or may not say, the kernel *can't* use a redzone -- an interrupt would overwrite it.
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h index 8b58d6975d5d..be6d675ae3ac 100644 --- a/arch/x86/include/asm/smap.h +++ b/arch/x86/include/asm/smap.h @@ -61,7 +61,7 @@ static __always_inline unsigned long smap_save(void) ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP) "pushf; pop %0; " __ASM_CLAC "\n\t" "1:" - : "=rm" (flags) : : "memory", "cc"); + : "=r" (flags) : : "memory", "cc"); return flags; }
The old smap_save() code was: pushf pop %0 with %0 defined by an "=rm" constraint. This is fine if the compiler picked the register option, but it was incorrect with an %rsp-relative memory operand. With some intentional abuse, I can get both gcc and clang to generate code along these lines: pushfq popq 0x8(%rsp) mov 0x8(%rsp),%rax which is incorrect and will not work as intended. Fix it by removing the memory option. This issue is exacerbated by a clang optimization bug: https://bugs.llvm.org/show_bug.cgi?id=47530 Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()") Cc: stable@vger.kernel.org Reported-by: Bill Wendling <morbo@google.com> # I think Signed-off-by: Andy Lutomirski <luto@kernel.org> --- arch/x86/include/asm/smap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)