[RFC,AArch64] Add support for system register based stack protector canary access

Message ID 7a5a57fa-629d-d2ff-6292-e0893647ec8a@arm.com
State New
Headers show
Series
  • [RFC,AArch64] Add support for system register based stack protector canary access
Related show

Commit Message

Ramana Radhakrishnan Dec. 3, 2018, 9:55 a.m.
For quite sometime the kernel guys, (more specifically Ard) have been 
talking about using a system register (sp_el0) and an offset from that 
for a canary based access. This patchset adds support for a new set of
command line options similar to how powerpc has done this.

I don't intend to change the defaults in userland, we've discussed this 
for user-land in the past and as far as glibc and userland is concerned 
we stick to the options as currently existing. The system register 
option is really for the kernel to use along with an offset as they 
control their ABI and this is a decision for them to make.

I did consider sticking this all under a mcmodel=kernel-small option but
thought that would be a bit too aggressive. There is very little error
checking I can do in terms of the system register being used and really
the assembler would barf quite quickly in case things go wrong. I've
managed to rebuild Ard's kernel tree with an additional patch that
I will send to him. I haven't managed to boot this kernel.

There was an additional question asked about the performance 
characteristics of this but it's a security feature and the kernel 
doesn't have the luxury of a hidden symbol. Further since the kernel 
uses sp_el0 for access everywhere and if they choose to use the same 
register I don't think the performance characteristics would be too bad, 
but that's a decision for the kernel folks to make when taking in the 
feature into the kernel.

I still need to add some tests and documentation in invoke.texi but
this is at the stage where it would be nice for some other folks
to look at this.

The difference in code generated is as below.

extern void bar (char *);
int foo (void)
{
   char a[100];
   bar (&a);
}

$GCC -O2  -fstack-protector-strong  vs 
-mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg 
-mstack-protector-guard-offset=1024 -fstack-protector-strong

	



I will be afk tomorrow and day after but this is to elicit some comments 
and for Ard to try this out with his kernel patches.

Thoughts ?

regards
Ramana

gcc/ChangeLog:

2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

         * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New
         * config/aarch64/aarch64.c (aarch64_override_options_internal): 
Handle
         and put in error checks for stack protector guard options.
         (aarch64_stack_protect_guard): New.
         (TARGET_STACK_PROTECT_GUARD): Define.
         * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.
         (reg_stack_protect_address<mode>): New.
         (stack_protect_set): Adjust for SSP_GLOBAL.
         (stack_protect_test): Likewise.
         * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.
         (-mstack-protector-guard): Likewise.
         (-mstack-protector-guard-offset): Likewise.
         * doc/invoke.texi: Document new AArch64 options.
commit 9febaa23c114e598ddc9a2406ad96d8fa3ebe0c6
Author: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Date:   Mon Nov 19 10:12:12 2018 +0000


diff --git a/gcc/config/aarch64/aarch64-opts.h b/gcc/config/aarch64/aarch64-opts.h
index 7a5c6d7664f..2f06f3e0e5a 100644
--- a/gcc/config/aarch64/aarch64-opts.h
+++ b/gcc/config/aarch64/aarch64-opts.h
@@ -91,4 +91,10 @@ enum aarch64_sve_vector_bits_enum {
   SVE_2048 = 2048
 };
 
+/* Where to get the canary for the stack protector.  */
+enum stack_protector_guard {
+  SSP_SYSREG,			/* per-thread canary in special system register */
+  SSP_GLOBAL			/* global canary */
+};
+
 #endif
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 0d89ba27e4a..a56b2166542 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10955,6 +10955,41 @@ aarch64_override_options_internal (struct gcc_options *opts)
   if (opts->x_flag_strict_volatile_bitfields < 0 && abi_version_at_least (2))
     opts->x_flag_strict_volatile_bitfields = 1;
 
+  if (aarch64_stack_protector_guard == SSP_GLOBAL
+      && opts->x_aarch64_stack_protector_guard_offset_str)
+    {
+      error ("Incompatible options -mstack-protector-guard=global and"
+	     "-mstack-protector-guard-offset=%qs",
+	     aarch64_stack_protector_guard_offset_str);
+    }
+
+  if (aarch64_stack_protector_guard == SSP_SYSREG
+      && !(opts->x_aarch64_stack_protector_guard_offset_str
+	   && opts->x_aarch64_stack_protector_guard_reg_str))
+    {
+      error ("Both -mstack-protector-guard-offset and "
+	     "-mstack-protector-guard-reg must be used "
+	     "with -mstack-protector-guard=sysreg");
+    }
+
+  if (opts->x_aarch64_stack_protector_guard_reg_str)
+    {
+      if (strlen (opts->x_aarch64_stack_protector_guard_reg_str) > 100)
+	  error ("Specify a system register with a small string length.");
+    }
+
+  if (opts->x_aarch64_stack_protector_guard_offset_str)
+    {
+      char *end;
+      const char *str = aarch64_stack_protector_guard_offset_str;
+      errno = 0;
+      long offs = strtol (aarch64_stack_protector_guard_offset_str, &end, 0);
+      if (!*str || *end || errno)
+	error ("%qs is not a valid offset in %qs", str,
+	       "-mstack-protector-guard-offset=");
+      aarch64_stack_protector_guard_offset = offs;
+    }
+
   initialize_aarch64_code_model (opts);
   initialize_aarch64_tls_size (opts);
 
@@ -17872,8 +17907,24 @@ aarch64_run_selftests (void)
 
 } // namespace selftest
 
+/* Implement TARGET_STACK_PROTECT_GUARD. In case of a
+   global variable based guard use the default else
+   return a null tree.  */
+static tree
+aarch64_stack_protect_guard (void)
+{
+  if (aarch64_stack_protector_guard == SSP_GLOBAL)
+    return default_stack_protect_guard ();
+
+  return NULL_TREE;
+}
+
+
 #endif /* #if CHECKING_P */
 
+#undef TARGET_STACK_PROTECT_GUARD
+#define TARGET_STACK_PROTECT_GUARD aarch64_stack_protect_guard
+
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST aarch64_address_cost
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 82af4d47f78..8b0eb9e4382 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -194,6 +194,7 @@
     UNSPEC_UCVTF
     UNSPEC_USHL_2S
     UNSPEC_VSTRUCTDUMMY
+    UNSPEC_SSP_SYSREG
     UNSPEC_SP_SET
     UNSPEC_SP_TEST
     UNSPEC_RSQRT
@@ -6561,13 +6562,46 @@
   ""
 {
   machine_mode mode = GET_MODE (operands[0]);
+  if (aarch64_stack_protector_guard != SSP_GLOBAL)
+  {
+    /* Generate access through the system register.  */
+    rtx tmp_reg = gen_reg_rtx (mode);
+    if (mode == DImode)
+    {
+        emit_insn (gen_reg_stack_protect_address_di (tmp_reg));
+        emit_insn (gen_adddi3 (tmp_reg, tmp_reg,
+			       GEN_INT (aarch64_stack_protector_guard_offset)));
+    }
+    else
+    {
+	emit_insn (gen_reg_stack_protect_address_si (tmp_reg));
+	emit_insn (gen_addsi3 (tmp_reg, tmp_reg,
+			       GEN_INT (aarch64_stack_protector_guard_offset)));
 
+    }
+    operands[1] = gen_rtx_MEM (mode, tmp_reg);
+  }
+  
   emit_insn ((mode == DImode
 	      ? gen_stack_protect_set_di
 	      : gen_stack_protect_set_si) (operands[0], operands[1]));
   DONE;
 })
 
+(define_insn "reg_stack_protect_address_<mode>"
+ [(set (match_operand:PTR 0 "register_operand" "=r")
+       (unspec:PTR [(const_int 0)]
+	UNSPEC_SSP_SYSREG))]
+ "aarch64_stack_protector_guard != SSP_GLOBAL"
+ {
+   char buf[150];
+   snprintf (buf, 150, "mrs\\t%%<w>0, %s",
+	    aarch64_stack_protector_guard_reg_str);
+   output_asm_insn (buf, operands);
+   return "";
+ }
+ [(set_attr "type" "mrs")])
+
 (define_insn "stack_protect_set_<mode>"
   [(set (match_operand:PTR 0 "memory_operand" "=m")
 	(unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")]
@@ -6588,12 +6622,34 @@
   machine_mode mode = GET_MODE (operands[0]);
 
   result = gen_reg_rtx(mode);
+  if (aarch64_stack_protector_guard != SSP_GLOBAL)
+  {
+    /* Generate access through the system register. The
+       sequence we want here is the access
+       of the stack offset to come with
+       mrs scratch_reg, <system_register>
+       add scratch_reg, scratch_reg, :lo12:offset. */
+    rtx tmp_reg = gen_reg_rtx (mode);
+    if (mode == DImode)
+    {
+       emit_insn (gen_reg_stack_protect_address_di (tmp_reg));
+       emit_insn (gen_adddi3 (tmp_reg, tmp_reg,
+       		              GEN_INT (aarch64_stack_protector_guard_offset)));
+    }
+    else
+    {
+	emit_insn (gen_reg_stack_protect_address_si (tmp_reg));
+	emit_insn (gen_addsi3 (tmp_reg, tmp_reg,
+			       GEN_INT (aarch64_stack_protector_guard_offset)));
 
+    }
+    operands[1] = gen_rtx_MEM (mode, tmp_reg);
+  }
   emit_insn ((mode == DImode
-	      ? gen_stack_protect_test_di
-	      : gen_stack_protect_test_si) (result,
-					    operands[0],
-					    operands[1]));
+		  ? gen_stack_protect_test_di
+		  : gen_stack_protect_test_si) (result,
+					        operands[0],
+					        operands[1]));
 
   if (mode == DImode)
     emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index b2e80cbf6f1..1aaf4beb329 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -218,3 +218,33 @@ Enables verbose cost model dumping in the debug dump files.
 mtrack-speculation
 Target Var(aarch64_track_speculation)
 Generate code to track when the CPU might be speculating incorrectly.
+
+mstack-protector-guard=
+Target RejectNegative Joined Enum(stack_protector_guard) Var(aarch64_stack_protector_guard) Init(SSP_GLOBAL)
+Use given stack-protector guard.
+
+Enum
+Name(stack_protector_guard) Type(enum stack_protector_guard)
+Valid arguments to -mstack-protector-guard=:
+
+EnumValue
+Enum(stack_protector_guard) String(sysreg) Value(SSP_SYSREG)
+
+EnumValue
+Enum(stack_protector_guard) String(global) Value(SSP_GLOBAL)
+
+mstack-protector-guard-reg=
+Target Joined RejectNegative String Var(aarch64_stack_protector_guard_reg_str)
+Use the system register specified on the command line as the stack protector
+guard register. This option is for use with fstack-protector-strong and
+not for use in user-land code.
+
+mstack-protector-guard-offset=
+Target Joined RejectNegative String Var(aarch64_stack_protector_guard_offset_str)
+Use an immediate to offset from the stack protector guard register, sp_el0.
+This option is for use with fstack-protector-strong and not for use in
+user-land code.
+
+TargetVariable
+long aarch64_stack_protector_guard_offset = 0
+
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d46ebd02c4e..dbe1ca42be4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -629,7 +629,9 @@ Objective-C and Objective-C++ Dialects}.
 -mpc-relative-literal-loads @gol
 -msign-return-address=@var{scope} @gol
 -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
--moverride=@var{string}  -mverbose-cost-dump  -mtrack-speculation} 
+-moverride=@var{string}  -mverbose-cost-dump @gol
+-mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} @gol
+-mstack-protector-guard-offset=@var{offset} -mtrack-speculation }
 
 @emph{Adapteva Epiphany Options}
 @gccoptlist{-mhalf-reg-file  -mprefer-short-insn-regs @gol
@@ -15450,6 +15452,24 @@ object boundary as described in the architecture specification.
 Omit or keep the frame pointer in leaf functions.  The former behavior is the
 default.
 
+@item -mstack-protector-guard=@var{guard}
+@itemx -mstack-protector-guard-reg=@var{reg}
+@itemx -mstack-protector-guard-offset=@var{offset}
+@opindex mstack-protector-guard
+@opindex mstack-protector-guard-reg
+@opindex mstack-protector-guard-offset
+Generate stack protection code using canary at @var{guard}.  Supported
+locations are @samp{global} for a global canary or @samp{sysreg} for a
+canary in an appropriate system register.
+
+With the latter choice the options
+@option{-mstack-protector-guard-reg=@var{reg}} and
+@option{-mstack-protector-guard-offset=@var{offset}} furthermore specify
+which system register to use as base register for reading the canary,
+and from what offset from that base register. There is no default
+register or offset as this is entirely for use within the Linux
+kernel.
+
 @item -mtls-dialect=desc
 @opindex mtls-dialect=desc
 Use TLS descriptors as the thread-local storage mechanism for dynamic accesses

Comments

Jakub Jelinek Dec. 3, 2018, 9:59 a.m. | #1
On Mon, Dec 03, 2018 at 09:55:36AM +0000, Ramana Radhakrishnan wrote:
> +  if (aarch64_stack_protector_guard == SSP_GLOBAL

> +      && opts->x_aarch64_stack_protector_guard_offset_str)

> +    {

> +      error ("Incompatible options -mstack-protector-guard=global and"


Diagnostic messages shouldn't start with capital letters (3 times in the
patch), unless it is something that is capitalized always, even in the
middle of sentences (say GNU etc.).

	Jakub
Ramana Radhakrishnan Dec. 3, 2018, 10:03 a.m. | #2
On 03/12/2018 09:59, Jakub Jelinek wrote:
> On Mon, Dec 03, 2018 at 09:55:36AM +0000, Ramana Radhakrishnan wrote:

>> +  if (aarch64_stack_protector_guard == SSP_GLOBAL

>> +      && opts->x_aarch64_stack_protector_guard_offset_str)

>> +    {

>> +      error ("Incompatible options -mstack-protector-guard=global and"

> 

> Diagnostic messages shouldn't start with capital letters (3 times in the

> patch), unless it is something that is capitalized always, even in the

> middle of sentences (say GNU etc.).

> 

> 	Jakub

> 

Sorry - will fix in next iteration.

Ramana
Florian Weimer Dec. 3, 2018, 3:30 p.m. | #3
* Ramana Radhakrishnan:

> I don't intend to change the defaults in userland, we've discussed this 

> for user-land in the past and as far as glibc and userland is concerned 

> we stick to the options as currently existing. The system register 

> option is really for the kernel to use along with an offset as they 

> control their ABI and this is a decision for them to make.


For userland, I would like to eventually copy the OpenBSD approach for
architectures which have some form of PC-relative addressing: we can
have multiple random canaries in (RELRO) .rodata in sufficiently close
to the code that needs them (assuming that we have split .rodata).  At
least for x86-64, I expect this to be a small win.  It's also a slight
hardening improvement if the reference canary is not stored in writable
memory.

Thanks,
Florian
Ard Biesheuvel Dec. 3, 2018, 4:39 p.m. | #4
On Mon, 3 Dec 2018 at 10:55, Ramana Radhakrishnan
<Ramana.Radhakrishnan@arm.com> wrote:
>

> For quite sometime the kernel guys, (more specifically Ard) have been

> talking about using a system register (sp_el0) and an offset from that

> for a canary based access. This patchset adds support for a new set of

> command line options similar to how powerpc has done this.

>

> I don't intend to change the defaults in userland, we've discussed this

> for user-land in the past and as far as glibc and userland is concerned

> we stick to the options as currently existing. The system register

> option is really for the kernel to use along with an offset as they

> control their ABI and this is a decision for them to make.

>

> I did consider sticking this all under a mcmodel=kernel-small option but

> thought that would be a bit too aggressive. There is very little error

> checking I can do in terms of the system register being used and really

> the assembler would barf quite quickly in case things go wrong. I've

> managed to rebuild Ard's kernel tree with an additional patch that

> I will send to him. I haven't managed to boot this kernel.

>

> There was an additional question asked about the performance

> characteristics of this but it's a security feature and the kernel

> doesn't have the luxury of a hidden symbol. Further since the kernel

> uses sp_el0 for access everywhere and if they choose to use the same

> register I don't think the performance characteristics would be too bad,

> but that's a decision for the kernel folks to make when taking in the

> feature into the kernel.

>

> I still need to add some tests and documentation in invoke.texi but

> this is at the stage where it would be nice for some other folks

> to look at this.

>

> The difference in code generated is as below.

>

> extern void bar (char *);

> int foo (void)

> {

>    char a[100];

>    bar (&a);

> }

>

> $GCC -O2  -fstack-protector-strong  vs

> -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg

> -mstack-protector-guard-offset=1024 -fstack-protector-strong

>

>

> --- tst.s       2018-12-03 09:46:21.174167443 +0000

> +++ tst.s.1     2018-12-03 09:46:03.546257203 +0000

> @@ -15,15 +15,14 @@

>         mov     x29, sp

>         str     x19, [sp, 16]

>         .cfi_offset 19, -128

> -       adrp    x19, __stack_chk_guard

> -       add     x19, x19, :lo12:__stack_chk_guard

> -       ldr     x0, [x19]

> -       str     x0, [sp, 136]

> -       mov     x0,0

> +       mrs     x19, sp_el0

>         add     x0, sp, 32

> +       ldr     x1, [x19, 1024]

> +       str     x1, [sp, 136]

> +       mov     x1,0

>         bl      bar

>         ldr     x0, [sp, 136]

> -       ldr     x1, [x19]

> +       ldr     x1, [x19, 1024]

>         eor     x1, x0, x1

>         cbnz    x1, .L5

>

>

>

>

> I will be afk tomorrow and day after but this is to elicit some comments

> and for Ard to try this out with his kernel patches.

>


Thanks Ramana. I managed to build and run a complete kernel (including
modules) on a bare metal system, and everything works as expected.

The only thing I'd like to confirm with you is the logic wrt the
command line arguments, more specifically, if/when all 3 arguments
have to appear, and whether they are permitted to appear if
-fstack-protector is not set.

This is relevant given that we invoke the compiler in 3 different ways:
- at the configure stage, we invoke the compiler with some/all of
these options to decide whether the feature is supported, but the
actual offset is not known, but also irrelevant
- we invoke the compiler to build the header file that actually gives
us the offset to pass to later invocations
- finally, all kernel objects are built with all 3 arguments passed on
the command line

It looks like your code permits -mstack-protector-guard-reg at any
time, but only permits -mstack-protector-guard-offset if
-mstack-protector-guard is set to sysreg (and thus set explicitly,
since the default is global). Is that intentional? Can we expect this
to remain like that?
Ramana Radhakrishnan Jan. 10, 2019, 10:53 a.m. | #5
On Mon, Dec 3, 2018 at 9:55 AM Ramana Radhakrishnan
<Ramana.Radhakrishnan@arm.com> wrote:
>

> For quite sometime the kernel guys, (more specifically Ard) have been

> talking about using a system register (sp_el0) and an offset from that

> for a canary based access. This patchset adds support for a new set of

> command line options similar to how powerpc has done this.

>

> I don't intend to change the defaults in userland, we've discussed this

> for user-land in the past and as far as glibc and userland is concerned

> we stick to the options as currently existing. The system register

> option is really for the kernel to use along with an offset as they

> control their ABI and this is a decision for them to make.

>

> I did consider sticking this all under a mcmodel=kernel-small option but

> thought that would be a bit too aggressive. There is very little error

> checking I can do in terms of the system register being used and really

> the assembler would barf quite quickly in case things go wrong. I've

> managed to rebuild Ard's kernel tree with an additional patch that

> I will send to him. I haven't managed to boot this kernel.

>

> There was an additional question asked about the performance

> characteristics of this but it's a security feature and the kernel

> doesn't have the luxury of a hidden symbol. Further since the kernel

> uses sp_el0 for access everywhere and if they choose to use the same

> register I don't think the performance characteristics would be too bad,

> but that's a decision for the kernel folks to make when taking in the

> feature into the kernel.

>

> I still need to add some tests and documentation in invoke.texi but

> this is at the stage where it would be nice for some other folks

> to look at this.

>

> The difference in code generated is as below.

>

> extern void bar (char *);

> int foo (void)

> {

>    char a[100];

>    bar (&a);

> }

>

> $GCC -O2  -fstack-protector-strong  vs

> -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg

> -mstack-protector-guard-offset=1024 -fstack-protector-strong

>

>

> --- tst.s       2018-12-03 09:46:21.174167443 +0000

> +++ tst.s.1     2018-12-03 09:46:03.546257203 +0000

> @@ -15,15 +15,14 @@

>         mov     x29, sp

>         str     x19, [sp, 16]

>         .cfi_offset 19, -128

> -       adrp    x19, __stack_chk_guard

> -       add     x19, x19, :lo12:__stack_chk_guard

> -       ldr     x0, [x19]

> -       str     x0, [sp, 136]

> -       mov     x0,0

> +       mrs     x19, sp_el0

>         add     x0, sp, 32

> +       ldr     x1, [x19, 1024]

> +       str     x1, [sp, 136]

> +       mov     x1,0

>         bl      bar

>         ldr     x0, [sp, 136]

> -       ldr     x1, [x19]

> +       ldr     x1, [x19, 1024]

>         eor     x1, x0, x1

>         cbnz    x1, .L5

>

>

>

>

> I will be afk tomorrow and day after but this is to elicit some comments

> and for Ard to try this out with his kernel patches.

>

> Thoughts ?

>

> regards

> Ramana

>

> gcc/ChangeLog:

>

> 2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

>

>          * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New

>          * config/aarch64/aarch64.c (aarch64_override_options_internal):

> Handle

>          and put in error checks for stack protector guard options.

>          (aarch64_stack_protect_guard): New.

>          (TARGET_STACK_PROTECT_GUARD): Define.

>          * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.

>          (reg_stack_protect_address<mode>): New.

>          (stack_protect_set): Adjust for SSP_GLOBAL.

>          (stack_protect_test): Likewise.

>          * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.

>          (-mstack-protector-guard): Likewise.

>          (-mstack-protector-guard-offset): Likewise.

>          * doc/invoke.texi: Document new AArch64 options.


Any further thoughts or is it just Jakub's comments that I need to
address on this patch ? It looks like the kernel folks have queued
this for the next kernel release and given this is helping the kernel
with a security feature, can we move this forward ?

Ramana
Jakub Jelinek Jan. 10, 2019, 11:05 a.m. | #6
On Thu, Jan 10, 2019 at 10:53:32AM +0000, Ramana Radhakrishnan wrote:
> > 2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

> >

> >          * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New

> >          * config/aarch64/aarch64.c (aarch64_override_options_internal):

> > Handle

> >          and put in error checks for stack protector guard options.

> >          (aarch64_stack_protect_guard): New.

> >          (TARGET_STACK_PROTECT_GUARD): Define.

> >          * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.

> >          (reg_stack_protect_address<mode>): New.

> >          (stack_protect_set): Adjust for SSP_GLOBAL.

> >          (stack_protect_test): Likewise.

> >          * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.

> >          (-mstack-protector-guard): Likewise.

> >          (-mstack-protector-guard-offset): Likewise.

> >          * doc/invoke.texi: Document new AArch64 options.

> 

> Any further thoughts or is it just Jakub's comments that I need to

> address on this patch ? It looks like the kernel folks have queued

> this for the next kernel release and given this is helping the kernel

> with a security feature, can we move this forward ?


From RM POV this is ok in stage4 if you commit it RSN.
Both x86 and powerpc have -mstack-protector-guard{,-reg,-offset}= options,
x86 even has -mstack-protector-guard-symbol=.  So it would be nice if the
aarch64 options are compatible with those other arches.

Please make sure you don't regress non-glibc SSP support (don't repeat
PR85644/PR86832).

	Jakub
Ramana Radhakrishnan Jan. 10, 2019, 12:51 p.m. | #7
On Thu, Jan 10, 2019 at 11:05 AM Jakub Jelinek <jakub@redhat.com> wrote:
>

> On Thu, Jan 10, 2019 at 10:53:32AM +0000, Ramana Radhakrishnan wrote:

> > > 2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

> > >

> > >          * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New

> > >          * config/aarch64/aarch64.c (aarch64_override_options_internal):

> > > Handle

> > >          and put in error checks for stack protector guard options.

> > >          (aarch64_stack_protect_guard): New.

> > >          (TARGET_STACK_PROTECT_GUARD): Define.

> > >          * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.

> > >          (reg_stack_protect_address<mode>): New.

> > >          (stack_protect_set): Adjust for SSP_GLOBAL.

> > >          (stack_protect_test): Likewise.

> > >          * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.

> > >          (-mstack-protector-guard): Likewise.

> > >          (-mstack-protector-guard-offset): Likewise.

> > >          * doc/invoke.texi: Document new AArch64 options.

> >

> > Any further thoughts or is it just Jakub's comments that I need to

> > address on this patch ? It looks like the kernel folks have queued

> > this for the next kernel release and given this is helping the kernel

> > with a security feature, can we move this forward ?

>

> From RM POV this is ok in stage4 if you commit it RSN.

> Both x86 and powerpc have -mstack-protector-guard{,-reg,-offset}= options,

> x86 even has -mstack-protector-guard-symbol=.  So it would be nice if the

> aarch64 options are compatible with those other arches.

>


Thanks Jakub. I haven't added the -mstack-protector-guard-symbol as
there is no requirement to do so now and I don't want to add an option
that isn't being used. IIRC, the other options seem to be in sync with
x86 and powerpc.

> Please make sure you don't regress non-glibc SSP support (don't repeat

> PR85644/PR86832).

>


That should be ok as I'm not changing any defaults. I would expect
that non-glibc based libraries that support SSP must be mimicking
glibc support for this using the global symbol as there is nothing
special in the backend for this today. I guess there is freebsd as a
non-glibc target or musl that I can look at but I don't expect that to
be an issue.

I'll wait until tomorrow to respin just to see if I can get any
further feedback.

regards
Ramana



>         Jakub
James Greenhalgh Jan. 10, 2019, 3:49 p.m. | #8
On Mon, Dec 03, 2018 at 03:55:36AM -0600, Ramana Radhakrishnan wrote:
> For quite sometime the kernel guys, (more specifically Ard) have been 

> talking about using a system register (sp_el0) and an offset from that 

> for a canary based access. This patchset adds support for a new set of

> command line options similar to how powerpc has done this.

> 

> I don't intend to change the defaults in userland, we've discussed this 

> for user-land in the past and as far as glibc and userland is concerned 

> we stick to the options as currently existing. The system register 

> option is really for the kernel to use along with an offset as they 

> control their ABI and this is a decision for them to make.

> 

> I did consider sticking this all under a mcmodel=kernel-small option but

> thought that would be a bit too aggressive. There is very little error

> checking I can do in terms of the system register being used and really

> the assembler would barf quite quickly in case things go wrong. I've

> managed to rebuild Ard's kernel tree with an additional patch that

> I will send to him. I haven't managed to boot this kernel.

> 

> There was an additional question asked about the performance 

> characteristics of this but it's a security feature and the kernel 

> doesn't have the luxury of a hidden symbol. Further since the kernel 

> uses sp_el0 for access everywhere and if they choose to use the same 

> register I don't think the performance characteristics would be too bad, 

> but that's a decision for the kernel folks to make when taking in the 

> feature into the kernel.

> 

> I still need to add some tests and documentation in invoke.texi but

> this is at the stage where it would be nice for some other folks

> to look at this.

> 

> The difference in code generated is as below.

> 

> extern void bar (char *);

> int foo (void)

> {

>    char a[100];

>    bar (&a);

> }

> 

> $GCC -O2  -fstack-protector-strong  vs 

> -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg 

> -mstack-protector-guard-offset=1024 -fstack-protector-strong

> 

> 	

> --- tst.s	2018-12-03 09:46:21.174167443 +0000

> +++ tst.s.1	2018-12-03 09:46:03.546257203 +0000

> @@ -15,15 +15,14 @@

>   	mov	x29, sp

>   	str	x19, [sp, 16]

>   	.cfi_offset 19, -128

> -	adrp	x19, __stack_chk_guard

> -	add	x19, x19, :lo12:__stack_chk_guard

> -	ldr	x0, [x19]

> -	str	x0, [sp, 136]

> -	mov	x0,0

> +	mrs	x19, sp_el0

>   	add	x0, sp, 32

> +	ldr	x1, [x19, 1024]

> +	str	x1, [sp, 136]

> +	mov	x1,0

>   	bl	bar

>   	ldr	x0, [sp, 136]

> -	ldr	x1, [x19]

> +	ldr	x1, [x19, 1024]

>   	eor	x1, x0, x1

>   	cbnz	x1, .L5

> 

> 

> 

> 

> I will be afk tomorrow and day after but this is to elicit some comments 

> and for Ard to try this out with his kernel patches.

> 

> Thoughts ?


I didn't see ananswer on list to Ard's questions about the command-line logic.
Remember to also fix up the error message concerns Florian raised.

That said, if Jakub is happy with this in Stage 4, I am too.

My biggest concern is the -mstack-protector-guard-reg interface, which
is unchecked user input and so opens up nasty ways to force the compiler
towards out of bounds accesses (e.g.
-mstack-protector-guard-reg="What memory is at %10")

Thanks,
James

> 

> regards

> Ramana

> 

> gcc/ChangeLog:

> 

> 2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

> 

>          * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New

>          * config/aarch64/aarch64.c (aarch64_override_options_internal): 

> Handle

>          and put in error checks for stack protector guard options.

>          (aarch64_stack_protect_guard): New.

>          (TARGET_STACK_PROTECT_GUARD): Define.

>          * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.

>          (reg_stack_protect_address<mode>): New.

>          (stack_protect_set): Adjust for SSP_GLOBAL.

>          (stack_protect_test): Likewise.

>          * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.

>          (-mstack-protector-guard): Likewise.

>          (-mstack-protector-guard-offset): Likewise.

>          * doc/invoke.texi: Document new AArch64 options.
Will Deacon Jan. 10, 2019, 3:55 p.m. | #9
On Thu, Jan 10, 2019 at 03:49:27PM +0000, James Greenhalgh wrote:
> On Mon, Dec 03, 2018 at 03:55:36AM -0600, Ramana Radhakrishnan wrote:

> > For quite sometime the kernel guys, (more specifically Ard) have been 

> > talking about using a system register (sp_el0) and an offset from that 

> > for a canary based access. This patchset adds support for a new set of

> > command line options similar to how powerpc has done this.

> > 

> > I don't intend to change the defaults in userland, we've discussed this 

> > for user-land in the past and as far as glibc and userland is concerned 

> > we stick to the options as currently existing. The system register 

> > option is really for the kernel to use along with an offset as they 

> > control their ABI and this is a decision for them to make.

> > 

> > I did consider sticking this all under a mcmodel=kernel-small option but

> > thought that would be a bit too aggressive. There is very little error

> > checking I can do in terms of the system register being used and really

> > the assembler would barf quite quickly in case things go wrong. I've

> > managed to rebuild Ard's kernel tree with an additional patch that

> > I will send to him. I haven't managed to boot this kernel.

> > 

> > There was an additional question asked about the performance 

> > characteristics of this but it's a security feature and the kernel 

> > doesn't have the luxury of a hidden symbol. Further since the kernel 

> > uses sp_el0 for access everywhere and if they choose to use the same 

> > register I don't think the performance characteristics would be too bad, 

> > but that's a decision for the kernel folks to make when taking in the 

> > feature into the kernel.

> > 

> > I still need to add some tests and documentation in invoke.texi but

> > this is at the stage where it would be nice for some other folks

> > to look at this.

> > 

> > The difference in code generated is as below.

> > 

> > extern void bar (char *);

> > int foo (void)

> > {

> >    char a[100];

> >    bar (&a);

> > }

> > 

> > $GCC -O2  -fstack-protector-strong  vs 

> > -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg 

> > -mstack-protector-guard-offset=1024 -fstack-protector-strong

> > 

> > 	

> > --- tst.s	2018-12-03 09:46:21.174167443 +0000

> > +++ tst.s.1	2018-12-03 09:46:03.546257203 +0000

> > @@ -15,15 +15,14 @@

> >   	mov	x29, sp

> >   	str	x19, [sp, 16]

> >   	.cfi_offset 19, -128

> > -	adrp	x19, __stack_chk_guard

> > -	add	x19, x19, :lo12:__stack_chk_guard

> > -	ldr	x0, [x19]

> > -	str	x0, [sp, 136]

> > -	mov	x0,0

> > +	mrs	x19, sp_el0

> >   	add	x0, sp, 32

> > +	ldr	x1, [x19, 1024]

> > +	str	x1, [sp, 136]

> > +	mov	x1,0

> >   	bl	bar

> >   	ldr	x0, [sp, 136]

> > -	ldr	x1, [x19]

> > +	ldr	x1, [x19, 1024]

> >   	eor	x1, x0, x1

> >   	cbnz	x1, .L5

> > 

> > 

> > 

> > 

> > I will be afk tomorrow and day after but this is to elicit some comments 

> > and for Ard to try this out with his kernel patches.

> > 

> > Thoughts ?

> 

> I didn't see ananswer on list to Ard's questions about the command-line logic.


FWIW: the kernel-side is now merged upstream in 5.0-rc1:

http://git.kernel.org/linus/0a1213fa7432

where we ended up checking for the presence of all three options to be
on the safe side.

Will
Ramana Radhakrishnan Jan. 10, 2019, 4:49 p.m. | #10
On 10/01/2019 15:49, James Greenhalgh wrote:
> On Mon, Dec 03, 2018 at 03:55:36AM -0600, Ramana Radhakrishnan wrote:

>> For quite sometime the kernel guys, (more specifically Ard) have been

>> talking about using a system register (sp_el0) and an offset from that

>> for a canary based access. This patchset adds support for a new set of

>> command line options similar to how powerpc has done this.

>>

>> I don't intend to change the defaults in userland, we've discussed this

>> for user-land in the past and as far as glibc and userland is concerned

>> we stick to the options as currently existing. The system register

>> option is really for the kernel to use along with an offset as they

>> control their ABI and this is a decision for them to make.

>>

>> I did consider sticking this all under a mcmodel=kernel-small option but

>> thought that would be a bit too aggressive. There is very little error

>> checking I can do in terms of the system register being used and really

>> the assembler would barf quite quickly in case things go wrong. I've

>> managed to rebuild Ard's kernel tree with an additional patch that

>> I will send to him. I haven't managed to boot this kernel.

>>

>> There was an additional question asked about the performance

>> characteristics of this but it's a security feature and the kernel

>> doesn't have the luxury of a hidden symbol. Further since the kernel

>> uses sp_el0 for access everywhere and if they choose to use the same

>> register I don't think the performance characteristics would be too bad,

>> but that's a decision for the kernel folks to make when taking in the

>> feature into the kernel.

>>

>> I still need to add some tests and documentation in invoke.texi but

>> this is at the stage where it would be nice for some other folks

>> to look at this.

>>

>> The difference in code generated is as below.

>>

>> extern void bar (char *);

>> int foo (void)

>> {

>>     char a[100];

>>     bar (&a);

>> }

>>

>> $GCC -O2  -fstack-protector-strong  vs

>> -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg

>> -mstack-protector-guard-offset=1024 -fstack-protector-strong

>>

>> 	

>> --- tst.s	2018-12-03 09:46:21.174167443 +0000

>> +++ tst.s.1	2018-12-03 09:46:03.546257203 +0000

>> @@ -15,15 +15,14 @@

>>    	mov	x29, sp

>>    	str	x19, [sp, 16]

>>    	.cfi_offset 19, -128

>> -	adrp	x19, __stack_chk_guard

>> -	add	x19, x19, :lo12:__stack_chk_guard

>> -	ldr	x0, [x19]

>> -	str	x0, [sp, 136]

>> -	mov	x0,0

>> +	mrs	x19, sp_el0

>>    	add	x0, sp, 32

>> +	ldr	x1, [x19, 1024]

>> +	str	x1, [sp, 136]

>> +	mov	x1,0

>>    	bl	bar

>>    	ldr	x0, [sp, 136]

>> -	ldr	x1, [x19]

>> +	ldr	x1, [x19, 1024]

>>    	eor	x1, x0, x1

>>    	cbnz	x1, .L5

>>

>>

>>

>>

>> I will be afk tomorrow and day after but this is to elicit some comments

>> and for Ard to try this out with his kernel patches.

>>

>> Thoughts ?

> 

> I didn't see ananswer on list to Ard's questions about the command-line logic.


Ah I must have missed that - will take that up separately.

> Remember to also fix up the error message concerns Florian raised.

> 



> That said, if Jakub is happy with this in Stage 4, I am too.

> 

> My biggest concern is the -mstack-protector-guard-reg interface, which

> is unchecked user input and so opens up nasty ways to force the compiler

> towards out of bounds accesses (e.g.

> -mstack-protector-guard-reg="What memory is at %10")

> 


-mstack-protector-guard-reg is fine - it's a system register , if the 
assembler doesn't recognize it , it will barf.

-mstack-protector-guard-offset=<offset> I assume is what you are 
concerned about. I don't have a good answer to that one and am going to 
chicken out and say this is the same interface as x86 and power and 
while I accept it's an access to any location, the user can still do 
that with a C program and any arbitrary inline asm :-/



regards
Ramana

> Thanks,

> James

> 

>>

>> regards

>> Ramana

>>

>> gcc/ChangeLog:

>>

>> 2018-11-23  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>

>>

>>           * config/aarch64/aarch64-opts.h (enum stack_protector_guard): New

>>           * config/aarch64/aarch64.c (aarch64_override_options_internal):

>> Handle

>>           and put in error checks for stack protector guard options.

>>           (aarch64_stack_protect_guard): New.

>>           (TARGET_STACK_PROTECT_GUARD): Define.

>>           * config/aarch64/aarch64.md (UNSPEC_SSP_SYSREG): New.

>>           (reg_stack_protect_address<mode>): New.

>>           (stack_protect_set): Adjust for SSP_GLOBAL.

>>           (stack_protect_test): Likewise.

>>           * config/aarch64/aarch64.opt (-mstack-protector-guard-reg): New.

>>           (-mstack-protector-guard): Likewise.

>>           (-mstack-protector-guard-offset): Likewise.

>>           * doc/invoke.texi: Document new AArch64 options.

>
Ramana Radhakrishnan Jan. 10, 2019, 4:53 p.m. | #11
On 03/12/2018 16:39, Ard Biesheuvel wrote:
> On Mon, 3 Dec 2018 at 10:55, Ramana Radhakrishnan

> <Ramana.Radhakrishnan@arm.com> wrote:

>>

>> For quite sometime the kernel guys, (more specifically Ard) have been

>> talking about using a system register (sp_el0) and an offset from that

>> for a canary based access. This patchset adds support for a new set of

>> command line options similar to how powerpc has done this.

>>

>> I don't intend to change the defaults in userland, we've discussed this

>> for user-land in the past and as far as glibc and userland is concerned

>> we stick to the options as currently existing. The system register

>> option is really for the kernel to use along with an offset as they

>> control their ABI and this is a decision for them to make.

>>

>> I did consider sticking this all under a mcmodel=kernel-small option but

>> thought that would be a bit too aggressive. There is very little error

>> checking I can do in terms of the system register being used and really

>> the assembler would barf quite quickly in case things go wrong. I've

>> managed to rebuild Ard's kernel tree with an additional patch that

>> I will send to him. I haven't managed to boot this kernel.

>>

>> There was an additional question asked about the performance

>> characteristics of this but it's a security feature and the kernel

>> doesn't have the luxury of a hidden symbol. Further since the kernel

>> uses sp_el0 for access everywhere and if they choose to use the same

>> register I don't think the performance characteristics would be too bad,

>> but that's a decision for the kernel folks to make when taking in the

>> feature into the kernel.

>>

>> I still need to add some tests and documentation in invoke.texi but

>> this is at the stage where it would be nice for some other folks

>> to look at this.

>>

>> The difference in code generated is as below.

>>

>> extern void bar (char *);

>> int foo (void)

>> {

>>     char a[100];

>>     bar (&a);

>> }

>>

>> $GCC -O2  -fstack-protector-strong  vs

>> -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard=sysreg

>> -mstack-protector-guard-offset=1024 -fstack-protector-strong

>>

>>

>> --- tst.s       2018-12-03 09:46:21.174167443 +0000

>> +++ tst.s.1     2018-12-03 09:46:03.546257203 +0000

>> @@ -15,15 +15,14 @@

>>          mov     x29, sp

>>          str     x19, [sp, 16]

>>          .cfi_offset 19, -128

>> -       adrp    x19, __stack_chk_guard

>> -       add     x19, x19, :lo12:__stack_chk_guard

>> -       ldr     x0, [x19]

>> -       str     x0, [sp, 136]

>> -       mov     x0,0

>> +       mrs     x19, sp_el0

>>          add     x0, sp, 32

>> +       ldr     x1, [x19, 1024]

>> +       str     x1, [sp, 136]

>> +       mov     x1,0

>>          bl      bar

>>          ldr     x0, [sp, 136]

>> -       ldr     x1, [x19]

>> +       ldr     x1, [x19, 1024]

>>          eor     x1, x0, x1

>>          cbnz    x1, .L5

>>

>>

>>

>>

>> I will be afk tomorrow and day after but this is to elicit some comments

>> and for Ard to try this out with his kernel patches.

>>

> 

> Thanks Ramana. I managed to build and run a complete kernel (including

> modules) on a bare metal system, and everything works as expected.

> 

> The only thing I'd like to confirm with you is the logic wrt the

> command line arguments, more specifically, if/when all 3 arguments

> have to appear, and whether they are permitted to appear if

> -fstack-protector is not set.


They are permitted to appear without -fstack-protector even though it 
doesn't make much sense ...

> 

> This is relevant given that we invoke the compiler in 3 different ways:

> - at the configure stage, we invoke the compiler with some/all of

> these options to decide whether the feature is supported, but the

> actual offset is not known, but also irrelevant

> - we invoke the compiler to build the header file that actually gives

> us the offset to pass to later invocations

> - finally, all kernel objects are built with all 3 arguments passed on

> the command line

> 

> It looks like your code permits -mstack-protector-guard-reg at any

> time, but only permits -mstack-protector-guard-offset if

> -mstack-protector-guard is set to sysreg (and thus set explicitly,

> since the default is global). Is that intentional? Can we expect this

> to remain like that?


It doesn't make sense to permit an offset if the stack protector guard 
is a global variable.


If the default changes to sysreg which I doubt, then I would expect 
-mstack-protector-guard-offset to be useable without 
-mstack-protector-guard=sysreg . However changing the default is not 
something I'm sure we have the appetite for yet in user land. The 
decision was made in 2015 that for user land the stack protector guard 
would be a hidden symbol and I expect there to be quite a lot of 
protracted discussion before changing this.


regards
Ramana


>

Patch

--- tst.s	2018-12-03 09:46:21.174167443 +0000
+++ tst.s.1	2018-12-03 09:46:03.546257203 +0000
@@ -15,15 +15,14 @@ 
  	mov	x29, sp
  	str	x19, [sp, 16]
  	.cfi_offset 19, -128
-	adrp	x19, __stack_chk_guard
-	add	x19, x19, :lo12:__stack_chk_guard
-	ldr	x0, [x19]
-	str	x0, [sp, 136]
-	mov	x0,0
+	mrs	x19, sp_el0
  	add	x0, sp, 32
+	ldr	x1, [x19, 1024]
+	str	x1, [sp, 136]
+	mov	x1,0
  	bl	bar
  	ldr	x0, [sp, 136]
-	ldr	x1, [x19]
+	ldr	x1, [x19, 1024]
  	eor	x1, x0, x1
  	cbnz	x1, .L5