diff mbox series

[aarch64] disable shrink wrapping when tracking speculative execution

Message ID 5198b7ac-afa0-193f-6c2c-47feacc91cf2@arm.com
State New
Headers show
Series [aarch64] disable shrink wrapping when tracking speculative execution | expand

Commit Message

Richard Earnshaw (lists) Nov. 2, 2018, 1:38 p.m. UTC
Although there's no fundamental reason why shrink wrapping and
speculation tracking are incompatible, a phase-ordering requirement (we
need to do speculation tracking before the final basic block clean-up)
means that the shrink wrapping pass can undo some of the changes the
speculation tracking pass makes.  The result is that the tracking, while
still safe is less comprehensive than we really want.

So to keep things simple, and because the tracking code is quite
expensive anyway, it seems best to just disable that pass when we are
tracking speculative execution.

	* config/aarch64/aarch64.c (aarch64_override_options): Disable
	shrink-wrapping when -mtrack-speculation.

Committed.

Comments

Richard Biener Nov. 2, 2018, 1:53 p.m. UTC | #1
On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>

> Although there's no fundamental reason why shrink wrapping and

> speculation tracking are incompatible, a phase-ordering requirement (we

> need to do speculation tracking before the final basic block clean-up)

> means that the shrink wrapping pass can undo some of the changes the

> speculation tracking pass makes.  The result is that the tracking, while

> still safe is less comprehensive than we really want.

>

> So to keep things simple, and because the tracking code is quite

> expensive anyway, it seems best to just disable that pass when we are

> tracking speculative execution.


Shouldn't you be able to do this per function at least?

Richard.

>         * config/aarch64/aarch64.c (aarch64_override_options): Disable

>         shrink-wrapping when -mtrack-speculation.

>

> Committed.
Richard Earnshaw (lists) Nov. 2, 2018, 2:04 p.m. UTC | #2
On 02/11/2018 13:53, Richard Biener wrote:
> On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)

> <Richard.Earnshaw@arm.com> wrote:

>>

>> Although there's no fundamental reason why shrink wrapping and

>> speculation tracking are incompatible, a phase-ordering requirement (we

>> need to do speculation tracking before the final basic block clean-up)

>> means that the shrink wrapping pass can undo some of the changes the

>> speculation tracking pass makes.  The result is that the tracking, while

>> still safe is less comprehensive than we really want.

>>

>> So to keep things simple, and because the tracking code is quite

>> expensive anyway, it seems best to just disable that pass when we are

>> tracking speculative execution.

> 

> Shouldn't you be able to do this per function at least?

> 


do what per function?  track speculation?

R.

> Richard.

> 

>>         * config/aarch64/aarch64.c (aarch64_override_options): Disable

>>         shrink-wrapping when -mtrack-speculation.

>>

>> Committed.
Richard Biener Nov. 5, 2018, 10:05 a.m. UTC | #3
On Fri, Nov 2, 2018 at 3:04 PM Richard Earnshaw (lists)
<Richard.Earnshaw@arm.com> wrote:
>

> On 02/11/2018 13:53, Richard Biener wrote:

> > On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)

> > <Richard.Earnshaw@arm.com> wrote:

> >>

> >> Although there's no fundamental reason why shrink wrapping and

> >> speculation tracking are incompatible, a phase-ordering requirement (we

> >> need to do speculation tracking before the final basic block clean-up)

> >> means that the shrink wrapping pass can undo some of the changes the

> >> speculation tracking pass makes.  The result is that the tracking, while

> >> still safe is less comprehensive than we really want.

> >>

> >> So to keep things simple, and because the tracking code is quite

> >> expensive anyway, it seems best to just disable that pass when we are

> >> tracking speculative execution.

> >

> > Shouldn't you be able to do this per function at least?

> >

>

> do what per function?  track speculation?


disable shrink-wrapping only when any speculation was there
(this is about __bultin_speculation_safe_value, no?)

Richard.

> R.

>

> > Richard.

> >

> >>         * config/aarch64/aarch64.c (aarch64_override_options): Disable

> >>         shrink-wrapping when -mtrack-speculation.

> >>

> >> Committed.

>
Richard Earnshaw (lists) Nov. 5, 2018, 10:09 a.m. UTC | #4
On 05/11/2018 10:05, Richard Biener wrote:
> On Fri, Nov 2, 2018 at 3:04 PM Richard Earnshaw (lists)

> <Richard.Earnshaw@arm.com> wrote:

>>

>> On 02/11/2018 13:53, Richard Biener wrote:

>>> On Fri, Nov 2, 2018 at 2:38 PM Richard Earnshaw (lists)

>>> <Richard.Earnshaw@arm.com> wrote:

>>>>

>>>> Although there's no fundamental reason why shrink wrapping and

>>>> speculation tracking are incompatible, a phase-ordering requirement (we

>>>> need to do speculation tracking before the final basic block clean-up)

>>>> means that the shrink wrapping pass can undo some of the changes the

>>>> speculation tracking pass makes.  The result is that the tracking, while

>>>> still safe is less comprehensive than we really want.

>>>>

>>>> So to keep things simple, and because the tracking code is quite

>>>> expensive anyway, it seems best to just disable that pass when we are

>>>> tracking speculative execution.

>>>

>>> Shouldn't you be able to do this per function at least?

>>>

>>

>> do what per function?  track speculation?

> 

> disable shrink-wrapping only when any speculation was there

> (this is about __bultin_speculation_safe_value, no?)

> 


Only indirectly.  This is about the tracking code that tracks
conditional branches and propagates that information through call/return
sequences.  Shrink wrapping messes with the prologue/epilogue sequences
after the speculation tracking pass has run and unknowingly deletes some
of the additional code that was previously inserted by the tracking pass.

R.

> Richard.

> 

>> R.

>>

>>> Richard.

>>>

>>>>         * config/aarch64/aarch64.c (aarch64_override_options): Disable

>>>>         shrink-wrapping when -mtrack-speculation.

>>>>

>>>> Committed.

>>
Segher Boessenkool Nov. 6, 2018, 1:40 a.m. UTC | #5
Hi Richard,

On Mon, Nov 05, 2018 at 10:09:30AM +0000, Richard Earnshaw (lists) wrote:
> >>> Shouldn't you be able to do this per function at least?

> >>

> >> do what per function?  track speculation?

> > 

> > disable shrink-wrapping only when any speculation was there

> > (this is about __bultin_speculation_safe_value, no?)

> 

> Only indirectly.  This is about the tracking code that tracks

> conditional branches and propagates that information through call/return

> sequences.  Shrink wrapping messes with the prologue/epilogue sequences

> after the speculation tracking pass has run and unknowingly deletes some

> of the additional code that was previously inserted by the tracking pass.


Do you have an example of this?  Shrink-wrapping does not generally
delete any code.


Segher
Richard Earnshaw (lists) Nov. 6, 2018, 11:46 a.m. UTC | #6
On 06/11/2018 01:40, Segher Boessenkool wrote:
> Hi Richard,

> 

> On Mon, Nov 05, 2018 at 10:09:30AM +0000, Richard Earnshaw (lists) wrote:

>>>>> Shouldn't you be able to do this per function at least?

>>>>

>>>> do what per function?  track speculation?

>>>

>>> disable shrink-wrapping only when any speculation was there

>>> (this is about __bultin_speculation_safe_value, no?)

>>

>> Only indirectly.  This is about the tracking code that tracks

>> conditional branches and propagates that information through call/return

>> sequences.  Shrink wrapping messes with the prologue/epilogue sequences

>> after the speculation tracking pass has run and unknowingly deletes some

>> of the additional code that was previously inserted by the tracking pass.

> 

> Do you have an example of this?  Shrink-wrapping does not generally

> delete any code.

> 


Well it generates new 'light-weight' prologue and epilogue sequences for
the 'shrunk' code path that lack the establishment of the tracker
register and doesn't know how to move the existing sequence to the new
entry sequence.

Consider this (trivial shrink-wrap case).

int f ();

int g (int a)
{
  if (a > 3)
    return f() + 4;
  return a;
}

Without speculation tracking we get (sorry, aarch64, hope you can follow
this):

g:
        cmp     w0, 3
        bgt     .L8	// a > 3
        ret
        .p2align 2
.L8:
        stp     x29, x30, [sp, -16]!
        mov     x29, sp
        bl      f		// f()
        add     w0, w0, 4	// + 4
        ldp     x29, x30, [sp], 16
        ret

With shrink-wrapping and speculation both enabled, we get the following
code:

g:
        cmp     sp, 0
        csetm   x15, ne  // Establish tracker - OK
        cmp     w0, 3
        bgt     .L8
	// tracker not updated, speculation state not re-encoded in SP
        ret
        .p2align 2
.L8:
        stp     x29, x30, [sp, -16]!
        csel    x15, x15, xzr, gt  // tracker updated for branch
        mov     x14, sp
        mov     x29, sp
        and     x14, x14, x15	  // Encode speculation status in SP
        mov     sp, x14
        bl      f
        add     w0, w0, 4
        cmp     sp, 0		 // extract speculation status from call
        ldp     x29, x30, [sp], 16
        csetm   x15, ne
        mov     x14, sp
        and     x14, x14, x15   // And re-encode it in return sp
        mov     sp, x14
        ret

So although this code executes correctly from an architectural
perspective, if the early return path is taken speclatively, the caller
receives incomplete information about the speculation that has taken place.

And if we disable shrink wrapping, then obviously things work as
originally intended:

g:
        cmp     sp, 0
        stp     x29, x30, [sp, -16]!
        csetm   x15, ne		// establish tracker
        mov     x29, sp
        cmp     w0, 3
        bgt     .L5
        ldp     x29, x30, [sp], 16
        csel    x15, x15, xzr, le  // Update tracker for branch
        mov     x14, sp
        and     x14, x14, x15
        mov     sp, x14		// tracking encoded into SP for return
        ret
        .p2align 2
.L5:
        csel    x15, x15, xzr, gt  // As above.
        mov     x14, sp
        and     x14, x14, x15
        mov     sp, x14
        bl      f
        add     w0, w0, 4
        cmp     sp, 0
        ldp     x29, x30, [sp], 16
        csetm   x15, ne
        mov     x14, sp
        and     x14, x14, x15
        mov     sp, x14
        ret

I'm not asking that shrink wrapping be updated to handle all this; in
fact, I'm not sure it's that easy to do as the branch patterns and
simple-return patterns aren't set up to handle this.

R.

> 

> Segher

>
Segher Boessenkool Nov. 6, 2018, 6:18 p.m. UTC | #7
Hi Richard,

On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote:
> On 06/11/2018 01:40, Segher Boessenkool wrote:

> > Hi Richard,

> > 

> > On Mon, Nov 05, 2018 at 10:09:30AM +0000, Richard Earnshaw (lists) wrote:

> >>>>> Shouldn't you be able to do this per function at least?

> >>>>

> >>>> do what per function?  track speculation?

> >>>

> >>> disable shrink-wrapping only when any speculation was there

> >>> (this is about __bultin_speculation_safe_value, no?)

> >>

> >> Only indirectly.  This is about the tracking code that tracks

> >> conditional branches and propagates that information through call/return

> >> sequences.  Shrink wrapping messes with the prologue/epilogue sequences

> >> after the speculation tracking pass has run and unknowingly deletes some

> >> of the additional code that was previously inserted by the tracking pass.

> > 

> > Do you have an example of this?  Shrink-wrapping does not generally

> > delete any code.

> > 

> 

> Well it generates new 'light-weight' prologue and epilogue sequences for

> the 'shrunk' code path that lack the establishment of the tracker

> register and doesn't know how to move the existing sequence to the new

> entry sequence.


Ah, so the shrink-wrapping code is not deleting anything at all (just
not adding it).  Gotcha :-)

[ snip example code; thanks, that helped ]

> I'm not asking that shrink wrapping be updated to handle all this; in

> fact, I'm not sure it's that easy to do as the branch patterns and

> simple-return patterns aren't set up to handle this.


One thing you could do is make shrink-wrap aware what part of the code
needs the speculation tracking parts of the prologue.  You could do this
by making a separate shrink-wrapping component for it, or you can do it
by marking the places needing it as needing the full prologue, e.g. by
emitting a fake call into it (and not outputting any code for that call).
The latter does cause a stack frame to be emitted even when it wouldn't
otherwise, unfortunately.  The separate shrink-wrapping approach should
work beautifully as far as I see.


Segher
Richard Earnshaw (lists) Nov. 6, 2018, 7:43 p.m. UTC | #8
On 06/11/2018 18:18, Segher Boessenkool wrote:
> Hi Richard,

> 

> On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote:

>> On 06/11/2018 01:40, Segher Boessenkool wrote:

>>> Hi Richard,

>>>

>>> On Mon, Nov 05, 2018 at 10:09:30AM +0000, Richard Earnshaw (lists) wrote:

>>>>>>> Shouldn't you be able to do this per function at least?

>>>>>>

>>>>>> do what per function?  track speculation?

>>>>>

>>>>> disable shrink-wrapping only when any speculation was there

>>>>> (this is about __bultin_speculation_safe_value, no?)

>>>>

>>>> Only indirectly.  This is about the tracking code that tracks

>>>> conditional branches and propagates that information through call/return

>>>> sequences.  Shrink wrapping messes with the prologue/epilogue sequences

>>>> after the speculation tracking pass has run and unknowingly deletes some

>>>> of the additional code that was previously inserted by the tracking pass.

>>>

>>> Do you have an example of this?  Shrink-wrapping does not generally

>>> delete any code.

>>>

>>

>> Well it generates new 'light-weight' prologue and epilogue sequences for

>> the 'shrunk' code path that lack the establishment of the tracker

>> register and doesn't know how to move the existing sequence to the new

>> entry sequence.

> 

> Ah, so the shrink-wrapping code is not deleting anything at all (just

> not adding it).  Gotcha :-)


Well.... you could argue that it deleted the tracker update for the case
where the branch was not taken, and it also deleted the part of the
prologue where the tracker state was restored into SP before the return.
 But I'm being picky... :-)

> 

> [ snip example code; thanks, that helped ]

> 

>> I'm not asking that shrink wrapping be updated to handle all this; in

>> fact, I'm not sure it's that easy to do as the branch patterns and

>> simple-return patterns aren't set up to handle this.

> 

> One thing you could do is make shrink-wrap aware what part of the code

> needs the speculation tracking parts of the prologue.  You could do this

> by making a separate shrink-wrapping component for it, or you can do it

> by marking the places needing it as needing the full prologue, e.g. by

> emitting a fake call into it (and not outputting any code for that call).

> The latter does cause a stack frame to be emitted even when it wouldn't

> otherwise, unfortunately.  The separate shrink-wrapping approach should

> work beautifully as far as I see.

> 

> 


There are number of optimizations that are worth investigation with the
tracking support; but whether they'll notably improve performance I'm
not sure.  Tracking just just expensive and the main problem is the
serialization of the state, which limits the core's ability to reorder
stuff internally.

R.
Richard Earnshaw (lists) Nov. 6, 2018, 7:45 p.m. UTC | #9
On 06/11/2018 19:43, Richard Earnshaw (lists) wrote:
> On 06/11/2018 18:18, Segher Boessenkool wrote:

>> Hi Richard,

>>

>> On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote:

>>> On 06/11/2018 01:40, Segher Boessenkool wrote:

>>>> Hi Richard,

>>>>

>>>> On Mon, Nov 05, 2018 at 10:09:30AM +0000, Richard Earnshaw (lists) wrote:

>>>>>>>> Shouldn't you be able to do this per function at least?

>>>>>>>

>>>>>>> do what per function?  track speculation?

>>>>>>

>>>>>> disable shrink-wrapping only when any speculation was there

>>>>>> (this is about __bultin_speculation_safe_value, no?)

>>>>>

>>>>> Only indirectly.  This is about the tracking code that tracks

>>>>> conditional branches and propagates that information through call/return

>>>>> sequences.  Shrink wrapping messes with the prologue/epilogue sequences

>>>>> after the speculation tracking pass has run and unknowingly deletes some

>>>>> of the additional code that was previously inserted by the tracking pass.

>>>>

>>>> Do you have an example of this?  Shrink-wrapping does not generally

>>>> delete any code.

>>>>

>>>

>>> Well it generates new 'light-weight' prologue and epilogue sequences for

>>> the 'shrunk' code path that lack the establishment of the tracker

>>> register and doesn't know how to move the existing sequence to the new

>>> entry sequence.

>>

>> Ah, so the shrink-wrapping code is not deleting anything at all (just

>> not adding it).  Gotcha :-)

> 

> Well.... you could argue that it deleted the tracker update for the case

> where the branch was not taken, and it also deleted the part of the

> prologue where the tracker state was restored into SP before the return.


Duh! epilogue, of course.

R.

>  But I'm being picky... :-)

> 

>>

>> [ snip example code; thanks, that helped ]

>>

>>> I'm not asking that shrink wrapping be updated to handle all this; in

>>> fact, I'm not sure it's that easy to do as the branch patterns and

>>> simple-return patterns aren't set up to handle this.

>>

>> One thing you could do is make shrink-wrap aware what part of the code

>> needs the speculation tracking parts of the prologue.  You could do this

>> by making a separate shrink-wrapping component for it, or you can do it

>> by marking the places needing it as needing the full prologue, e.g. by

>> emitting a fake call into it (and not outputting any code for that call).

>> The latter does cause a stack frame to be emitted even when it wouldn't

>> otherwise, unfortunately.  The separate shrink-wrapping approach should

>> work beautifully as far as I see.

>>

>>

> 

> There are number of optimizations that are worth investigation with the

> tracking support; but whether they'll notably improve performance I'm

> not sure.  Tracking just just expensive and the main problem is the

> serialization of the state, which limits the core's ability to reorder

> stuff internally.

> 

> R.

> 

>
Segher Boessenkool Nov. 6, 2018, 8:18 p.m. UTC | #10
On Tue, Nov 06, 2018 at 07:43:36PM +0000, Richard Earnshaw (lists) wrote:
> On 06/11/2018 18:18, Segher Boessenkool wrote:

> > On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote:

> >> Well it generates new 'light-weight' prologue and epilogue sequences for

> >> the 'shrunk' code path that lack the establishment of the tracker

> >> register and doesn't know how to move the existing sequence to the new

> >> entry sequence.

> > 

> > Ah, so the shrink-wrapping code is not deleting anything at all (just

> > not adding it).  Gotcha :-)

> 

> Well.... you could argue that it deleted the tracker update for the case

> where the branch was not taken, and it also deleted the part of the

> prologue where the tracker state was restored into SP before the return.

>  But I'm being picky... :-)


When I say "deleted" I mean "deleted RTL code that was actually there".
You seem to mean "prevented it from being created later"?

What I'm after is, if the shrink-wrapping code is deleting RTL it has
no business touching, that sounds like a serious bug.

> > [ snip example code; thanks, that helped ]

> > 

> >> I'm not asking that shrink wrapping be updated to handle all this; in

> >> fact, I'm not sure it's that easy to do as the branch patterns and

> >> simple-return patterns aren't set up to handle this.

> > 

> > One thing you could do is make shrink-wrap aware what part of the code

> > needs the speculation tracking parts of the prologue.  You could do this

> > by making a separate shrink-wrapping component for it, or you can do it

> > by marking the places needing it as needing the full prologue, e.g. by

> > emitting a fake call into it (and not outputting any code for that call).

> > The latter does cause a stack frame to be emitted even when it wouldn't

> > otherwise, unfortunately.  The separate shrink-wrapping approach should

> > work beautifully as far as I see.

> 

> There are number of optimizations that are worth investigation with the

> tracking support; but whether they'll notably improve performance I'm

> not sure.  Tracking just just expensive and the main problem is the

> serialization of the state, which limits the core's ability to reorder

> stuff internally.


Yeah, it will be seriously expensive always.  If people still use this
in production code you really _do_ want to optimise it.  If that helps
measurably, anyway.


Segher
Richard Earnshaw (lists) Nov. 6, 2018, 10:33 p.m. UTC | #11
On 06/11/2018 20:18, Segher Boessenkool wrote:
> On Tue, Nov 06, 2018 at 07:43:36PM +0000, Richard Earnshaw (lists) wrote:

>> On 06/11/2018 18:18, Segher Boessenkool wrote:

>>> On Tue, Nov 06, 2018 at 11:46:53AM +0000, Richard Earnshaw (lists) wrote:

>>>> Well it generates new 'light-weight' prologue and epilogue sequences for

>>>> the 'shrunk' code path that lack the establishment of the tracker

>>>> register and doesn't know how to move the existing sequence to the new

>>>> entry sequence.

>>>

>>> Ah, so the shrink-wrapping code is not deleting anything at all (just

>>> not adding it).  Gotcha :-)

>>

>> Well.... you could argue that it deleted the tracker update for the case

>> where the branch was not taken, and it also deleted the part of the

>> prologue where the tracker state was restored into SP before the return.

>>  But I'm being picky... :-)

> 

> When I say "deleted" I mean "deleted RTL code that was actually there".

> You seem to mean "prevented it from being created later"?

> 

> What I'm after is, if the shrink-wrapping code is deleting RTL it has

> no business touching, that sounds like a serious bug.


Well it has 'deleted' the update of the tracker register after the
conditional branch leading directly to the return insn.  But it's
possible that what has happened is that the use of the tracker variable
has been deleted (not re-emitted for the shrunk-wrap return sequence)
and thus another optimization has deleted the update as being dead.  I
haven't checked the rtl output directly to see how this is happening.

R.

> 

>>> [ snip example code; thanks, that helped ]

>>>

>>>> I'm not asking that shrink wrapping be updated to handle all this; in

>>>> fact, I'm not sure it's that easy to do as the branch patterns and

>>>> simple-return patterns aren't set up to handle this.

>>>

>>> One thing you could do is make shrink-wrap aware what part of the code

>>> needs the speculation tracking parts of the prologue.  You could do this

>>> by making a separate shrink-wrapping component for it, or you can do it

>>> by marking the places needing it as needing the full prologue, e.g. by

>>> emitting a fake call into it (and not outputting any code for that call).

>>> The latter does cause a stack frame to be emitted even when it wouldn't

>>> otherwise, unfortunately.  The separate shrink-wrapping approach should

>>> work beautifully as far as I see.

>>

>> There are number of optimizations that are worth investigation with the

>> tracking support; but whether they'll notably improve performance I'm

>> not sure.  Tracking just just expensive and the main problem is the

>> serialization of the state, which limits the core's ability to reorder

>> stuff internally.

> 

> Yeah, it will be seriously expensive always.  If people still use this

> in production code you really _do_ want to optimise it.  If that helps

> measurably, anyway.

> 

> 

> Segher

>
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 54f57463e97..d356fa37823 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11346,6 +11346,12 @@  aarch64_override_options (void)
        || (aarch64_arch_string && valid_arch))
     gcc_assert (explicit_arch != aarch64_no_arch);
 
+  /* The pass to insert speculation tracking runs before
+     shrink-wrapping and the latter does not know how to update the
+     tracking status.  So disable it in this case.  */
+  if (aarch64_track_speculation)
+    flag_shrink_wrap = 0;
+
   aarch64_override_options_internal (&global_options);
 
   /* Save these options as the default ones in case we push and pop them later