diff mbox

[v2,9/9] target-arm: Add WFx instruction trap support

Message ID 1429722561-12651-10-git-send-email-greg.bellows@linaro.org
State New
Headers show

Commit Message

Greg Bellows April 22, 2015, 5:09 p.m. UTC
Add support for trapping WFI and WFE instructions to the proper EL when
SCTLR/SCR/HCR settings apply.

Signed-off-by: Greg Bellows <greg.bellows@linaro.org>

---

v1 -> v2
- Replace check loop with simpler if checks.
- Changed WFx syncdrome function to take bool
- Changed return of uint32_t to int
- Added cdditional comments.
---
 target-arm/internals.h |  2 +-
 target-arm/op_helper.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 74 insertions(+), 4 deletions(-)

Comments

Peter Maydell April 23, 2015, 7:59 a.m. UTC | #1
On 23 April 2015 at 03:49, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> On Wed, Apr 22, 2015 at 12:09:21PM -0500, Greg Bellows wrote:
>>  void HELPER(wfe)(CPUARMState *env)
>>  {
>>      CPUState *cs = CPU(arm_env_get_cpu(env));
>> +    int target_el = check_wfx_trap(env, true);
>>
>>      /* Don't actually halt the CPU, just yield back to top
>>       * level loop
>>       */
>> -    cs->exception_index = EXCP_YIELD;
>> +    if (target_el) {
>> +        cs->exception_index = EXCP_UDEF;
>> +        env->exception.syndrome = syn_wfx(1, 0xe, true);
>> +        env->exception.target_el = target_el;
>> +        env->pc -= 4;
>> +    } else {
>> +        cs->exception_index = EXCP_YIELD;
>> +    }
>>      cpu_loop_exit(cs);
>>  }
>
>
> Hi Greg,
>
> Before trapping, don't you need to check that the CPU has no actual work?
> e.g:
> if (!cc->has_work(cs) && ..)

We don't track WFE-wakeup events, so we must always trap. (Well,
I suppose strictly we could also go for "never trap" on the basis
that really our implementation here is pretty much a NOP, but I
think always-trap is closer to reasonable.)

In theory you could maybe check has_work() for the WFI case,
since doing an EXCP_HLT really should cause us to stop until
has_work is true, but it seems a bit fragile -- could we really
guarantee that nothing would change between this point and
when we went back through the main loop that would change
whether has_work evaluates true or not? I think that it's better
there too to just always take the trap: setting EXCP_HLT is our
"going into a low power state" and so we should take the trap
if we would otherwise have done that.

> I think you can still EXCP_YIELD the core if it has work though
> as it's probably a good place to reschedule other CPUs in our
> single-threaded SMP model.

That is indeed the only reason we do anything with WFE at all
(it used to be a no-op, but yielding gives better performance).

-- PMM
Peter Maydell April 23, 2015, 10:57 a.m. UTC | #2
On 23 April 2015 at 11:39, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>
> On 23/04/2015 6:00 pm, "Peter Maydell" <peter.maydell@linaro.org> wrote:
>> In theory you could maybe check has_work() for the WFI case,
>> since doing an EXCP_HLT really should cause us to stop until
>> has_work is true, but it seems a bit fragile -- could we really
>> guarantee that nothing would change between this point and
>> when we went back through the main loop that would change
>> whether has_work evaluates true or not? I think that it's better
>> there too to just always take the trap: setting EXCP_HLT is our
>> "going into a low power state" and so we should take the trap
>> if we would otherwise have done that.
>
> I think functional wise we are OK.
> The implementation can AFAIK always choose to nop for whatever reason (e.g
> has_work()). Only when we choose to enter low power, the trap comes into
> play.

Ah, so in helper_wfi() do something like

    if (!has_work()) {
       if (trapping wfi) {
           EXCP_UDEF code;
       } else {
           EXCP_HALT code;
       }
    }
    /* otherwise just return, making this WFI a nop */

?

I think that would work.

> Maybe wfe is the most problematic one because it fires more frequently and
> often when has_work() is true?

Yes, I think we should start by not trapping on WFE and then look
at how good/bad perf is.

-- PMM
Peter Maydell April 23, 2015, 11:28 a.m. UTC | #3
On 23 April 2015 at 12:24, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> Maybe we can consider YIELD instead of NOP when has_work() is true as a WFI
> is probably a good hint from guests to reschedule QEMU CPUs.

That I'm not convinced about. If we have a pending interrupt then
our best bet is probably to take it immediately on this CPU, not
yield to another CPU and take the interrupt when we eventually
get control back.

-- PMM
Greg Bellows April 23, 2015, 2:26 p.m. UTC | #4
On Thu, Apr 23, 2015 at 6:34 AM, Edgar E. Iglesias <edgar.iglesias@gmail.com
> wrote:

> On Thu, Apr 23, 2015 at 12:28:43PM +0100, Peter Maydell wrote:
> > On 23 April 2015 at 12:24, Edgar E. Iglesias <edgar.iglesias@gmail.com>
> wrote:
> > > Maybe we can consider YIELD instead of NOP when has_work() is true as
> a WFI
> > > is probably a good hint from guests to reschedule QEMU CPUs.
> >
> > That I'm not convinced about. If we have a pending interrupt then
> > our best bet is probably to take it immediately on this CPU, not
> > yield to another CPU and take the interrupt when we eventually
> > get control back.
>
> Yeah, true. It's actually a very bad YIELD point when has_work() is true
> for a WFI.
>
> For WFE too I guess, when interrupts are unmasked.
>
> Good point.
>
> Cheers,
> Edgar
>


Good catch Edgar, we shouldn't trap if we are "going into a low-power
state".

It sounds like we arrived at the following:

wfi-  check has_work before taking either the UDEF or HLT exception hereby
making it a noop.

wfe - Never trap, making it always a noop.  This includes undoing the
existing YIELD exception.

Sound right?
Peter Maydell April 23, 2015, 2:30 p.m. UTC | #5
On 23 April 2015 at 15:26, Greg Bellows <greg.bellows@linaro.org> wrote:
> Good catch Edgar, we shouldn't trap if we are "going into a low-power
> state".

You mean "if we are not going into a low-power state".

> It sounds like we arrived at the following:
>
> wfi-  check has_work before taking either the UDEF or HLT exception hereby
> making it a noop.

This is OK (as per the code sketch in my earlier message).

> wfe - Never trap, making it always a noop.  This includes undoing the
> existing YIELD exception.

No, the YIELD behaviour is correct for WFE. This is important
to avoid us spinning for the rest of this vcpu's timeslice
(never making progress because the other vcpu which could
unblock us from this spinlock doesn't run). But we don't ever
want to trap to EL2.

-- PMM
Greg Bellows April 23, 2015, 2:41 p.m. UTC | #6
On Thu, Apr 23, 2015 at 9:30 AM, Peter Maydell <peter.maydell@linaro.org>
wrote:

> On 23 April 2015 at 15:26, Greg Bellows <greg.bellows@linaro.org> wrote:
> > Good catch Edgar, we shouldn't trap if we are "going into a low-power
> > state".
>
> You mean "if we are not going into a low-power state".
>

​Yes, sorry.
​


>
> > It sounds like we arrived at the following:
> >
> > wfi-  check has_work before taking either the UDEF or HLT exception
> hereby
> > making it a noop.
>
> This is OK (as per the code sketch in my earlier message).
>
> > wfe - Never trap, making it always a noop.  This includes undoing the
> > existing YIELD exception.
>
> No, the YIELD behaviour is correct for WFE. This is important
> to avoid us spinning for the rest of this vcpu's timeslice
> (never making progress because the other vcpu which could
> unblock us from this spinlock doesn't run). But we don't ever
> want to trap to EL2.
>

​Ok it sounds like the primary concern is the performance impact of
constantly trapping wfe to EL2 (in which case HCR must be set to cause
this).  Should we still honor EL3 and EL1 trapping ofr wfe?  In other words
check if they are the target and allow it if !has_work?



>
> -- PMM
>
Peter Maydell April 23, 2015, 2:51 p.m. UTC | #7
On 23 April 2015 at 15:41, Greg Bellows <greg.bellows@linaro.org> wrote:
> Ok it sounds like the primary concern is the performance impact of
> constantly trapping wfe to EL2 (in which case HCR must be set to cause
> this).  Should we still honor EL3 and EL1 trapping ofr wfe?  In other words
> check if they are the target and allow it if !has_work?

We should check what the architecture manual says, but I think
that in all the trap-WFE cases it says "traps if the CPU
would go into a low power state". QEMU *never* goes into a
low power state on WFE, so we never trap.

-- PMM
Greg Bellows April 23, 2015, 3 p.m. UTC | #8
On Thu, Apr 23, 2015 at 9:51 AM, Peter Maydell <peter.maydell@linaro.org>
wrote:

> On 23 April 2015 at 15:41, Greg Bellows <greg.bellows@linaro.org> wrote:
> > Ok it sounds like the primary concern is the performance impact of
> > constantly trapping wfe to EL2 (in which case HCR must be set to cause
> > this).  Should we still honor EL3 and EL1 trapping ofr wfe?  In other
> words
> > check if they are the target and allow it if !has_work?
>
> We should check what the architecture manual says, but I think
> that in all the trap-WFE cases it says "traps if the CPU
> would go into a low power state". QEMU *never* goes into a
> low power state on WFE, so we never trap.
>

​Right, that is exactly what it says.  So if QEMU never goes into a low
power state than neither wfi or wfe should trap as they have the same
explanation. In which case the best solution is to pull the patches out.
That work?



>
> -- PMM
>
Peter Maydell April 23, 2015, 3:11 p.m. UTC | #9
On 23 April 2015 at 16:00, Greg Bellows <greg.bellows@linaro.org> wrote:
>
>
> On Thu, Apr 23, 2015 at 9:51 AM, Peter Maydell <peter.maydell@linaro.org>
> wrote:
>> We should check what the architecture manual says, but I think
>> that in all the trap-WFE cases it says "traps if the CPU
>> would go into a low power state". QEMU *never* goes into a
>> low power state on WFE, so we never trap.
>
>
> Right, that is exactly what it says.  So if QEMU never goes into a low power
> state than neither wfi or wfe should trap as they have the same explanation.
> In which case the best solution is to pull the patches out.  That work?

No, because for WFI we *do* go into a low-power state (EXCP_HLT),
so we must trap.

-- PMM
diff mbox

Patch

diff --git a/target-arm/internals.h b/target-arm/internals.h
index de0a9c1..8e208f7 100644
--- a/target-arm/internals.h
+++ b/target-arm/internals.h
@@ -347,7 +347,7 @@  static inline uint32_t syn_breakpoint(int same_el)
         | ARM_EL_IL | 0x22;
 }
 
-static inline uint32_t syn_wfx(int cv, int cond, int ti)
+static inline uint32_t syn_wfx(int cv, int cond, bool ti)
 {
     return (EC_WFX_TRAP << ARM_EL_EC_SHIFT) |
            (cv << 24) | (cond << 20) | ti;
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 971edc7..43aa8ef 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -223,23 +223,93 @@  uint32_t HELPER(usat16)(CPUARMState *env, uint32_t x, uint32_t shift)
     return res;
 }
 
+/* Function checks whether WFx (WFI/WFE) instructions are set-up to be trapped.
+ * The function returns the target EL (1-3) if the instruction is to be trapped
+ * otherwise it returns 0 indicating it is not trapped.
+ */
+static inline int check_wfx_trap(CPUARMState *env, bool is_wfe)
+{
+    int cur_el = arm_current_el(env);
+    uint64_t mask;
+
+    /* If we are currently in EL0 then we need to check if SCTLR is set up for
+     * WFx instructions being trapped to EL1.
+     */
+    if (cur_el < 1) {
+        mask = is_wfe ? SCTLR_nTWE : SCTLR_nTWI;
+        if (arm_el_is_aa64(env, 1)) {
+            if (!(env->cp15.sctlr_el[1] & mask)) {
+                return 1;
+            }
+        } else if (arm_feature(env, ARM_FEATURE_V8)) {
+            /* SCTLR WFx SCTLR trap bits only exist in ARMv8 */
+            if (!(A32_BANKED_CURRENT_REG_GET(env, sctlr) & mask)) {
+                return 1;
+            }
+        }
+    }
+
+    /* We are not trapping to EL1 at this point.
+     * Check if we are trapping WFx to EL2 it present.
+     */
+    if (cur_el < 2) {
+        if (arm_feature(env, ARM_FEATURE_EL2) && !arm_is_secure(env)) {
+            mask = (is_wfe) ? HCR_TWE : HCR_TWI;
+            if (env->cp15.hcr_el2 & mask) {
+                return 2;
+            }
+        }
+    }
+
+    /* We are not trapping to EL1 or EL2 at this point.
+     * Check if we are trapping WFx to EL3 if present.
+     */
+    if (cur_el < 3) {
+        if (arm_feature(env, ARM_FEATURE_EL3) &&
+            arm_feature(env, ARM_FEATURE_V8)) {
+            mask = (is_wfe) ? SCR_TWE : SCR_TWI;
+            if (env->cp15.scr_el3 & mask) {
+                return 3;
+            }
+        }
+    }
+
+    return 0;
+}
+
 void HELPER(wfi)(CPUARMState *env)
 {
     CPUState *cs = CPU(arm_env_get_cpu(env));
+    int target_el = check_wfx_trap(env, false);
 
-    cs->exception_index = EXCP_HLT;
-    cs->halted = 1;
+    if (target_el) {
+        cs->exception_index = EXCP_UDEF;
+        env->exception.syndrome = syn_wfx(1, 0xe, false);
+        env->exception.target_el = target_el;
+        env->pc -= 4;
+    } else {
+        cs->exception_index = EXCP_HLT;
+        cs->halted = 1;
+    }
     cpu_loop_exit(cs);
 }
 
 void HELPER(wfe)(CPUARMState *env)
 {
     CPUState *cs = CPU(arm_env_get_cpu(env));
+    int target_el = check_wfx_trap(env, true);
 
     /* Don't actually halt the CPU, just yield back to top
      * level loop
      */
-    cs->exception_index = EXCP_YIELD;
+    if (target_el) {
+        cs->exception_index = EXCP_UDEF;
+        env->exception.syndrome = syn_wfx(1, 0xe, true);
+        env->exception.target_el = target_el;
+        env->pc -= 4;
+    } else {
+        cs->exception_index = EXCP_YIELD;
+    }
     cpu_loop_exit(cs);
 }