diff mbox series

[v2,3/3] target/arm: Flush only the TLBs affected by TTBR*_EL1

Message ID 20181019015617.22583-4-richard.henderson@linaro.org
State New
Headers show
Series target/arm: Reduce tlb_flush overhead | expand

Commit Message

Richard Henderson Oct. 19, 2018, 1:56 a.m. UTC
Only the EL0 and EL1 TLBs are affected by the EL1 register,
so flush only 2 of the 8 TLBs.

In testing a boot of the Ubuntu installer to the first menu, this
accounts for nearly all of the full tlb flushes: all but 11k of
the 1.2M instances without the patch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

---
 target/arm/helper.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

-- 
2.17.2

Comments

Peter Maydell Oct. 19, 2018, 2:28 p.m. UTC | #1
On 19 October 2018 at 02:56, Richard Henderson
<richard.henderson@linaro.org> wrote:
> Only the EL0 and EL1 TLBs are affected by the EL1 register,

> so flush only 2 of the 8 TLBs.

>

> In testing a boot of the Ubuntu installer to the first menu, this

> accounts for nearly all of the full tlb flushes: all but 11k of

> the 1.2M instances without the patch.

>

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

> ---

>  target/arm/helper.c | 16 +++++++++-------

>  1 file changed, 9 insertions(+), 7 deletions(-)

>

> diff --git a/target/arm/helper.c b/target/arm/helper.c

> index ed70ac645e..3ba8e66487 100644

> --- a/target/arm/helper.c

> +++ b/target/arm/helper.c

> @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

>      tcr->raw_tcr = value;

>  }

>

> -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,

> -                            uint64_t value)

> +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

> +                                uint64_t value)

>  {

>      /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */

>      if (cpreg_field_is_64bit(ri) &&

>          extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {

>          ARMCPU *cpu = arm_env_get_cpu(env);

> -        tlb_flush(CPU(cpu));

> +        tlb_flush_by_mmuidx(CPU(cpu),

> +                            ARMMMUIdxBit_S12NSE1 |

> +                            ARMMMUIdxBit_S12NSE0);


This isn't taking account of the possibility of secure mode.
ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected.

And for AArch32, this writefn is used for the secure-banked versions
of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing.

thanks
-- PMM
Richard Henderson Oct. 19, 2018, 3:21 p.m. UTC | #2
On 10/19/18 7:28 AM, Peter Maydell wrote:
> On 19 October 2018 at 02:56, Richard Henderson

> <richard.henderson@linaro.org> wrote:

>> Only the EL0 and EL1 TLBs are affected by the EL1 register,

>> so flush only 2 of the 8 TLBs.

>>

>> In testing a boot of the Ubuntu installer to the first menu, this

>> accounts for nearly all of the full tlb flushes: all but 11k of

>> the 1.2M instances without the patch.

>>

>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

>> ---

>>  target/arm/helper.c | 16 +++++++++-------

>>  1 file changed, 9 insertions(+), 7 deletions(-)

>>

>> diff --git a/target/arm/helper.c b/target/arm/helper.c

>> index ed70ac645e..3ba8e66487 100644

>> --- a/target/arm/helper.c

>> +++ b/target/arm/helper.c

>> @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

>>      tcr->raw_tcr = value;

>>  }

>>

>> -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,

>> -                            uint64_t value)

>> +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

>> +                                uint64_t value)

>>  {

>>      /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */

>>      if (cpreg_field_is_64bit(ri) &&

>>          extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {

>>          ARMCPU *cpu = arm_env_get_cpu(env);

>> -        tlb_flush(CPU(cpu));

>> +        tlb_flush_by_mmuidx(CPU(cpu),

>> +                            ARMMMUIdxBit_S12NSE1 |

>> +                            ARMMMUIdxBit_S12NSE0);

> 

> This isn't taking account of the possibility of secure mode.

> ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected.


Ah.  Is there an easy way to tell if secure mode is present/enabled?  It'd be
nice to not flush tlbs that aren't in use...

> And for AArch32, this writefn is used for the secure-banked versions

> of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing.


For aarch32, we don't have an asid, and so do not flush at all.


r~
Peter Maydell Oct. 19, 2018, 4:12 p.m. UTC | #3
On 19 October 2018 at 16:21, Richard Henderson
<richard.henderson@linaro.org> wrote:
> On 10/19/18 7:28 AM, Peter Maydell wrote:

>> On 19 October 2018 at 02:56, Richard Henderson

>> <richard.henderson@linaro.org> wrote:

>>> Only the EL0 and EL1 TLBs are affected by the EL1 register,

>>> so flush only 2 of the 8 TLBs.

>>>

>>> In testing a boot of the Ubuntu installer to the first menu, this

>>> accounts for nearly all of the full tlb flushes: all but 11k of

>>> the 1.2M instances without the patch.

>>>

>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

>>> ---

>>>  target/arm/helper.c | 16 +++++++++-------

>>>  1 file changed, 9 insertions(+), 7 deletions(-)

>>>

>>> diff --git a/target/arm/helper.c b/target/arm/helper.c

>>> index ed70ac645e..3ba8e66487 100644

>>> --- a/target/arm/helper.c

>>> +++ b/target/arm/helper.c

>>> @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

>>>      tcr->raw_tcr = value;

>>>  }

>>>

>>> -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,

>>> -                            uint64_t value)

>>> +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,

>>> +                                uint64_t value)

>>>  {

>>>      /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */

>>>      if (cpreg_field_is_64bit(ri) &&

>>>          extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {

>>>          ARMCPU *cpu = arm_env_get_cpu(env);

>>> -        tlb_flush(CPU(cpu));

>>> +        tlb_flush_by_mmuidx(CPU(cpu),

>>> +                            ARMMMUIdxBit_S12NSE1 |

>>> +                            ARMMMUIdxBit_S12NSE0);

>>

>> This isn't taking account of the possibility of secure mode.

>> ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected.

>

> Ah.  Is there an easy way to tell if secure mode is present/enabled?  It'd be

> nice to not flush tlbs that aren't in use...


If ARM_FEATURE_EL3 is set, we have an EL3. It is probably possible
to look at the current state of the CPU and determine whether we
need to flush the NS TLBs or the S TLBs, but I would need to
think about that. For AArch32 there are banked registers here, so
we could in theory distinguish writes to the S banked and NS banked
regs. For AArch64 there's only one EL1 register, so we'd
need to check the Arm ARM for how interleaved writes to the TTBR_EL1
and flipping between S and NS work (and when the guest is supposed
to do the TLB maintenance anyway).

A conservative check is probably:
  if arm_current_el() < 3
     // TTBR definitely can only be affecting the EL0/1
     // translation regime for the current security state
     if arm_is_secure_below_el3()
        if EL3 is AArch32
            flush S1SE0, S1E3
        else
            flush S1SE0, S1SE1
     else
        flush S12NSE1, S12NSE0
  else
     // err on the side of flushing more than maybe we need to
     flush S1SE0, S12NSE1, S12NSE0
     if EL3 is AArch32
          flush S1E3
     else
          flush S1SE1

(but you should check my logic ;-))

>> And for AArch32, this writefn is used for the secure-banked versions

>> of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing.

>

> For aarch32, we don't have an asid, and so do not flush at all.


We do for AArch32 with TTBCR.EAE == 1 (ie LPAE, when you want to
use the 64-bit form of the register, accessed via MRRC/MCRR).
cpreg_field_is_64bit() is true for both "AArch64 sysreg" and
"AArch32 64-bit cp reg".

thanks
-- PMM
Richard Henderson Oct. 19, 2018, 4:31 p.m. UTC | #4
On 10/19/18 9:12 AM, Peter Maydell wrote:
> A conservative check is probably:

>   if arm_current_el() < 3

>      // TTBR definitely can only be affecting the EL0/1

>      // translation regime for the current security state

>      if arm_is_secure_below_el3()

>         if EL3 is AArch32

>             flush S1SE0, S1E3

>         else

>             flush S1SE0, S1SE1

>      else

>         flush S12NSE1, S12NSE0

>   else

>      // err on the side of flushing more than maybe we need to

>      flush S1SE0, S12NSE1, S12NSE0

>      if EL3 is AArch32

>           flush S1E3

>      else

>           flush S1SE1

> 

> (but you should check my logic ;-))


Riiight.

Clearly it would be simpler and safer to track unused tlbs within cputlb.c.

> We do for AArch32 with TTBCR.EAE == 1 (ie LPAE, when you want to

> use the 64-bit form of the register, accessed via MRRC/MCRR).

> cpreg_field_is_64bit() is true for both "AArch64 sysreg" and

> "AArch32 64-bit cp reg".


Ok, thanks.

So drop this patch for now and I'll get back to it.  The other two in this
series are at least incremental improvement in the meantime.


r~
Peter Maydell Oct. 19, 2018, 4:37 p.m. UTC | #5
On 19 October 2018 at 17:31, Richard Henderson
<richard.henderson@linaro.org> wrote:
> On 10/19/18 9:12 AM, Peter Maydell wrote:

>> A conservative check is probably:

>>   if arm_current_el() < 3

>>      // TTBR definitely can only be affecting the EL0/1

>>      // translation regime for the current security state

>>      if arm_is_secure_below_el3()

>>         if EL3 is AArch32

>>             flush S1SE0, S1E3

>>         else

>>             flush S1SE0, S1SE1

>>      else

>>         flush S12NSE1, S12NSE0

>>   else

>>      // err on the side of flushing more than maybe we need to

>>      flush S1SE0, S12NSE1, S12NSE0

>>      if EL3 is AArch32

>>           flush S1E3

>>      else

>>           flush S1SE1

>>

>> (but you should check my logic ;-))

>

> Riiight.

>

> Clearly it would be simpler and safer to track unused tlbs within cputlb.c.


The advantage of the above logic is that it means that even
when we do have trustzone and are using the secure TLBs,
we only flush the side we need to, not the whole lot.
(But yeah, it's a bit hairy.)

> So drop this patch for now and I'll get back to it.  The other two in this

> series are at least incremental improvement in the meantime.


OK, I'll put those in target-arm.next.

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ed70ac645e..3ba8e66487 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2706,14 +2706,16 @@  static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
     tcr->raw_tcr = value;
 }
 
-static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-                            uint64_t value)
+static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                                uint64_t value)
 {
     /* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
     if (cpreg_field_is_64bit(ri) &&
         extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
         ARMCPU *cpu = arm_env_get_cpu(env);
-        tlb_flush(CPU(cpu));
+        tlb_flush_by_mmuidx(CPU(cpu),
+                            ARMMMUIdxBit_S12NSE1 |
+                            ARMMMUIdxBit_S12NSE0);
     }
     raw_write(env, ri, value);
 }
@@ -2761,12 +2763,12 @@  static const ARMCPRegInfo vmsa_cp_reginfo[] = {
       .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
     { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) } },
     { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
       .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
-      .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+      .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) } },
     { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64,
@@ -3018,12 +3020,12 @@  static const ARMCPRegInfo lpae_cp_reginfo[] = {
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
                              offsetof(CPUARMState, cp15.ttbr0_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_el1_write, },
     { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1,
       .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
       .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s),
                              offsetof(CPUARMState, cp15.ttbr1_ns) },
-      .writefn = vmsa_ttbr_write, },
+      .writefn = vmsa_ttbr_el1_write, },
     REGINFO_SENTINEL
 };