Message ID | 20181019015617.22583-4-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | target/arm: Reduce tlb_flush overhead | expand |
On 19 October 2018 at 02:56, Richard Henderson <richard.henderson@linaro.org> wrote: > Only the EL0 and EL1 TLBs are affected by the EL1 register, > so flush only 2 of the 8 TLBs. > > In testing a boot of the Ubuntu installer to the first menu, this > accounts for nearly all of the full tlb flushes: all but 11k of > the 1.2M instances without the patch. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > target/arm/helper.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/target/arm/helper.c b/target/arm/helper.c > index ed70ac645e..3ba8e66487 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, > tcr->raw_tcr = value; > } > > -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri, > - uint64_t value) > +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, > + uint64_t value) > { > /* If the ASID changes (with a 64-bit write), we must flush the TLB. */ > if (cpreg_field_is_64bit(ri) && > extract64(raw_read(env, ri) ^ value, 48, 16) != 0) { > ARMCPU *cpu = arm_env_get_cpu(env); > - tlb_flush(CPU(cpu)); > + tlb_flush_by_mmuidx(CPU(cpu), > + ARMMMUIdxBit_S12NSE1 | > + ARMMMUIdxBit_S12NSE0); This isn't taking account of the possibility of secure mode. ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected. And for AArch32, this writefn is used for the secure-banked versions of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing. thanks -- PMM
On 10/19/18 7:28 AM, Peter Maydell wrote: > On 19 October 2018 at 02:56, Richard Henderson > <richard.henderson@linaro.org> wrote: >> Only the EL0 and EL1 TLBs are affected by the EL1 register, >> so flush only 2 of the 8 TLBs. >> >> In testing a boot of the Ubuntu installer to the first menu, this >> accounts for nearly all of the full tlb flushes: all but 11k of >> the 1.2M instances without the patch. >> >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> >> --- >> target/arm/helper.c | 16 +++++++++------- >> 1 file changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/target/arm/helper.c b/target/arm/helper.c >> index ed70ac645e..3ba8e66487 100644 >> --- a/target/arm/helper.c >> +++ b/target/arm/helper.c >> @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, >> tcr->raw_tcr = value; >> } >> >> -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri, >> - uint64_t value) >> +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, >> + uint64_t value) >> { >> /* If the ASID changes (with a 64-bit write), we must flush the TLB. */ >> if (cpreg_field_is_64bit(ri) && >> extract64(raw_read(env, ri) ^ value, 48, 16) != 0) { >> ARMCPU *cpu = arm_env_get_cpu(env); >> - tlb_flush(CPU(cpu)); >> + tlb_flush_by_mmuidx(CPU(cpu), >> + ARMMMUIdxBit_S12NSE1 | >> + ARMMMUIdxBit_S12NSE0); > > This isn't taking account of the possibility of secure mode. > ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected. Ah. Is there an easy way to tell if secure mode is present/enabled? It'd be nice to not flush tlbs that aren't in use... > And for AArch32, this writefn is used for the secure-banked versions > of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing. For aarch32, we don't have an asid, and so do not flush at all. r~
On 19 October 2018 at 16:21, Richard Henderson <richard.henderson@linaro.org> wrote: > On 10/19/18 7:28 AM, Peter Maydell wrote: >> On 19 October 2018 at 02:56, Richard Henderson >> <richard.henderson@linaro.org> wrote: >>> Only the EL0 and EL1 TLBs are affected by the EL1 register, >>> so flush only 2 of the 8 TLBs. >>> >>> In testing a boot of the Ubuntu installer to the first menu, this >>> accounts for nearly all of the full tlb flushes: all but 11k of >>> the 1.2M instances without the patch. >>> >>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> >>> --- >>> target/arm/helper.c | 16 +++++++++------- >>> 1 file changed, 9 insertions(+), 7 deletions(-) >>> >>> diff --git a/target/arm/helper.c b/target/arm/helper.c >>> index ed70ac645e..3ba8e66487 100644 >>> --- a/target/arm/helper.c >>> +++ b/target/arm/helper.c >>> @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, >>> tcr->raw_tcr = value; >>> } >>> >>> -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri, >>> - uint64_t value) >>> +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, >>> + uint64_t value) >>> { >>> /* If the ASID changes (with a 64-bit write), we must flush the TLB. */ >>> if (cpreg_field_is_64bit(ri) && >>> extract64(raw_read(env, ri) ^ value, 48, 16) != 0) { >>> ARMCPU *cpu = arm_env_get_cpu(env); >>> - tlb_flush(CPU(cpu)); >>> + tlb_flush_by_mmuidx(CPU(cpu), >>> + ARMMMUIdxBit_S12NSE1 | >>> + ARMMMUIdxBit_S12NSE0); >> >> This isn't taking account of the possibility of secure mode. >> ARMMMUIdxBit_S1SE0 and ARMMMUIdxBit_S1SE1 might also be affected. > > Ah. Is there an easy way to tell if secure mode is present/enabled? It'd be > nice to not flush tlbs that aren't in use... If ARM_FEATURE_EL3 is set, we have an EL3. It is probably possible to look at the current state of the CPU and determine whether we need to flush the NS TLBs or the S TLBs, but I would need to think about that. For AArch32 there are banked registers here, so we could in theory distinguish writes to the S banked and NS banked regs. For AArch64 there's only one EL1 register, so we'd need to check the Arm ARM for how interleaved writes to the TTBR_EL1 and flipping between S and NS work (and when the guest is supposed to do the TLB maintenance anyway). A conservative check is probably: if arm_current_el() < 3 // TTBR definitely can only be affecting the EL0/1 // translation regime for the current security state if arm_is_secure_below_el3() if EL3 is AArch32 flush S1SE0, S1E3 else flush S1SE0, S1SE1 else flush S12NSE1, S12NSE0 else // err on the side of flushing more than maybe we need to flush S1SE0, S12NSE1, S12NSE0 if EL3 is AArch32 flush S1E3 else flush S1SE1 (but you should check my logic ;-)) >> And for AArch32, this writefn is used for the secure-banked versions >> of TTBR0/TTBR1, which means ARMMMUIdxBit_S1E3 may also need flushing. > > For aarch32, we don't have an asid, and so do not flush at all. We do for AArch32 with TTBCR.EAE == 1 (ie LPAE, when you want to use the 64-bit form of the register, accessed via MRRC/MCRR). cpreg_field_is_64bit() is true for both "AArch64 sysreg" and "AArch32 64-bit cp reg". thanks -- PMM
On 10/19/18 9:12 AM, Peter Maydell wrote: > A conservative check is probably: > if arm_current_el() < 3 > // TTBR definitely can only be affecting the EL0/1 > // translation regime for the current security state > if arm_is_secure_below_el3() > if EL3 is AArch32 > flush S1SE0, S1E3 > else > flush S1SE0, S1SE1 > else > flush S12NSE1, S12NSE0 > else > // err on the side of flushing more than maybe we need to > flush S1SE0, S12NSE1, S12NSE0 > if EL3 is AArch32 > flush S1E3 > else > flush S1SE1 > > (but you should check my logic ;-)) Riiight. Clearly it would be simpler and safer to track unused tlbs within cputlb.c. > We do for AArch32 with TTBCR.EAE == 1 (ie LPAE, when you want to > use the 64-bit form of the register, accessed via MRRC/MCRR). > cpreg_field_is_64bit() is true for both "AArch64 sysreg" and > "AArch32 64-bit cp reg". Ok, thanks. So drop this patch for now and I'll get back to it. The other two in this series are at least incremental improvement in the meantime. r~
On 19 October 2018 at 17:31, Richard Henderson <richard.henderson@linaro.org> wrote: > On 10/19/18 9:12 AM, Peter Maydell wrote: >> A conservative check is probably: >> if arm_current_el() < 3 >> // TTBR definitely can only be affecting the EL0/1 >> // translation regime for the current security state >> if arm_is_secure_below_el3() >> if EL3 is AArch32 >> flush S1SE0, S1E3 >> else >> flush S1SE0, S1SE1 >> else >> flush S12NSE1, S12NSE0 >> else >> // err on the side of flushing more than maybe we need to >> flush S1SE0, S12NSE1, S12NSE0 >> if EL3 is AArch32 >> flush S1E3 >> else >> flush S1SE1 >> >> (but you should check my logic ;-)) > > Riiight. > > Clearly it would be simpler and safer to track unused tlbs within cputlb.c. The advantage of the above logic is that it means that even when we do have trustzone and are using the secure TLBs, we only flush the side we need to, not the whole lot. (But yeah, it's a bit hairy.) > So drop this patch for now and I'll get back to it. The other two in this > series are at least incremental improvement in the meantime. OK, I'll put those in target-arm.next. thanks -- PMM
diff --git a/target/arm/helper.c b/target/arm/helper.c index ed70ac645e..3ba8e66487 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -2706,14 +2706,16 @@ static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, tcr->raw_tcr = value; } -static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri, - uint64_t value) +static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri, + uint64_t value) { /* If the ASID changes (with a 64-bit write), we must flush the TLB. */ if (cpreg_field_is_64bit(ri) && extract64(raw_read(env, ri) ^ value, 48, 16) != 0) { ARMCPU *cpu = arm_env_get_cpu(env); - tlb_flush(CPU(cpu)); + tlb_flush_by_mmuidx(CPU(cpu), + ARMMMUIdxBit_S12NSE1 | + ARMMMUIdxBit_S12NSE0); } raw_write(env, ri, value); } @@ -2761,12 +2763,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = { .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, }, { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH, .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0, - .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0, + .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0, .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s), offsetof(CPUARMState, cp15.ttbr0_ns) } }, { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH, .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1, - .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0, + .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0, .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s), offsetof(CPUARMState, cp15.ttbr1_ns) } }, { .name = "TCR_EL1", .state = ARM_CP_STATE_AA64, @@ -3018,12 +3020,12 @@ static const ARMCPRegInfo lpae_cp_reginfo[] = { .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS, .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s), offsetof(CPUARMState, cp15.ttbr0_ns) }, - .writefn = vmsa_ttbr_write, }, + .writefn = vmsa_ttbr_el1_write, }, { .name = "TTBR1", .cp = 15, .crm = 2, .opc1 = 1, .access = PL1_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS, .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr1_s), offsetof(CPUARMState, cp15.ttbr1_ns) }, - .writefn = vmsa_ttbr_write, }, + .writefn = vmsa_ttbr_el1_write, }, REGINFO_SENTINEL };
Only the EL0 and EL1 TLBs are affected by the EL1 register, so flush only 2 of the 8 TLBs. In testing a boot of the Ubuntu installer to the first menu, this accounts for nearly all of the full tlb flushes: all but 11k of the 1.2M instances without the patch. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/helper.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) -- 2.17.2