diff mbox series

[PULL,11/20] target/arm: Don't mishandle count when enabling or disabling PMU counters

Message ID 20220914115217.117532-13-richard.henderson@linaro.org
State Accepted
Commit 01765386a88868ae993bcbb39f111501fe690035
Headers show
Series [PULL,01/20] target/arm: Add cortex-a35 | expand

Commit Message

Richard Henderson Sept. 14, 2022, 11:52 a.m. UTC
From: Peter Maydell <peter.maydell@linaro.org>

The PMU cycle and event counter infrastructure design requires that
operations on the PMU register fields are wrapped in pmu_op_start()
and pmu_op_finish() calls (or their more specific pmmcntr and
pmevcntr equivalents).  This includes any changes to registers which
affect whether the counter should be enabled or disabled, but we
forgot to do this.

The effect of this bug is that in sequences like:
 * disable the cycle counter (PMCCNTR) using the PMCNTEN register
 * write a value such as 0xfffff000 to the PMCCNTR
 * restart the counter by writing to PMCNTEN
the value written to the cycle counter is corrupted, and it starts
counting from the wrong place. (Essentially, we fail to record that
the QEMU_CLOCK_VIRTUAL timestamp when the counter should be considered
to have started counting is the point when PMCNTEN is written to enable
the counter.)

Add the necessary bracketing calls, so that updates to the various
registers which affect whether the PMU is counting are handled
correctly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220822132358.3524971-4-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

Comments

Thomas Huth Sept. 20, 2022, 4:45 p.m. UTC | #1
On 14/09/2022 13.52, Richard Henderson wrote:
> From: Peter Maydell <peter.maydell@linaro.org>
> 
> The PMU cycle and event counter infrastructure design requires that
> operations on the PMU register fields are wrapped in pmu_op_start()
> and pmu_op_finish() calls (or their more specific pmmcntr and
> pmevcntr equivalents).  This includes any changes to registers which
> affect whether the counter should be enabled or disabled, but we
> forgot to do this.
> 
> The effect of this bug is that in sequences like:
>   * disable the cycle counter (PMCCNTR) using the PMCNTEN register
>   * write a value such as 0xfffff000 to the PMCCNTR
>   * restart the counter by writing to PMCNTEN
> the value written to the cycle counter is corrupted, and it starts
> counting from the wrong place. (Essentially, we fail to record that
> the QEMU_CLOCK_VIRTUAL timestamp when the counter should be considered
> to have started counting is the point when PMCNTEN is written to enable
> the counter.)
> 
> Add the necessary bracketing calls, so that updates to the various
> registers which affect whether the PMU is counting are handled
> correctly.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Message-Id: <20220822132358.3524971-4-peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/arm/helper.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 45 insertions(+)

  Hi Peter, hi Richard,

this seems to break some Avocado based test(s) in our CI:

  make check-venv
  ./tests/venv/bin/avocado run tests/avocado/replay_kernel.py:ReplayKernelNormal.test_aarch64_virt

... fails with commit 01765386a88868ae993bcb but still passes
with the preceeding commit. Could you please have a look?

  Thanks,
   Thomas
Peter Maydell Sept. 23, 2022, 10:50 a.m. UTC | #2
On Tue, 20 Sept 2022 at 17:45, Thomas Huth <thuth@redhat.com> wrote:
> this seems to break some Avocado based test(s) in our CI:
>
>   make check-venv
>   ./tests/venv/bin/avocado run tests/avocado/replay_kernel.py:ReplayKernelNormal.test_aarch64_virt
>
> ... fails with commit 01765386a88868ae993bcb but still passes
> with the preceeding commit. Could you please have a look?

Thanks for the report. I can reproduce it, it seems to happen
with all kinds of icount, not specifically with record-replay.
We hit the "Bad icount read" exit in icount_get_raw_locked().
I'll investigate further.

(Rather unhelpfully, the avocado framework has ignored the fact
that the QEMU process exited with an non-zero status, has
failed to log the "Bad icount read" message in the log, and
has only failed the test for hitting the 120s timeout :-( )

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/arm/helper.c b/target/arm/helper.c
index e4824e01b8..a348c7407d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -1079,6 +1079,14 @@  static CPAccessResult pmreg_access_ccntr(CPUARMState *env,
     return pmreg_access(env, ri, isread);
 }
 
+/*
+ * Bits in MDCR_EL2 and MDCR_EL3 which pmu_counter_enabled() looks at.
+ * We use these to decide whether we need to wrap a write to MDCR_EL2
+ * or MDCR_EL3 in pmu_op_start()/pmu_op_finish() calls.
+ */
+#define MDCR_EL2_PMU_ENABLE_BITS (MDCR_HPME | MDCR_HPMD | MDCR_HPMN)
+#define MDCR_EL3_PMU_ENABLE_BITS (MDCR_SPME)
+
 /* Returns true if the counter (pass 31 for PMCCNTR) should count events using
  * the current EL, security state, and register configuration.
  */
@@ -1432,15 +1440,19 @@  static uint64_t pmccfiltr_read_a32(CPUARMState *env, const ARMCPRegInfo *ri)
 static void pmcntenset_write(CPUARMState *env, const ARMCPRegInfo *ri,
                             uint64_t value)
 {
+    pmu_op_start(env);
     value &= pmu_counter_mask(env);
     env->cp15.c9_pmcnten |= value;
+    pmu_op_finish(env);
 }
 
 static void pmcntenclr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                              uint64_t value)
 {
+    pmu_op_start(env);
     value &= pmu_counter_mask(env);
     env->cp15.c9_pmcnten &= ~value;
+    pmu_op_finish(env);
 }
 
 static void pmovsr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4681,7 +4693,39 @@  static void sctlr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 static void sdcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
                        uint64_t value)
 {
+    /*
+     * Some MDCR_EL3 bits affect whether PMU counters are running:
+     * if we are trying to change any of those then we must
+     * bracket this update with PMU start/finish calls.
+     */
+    bool pmu_op = (env->cp15.mdcr_el3 ^ value) & MDCR_EL3_PMU_ENABLE_BITS;
+
+    if (pmu_op) {
+        pmu_op_start(env);
+    }
     env->cp15.mdcr_el3 = value & SDCR_VALID_MASK;
+    if (pmu_op) {
+        pmu_op_finish(env);
+    }
+}
+
+static void mdcr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
+                           uint64_t value)
+{
+    /*
+     * Some MDCR_EL2 bits affect whether PMU counters are running:
+     * if we are trying to change any of those then we must
+     * bracket this update with PMU start/finish calls.
+     */
+    bool pmu_op = (env->cp15.mdcr_el2 ^ value) & MDCR_EL2_PMU_ENABLE_BITS;
+
+    if (pmu_op) {
+        pmu_op_start(env);
+    }
+    env->cp15.mdcr_el2 = value;
+    if (pmu_op) {
+        pmu_op_finish(env);
+    }
 }
 
 static const ARMCPRegInfo v8_cp_reginfo[] = {
@@ -7724,6 +7768,7 @@  void register_cp_regs_for_features(ARMCPU *cpu)
         ARMCPRegInfo mdcr_el2 = {
             .name = "MDCR_EL2", .state = ARM_CP_STATE_BOTH,
             .opc0 = 3, .opc1 = 4, .crn = 1, .crm = 1, .opc2 = 1,
+            .writefn = mdcr_el2_write,
             .access = PL2_RW, .resetvalue = pmu_num_counters(env),
             .fieldoffset = offsetof(CPUARMState, cp15.mdcr_el2),
         };