diff mbox series

target/arm: Do not use gen_mte_checkN in trans_STGP

Message ID 20230727162621.445400-1-richard.henderson@linaro.org
State Superseded
Headers show
Series target/arm: Do not use gen_mte_checkN in trans_STGP | expand

Commit Message

Richard Henderson July 27, 2023, 4:26 p.m. UTC
STGP writes to tag memory, it does not check it.
This happened to work because we wrote tag memory first
so that the check always succeeded.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/tcg/translate-a64.c | 41 +++++++++++++---------------------
 1 file changed, 15 insertions(+), 26 deletions(-)

Comments

Peter Maydell July 28, 2023, 1:17 p.m. UTC | #1
On Thu, 27 Jul 2023 at 17:33, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> STGP writes to tag memory, it does not check it.
> This happened to work because we wrote tag memory first
> so that the check always succeeded.

So this is code cleanup to be more sensible, rather
than a guest visible bug ?

thanks
-- PMM
Richard Henderson July 28, 2023, 1:30 p.m. UTC | #2
On 7/28/23 06:17, Peter Maydell wrote:
> On Thu, 27 Jul 2023 at 17:33, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> STGP writes to tag memory, it does not check it.
>> This happened to work because we wrote tag memory first
>> so that the check always succeeded.
> 
> So this is code cleanup to be more sensible, rather
> than a guest visible bug ?

Yes.


r~
Peter Maydell Aug. 3, 2023, 1:10 p.m. UTC | #3
On Thu, 27 Jul 2023 at 17:33, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> STGP writes to tag memory, it does not check it.
> This happened to work because we wrote tag memory first
> so that the check always succeeded.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  target/arm/tcg/translate-a64.c | 41 +++++++++++++---------------------
>  1 file changed, 15 insertions(+), 26 deletions(-)
>
> diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
> index 5fa1257d32..dfd18e19ca 100644
> --- a/target/arm/tcg/translate-a64.c
> +++ b/target/arm/tcg/translate-a64.c
> @@ -3020,37 +3020,17 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
>          tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
>      }
>
> -    if (!s->ata) {
> -        /*
> -         * TODO: We could rely on the stores below, at least for
> -         * system mode, if we arrange to add MO_ALIGN_16.
> -         */
> -        gen_helper_stg_stub(cpu_env, dirty_addr);
> -    } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
> -        gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
> -    } else {
> -        gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
> -    }
> -
> -    mop = finalize_memop(s, MO_64);
> -    clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop);
> -
> +    clean_addr = clean_data_tbi(s, dirty_addr);
>      tcg_rt = cpu_reg(s, a->rt);
>      tcg_rt2 = cpu_reg(s, a->rt2);
>
>      /*
> -     * STGP is defined as two 8-byte memory operations and one tag operation.
> -     * We implement it as one single 16-byte memory operation for convenience.
> -     * Rebuild mop as for STP.
> -     * TODO: The atomicity with LSE2 is stronger than required.
> -     * Need a form of MO_ATOM_WITHIN16_PAIR that never requires
> -     * 16-byte atomicity.
> +     * STGP is defined as two 8-byte memory operations, aligned to TAG_GRANULE,
> +     * and one tag operation.  We implement it as one single aligned 16-byte
> +     * memory operation for convenience.  Note that the alignment ensures
> +     * MO_ATOM_IFALIGN_PAIR produces 8-byte atomicity for the memory store.
>       */
> -    mop = MO_128;
> -    if (s->align_mem) {
> -        mop |= MO_ALIGN_8;
> -    }
> -    mop = finalize_memop_pair(s, mop);
> +    mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR;

So here we're implicitly assuming TAG_GRANULE is 16 and
then relying on the codegen for a MO_128 | MO_ALIGN
operation to give us the alignment fault if the guest
address isn't aligned to the tag granule, right ?

Previously we also put s->be_data into the MemOp
(via finalize_memop_pair() calling finalize_memop_atom()).
Don't we still need to do that ? (We do explicitly swap
the two i64s into the i128 in different orders depending
on the be_data setting, but I think that is to handle
MO_128 | MO_BE giving a true BE 128 bit store, whereas
what we're implementing is two BE 64 bit stores.)


>      tmp = tcg_temp_new_i128();
>      if (s->be_data == MO_LE) {
> @@ -3060,6 +3040,15 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
>      }
>      tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop);

thanks
-- PMM
Richard Henderson Aug. 3, 2023, 2:42 p.m. UTC | #4
On 8/3/23 06:10, Peter Maydell wrote:
> On Thu, 27 Jul 2023 at 17:33, Richard Henderson
>> -    mop = MO_128;
>> -    if (s->align_mem) {
>> -        mop |= MO_ALIGN_8;
>> -    }
>> -    mop = finalize_memop_pair(s, mop);
>> +    mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR;
> 
> So here we're implicitly assuming TAG_GRANULE is 16 and
> then relying on the codegen for a MO_128 | MO_ALIGN
> operation to give us the alignment fault if the guest
> address isn't aligned to the tag granule, right ?

Yes.

> 
> Previously we also put s->be_data into the MemOp
> (via finalize_memop_pair() calling finalize_memop_atom()).
> Don't we still need to do that ?

Whoops, yes.


r~
Peter Maydell Sept. 1, 2023, 12:36 p.m. UTC | #5
On Thu, 3 Aug 2023 at 15:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/3/23 06:10, Peter Maydell wrote:
> > On Thu, 27 Jul 2023 at 17:33, Richard Henderson
> >> -    mop = MO_128;
> >> -    if (s->align_mem) {
> >> -        mop |= MO_ALIGN_8;
> >> -    }
> >> -    mop = finalize_memop_pair(s, mop);
> >> +    mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR;
> >
> > So here we're implicitly assuming TAG_GRANULE is 16 and
> > then relying on the codegen for a MO_128 | MO_ALIGN
> > operation to give us the alignment fault if the guest
> > address isn't aligned to the tag granule, right ?
>
> Yes.
>
> >
> > Previously we also put s->be_data into the MemOp
> > (via finalize_memop_pair() calling finalize_memop_atom()).
> > Don't we still need to do that ?
>
> Whoops, yes.

I think you still need to respin this one, right?

(I re-found it because I remembered this "writes tags,
doesn't check them" thing and that it ought to apply to
the FEAT_MOPS SETG block-tag-set that I'm writing also.)

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 5fa1257d32..dfd18e19ca 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -3020,37 +3020,17 @@  static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
         tcg_gen_addi_i64(dirty_addr, dirty_addr, offset);
     }
 
-    if (!s->ata) {
-        /*
-         * TODO: We could rely on the stores below, at least for
-         * system mode, if we arrange to add MO_ALIGN_16.
-         */
-        gen_helper_stg_stub(cpu_env, dirty_addr);
-    } else if (tb_cflags(s->base.tb) & CF_PARALLEL) {
-        gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
-    } else {
-        gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
-    }
-
-    mop = finalize_memop(s, MO_64);
-    clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop);
-
+    clean_addr = clean_data_tbi(s, dirty_addr);
     tcg_rt = cpu_reg(s, a->rt);
     tcg_rt2 = cpu_reg(s, a->rt2);
 
     /*
-     * STGP is defined as two 8-byte memory operations and one tag operation.
-     * We implement it as one single 16-byte memory operation for convenience.
-     * Rebuild mop as for STP.
-     * TODO: The atomicity with LSE2 is stronger than required.
-     * Need a form of MO_ATOM_WITHIN16_PAIR that never requires
-     * 16-byte atomicity.
+     * STGP is defined as two 8-byte memory operations, aligned to TAG_GRANULE,
+     * and one tag operation.  We implement it as one single aligned 16-byte
+     * memory operation for convenience.  Note that the alignment ensures
+     * MO_ATOM_IFALIGN_PAIR produces 8-byte atomicity for the memory store.
      */
-    mop = MO_128;
-    if (s->align_mem) {
-        mop |= MO_ALIGN_8;
-    }
-    mop = finalize_memop_pair(s, mop);
+    mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR;
 
     tmp = tcg_temp_new_i128();
     if (s->be_data == MO_LE) {
@@ -3060,6 +3040,15 @@  static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
     }
     tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop);
 
+    /* Perform the tag store, if tag access enabled. */
+    if (s->ata) {
+        if (tb_cflags(s->base.tb) & CF_PARALLEL) {
+            gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr);
+        } else {
+            gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
+        }
+    }
+
     op_addr_ldstpair_post(s, a, dirty_addr, offset);
     return true;
 }