Message ID | 20230727162621.445400-1-richard.henderson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | target/arm: Do not use gen_mte_checkN in trans_STGP | expand |
On Thu, 27 Jul 2023 at 17:33, Richard Henderson <richard.henderson@linaro.org> wrote: > > STGP writes to tag memory, it does not check it. > This happened to work because we wrote tag memory first > so that the check always succeeded. So this is code cleanup to be more sensible, rather than a guest visible bug ? thanks -- PMM
On 7/28/23 06:17, Peter Maydell wrote: > On Thu, 27 Jul 2023 at 17:33, Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> STGP writes to tag memory, it does not check it. >> This happened to work because we wrote tag memory first >> so that the check always succeeded. > > So this is code cleanup to be more sensible, rather > than a guest visible bug ? Yes. r~
On Thu, 27 Jul 2023 at 17:33, Richard Henderson <richard.henderson@linaro.org> wrote: > > STGP writes to tag memory, it does not check it. > This happened to work because we wrote tag memory first > so that the check always succeeded. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > target/arm/tcg/translate-a64.c | 41 +++++++++++++--------------------- > 1 file changed, 15 insertions(+), 26 deletions(-) > > diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c > index 5fa1257d32..dfd18e19ca 100644 > --- a/target/arm/tcg/translate-a64.c > +++ b/target/arm/tcg/translate-a64.c > @@ -3020,37 +3020,17 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a) > tcg_gen_addi_i64(dirty_addr, dirty_addr, offset); > } > > - if (!s->ata) { > - /* > - * TODO: We could rely on the stores below, at least for > - * system mode, if we arrange to add MO_ALIGN_16. > - */ > - gen_helper_stg_stub(cpu_env, dirty_addr); > - } else if (tb_cflags(s->base.tb) & CF_PARALLEL) { > - gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr); > - } else { > - gen_helper_stg(cpu_env, dirty_addr, dirty_addr); > - } > - > - mop = finalize_memop(s, MO_64); > - clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop); > - > + clean_addr = clean_data_tbi(s, dirty_addr); > tcg_rt = cpu_reg(s, a->rt); > tcg_rt2 = cpu_reg(s, a->rt2); > > /* > - * STGP is defined as two 8-byte memory operations and one tag operation. > - * We implement it as one single 16-byte memory operation for convenience. > - * Rebuild mop as for STP. > - * TODO: The atomicity with LSE2 is stronger than required. > - * Need a form of MO_ATOM_WITHIN16_PAIR that never requires > - * 16-byte atomicity. > + * STGP is defined as two 8-byte memory operations, aligned to TAG_GRANULE, > + * and one tag operation. We implement it as one single aligned 16-byte > + * memory operation for convenience. Note that the alignment ensures > + * MO_ATOM_IFALIGN_PAIR produces 8-byte atomicity for the memory store. > */ > - mop = MO_128; > - if (s->align_mem) { > - mop |= MO_ALIGN_8; > - } > - mop = finalize_memop_pair(s, mop); > + mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR; So here we're implicitly assuming TAG_GRANULE is 16 and then relying on the codegen for a MO_128 | MO_ALIGN operation to give us the alignment fault if the guest address isn't aligned to the tag granule, right ? Previously we also put s->be_data into the MemOp (via finalize_memop_pair() calling finalize_memop_atom()). Don't we still need to do that ? (We do explicitly swap the two i64s into the i128 in different orders depending on the be_data setting, but I think that is to handle MO_128 | MO_BE giving a true BE 128 bit store, whereas what we're implementing is two BE 64 bit stores.) > tmp = tcg_temp_new_i128(); > if (s->be_data == MO_LE) { > @@ -3060,6 +3040,15 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a) > } > tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop); thanks -- PMM
On 8/3/23 06:10, Peter Maydell wrote: > On Thu, 27 Jul 2023 at 17:33, Richard Henderson >> - mop = MO_128; >> - if (s->align_mem) { >> - mop |= MO_ALIGN_8; >> - } >> - mop = finalize_memop_pair(s, mop); >> + mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR; > > So here we're implicitly assuming TAG_GRANULE is 16 and > then relying on the codegen for a MO_128 | MO_ALIGN > operation to give us the alignment fault if the guest > address isn't aligned to the tag granule, right ? Yes. > > Previously we also put s->be_data into the MemOp > (via finalize_memop_pair() calling finalize_memop_atom()). > Don't we still need to do that ? Whoops, yes. r~
On Thu, 3 Aug 2023 at 15:42, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 8/3/23 06:10, Peter Maydell wrote: > > On Thu, 27 Jul 2023 at 17:33, Richard Henderson > >> - mop = MO_128; > >> - if (s->align_mem) { > >> - mop |= MO_ALIGN_8; > >> - } > >> - mop = finalize_memop_pair(s, mop); > >> + mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR; > > > > So here we're implicitly assuming TAG_GRANULE is 16 and > > then relying on the codegen for a MO_128 | MO_ALIGN > > operation to give us the alignment fault if the guest > > address isn't aligned to the tag granule, right ? > > Yes. > > > > > Previously we also put s->be_data into the MemOp > > (via finalize_memop_pair() calling finalize_memop_atom()). > > Don't we still need to do that ? > > Whoops, yes. I think you still need to respin this one, right? (I re-found it because I remembered this "writes tags, doesn't check them" thing and that it ought to apply to the FEAT_MOPS SETG block-tag-set that I'm writing also.) thanks -- PMM
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 5fa1257d32..dfd18e19ca 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -3020,37 +3020,17 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a) tcg_gen_addi_i64(dirty_addr, dirty_addr, offset); } - if (!s->ata) { - /* - * TODO: We could rely on the stores below, at least for - * system mode, if we arrange to add MO_ALIGN_16. - */ - gen_helper_stg_stub(cpu_env, dirty_addr); - } else if (tb_cflags(s->base.tb) & CF_PARALLEL) { - gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr); - } else { - gen_helper_stg(cpu_env, dirty_addr, dirty_addr); - } - - mop = finalize_memop(s, MO_64); - clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop); - + clean_addr = clean_data_tbi(s, dirty_addr); tcg_rt = cpu_reg(s, a->rt); tcg_rt2 = cpu_reg(s, a->rt2); /* - * STGP is defined as two 8-byte memory operations and one tag operation. - * We implement it as one single 16-byte memory operation for convenience. - * Rebuild mop as for STP. - * TODO: The atomicity with LSE2 is stronger than required. - * Need a form of MO_ATOM_WITHIN16_PAIR that never requires - * 16-byte atomicity. + * STGP is defined as two 8-byte memory operations, aligned to TAG_GRANULE, + * and one tag operation. We implement it as one single aligned 16-byte + * memory operation for convenience. Note that the alignment ensures + * MO_ATOM_IFALIGN_PAIR produces 8-byte atomicity for the memory store. */ - mop = MO_128; - if (s->align_mem) { - mop |= MO_ALIGN_8; - } - mop = finalize_memop_pair(s, mop); + mop = MO_128 | MO_ALIGN | MO_ATOM_IFALIGN_PAIR; tmp = tcg_temp_new_i128(); if (s->be_data == MO_LE) { @@ -3060,6 +3040,15 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a) } tcg_gen_qemu_st_i128(tmp, clean_addr, get_mem_index(s), mop); + /* Perform the tag store, if tag access enabled. */ + if (s->ata) { + if (tb_cflags(s->base.tb) & CF_PARALLEL) { + gen_helper_stg_parallel(cpu_env, dirty_addr, dirty_addr); + } else { + gen_helper_stg(cpu_env, dirty_addr, dirty_addr); + } + } + op_addr_ldstpair_post(s, a, dirty_addr, offset); return true; }
STGP writes to tag memory, it does not check it. This happened to work because we wrote tag memory first so that the check always succeeded. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/tcg/translate-a64.c | 41 +++++++++++++--------------------- 1 file changed, 15 insertions(+), 26 deletions(-)