Message ID | 20180809042206.15726-12-richard.henderson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | target/arm: sve system mode patches | expand |
On 9 August 2018 at 05:21, Richard Henderson <richard.henderson@linaro.org> wrote: > The 16-byte load only uses 16 predicate bits. But while > reusing the other load infrastructure, we find other bits > that are set and trigger an assert. To avoid this and > retain the assert, zero-extend the predicate that we pass > to the LD1 helper. > > Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > target/arm/translate-sve.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c > index d27bc8c946..bef6b8242d 100644 > --- a/target/arm/translate-sve.c > +++ b/target/arm/translate-sve.c > @@ -4765,12 +4765,33 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) > unsigned vsz = vec_full_reg_size(s); > TCGv_ptr t_pg; > TCGv_i32 desc; > + int poff; > > /* Load the first quadword using the normal predicated load helpers. */ > desc = tcg_const_i32(simd_desc(16, 16, zt)); > - t_pg = tcg_temp_new_ptr(); > > - tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); > + poff = pred_full_reg_offset(s, pg); > + if (vsz > 16) { > + /* > + * Zero-extend the first 16 bits of the predicate into a temporary. > + * This avoids triggering an assert making sure we don't have bits > + * set within a predicate beyond VQ, but we have lowered VQ to 1 > + * for this load operation. > + */ > + TCGv_i64 tmp = tcg_temp_new_i64(); > +#ifdef HOST_WORDS_BIGENDIAN > + poff += 6; > +#endif > + tcg_gen_ld16u_i64(tmp, cpu_env, poff); > + > + poff = offsetof(CPUARMState, vfp.preg_tmp); > + tcg_gen_st_i64(tmp, cpu_env, poff); > + tcg_temp_free_i64(tmp); > + } > + > + t_pg = tcg_temp_new_ptr(); > + tcg_gen_addi_ptr(t_pg, cpu_env, poff); Reviewed-by: Peter Maydell <peter.maydell@linaro.org> The bigendian #ifdef in the middle of the code is a little ugly, though -- I don't suppose it's possible to avoid it (or abstract it away) somehow? thanks -- PMM
On 08/23/2018 08:21 AM, Peter Maydell wrote: > On 9 August 2018 at 05:21, Richard Henderson > <richard.henderson@linaro.org> wrote: >> The 16-byte load only uses 16 predicate bits. But while >> reusing the other load infrastructure, we find other bits >> that are set and trigger an assert. To avoid this and >> retain the assert, zero-extend the predicate that we pass >> to the LD1 helper. >> >> Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> >> --- >> target/arm/translate-sve.c | 25 +++++++++++++++++++++++-- >> 1 file changed, 23 insertions(+), 2 deletions(-) >> >> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c >> index d27bc8c946..bef6b8242d 100644 >> --- a/target/arm/translate-sve.c >> +++ b/target/arm/translate-sve.c >> @@ -4765,12 +4765,33 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) >> unsigned vsz = vec_full_reg_size(s); >> TCGv_ptr t_pg; >> TCGv_i32 desc; >> + int poff; >> >> /* Load the first quadword using the normal predicated load helpers. */ >> desc = tcg_const_i32(simd_desc(16, 16, zt)); >> - t_pg = tcg_temp_new_ptr(); >> >> - tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); >> + poff = pred_full_reg_offset(s, pg); >> + if (vsz > 16) { >> + /* >> + * Zero-extend the first 16 bits of the predicate into a temporary. >> + * This avoids triggering an assert making sure we don't have bits >> + * set within a predicate beyond VQ, but we have lowered VQ to 1 >> + * for this load operation. >> + */ >> + TCGv_i64 tmp = tcg_temp_new_i64(); >> +#ifdef HOST_WORDS_BIGENDIAN >> + poff += 6; >> +#endif >> + tcg_gen_ld16u_i64(tmp, cpu_env, poff); >> + >> + poff = offsetof(CPUARMState, vfp.preg_tmp); >> + tcg_gen_st_i64(tmp, cpu_env, poff); >> + tcg_temp_free_i64(tmp); >> + } >> + >> + t_pg = tcg_temp_new_ptr(); >> + tcg_gen_addi_ptr(t_pg, cpu_env, poff); > > Reviewed-by: Peter Maydell <peter.maydell@linaro.org> > > The bigendian #ifdef in the middle of the code is a little > ugly, though -- I don't suppose it's possible to avoid it > (or abstract it away) somehow? Adding a helper function for this one use didn't seem worthwhile. But I certainly can duplicate the form of vec_reg_offset if you prefer. r~
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d27bc8c946..bef6b8242d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4765,12 +4765,33 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; TCGv_i32 desc; + int poff; /* Load the first quadword using the normal predicated load helpers. */ desc = tcg_const_i32(simd_desc(16, 16, zt)); - t_pg = tcg_temp_new_ptr(); - tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + poff = pred_full_reg_offset(s, pg); + if (vsz > 16) { + /* + * Zero-extend the first 16 bits of the predicate into a temporary. + * This avoids triggering an assert making sure we don't have bits + * set within a predicate beyond VQ, but we have lowered VQ to 1 + * for this load operation. + */ + TCGv_i64 tmp = tcg_temp_new_i64(); +#ifdef HOST_WORDS_BIGENDIAN + poff += 6; +#endif + tcg_gen_ld16u_i64(tmp, cpu_env, poff); + + poff = offsetof(CPUARMState, vfp.preg_tmp); + tcg_gen_st_i64(tmp, cpu_env, poff); + tcg_temp_free_i64(tmp); + } + + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_pg, cpu_env, poff); + fns[msz](cpu_env, t_pg, addr, desc); tcg_temp_free_ptr(t_pg);
The 16-byte load only uses 16 predicate bits. But while reusing the other load infrastructure, we find other bits that are set and trigger an assert. To avoid this and retain the assert, zero-extend the predicate that we pass to the LD1 helper. Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/translate-sve.c | 25 +++++++++++++++++++++++-- 1 file changed, 23 insertions(+), 2 deletions(-) -- 2.17.1