diff mbox series

translate.c: Fix usermode big-endian AArch32 LDREXD and STREXD

Message ID 1508522170-22539-1-git-send-email-peter.maydell@linaro.org
State Superseded
Headers show
Series translate.c: Fix usermode big-endian AArch32 LDREXD and STREXD | expand

Commit Message

Peter Maydell Oct. 20, 2017, 5:56 p.m. UTC
For AArch32 LDREXD and STREXD, architecturally the 32-bit word at the
lowest address is always Rt and the one at addr+4 is Rt2, even if the
CPU is big-endian. Our implementation does these with a single
64-bit store, so if we're big-endian then we need to put the two
32-bit halves together in the opposite order to little-endian,
so that they end up in the right places. We were trying to do
this with the gen_aa32_frob64() function, but that is not correct
for the usermode emulator, because there there is a distinction
between "load a 64 bit value" (which does a BE 64-bit access
and doesn't need swapping) and "load two 32 bit values as one
64 bit access" (where we still need to do the swapping, like
system mode BE32).

Fixes: https://bugs.launchpad.net/qemu/+bug/1725267

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

---
This is very much "last thing on Friday", but I'm going to be
travelling next week so thought I'd shove it out for review
and testing. It does fix the test case in the bug. I have
not tested:
 * BE32 system mode
 * BE8
 * little endian :-)

This should probably be cc: stable when it gets through
review and testing.
---
 target/arm/translate.c | 40 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 5 deletions(-)

-- 
2.7.4

Comments

Richard Henderson Oct. 20, 2017, 6:15 p.m. UTC | #1
On 10/20/2017 10:56 AM, Peter Maydell wrote:
> +        if (s->be_data) {

> +            tcg_gen_extr_i64_i32(tmp2, tmp, t64);

> +        } else {

> +            tcg_gen_extr_i64_i32(tmp, tmp2, t64);

> +        }


The test against be_data must be of the form s->be_data == MO_BE/LE.

The actual bits have MO_BSWAP non-zero, and then set MO_BE/LE to either 0 or
MO_BSWAP depending on the host endianness.


> -        gen_aa32_frob64(s, o64);

> +        if (s->be_data) {

> +            tcg_gen_rotri_i64(o64, o64, 32);

> +        }

>          tcg_gen_setcond_i64(TCG_COND_NE, o64, o64, cpu_exclusive_val);


We're not splitting o64 to parts.  Are you sure it shouldn't stay as frob?


r~
Peter Maydell Oct. 21, 2017, 10:17 a.m. UTC | #2
On 20 October 2017 at 19:15, Richard Henderson <rth@twiddle.net> wrote:
> On 10/20/2017 10:56 AM, Peter Maydell wrote:

>> +        if (s->be_data) {

>> +            tcg_gen_extr_i64_i32(tmp2, tmp, t64);

>> +        } else {

>> +            tcg_gen_extr_i64_i32(tmp, tmp2, t64);

>> +        }

>

> The test against be_data must be of the form s->be_data == MO_BE/LE.

>

> The actual bits have MO_BSWAP non-zero, and then set MO_BE/LE to either 0 or

> MO_BSWAP depending on the host endianness.


Oops, yes.

>> -        gen_aa32_frob64(s, o64);

>> +        if (s->be_data) {

>> +            tcg_gen_rotri_i64(o64, o64, 32);

>> +        }

>>          tcg_gen_setcond_i64(TCG_COND_NE, o64, o64, cpu_exclusive_val);

>

> We're not splitting o64 to parts.  Are you sure it shouldn't stay as frob?


This is confusing, but I don't think frob is right. We want to
be matching either (a) the transformation we just did to produce
the 64 bit data we're storing, or (b) the transformation we
do on the ldrexd, don't we? Neither of those is frob.

I think I need to think through a bit more carefully about
what's actually going on here, since we seem to have the
value in two places (actual memory, and the cpu_exclusive_val
TCGv.)

thanks
-- PMM
Richard Henderson Oct. 21, 2017, 6:56 p.m. UTC | #3
On 10/21/2017 03:17 AM, Peter Maydell wrote:
>>> -        gen_aa32_frob64(s, o64);

>>> +        if (s->be_data) {

>>> +            tcg_gen_rotri_i64(o64, o64, 32);

>>> +        }

>>>          tcg_gen_setcond_i64(TCG_COND_NE, o64, o64, cpu_exclusive_val);

>>

>> We're not splitting o64 to parts.  Are you sure it shouldn't stay as frob?

> 

> This is confusing, but I don't think frob is right. We want to

> be matching either (a) the transformation we just did to produce

> the 64 bit data we're storing, or (b) the transformation we

> do on the ldrexd, don't we? Neither of those is frob.

> 

> I think I need to think through a bit more carefully about

> what's actually going on here, since we seem to have the

> value in two places (actual memory, and the cpu_exclusive_val

> TCGv.)


We should match what we do in ldrexd, I think.  Which is just a straight BE
load.  So I guess the frob64 should be dropped and nothing should replace it.


r~
diff mbox series

Patch

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4da1a4c..88892e0 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7902,9 +7902,25 @@  static void gen_load_exclusive(DisasContext *s, int rt, int rt2,
         TCGv_i32 tmp2 = tcg_temp_new_i32();
         TCGv_i64 t64 = tcg_temp_new_i64();
 
-        gen_aa32_ld_i64(s, t64, addr, get_mem_index(s), opc);
+        /* For AArch32, architecturally the 32-bit word at the lowest
+         * address is always Rt and the one at addr+4 is Rt2, even if
+         * the CPU is big-endian. That means we don't want to do a
+         * gen_aa32_ld_i64(), which invokes gen_aa32_frob64() as if
+         * for an architecturally 64-bit access, but instead do a
+         * 64-bit access using MO_BE if appropriate and then split
+         * the two halves.
+         * This only makes a difference for BE32 user-mode, where
+         * frob64() must not flip the two halves of the 64-bit data
+         * but this code must treat BE32 user-mode like BE32 system.
+         */
+        TCGv a = gen_aa32_addr(s, addr, opc);
+        tcg_gen_qemu_ld_i64(t64, a, get_mem_index(s), opc);
         tcg_gen_mov_i64(cpu_exclusive_val, t64);
-        tcg_gen_extr_i64_i32(tmp, tmp2, t64);
+        if (s->be_data) {
+            tcg_gen_extr_i64_i32(tmp2, tmp, t64);
+        } else {
+            tcg_gen_extr_i64_i32(tmp, tmp2, t64);
+        }
         tcg_temp_free_i64(t64);
 
         store_reg(s, rt2, tmp2);
@@ -7953,15 +7969,29 @@  static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2,
         TCGv_i64 n64 = tcg_temp_new_i64();
 
         t2 = load_reg(s, rt2);
-        tcg_gen_concat_i32_i64(n64, t1, t2);
+        /* For AArch32, architecturally the 32-bit word at the lowest
+         * address is always Rt and the one at addr+4 is Rt2, even if
+         * the CPU is big-endian. Since we're going to treat this as a
+         * single 64-bit BE store, we need to put the two halves in the
+         * opposite order for BE to LE, so that they end up in the right
+         * places.
+         * We don't want gen_aa32_frob64() because that does the wrong
+         * thing for BE32 usermode.
+         */
+        if (s->be_data) {
+            tcg_gen_concat_i32_i64(n64, t2, t1);
+        } else {
+            tcg_gen_concat_i32_i64(n64, t1, t2);
+        }
         tcg_temp_free_i32(t2);
-        gen_aa32_frob64(s, n64);
 
         tcg_gen_atomic_cmpxchg_i64(o64, taddr, cpu_exclusive_val, n64,
                                    get_mem_index(s), opc);
         tcg_temp_free_i64(n64);
 
-        gen_aa32_frob64(s, o64);
+        if (s->be_data) {
+            tcg_gen_rotri_i64(o64, o64, 32);
+        }
         tcg_gen_setcond_i64(TCG_COND_NE, o64, o64, cpu_exclusive_val);
         tcg_gen_extrl_i64_i32(t0, o64);