diff mbox series

[PULL,11/27] arm: Allow system registers for KVM guests to be changed by QEMU code

Message ID 20190214190603.25030-12-peter.maydell@linaro.org
State Accepted
Commit 823e1b3818f9b10b824ddcd756983b6e2fa68730
Headers show
Series target-arm queue | expand

Commit Message

Peter Maydell Feb. 14, 2019, 7:05 p.m. UTC
At the moment the Arm implementations of kvm_arch_{get,put}_registers()
don't support having QEMU change the values of system registers
(aka coprocessor registers for AArch32). This is because although
kvm_arch_get_registers() calls write_list_to_cpustate() to
update the CPU state struct fields (so QEMU code can read the
values in the usual way), kvm_arch_put_registers() does not
call write_cpustate_to_list(), meaning that any changes to
the CPU state struct fields will not be passed back to KVM.

The rationale for this design is documented in a comment in the
AArch32 kvm_arch_put_registers() -- writing the values in the
cpregs list into the CPU state struct is "lossy" because the
write of a register might not succeed, and so if we blindly
copy the CPU state values back again we will incorrectly
change register values for the guest. The assumption was that
no QEMU code would need to write to the registers.

However, when we implemented debug support for KVM guests, we
broke that assumption: the code to handle "set the guest up
to take a breakpoint exception" does so by updating various
guest registers including ESR_EL1.

Support this by making kvm_arch_put_registers() synchronize
CPU state back into the list. We sync only those registers
where the initial write succeeds, which should be sufficient.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Tested-by: Alex Bennée <alex.bennee@linaro.org>

Tested-by: Dongjiu Geng <gengdongjiu@huawei.com>

---
 target/arm/cpu.h     |  9 ++++++++-
 target/arm/helper.c  | 27 +++++++++++++++++++++++++--
 target/arm/kvm32.c   | 20 ++------------------
 target/arm/kvm64.c   |  2 ++
 target/arm/machine.c |  2 +-
 5 files changed, 38 insertions(+), 22 deletions(-)

-- 
2.20.1

Comments

Eric Auger Feb. 21, 2019, 2:20 p.m. UTC | #1
Hi Peter, Alex,
On 2/14/19 8:05 PM, Peter Maydell wrote:
> At the moment the Arm implementations of kvm_arch_{get,put}_registers()

> don't support having QEMU change the values of system registers

> (aka coprocessor registers for AArch32). This is because although

> kvm_arch_get_registers() calls write_list_to_cpustate() to

> update the CPU state struct fields (so QEMU code can read the

> values in the usual way), kvm_arch_put_registers() does not

> call write_cpustate_to_list(), meaning that any changes to

> the CPU state struct fields will not be passed back to KVM.

> 

> The rationale for this design is documented in a comment in the

> AArch32 kvm_arch_put_registers() -- writing the values in the

> cpregs list into the CPU state struct is "lossy" because the

> write of a register might not succeed, and so if we blindly

> copy the CPU state values back again we will incorrectly

> change register values for the guest. The assumption was that

> no QEMU code would need to write to the registers.

> 

> However, when we implemented debug support for KVM guests, we

> broke that assumption: the code to handle "set the guest up

> to take a breakpoint exception" does so by updating various

> guest registers including ESR_EL1.

> 

> Support this by making kvm_arch_put_registers() synchronize

> CPU state back into the list. We sync only those registers

> where the initial write succeeds, which should be sufficient.

> 

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> Tested-by: Alex Bennée <alex.bennee@linaro.org>

> Tested-by: Dongjiu Geng <gengdongjiu@huawei.com>

This commit introduces a regression when running with EDK2 FW:

I get the following traces:

error: kvm run failed Function not implemented
 PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a
X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000
X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710
X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756
X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0
X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0
X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0
X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2
X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010
X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0
X29=000000013e3d12a0 X30=000000013f5a2518  SP=000000013b7be0b0
PSTATE=404003c4 -Z-- EL1t


and in host dmesg:
[ 3507.926571] kvm [35042]: load/store instruction decoding not implemented


My qemu cmd line is:

qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G
-display none --enable-kvm -serial tcp:localhost:4444,server -qmp
unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device
virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop
-drive
file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0
-device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2
-netdev
tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown
-bios
/home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd
-net none -d guest_errors

reverting the patch removes the guest crash. Any clue?

thanks

Eric


> ---

>  target/arm/cpu.h     |  9 ++++++++-

>  target/arm/helper.c  | 27 +++++++++++++++++++++++++--

>  target/arm/kvm32.c   | 20 ++------------------

>  target/arm/kvm64.c   |  2 ++

>  target/arm/machine.c |  2 +-

>  5 files changed, 38 insertions(+), 22 deletions(-)

> 

> diff --git a/target/arm/cpu.h b/target/arm/cpu.h

> index f0334413ece..bfc05c796a5 100644

> --- a/target/arm/cpu.h

> +++ b/target/arm/cpu.h

> @@ -2535,18 +2535,25 @@ bool write_list_to_cpustate(ARMCPU *cpu);

>  /**

>   * write_cpustate_to_list:

>   * @cpu: ARMCPU

> + * @kvm_sync: true if this is for syncing back to KVM

>   *

>   * For each register listed in the ARMCPU cpreg_indexes list, write

>   * its value from the ARMCPUState structure into the cpreg_values list.

>   * This is used to copy info from TCG's working data structures into

>   * KVM or for outbound migration.

>   *

> + * @kvm_sync is true if we are doing this in order to sync the

> + * register state back to KVM. In this case we will only update

> + * values in the list if the previous list->cpustate sync actually

> + * successfully wrote the CPU state. Otherwise we will keep the value

> + * that is in the list.

> + *

>   * Returns: true if all register values were read correctly,

>   * false if some register was unknown or could not be read.

>   * Note that we do not stop early on failure -- we will attempt

>   * reading all registers in the list.

>   */

> -bool write_cpustate_to_list(ARMCPU *cpu);

> +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);

>  

>  #define ARM_CPUID_TI915T      0x54029152

>  #define ARM_CPUID_TI925T      0x54029252

> diff --git a/target/arm/helper.c b/target/arm/helper.c

> index 5ac335f598c..7653aa6a50a 100644

> --- a/target/arm/helper.c

> +++ b/target/arm/helper.c

> @@ -264,7 +264,7 @@ static bool raw_accessors_invalid(const ARMCPRegInfo *ri)

>      return true;

>  }

>  

> -bool write_cpustate_to_list(ARMCPU *cpu)

> +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)

>  {

>      /* Write the coprocessor state from cpu->env to the (index,value) list. */

>      int i;

> @@ -273,6 +273,7 @@ bool write_cpustate_to_list(ARMCPU *cpu)

>      for (i = 0; i < cpu->cpreg_array_len; i++) {

>          uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]);

>          const ARMCPRegInfo *ri;

> +        uint64_t newval;

>  

>          ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);

>          if (!ri) {

> @@ -282,7 +283,29 @@ bool write_cpustate_to_list(ARMCPU *cpu)

>          if (ri->type & ARM_CP_NO_RAW) {

>              continue;

>          }

> -        cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri);

> +

> +        newval = read_raw_cp_reg(&cpu->env, ri);

> +        if (kvm_sync) {

> +            /*

> +             * Only sync if the previous list->cpustate sync succeeded.

> +             * Rather than tracking the success/failure state for every

> +             * item in the list, we just recheck "does the raw write we must

> +             * have made in write_list_to_cpustate() read back OK" here.

> +             */

> +            uint64_t oldval = cpu->cpreg_values[i];

> +

> +            if (oldval == newval) {

> +                continue;

> +            }

> +

> +            write_raw_cp_reg(&cpu->env, ri, oldval);

> +            if (read_raw_cp_reg(&cpu->env, ri) != oldval) {

> +                continue;

> +            }

> +

> +            write_raw_cp_reg(&cpu->env, ri, newval);

> +        }

> +        cpu->cpreg_values[i] = newval;

>      }

>      return ok;

>  }

> diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c

> index bd51eb43c86..a75e04cc8f3 100644

> --- a/target/arm/kvm32.c

> +++ b/target/arm/kvm32.c

> @@ -387,24 +387,8 @@ int kvm_arch_put_registers(CPUState *cs, int level)

>          return ret;

>      }

>  

> -    /* Note that we do not call write_cpustate_to_list()

> -     * here, so we are only writing the tuple list back to

> -     * KVM. This is safe because nothing can change the

> -     * CPUARMState cp15 fields (in particular gdb accesses cannot)

> -     * and so there are no changes to sync. In fact syncing would

> -     * be wrong at this point: for a constant register where TCG and

> -     * KVM disagree about its value, the preceding write_list_to_cpustate()

> -     * would not have had any effect on the CPUARMState value (since the

> -     * register is read-only), and a write_cpustate_to_list() here would

> -     * then try to write the TCG value back into KVM -- this would either

> -     * fail or incorrectly change the value the guest sees.

> -     *

> -     * If we ever want to allow the user to modify cp15 registers via

> -     * the gdb stub, we would need to be more clever here (for instance

> -     * tracking the set of registers kvm_arch_get_registers() successfully

> -     * managed to update the CPUARMState with, and only allowing those

> -     * to be written back up into the kernel).

> -     */

> +    write_cpustate_to_list(cpu, true);

> +

>      if (!write_list_to_kvmstate(cpu, level)) {

>          return EINVAL;

>      }

> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c

> index 089af9c5f02..e3ba1492482 100644

> --- a/target/arm/kvm64.c

> +++ b/target/arm/kvm64.c

> @@ -838,6 +838,8 @@ int kvm_arch_put_registers(CPUState *cs, int level)

>          return ret;

>      }

>  

> +    write_cpustate_to_list(cpu, true);

> +

>      if (!write_list_to_kvmstate(cpu, level)) {

>          return EINVAL;

>      }

> diff --git a/target/arm/machine.c b/target/arm/machine.c

> index b2925496148..124192bfc26 100644

> --- a/target/arm/machine.c

> +++ b/target/arm/machine.c

> @@ -630,7 +630,7 @@ static int cpu_pre_save(void *opaque)

>              abort();

>          }

>      } else {

> -        if (!write_cpustate_to_list(cpu)) {

> +        if (!write_cpustate_to_list(cpu, false)) {

>              /* This should never fail. */

>              abort();

>          }

>
Peter Maydell Feb. 21, 2019, 2:23 p.m. UTC | #2
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
> On 2/14/19 8:05 PM, Peter Maydell wrote:

> > Support this by making kvm_arch_put_registers() synchronize

> > CPU state back into the list. We sync only those registers

> > where the initial write succeeds, which should be sufficient.


> This commit introduces a regression when running with EDK2 FW:

>

> I get the following traces:

>

> error: kvm run failed Function not implemented

>  PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a

> X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000

> X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710

> X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756

> X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0

> X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0

> X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0

> X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2

> X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010

> X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0

> X29=000000013e3d12a0 X30=000000013f5a2518  SP=000000013b7be0b0

> PSTATE=404003c4 -Z-- EL1t

>

>

> and in host dmesg:

> [ 3507.926571] kvm [35042]: load/store instruction decoding not implemented


Ugh. Presumably this means that at some point we're writing
back a wrong value to a guest system register and making
it fall over :-( I guess debug would be by identifying
when execution diverges...

thanks
-- PMM
Peter Maydell Feb. 21, 2019, 2:26 p.m. UTC | #3
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
> My qemu cmd line is:

>

> qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G

> -display none --enable-kvm -serial tcp:localhost:4444,server -qmp

> unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device

> virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop

> -drive

> file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0

> -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2

> -netdev

> tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown

> -bios

> /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd

> -net none -d guest_errors


Does the bug still occur if you get rid of all the network/disk related
command line options here (hopefully it does, to reduce the size of
the test case) ? Can you provide the QEMU_EFI.fd ?

thanks
-- PMM
Alex Bennée Feb. 21, 2019, 2:42 p.m. UTC | #4
Auger Eric <eric.auger@redhat.com> writes:

> Hi Peter, Alex,

> On 2/14/19 8:05 PM, Peter Maydell wrote:

>> At the moment the Arm implementations of kvm_arch_{get,put}_registers()

>> don't support having QEMU change the values of system registers

>> (aka coprocessor registers for AArch32). This is because although

>> kvm_arch_get_registers() calls write_list_to_cpustate() to

>> update the CPU state struct fields (so QEMU code can read the

>> values in the usual way), kvm_arch_put_registers() does not

>> call write_cpustate_to_list(), meaning that any changes to

>> the CPU state struct fields will not be passed back to KVM.

>>

>> The rationale for this design is documented in a comment in the

>> AArch32 kvm_arch_put_registers() -- writing the values in the

>> cpregs list into the CPU state struct is "lossy" because the

>> write of a register might not succeed, and so if we blindly

>> copy the CPU state values back again we will incorrectly

>> change register values for the guest. The assumption was that

>> no QEMU code would need to write to the registers.

>>

>> However, when we implemented debug support for KVM guests, we

>> broke that assumption: the code to handle "set the guest up

>> to take a breakpoint exception" does so by updating various

>> guest registers including ESR_EL1.

>>

>> Support this by making kvm_arch_put_registers() synchronize

>> CPU state back into the list. We sync only those registers

>> where the initial write succeeds, which should be sufficient.

>>

>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

>> Tested-by: Alex Bennée <alex.bennee@linaro.org>

>> Tested-by: Dongjiu Geng <gengdongjiu@huawei.com>

> This commit introduces a regression when running with EDK2 FW:

>

> I get the following traces:

>

> error: kvm run failed Function not implemented

>  PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a


Any chance of attaching to the gdbstub and an x/10i around that PC?

--
Alex Bennée
Eric Auger Feb. 21, 2019, 2:55 p.m. UTC | #5
Hi,
On 2/21/19 3:26 PM, Peter Maydell wrote:
> On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:

>> My qemu cmd line is:

>>

>> qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G

>> -display none --enable-kvm -serial tcp:localhost:4444,server -qmp

>> unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device

>> virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop

>> -drive

>> file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0

>> -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2

>> -netdev

>> tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown

>> -bios

>> /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd

>> -net none -d guest_errors

> 

> Does the bug still occur if you get rid of all the network/disk related

> command line options here (hopefully it does, to reduce the size of

> the test case) ? Can you provide the QEMU_EFI.fd ?

I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in
debug mode

I can reproduce the bug with

qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 4 -m 4G
-display none --enable-kvm -serial stdio \
-device
virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop
\
-drive
file=/home/augere/VM/IMAGES/aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0
\
-bios
/home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd
\
-net none -d guest_errors

also with virtio-blk-device

Thanks

Eric
> 

> thanks

> -- PMM

>
Peter Maydell Feb. 26, 2019, 4:53 p.m. UTC | #6
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
>

> Hi Peter, Alex,

> On 2/14/19 8:05 PM, Peter Maydell wrote:

> > Support this by making kvm_arch_put_registers() synchronize

> > CPU state back into the list. We sync only those registers

> > where the initial write succeeds, which should be sufficient.


> This commit introduces a regression when running with EDK2 FW:


I haven't found time to investigate this yet, so I propose
to just revert this commit for the moment.

thanks
-- PMM
Peter Maydell March 8, 2019, 3:53 p.m. UTC | #7
On Thu, 21 Feb 2019 at 14:55, Auger Eric <eric.auger@redhat.com> wrote:
>

> Hi,

> On 2/21/19 3:26 PM, Peter Maydell wrote:

> > On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:

> >> [a regression using aarch64 KVM]

> >Can you provide the QEMU_EFI.fd ?

> I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in

> debug mode


I don't really want to mess about building EDK2. Could you just
provide the blob that demonstrates the bug, please?

thanks
-- PMM
Eric Auger March 8, 2019, 4:54 p.m. UTC | #8
Hi Peter,

On 3/8/19 4:53 PM, Peter Maydell wrote:
> On Thu, 21 Feb 2019 at 14:55, Auger Eric <eric.auger@redhat.com> wrote:

>>

>> Hi,

>> On 2/21/19 3:26 PM, Peter Maydell wrote:

>>> On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:

>>>> [a regression using aarch64 KVM]

>>> Can you provide the QEMU_EFI.fd ?

>> I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in

>> debug mode

> 

> I don't really want to mess about building EDK2. Could you just

> provide the blob that demonstrates the bug, please?


I just sent you the binary in a separate email.

Thanks

Eric
> 

> thanks

> -- PMM

>
Peter Maydell March 11, 2019, 1:26 p.m. UTC | #9
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
> This commit introduces a regression when running with EDK2 FW:

>

> I get the following traces:

>

> error: kvm run failed Function not implemented


Unfortunately I can't repro this, on a Mustang host with
4.8.0-42-generic host kernel, with this command line:

 ./build/kvm/aarch64-softmmu/qemu-system-aarch64  -M
virt,gic-version=host -cpu host -smp 2 -m 4G -display none
--enable-kvm  -serial stdio -bios
~/QEMU_EFI_cbuild_debug_feb21_2019.fd

Does that simplified command line still repro the problem
for you on your hardware, or does the bug require all
the extra stuff with blk and net devices and a real
guest filesystem to trigger? (You'll need to revert the
revert commit 942f99c825fc94c8b1a4, obivously.)

More generally, can you strip the repro command line
down as much as possible to something that still shows
the bug? I'd rather not have to sort out getting TAP networking
on this box if it isn't actually a necessary component of
reproducing the bug :-)

thanks
-- PMM
Eric Auger March 11, 2019, 2:54 p.m. UTC | #10
Hi Peter,

On 3/11/19 2:26 PM, Peter Maydell wrote:
> On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:

>> This commit introduces a regression when running with EDK2 FW:

>>

>> I get the following traces:

>>

>> error: kvm run failed Function not implemented

> 

> Unfortunately I can't repro this, on a Mustang host with

> 4.8.0-42-generic host kernel, with this command line:

> 

>  ./build/kvm/aarch64-softmmu/qemu-system-aarch64  -M

> virt,gic-version=host -cpu host -smp 2 -m 4G -display none

> --enable-kvm  -serial stdio -bios

> ~/QEMU_EFI_cbuild_debug_feb21_2019.fd

> 

> Does that simplified command line still repro the problem

> for you on your hardware, or does the bug require all

> the extra stuff with blk and net devices and a real

> guest filesystem to trigger? (You'll need to revert the

> revert commit 942f99c825fc94c8b1a4, obivously.)

> 

> More generally, can you strip the repro command line

> down as much as possible to something that still shows

> the bug? I'd rather not have to sort out getting TAP networking

> on this box if it isn't actually a necessary component of

> reproducing the bug :-)


yes I hit the bug with your reduced command line:

aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host
-smp 2 -m 4G -display none --enable-kvm  -serial stdio -bios
~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd

[285347.260500] kvm [82944]: load/store instruction decoding not implemented

InstallProtocolInterface: CE345171-BA0B-11D2-8E4F-00A0C969723B 13E6C2F20
 BlockSize : 262144
 LastBlock : FF
error: kvm run failed Function not implemented
 PC=000000013f5a6004 X00=0000000000000000 X01=000000013e7310b0
X02=000000013f5a6004 X03=00000000404003c5 X04=0000000000000000
X05=0000000096000046 X06=000000013f828210 X07=000000013f828310
X08=000000013f8504c8 X09=0000000400000000 X10=000000013d6c1000
X11=000000013f297fff X12=0000000000000000 X13=0000000000000008
X14=0000000000000002 X15=00000000000000ff X16=000000013f872da0
X17=00000000ffffa6ab X18=0000000000000000 X19=0000000009000000
X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2
X23=0000000009000018 X24=000000013f631090 X25=0000000000000010
X26=0000000000000100 X27=000000013f89501b X28=000000013e7311b0
X29=000000013e7313e0 X30=000000013f5a5e40  SP=000000013f7bdf60
PSTATE=804003c5 N--- EL1h

Thanks

Eric

> 

> thanks

> -- PMM

>
Peter Maydell March 11, 2019, 2:55 p.m. UTC | #11
On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote:
> yes I hit the bug with your reduced command line:

>

> aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host

> -smp 2 -m 4G -display none --enable-kvm  -serial stdio -bios

> ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd


OK, that's good to know. What hardware are you using, and what host
kernel version ?

thanks
-- PMM
Eric Auger March 11, 2019, 3:09 p.m. UTC | #12
Hi Peter,

On 3/11/19 3:55 PM, Peter Maydell wrote:
> On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote:

>> yes I hit the bug with your reduced command line:

>>

>> aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host

>> -smp 2 -m 4G -display none --enable-kvm  -serial stdio -bios

>> ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd

> 

> OK, that's good to know. What hardware are you using, and what host

> kernel version ?


I am using Cavium ThunderX 2 and 5.0 host

Thanks

Eric
> 

> thanks

> -- PMM

>
Peter Maydell March 11, 2019, 4:07 p.m. UTC | #13
On Mon, 11 Mar 2019 at 15:09, Auger Eric <eric.auger@redhat.com> wrote:
>

> Hi Peter,

>

> On 3/11/19 3:55 PM, Peter Maydell wrote:

> > On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote:

> >> yes I hit the bug with your reduced command line:

> >>

> >> aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host

> >> -smp 2 -m 4G -display none --enable-kvm  -serial stdio -bios

> >> ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd

> >

> > OK, that's good to know. What hardware are you using, and what host

> > kernel version ?

>

> I am using Cavium ThunderX 2 and 5.0 host


Thanks -- I got access to a thunderx 2 and have repro'd the failure
there.

-- PMM
Peter Maydell March 15, 2019, 1:51 p.m. UTC | #14
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
> This commit introduces a regression when running with EDK2 FW:


I believe I've now tracked down what was going wrong here.
The problem was with the guest CPU reset path -- in
kvm_arm_reset_vcpu() we copy the kernel's reset state
into the cpreg_indexes/cpreg_values arrays, because the
next thing we do after a reset is a kvm_arch_put_registers()
which will copy from the list values back to the kernel.
But now we've updated kvm_arch_put_registers() to do
a two-step CPUState -> cpreg* arrays -> kernel sync,
we need kvm_arm_reset_vcpu() to also sync cpreg arrays to
the CPUState. Otherwise the first kvm_arch_put_registers()
will write incorrect values back to the kernel, which for
some guests which are more trusting of reset register values
than Linux causes them to fail.

The fix is just this, on top of reverting the revert:

--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -507,6 +507,14 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
         fprintf(stderr, "write_kvmstate_to_list failed\n");
         abort();
     }
+    /*
+     * Sync the reset values also into the CPUState. This is necessary
+     * because the next thing we do will be a kvm_arch_put_registers()
+     * which will update the list values from the CPUState before copying
+     * the list values back to KVM. It's OK to ignore failure returns here
+     * for the same reason we do so in kvm_arch_get_registers().
+     */
+    write_list_to_cpustate(cpu);
 }

 /*

I'll send out a proper patch in a bit.

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index f0334413ece..bfc05c796a5 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2535,18 +2535,25 @@  bool write_list_to_cpustate(ARMCPU *cpu);
 /**
  * write_cpustate_to_list:
  * @cpu: ARMCPU
+ * @kvm_sync: true if this is for syncing back to KVM
  *
  * For each register listed in the ARMCPU cpreg_indexes list, write
  * its value from the ARMCPUState structure into the cpreg_values list.
  * This is used to copy info from TCG's working data structures into
  * KVM or for outbound migration.
  *
+ * @kvm_sync is true if we are doing this in order to sync the
+ * register state back to KVM. In this case we will only update
+ * values in the list if the previous list->cpustate sync actually
+ * successfully wrote the CPU state. Otherwise we will keep the value
+ * that is in the list.
+ *
  * Returns: true if all register values were read correctly,
  * false if some register was unknown or could not be read.
  * Note that we do not stop early on failure -- we will attempt
  * reading all registers in the list.
  */
-bool write_cpustate_to_list(ARMCPU *cpu);
+bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync);
 
 #define ARM_CPUID_TI915T      0x54029152
 #define ARM_CPUID_TI925T      0x54029252
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 5ac335f598c..7653aa6a50a 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -264,7 +264,7 @@  static bool raw_accessors_invalid(const ARMCPRegInfo *ri)
     return true;
 }
 
-bool write_cpustate_to_list(ARMCPU *cpu)
+bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync)
 {
     /* Write the coprocessor state from cpu->env to the (index,value) list. */
     int i;
@@ -273,6 +273,7 @@  bool write_cpustate_to_list(ARMCPU *cpu)
     for (i = 0; i < cpu->cpreg_array_len; i++) {
         uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]);
         const ARMCPRegInfo *ri;
+        uint64_t newval;
 
         ri = get_arm_cp_reginfo(cpu->cp_regs, regidx);
         if (!ri) {
@@ -282,7 +283,29 @@  bool write_cpustate_to_list(ARMCPU *cpu)
         if (ri->type & ARM_CP_NO_RAW) {
             continue;
         }
-        cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri);
+
+        newval = read_raw_cp_reg(&cpu->env, ri);
+        if (kvm_sync) {
+            /*
+             * Only sync if the previous list->cpustate sync succeeded.
+             * Rather than tracking the success/failure state for every
+             * item in the list, we just recheck "does the raw write we must
+             * have made in write_list_to_cpustate() read back OK" here.
+             */
+            uint64_t oldval = cpu->cpreg_values[i];
+
+            if (oldval == newval) {
+                continue;
+            }
+
+            write_raw_cp_reg(&cpu->env, ri, oldval);
+            if (read_raw_cp_reg(&cpu->env, ri) != oldval) {
+                continue;
+            }
+
+            write_raw_cp_reg(&cpu->env, ri, newval);
+        }
+        cpu->cpreg_values[i] = newval;
     }
     return ok;
 }
diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c
index bd51eb43c86..a75e04cc8f3 100644
--- a/target/arm/kvm32.c
+++ b/target/arm/kvm32.c
@@ -387,24 +387,8 @@  int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
-    /* Note that we do not call write_cpustate_to_list()
-     * here, so we are only writing the tuple list back to
-     * KVM. This is safe because nothing can change the
-     * CPUARMState cp15 fields (in particular gdb accesses cannot)
-     * and so there are no changes to sync. In fact syncing would
-     * be wrong at this point: for a constant register where TCG and
-     * KVM disagree about its value, the preceding write_list_to_cpustate()
-     * would not have had any effect on the CPUARMState value (since the
-     * register is read-only), and a write_cpustate_to_list() here would
-     * then try to write the TCG value back into KVM -- this would either
-     * fail or incorrectly change the value the guest sees.
-     *
-     * If we ever want to allow the user to modify cp15 registers via
-     * the gdb stub, we would need to be more clever here (for instance
-     * tracking the set of registers kvm_arch_get_registers() successfully
-     * managed to update the CPUARMState with, and only allowing those
-     * to be written back up into the kernel).
-     */
+    write_cpustate_to_list(cpu, true);
+
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 089af9c5f02..e3ba1492482 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -838,6 +838,8 @@  int kvm_arch_put_registers(CPUState *cs, int level)
         return ret;
     }
 
+    write_cpustate_to_list(cpu, true);
+
     if (!write_list_to_kvmstate(cpu, level)) {
         return EINVAL;
     }
diff --git a/target/arm/machine.c b/target/arm/machine.c
index b2925496148..124192bfc26 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -630,7 +630,7 @@  static int cpu_pre_save(void *opaque)
             abort();
         }
     } else {
-        if (!write_cpustate_to_list(cpu)) {
+        if (!write_cpustate_to_list(cpu, false)) {
             /* This should never fail. */
             abort();
         }