Message ID | 20190214190603.25030-12-peter.maydell@linaro.org |
---|---|
State | Accepted |
Commit | 823e1b3818f9b10b824ddcd756983b6e2fa68730 |
Headers | show |
Series | target-arm queue | expand |
Hi Peter, Alex, On 2/14/19 8:05 PM, Peter Maydell wrote: > At the moment the Arm implementations of kvm_arch_{get,put}_registers() > don't support having QEMU change the values of system registers > (aka coprocessor registers for AArch32). This is because although > kvm_arch_get_registers() calls write_list_to_cpustate() to > update the CPU state struct fields (so QEMU code can read the > values in the usual way), kvm_arch_put_registers() does not > call write_cpustate_to_list(), meaning that any changes to > the CPU state struct fields will not be passed back to KVM. > > The rationale for this design is documented in a comment in the > AArch32 kvm_arch_put_registers() -- writing the values in the > cpregs list into the CPU state struct is "lossy" because the > write of a register might not succeed, and so if we blindly > copy the CPU state values back again we will incorrectly > change register values for the guest. The assumption was that > no QEMU code would need to write to the registers. > > However, when we implemented debug support for KVM guests, we > broke that assumption: the code to handle "set the guest up > to take a breakpoint exception" does so by updating various > guest registers including ESR_EL1. > > Support this by making kvm_arch_put_registers() synchronize > CPU state back into the list. We sync only those registers > where the initial write succeeds, which should be sufficient. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > Reviewed-by: Alex Bennée <alex.bennee@linaro.org> > Tested-by: Alex Bennée <alex.bennee@linaro.org> > Tested-by: Dongjiu Geng <gengdongjiu@huawei.com> This commit introduces a regression when running with EDK2 FW: I get the following traces: error: kvm run failed Function not implemented PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000 X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710 X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756 X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0 X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0 X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0 X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2 X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010 X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0 X29=000000013e3d12a0 X30=000000013f5a2518 SP=000000013b7be0b0 PSTATE=404003c4 -Z-- EL1t and in host dmesg: [ 3507.926571] kvm [35042]: load/store instruction decoding not implemented My qemu cmd line is: qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G -display none --enable-kvm -serial tcp:localhost:4444,server -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop -drive file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0 -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2 -netdev tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown -bios /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd -net none -d guest_errors reverting the patch removes the guest crash. Any clue? thanks Eric > --- > target/arm/cpu.h | 9 ++++++++- > target/arm/helper.c | 27 +++++++++++++++++++++++++-- > target/arm/kvm32.c | 20 ++------------------ > target/arm/kvm64.c | 2 ++ > target/arm/machine.c | 2 +- > 5 files changed, 38 insertions(+), 22 deletions(-) > > diff --git a/target/arm/cpu.h b/target/arm/cpu.h > index f0334413ece..bfc05c796a5 100644 > --- a/target/arm/cpu.h > +++ b/target/arm/cpu.h > @@ -2535,18 +2535,25 @@ bool write_list_to_cpustate(ARMCPU *cpu); > /** > * write_cpustate_to_list: > * @cpu: ARMCPU > + * @kvm_sync: true if this is for syncing back to KVM > * > * For each register listed in the ARMCPU cpreg_indexes list, write > * its value from the ARMCPUState structure into the cpreg_values list. > * This is used to copy info from TCG's working data structures into > * KVM or for outbound migration. > * > + * @kvm_sync is true if we are doing this in order to sync the > + * register state back to KVM. In this case we will only update > + * values in the list if the previous list->cpustate sync actually > + * successfully wrote the CPU state. Otherwise we will keep the value > + * that is in the list. > + * > * Returns: true if all register values were read correctly, > * false if some register was unknown or could not be read. > * Note that we do not stop early on failure -- we will attempt > * reading all registers in the list. > */ > -bool write_cpustate_to_list(ARMCPU *cpu); > +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync); > > #define ARM_CPUID_TI915T 0x54029152 > #define ARM_CPUID_TI925T 0x54029252 > diff --git a/target/arm/helper.c b/target/arm/helper.c > index 5ac335f598c..7653aa6a50a 100644 > --- a/target/arm/helper.c > +++ b/target/arm/helper.c > @@ -264,7 +264,7 @@ static bool raw_accessors_invalid(const ARMCPRegInfo *ri) > return true; > } > > -bool write_cpustate_to_list(ARMCPU *cpu) > +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync) > { > /* Write the coprocessor state from cpu->env to the (index,value) list. */ > int i; > @@ -273,6 +273,7 @@ bool write_cpustate_to_list(ARMCPU *cpu) > for (i = 0; i < cpu->cpreg_array_len; i++) { > uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]); > const ARMCPRegInfo *ri; > + uint64_t newval; > > ri = get_arm_cp_reginfo(cpu->cp_regs, regidx); > if (!ri) { > @@ -282,7 +283,29 @@ bool write_cpustate_to_list(ARMCPU *cpu) > if (ri->type & ARM_CP_NO_RAW) { > continue; > } > - cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri); > + > + newval = read_raw_cp_reg(&cpu->env, ri); > + if (kvm_sync) { > + /* > + * Only sync if the previous list->cpustate sync succeeded. > + * Rather than tracking the success/failure state for every > + * item in the list, we just recheck "does the raw write we must > + * have made in write_list_to_cpustate() read back OK" here. > + */ > + uint64_t oldval = cpu->cpreg_values[i]; > + > + if (oldval == newval) { > + continue; > + } > + > + write_raw_cp_reg(&cpu->env, ri, oldval); > + if (read_raw_cp_reg(&cpu->env, ri) != oldval) { > + continue; > + } > + > + write_raw_cp_reg(&cpu->env, ri, newval); > + } > + cpu->cpreg_values[i] = newval; > } > return ok; > } > diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c > index bd51eb43c86..a75e04cc8f3 100644 > --- a/target/arm/kvm32.c > +++ b/target/arm/kvm32.c > @@ -387,24 +387,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) > return ret; > } > > - /* Note that we do not call write_cpustate_to_list() > - * here, so we are only writing the tuple list back to > - * KVM. This is safe because nothing can change the > - * CPUARMState cp15 fields (in particular gdb accesses cannot) > - * and so there are no changes to sync. In fact syncing would > - * be wrong at this point: for a constant register where TCG and > - * KVM disagree about its value, the preceding write_list_to_cpustate() > - * would not have had any effect on the CPUARMState value (since the > - * register is read-only), and a write_cpustate_to_list() here would > - * then try to write the TCG value back into KVM -- this would either > - * fail or incorrectly change the value the guest sees. > - * > - * If we ever want to allow the user to modify cp15 registers via > - * the gdb stub, we would need to be more clever here (for instance > - * tracking the set of registers kvm_arch_get_registers() successfully > - * managed to update the CPUARMState with, and only allowing those > - * to be written back up into the kernel). > - */ > + write_cpustate_to_list(cpu, true); > + > if (!write_list_to_kvmstate(cpu, level)) { > return EINVAL; > } > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c > index 089af9c5f02..e3ba1492482 100644 > --- a/target/arm/kvm64.c > +++ b/target/arm/kvm64.c > @@ -838,6 +838,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) > return ret; > } > > + write_cpustate_to_list(cpu, true); > + > if (!write_list_to_kvmstate(cpu, level)) { > return EINVAL; > } > diff --git a/target/arm/machine.c b/target/arm/machine.c > index b2925496148..124192bfc26 100644 > --- a/target/arm/machine.c > +++ b/target/arm/machine.c > @@ -630,7 +630,7 @@ static int cpu_pre_save(void *opaque) > abort(); > } > } else { > - if (!write_cpustate_to_list(cpu)) { > + if (!write_cpustate_to_list(cpu, false)) { > /* This should never fail. */ > abort(); > } >
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: > On 2/14/19 8:05 PM, Peter Maydell wrote: > > Support this by making kvm_arch_put_registers() synchronize > > CPU state back into the list. We sync only those registers > > where the initial write succeeds, which should be sufficient. > This commit introduces a regression when running with EDK2 FW: > > I get the following traces: > > error: kvm run failed Function not implemented > PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a > X02=0000000000000000 X03=00000000404003c4 X04=0000000000000000 > X05=0000000096000046 X06=000000013d2ef270 X07=000000013e3d1710 > X08=09010755ffaf8ba8 X09=ffaf8b9cfeeb5468 X10=feeb546409010756 > X11=09010757ffaf8b90 X12=feeb50680903068b X13=090306a1ffaf8bc0 > X14=0000000000000000 X15=0000000000000000 X16=000000013f872da0 > X17=00000000ffffa6ab X18=0000000000000000 X19=000000013f5a92d0 > X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2 > X23=000000013f5a92e8 X24=000000013f631090 X25=0000000000000010 > X26=0000000000000100 X27=000000013f89501b X28=000000013e3d14e0 > X29=000000013e3d12a0 X30=000000013f5a2518 SP=000000013b7be0b0 > PSTATE=404003c4 -Z-- EL1t > > > and in host dmesg: > [ 3507.926571] kvm [35042]: load/store instruction decoding not implemented Ugh. Presumably this means that at some point we're writing back a wrong value to a guest system register and making it fall over :-( I guess debug would be by identifying when execution diverges... thanks -- PMM
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: > My qemu cmd line is: > > qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G > -display none --enable-kvm -serial tcp:localhost:4444,server -qmp > unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device > virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop > -drive > file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0 > -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2 > -netdev > tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown > -bios > /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd > -net none -d guest_errors Does the bug still occur if you get rid of all the network/disk related command line options here (hopefully it does, to reduce the size of the test case) ? Can you provide the QEMU_EFI.fd ? thanks -- PMM
Auger Eric <eric.auger@redhat.com> writes: > Hi Peter, Alex, > On 2/14/19 8:05 PM, Peter Maydell wrote: >> At the moment the Arm implementations of kvm_arch_{get,put}_registers() >> don't support having QEMU change the values of system registers >> (aka coprocessor registers for AArch32). This is because although >> kvm_arch_get_registers() calls write_list_to_cpustate() to >> update the CPU state struct fields (so QEMU code can read the >> values in the usual way), kvm_arch_put_registers() does not >> call write_cpustate_to_list(), meaning that any changes to >> the CPU state struct fields will not be passed back to KVM. >> >> The rationale for this design is documented in a comment in the >> AArch32 kvm_arch_put_registers() -- writing the values in the >> cpregs list into the CPU state struct is "lossy" because the >> write of a register might not succeed, and so if we blindly >> copy the CPU state values back again we will incorrectly >> change register values for the guest. The assumption was that >> no QEMU code would need to write to the registers. >> >> However, when we implemented debug support for KVM guests, we >> broke that assumption: the code to handle "set the guest up >> to take a breakpoint exception" does so by updating various >> guest registers including ESR_EL1. >> >> Support this by making kvm_arch_put_registers() synchronize >> CPU state back into the list. We sync only those registers >> where the initial write succeeds, which should be sufficient. >> >> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> >> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> >> Tested-by: Alex Bennée <alex.bennee@linaro.org> >> Tested-by: Dongjiu Geng <gengdongjiu@huawei.com> > This commit introduces a regression when running with EDK2 FW: > > I get the following traces: > > error: kvm run failed Function not implemented > PC=000000013f5a6208 X00=00000000404003c4 X01=000000000000003a Any chance of attaching to the gdbstub and an x/10i around that PC? -- Alex Bennée
Hi, On 2/21/19 3:26 PM, Peter Maydell wrote: > On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: >> My qemu cmd line is: >> >> qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 2 -m 4G >> -display none --enable-kvm -serial tcp:localhost:4444,server -qmp >> unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -device >> virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop >> -drive >> file=aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0 >> -device virtio-net-pci,bus=pcie.0,netdev=nic0,mac=6a:f5:10:b1:3d:d2 >> -netdev >> tap,id=nic0,script=/home/augere/TEST/SCRIPTS/qemu-ifup,downscript=/home/augere/TEST/SCRIPTS/qemu-ifdown >> -bios >> /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd >> -net none -d guest_errors > > Does the bug still occur if you get rid of all the network/disk related > command line options here (hopefully it does, to reduce the size of > the test case) ? Can you provide the QEMU_EFI.fd ? I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in debug mode I can reproduce the bug with qemu-system-aarch64 -M virt,gic-version=3 -cpu host -smp 4 -m 4G -display none --enable-kvm -serial stdio \ -device virtio-blk-pci,bus=pcie.0,scsi=off,drive=drv0,id=virtio-disk0,bootindex=1,werror=stop,rerror=stop \ -drive file=/home/augere/VM/IMAGES/aarch64-vm0-rhel-alt-7.6.qcow2,format=qcow2,if=none,cache=writethrough,id=drv0 \ -bios /home/augere/UPSTREAM/edk2/Build/ArmVirtQemu-AARCH64/DEBUG_GCC5/FV/QEMU_EFI.fd \ -net none -d guest_errors also with virtio-blk-device Thanks Eric > > thanks > -- PMM >
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: > > Hi Peter, Alex, > On 2/14/19 8:05 PM, Peter Maydell wrote: > > Support this by making kvm_arch_put_registers() synchronize > > CPU state back into the list. We sync only those registers > > where the initial write succeeds, which should be sufficient. > This commit introduces a regression when running with EDK2 FW: I haven't found time to investigate this yet, so I propose to just revert this commit for the moment. thanks -- PMM
On Thu, 21 Feb 2019 at 14:55, Auger Eric <eric.auger@redhat.com> wrote: > > Hi, > On 2/21/19 3:26 PM, Peter Maydell wrote: > > On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: > >> [a regression using aarch64 KVM] > >Can you provide the QEMU_EFI.fd ? > I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in > debug mode I don't really want to mess about building EDK2. Could you just provide the blob that demonstrates the bug, please? thanks -- PMM
Hi Peter, On 3/8/19 4:53 PM, Peter Maydell wrote: > On Thu, 21 Feb 2019 at 14:55, Auger Eric <eric.auger@redhat.com> wrote: >> >> Hi, >> On 2/21/19 3:26 PM, Peter Maydell wrote: >>> On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: >>>> [a regression using aarch64 KVM] >>> Can you provide the QEMU_EFI.fd ? >> I am using upstream EDK2 (https://github.com/tianocore/edk2) compiled in >> debug mode > > I don't really want to mess about building EDK2. Could you just > provide the blob that demonstrates the bug, please? I just sent you the binary in a separate email. Thanks Eric > > thanks > -- PMM >
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: > This commit introduces a regression when running with EDK2 FW: > > I get the following traces: > > error: kvm run failed Function not implemented Unfortunately I can't repro this, on a Mustang host with 4.8.0-42-generic host kernel, with this command line: ./build/kvm/aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host -smp 2 -m 4G -display none --enable-kvm -serial stdio -bios ~/QEMU_EFI_cbuild_debug_feb21_2019.fd Does that simplified command line still repro the problem for you on your hardware, or does the bug require all the extra stuff with blk and net devices and a real guest filesystem to trigger? (You'll need to revert the revert commit 942f99c825fc94c8b1a4, obivously.) More generally, can you strip the repro command line down as much as possible to something that still shows the bug? I'd rather not have to sort out getting TAP networking on this box if it isn't actually a necessary component of reproducing the bug :-) thanks -- PMM
Hi Peter, On 3/11/19 2:26 PM, Peter Maydell wrote: > On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote: >> This commit introduces a regression when running with EDK2 FW: >> >> I get the following traces: >> >> error: kvm run failed Function not implemented > > Unfortunately I can't repro this, on a Mustang host with > 4.8.0-42-generic host kernel, with this command line: > > ./build/kvm/aarch64-softmmu/qemu-system-aarch64 -M > virt,gic-version=host -cpu host -smp 2 -m 4G -display none > --enable-kvm -serial stdio -bios > ~/QEMU_EFI_cbuild_debug_feb21_2019.fd > > Does that simplified command line still repro the problem > for you on your hardware, or does the bug require all > the extra stuff with blk and net devices and a real > guest filesystem to trigger? (You'll need to revert the > revert commit 942f99c825fc94c8b1a4, obivously.) > > More generally, can you strip the repro command line > down as much as possible to something that still shows > the bug? I'd rather not have to sort out getting TAP networking > on this box if it isn't actually a necessary component of > reproducing the bug :-) yes I hit the bug with your reduced command line: aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host -smp 2 -m 4G -display none --enable-kvm -serial stdio -bios ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd [285347.260500] kvm [82944]: load/store instruction decoding not implemented InstallProtocolInterface: CE345171-BA0B-11D2-8E4F-00A0C969723B 13E6C2F20 BlockSize : 262144 LastBlock : FF error: kvm run failed Function not implemented PC=000000013f5a6004 X00=0000000000000000 X01=000000013e7310b0 X02=000000013f5a6004 X03=00000000404003c5 X04=0000000000000000 X05=0000000096000046 X06=000000013f828210 X07=000000013f828310 X08=000000013f8504c8 X09=0000000400000000 X10=000000013d6c1000 X11=000000013f297fff X12=0000000000000000 X13=0000000000000008 X14=0000000000000002 X15=00000000000000ff X16=000000013f872da0 X17=00000000ffffa6ab X18=0000000000000000 X19=0000000009000000 X20=000000013f5a7a78 X21=000000000000003a X22=000000013f5a7ab2 X23=0000000009000018 X24=000000013f631090 X25=0000000000000010 X26=0000000000000100 X27=000000013f89501b X28=000000013e7311b0 X29=000000013e7313e0 X30=000000013f5a5e40 SP=000000013f7bdf60 PSTATE=804003c5 N--- EL1h Thanks Eric > > thanks > -- PMM >
On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote: > yes I hit the bug with your reduced command line: > > aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host > -smp 2 -m 4G -display none --enable-kvm -serial stdio -bios > ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd OK, that's good to know. What hardware are you using, and what host kernel version ? thanks -- PMM
Hi Peter, On 3/11/19 3:55 PM, Peter Maydell wrote: > On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote: >> yes I hit the bug with your reduced command line: >> >> aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host >> -smp 2 -m 4G -display none --enable-kvm -serial stdio -bios >> ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd > > OK, that's good to know. What hardware are you using, and what host > kernel version ? I am using Cavium ThunderX 2 and 5.0 host Thanks Eric > > thanks > -- PMM >
On Mon, 11 Mar 2019 at 15:09, Auger Eric <eric.auger@redhat.com> wrote: > > Hi Peter, > > On 3/11/19 3:55 PM, Peter Maydell wrote: > > On Mon, 11 Mar 2019 at 14:54, Auger Eric <eric.auger@redhat.com> wrote: > >> yes I hit the bug with your reduced command line: > >> > >> aarch64-softmmu/qemu-system-aarch64 -M virt,gic-version=host -cpu host > >> -smp 2 -m 4G -display none --enable-kvm -serial stdio -bios > >> ~/VM/UEFI/QEMU_EFI_cbuild_debug_feb21_2019.fd > > > > OK, that's good to know. What hardware are you using, and what host > > kernel version ? > > I am using Cavium ThunderX 2 and 5.0 host Thanks -- I got access to a thunderx 2 and have repro'd the failure there. -- PMM
On Thu, 21 Feb 2019 at 14:20, Auger Eric <eric.auger@redhat.com> wrote:
> This commit introduces a regression when running with EDK2 FW:
I believe I've now tracked down what was going wrong here.
The problem was with the guest CPU reset path -- in
kvm_arm_reset_vcpu() we copy the kernel's reset state
into the cpreg_indexes/cpreg_values arrays, because the
next thing we do after a reset is a kvm_arch_put_registers()
which will copy from the list values back to the kernel.
But now we've updated kvm_arch_put_registers() to do
a two-step CPUState -> cpreg* arrays -> kernel sync,
we need kvm_arm_reset_vcpu() to also sync cpreg arrays to
the CPUState. Otherwise the first kvm_arch_put_registers()
will write incorrect values back to the kernel, which for
some guests which are more trusting of reset register values
than Linux causes them to fail.
The fix is just this, on top of reverting the revert:
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -507,6 +507,14 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu)
fprintf(stderr, "write_kvmstate_to_list failed\n");
abort();
}
+ /*
+ * Sync the reset values also into the CPUState. This is necessary
+ * because the next thing we do will be a kvm_arch_put_registers()
+ * which will update the list values from the CPUState before copying
+ * the list values back to KVM. It's OK to ignore failure returns here
+ * for the same reason we do so in kvm_arch_get_registers().
+ */
+ write_list_to_cpustate(cpu);
}
/*
I'll send out a proper patch in a bit.
thanks
-- PMM
diff --git a/target/arm/cpu.h b/target/arm/cpu.h index f0334413ece..bfc05c796a5 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -2535,18 +2535,25 @@ bool write_list_to_cpustate(ARMCPU *cpu); /** * write_cpustate_to_list: * @cpu: ARMCPU + * @kvm_sync: true if this is for syncing back to KVM * * For each register listed in the ARMCPU cpreg_indexes list, write * its value from the ARMCPUState structure into the cpreg_values list. * This is used to copy info from TCG's working data structures into * KVM or for outbound migration. * + * @kvm_sync is true if we are doing this in order to sync the + * register state back to KVM. In this case we will only update + * values in the list if the previous list->cpustate sync actually + * successfully wrote the CPU state. Otherwise we will keep the value + * that is in the list. + * * Returns: true if all register values were read correctly, * false if some register was unknown or could not be read. * Note that we do not stop early on failure -- we will attempt * reading all registers in the list. */ -bool write_cpustate_to_list(ARMCPU *cpu); +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync); #define ARM_CPUID_TI915T 0x54029152 #define ARM_CPUID_TI925T 0x54029252 diff --git a/target/arm/helper.c b/target/arm/helper.c index 5ac335f598c..7653aa6a50a 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -264,7 +264,7 @@ static bool raw_accessors_invalid(const ARMCPRegInfo *ri) return true; } -bool write_cpustate_to_list(ARMCPU *cpu) +bool write_cpustate_to_list(ARMCPU *cpu, bool kvm_sync) { /* Write the coprocessor state from cpu->env to the (index,value) list. */ int i; @@ -273,6 +273,7 @@ bool write_cpustate_to_list(ARMCPU *cpu) for (i = 0; i < cpu->cpreg_array_len; i++) { uint32_t regidx = kvm_to_cpreg_id(cpu->cpreg_indexes[i]); const ARMCPRegInfo *ri; + uint64_t newval; ri = get_arm_cp_reginfo(cpu->cp_regs, regidx); if (!ri) { @@ -282,7 +283,29 @@ bool write_cpustate_to_list(ARMCPU *cpu) if (ri->type & ARM_CP_NO_RAW) { continue; } - cpu->cpreg_values[i] = read_raw_cp_reg(&cpu->env, ri); + + newval = read_raw_cp_reg(&cpu->env, ri); + if (kvm_sync) { + /* + * Only sync if the previous list->cpustate sync succeeded. + * Rather than tracking the success/failure state for every + * item in the list, we just recheck "does the raw write we must + * have made in write_list_to_cpustate() read back OK" here. + */ + uint64_t oldval = cpu->cpreg_values[i]; + + if (oldval == newval) { + continue; + } + + write_raw_cp_reg(&cpu->env, ri, oldval); + if (read_raw_cp_reg(&cpu->env, ri) != oldval) { + continue; + } + + write_raw_cp_reg(&cpu->env, ri, newval); + } + cpu->cpreg_values[i] = newval; } return ok; } diff --git a/target/arm/kvm32.c b/target/arm/kvm32.c index bd51eb43c86..a75e04cc8f3 100644 --- a/target/arm/kvm32.c +++ b/target/arm/kvm32.c @@ -387,24 +387,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return ret; } - /* Note that we do not call write_cpustate_to_list() - * here, so we are only writing the tuple list back to - * KVM. This is safe because nothing can change the - * CPUARMState cp15 fields (in particular gdb accesses cannot) - * and so there are no changes to sync. In fact syncing would - * be wrong at this point: for a constant register where TCG and - * KVM disagree about its value, the preceding write_list_to_cpustate() - * would not have had any effect on the CPUARMState value (since the - * register is read-only), and a write_cpustate_to_list() here would - * then try to write the TCG value back into KVM -- this would either - * fail or incorrectly change the value the guest sees. - * - * If we ever want to allow the user to modify cp15 registers via - * the gdb stub, we would need to be more clever here (for instance - * tracking the set of registers kvm_arch_get_registers() successfully - * managed to update the CPUARMState with, and only allowing those - * to be written back up into the kernel). - */ + write_cpustate_to_list(cpu, true); + if (!write_list_to_kvmstate(cpu, level)) { return EINVAL; } diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index 089af9c5f02..e3ba1492482 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -838,6 +838,8 @@ int kvm_arch_put_registers(CPUState *cs, int level) return ret; } + write_cpustate_to_list(cpu, true); + if (!write_list_to_kvmstate(cpu, level)) { return EINVAL; } diff --git a/target/arm/machine.c b/target/arm/machine.c index b2925496148..124192bfc26 100644 --- a/target/arm/machine.c +++ b/target/arm/machine.c @@ -630,7 +630,7 @@ static int cpu_pre_save(void *opaque) abort(); } } else { - if (!write_cpustate_to_list(cpu)) { + if (!write_cpustate_to_list(cpu, false)) { /* This should never fail. */ abort(); }