diff mbox series

[RFC] tests/qtest: properly initialise the vring used idx

Message ID 20220406173356.1891500-1-alex.bennee@linaro.org
State Superseded
Headers show
Series [RFC] tests/qtest: properly initialise the vring used idx | expand

Commit Message

Alex Bennée April 6, 2022, 5:33 p.m. UTC
Eric noticed while attempting to enable the vhost-user-blk-test for
Aarch64 that that things didn't work unless he put in a dummy
guest_malloc() at the start of the test. Without it
qvirtio_wait_used_elem() would assert when it reads a junk value for
idx resulting in:

  qvirtqueue_get_buf: idx:2401 last_idx:0
  qvirtqueue_get_buf: 0x7ffcb6d3fe74, (nil)
  qvirtio_wait_used_elem: 3000000/0
  ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
  Bail out! ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)

What was actually happening is the guest_malloc() effectively pushed
the allocation of the vring into the next page which just happened to
have clear memory. After much tedious tracing of the code I could see
that qvring_init() does attempt initialise a bunch of the vring
structures but skips the vring->used.idx value. It is probably not
wise to assume guest memory is zeroed anyway. Once the ring is
properly initialised the hack is no longer needed to get things
working.

Thanks-to: John Snow <jsnow@redhat.com> for helping debug
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 tests/qtest/libqos/virtio.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Peter Maydell April 6, 2022, 7:26 p.m. UTC | #1
On Wed, 6 Apr 2022 at 18:36, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Eric noticed while attempting to enable the vhost-user-blk-test for
> Aarch64 that that things didn't work unless he put in a dummy
> guest_malloc() at the start of the test. Without it
> qvirtio_wait_used_elem() would assert when it reads a junk value for
> idx resulting in:
>
>   qvirtqueue_get_buf: idx:2401 last_idx:0
>   qvirtqueue_get_buf: 0x7ffcb6d3fe74, (nil)
>   qvirtio_wait_used_elem: 3000000/0
>   ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>   Bail out! ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>
> What was actually happening is the guest_malloc() effectively pushed
> the allocation of the vring into the next page which just happened to
> have clear memory. After much tedious tracing of the code I could see
> that qvring_init() does attempt initialise a bunch of the vring
> structures but skips the vring->used.idx value. It is probably not
> wise to assume guest memory is zeroed anyway. Once the ring is
> properly initialised the hack is no longer needed to get things
> working.

Guest memory is generally zero at startup. Do we manage to
hit the bit of memory at the start of the virt machine's RAM
where we store the DTB ? (As you say, initializing the data
structures is the right thing anyway.)

thanks
-- PMM
Alex Bennée April 6, 2022, 8:06 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Wed, 6 Apr 2022 at 18:36, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> Eric noticed while attempting to enable the vhost-user-blk-test for
>> Aarch64 that that things didn't work unless he put in a dummy
>> guest_malloc() at the start of the test. Without it
>> qvirtio_wait_used_elem() would assert when it reads a junk value for
>> idx resulting in:
>>
>>   qvirtqueue_get_buf: idx:2401 last_idx:0
>>   qvirtqueue_get_buf: 0x7ffcb6d3fe74, (nil)
>>   qvirtio_wait_used_elem: 3000000/0
>>   ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>>   Bail out! ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>>
>> What was actually happening is the guest_malloc() effectively pushed
>> the allocation of the vring into the next page which just happened to
>> have clear memory. After much tedious tracing of the code I could see
>> that qvring_init() does attempt initialise a bunch of the vring
>> structures but skips the vring->used.idx value. It is probably not
>> wise to assume guest memory is zeroed anyway. Once the ring is
>> properly initialised the hack is no longer needed to get things
>> working.
>
> Guest memory is generally zero at startup. Do we manage to
> hit the bit of memory at the start of the virt machine's RAM
> where we store the DTB ? (As you say, initializing the data
> structures is the right thing anyway.)

I don't know - where is the DTB loaded? Currently we are using the first
couple of pages in qtest because that where the qtest allocater is
initialised:

  static void *qos_create_machine_arm_virt(QTestState *qts)
  {
      QVirtMachine *machine = g_new0(QVirtMachine, 1);

      alloc_init(&machine->alloc, 0,
                 ARM_VIRT_RAM_ADDR,
                 ARM_VIRT_RAM_ADDR + ARM_VIRT_RAM_SIZE,
                 ARM_PAGE_SIZE);
      qvirtio_mmio_init_device(&machine->virtio_mmio, qts, VIRTIO_MMIO_BASE_ADDR,
                               VIRTIO_MMIO_SIZE);

      qos_create_generic_pcihost(&machine->bridge, qts, &machine->alloc);

      machine->obj.get_device = virt_get_device;
      machine->obj.get_driver = virt_get_driver;
      machine->obj.destructor = virt_destructor;
      return machine;
  }

I don't know if there is a more sane piece of memory we should be using.


>
> thanks
> -- PMM
Peter Maydell April 6, 2022, 8:28 p.m. UTC | #3
On Wed, 6 Apr 2022 at 21:07, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Peter Maydell <peter.maydell@linaro.org> writes:
> > Guest memory is generally zero at startup. Do we manage to
> > hit the bit of memory at the start of the virt machine's RAM
> > where we store the DTB ? (As you say, initializing the data
> > structures is the right thing anyway.)
>
> I don't know - where is the DTB loaded?

Start of RAM (that's physaddr 0x4000_0000). The thing I'm not sure
about is whether these qtests go through the code in hw/arm/boot.c
that loads the DTB into guest RAM or not.

> Currently we are using the first
> couple of pages in qtest because that where the qtest allocater is
> initialised:
>
>   static void *qos_create_machine_arm_virt(QTestState *qts)
>   {
>       QVirtMachine *machine = g_new0(QVirtMachine, 1);
>
>       alloc_init(&machine->alloc, 0,
>                  ARM_VIRT_RAM_ADDR,
>                  ARM_VIRT_RAM_ADDR + ARM_VIRT_RAM_SIZE,
>                  ARM_PAGE_SIZE);
>       qvirtio_mmio_init_device(&machine->virtio_mmio, qts, VIRTIO_MMIO_BASE_ADDR,
>                                VIRTIO_MMIO_SIZE);
>
>       qos_create_generic_pcihost(&machine->bridge, qts, &machine->alloc);
>
>       machine->obj.get_device = virt_get_device;
>       machine->obj.get_driver = virt_get_driver;
>       machine->obj.destructor = virt_destructor;
>       return machine;
>   }
>
> I don't know if there is a more sane piece of memory we should be using.

The first part of RAM is fine, it's just you can't assume it's
all zeroes :-)

-- PMM
Stefan Hajnoczi April 7, 2022, 7:02 a.m. UTC | #4
On Wed, Apr 06, 2022 at 06:33:56PM +0100, Alex Bennée wrote:
> Eric noticed while attempting to enable the vhost-user-blk-test for
> Aarch64 that that things didn't work unless he put in a dummy
> guest_malloc() at the start of the test. Without it
> qvirtio_wait_used_elem() would assert when it reads a junk value for
> idx resulting in:
> 
>   qvirtqueue_get_buf: idx:2401 last_idx:0
>   qvirtqueue_get_buf: 0x7ffcb6d3fe74, (nil)
>   qvirtio_wait_used_elem: 3000000/0
>   ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>   Bail out! ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
> 
> What was actually happening is the guest_malloc() effectively pushed
> the allocation of the vring into the next page which just happened to
> have clear memory. After much tedious tracing of the code I could see
> that qvring_init() does attempt initialise a bunch of the vring
> structures but skips the vring->used.idx value. It is probably not
> wise to assume guest memory is zeroed anyway. Once the ring is
> properly initialised the hack is no longer needed to get things
> working.
> 
> Thanks-to: John Snow <jsnow@redhat.com> for helping debug
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  tests/qtest/libqos/virtio.c | 2 ++
>  1 file changed, 2 insertions(+)

Nice work!

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Alex Bennée April 7, 2022, 8:24 a.m. UTC | #5
Peter Maydell <peter.maydell@linaro.org> writes:

> On Wed, 6 Apr 2022 at 21:07, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> > Guest memory is generally zero at startup. Do we manage to
>> > hit the bit of memory at the start of the virt machine's RAM
>> > where we store the DTB ? (As you say, initializing the data
>> > structures is the right thing anyway.)
>>
>> I don't know - where is the DTB loaded?
>
> Start of RAM (that's physaddr 0x4000_0000). The thing I'm not sure
> about is whether these qtests go through the code in hw/arm/boot.c
> that loads the DTB into guest RAM or not.

Yes because it's linked to the machine creation:

Thread 1 hit Breakpoint 1, arm_load_dtb (addr=1073741824, binfo=binfo@entry=0x55bc4ce26970, addr_limit=0, as=as@entry=0x55bc4d119c50, ms=ms@entry=0x55bc4ce26800) at ../../hw/arm/boot.c:534
534     {
(rr) bt
#0  arm_load_dtb (addr=1073741824, binfo=binfo@entry=0x55bc4ce26970, addr_limit=0, as=as@entry=0x55bc4d119c50, ms=ms@entry=0x55bc4ce26800) at ../../hw/arm/boot.c:534
#1  0x000055bc4a9f7ded in virt_machine_done (notifier=0x55bc4ce26910, data=<optimized out>) at ../../hw/arm/virt.c:1637
#2  0x000055bc4aebc807 in notifier_list_notify (list=list@entry=0x55bc4b5f8b20 <machine_init_done_notifiers>, data=data@entry=0x0) at ../../util/notify.c:39
#3  0x000055bc4a7f82db in qdev_machine_creation_done () at ../../hw/core/machine.c:1235
#4  0x000055bc4a744b19 in qemu_machine_creation_done () at ../../softmmu/vl.c:2725
#5  qmp_x_exit_preconfig (errp=<optimized out>) at ../../softmmu/vl.c:2748
#6  0x000055bc4a748a14 in qmp_x_exit_preconfig (errp=<optimized out>) at ../../softmmu/vl.c:2741
#7  qemu_init (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../../softmmu/vl.c:3776
#8  0x000055bc4a6de639 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../../softmmu/main.c:49

(ION: yay, I can capture qtest runs in rr now ;-)

>
>> Currently we are using the first
>> couple of pages in qtest because that where the qtest allocater is
>> initialised:
>>
>>   static void *qos_create_machine_arm_virt(QTestState *qts)
>>   {
>>       QVirtMachine *machine = g_new0(QVirtMachine, 1);
>>
>>       alloc_init(&machine->alloc, 0,
>>                  ARM_VIRT_RAM_ADDR,
>>                  ARM_VIRT_RAM_ADDR + ARM_VIRT_RAM_SIZE,
>>                  ARM_PAGE_SIZE);
>>       qvirtio_mmio_init_device(&machine->virtio_mmio, qts, VIRTIO_MMIO_BASE_ADDR,
>>                                VIRTIO_MMIO_SIZE);
>>
>>       qos_create_generic_pcihost(&machine->bridge, qts, &machine->alloc);
>>
>>       machine->obj.get_device = virt_get_device;
>>       machine->obj.get_driver = virt_get_driver;
>>       machine->obj.destructor = virt_destructor;
>>       return machine;
>>   }
>>
>> I don't know if there is a more sane piece of memory we should be using.
>
> The first part of RAM is fine, it's just you can't assume it's
> all zeroes :-)
>
> -- PMM
Eric Auger April 7, 2022, 8:34 a.m. UTC | #6
Hi Alex,

On 4/6/22 7:33 PM, Alex Bennée wrote:
> Eric noticed while attempting to enable the vhost-user-blk-test for
> Aarch64 that that things didn't work unless he put in a dummy
> guest_malloc() at the start of the test. Without it
> qvirtio_wait_used_elem() would assert when it reads a junk value for
> idx resulting in:
>
>   qvirtqueue_get_buf: idx:2401 last_idx:0
>   qvirtqueue_get_buf: 0x7ffcb6d3fe74, (nil)
>   qvirtio_wait_used_elem: 3000000/0
>   ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>   Bail out! ERROR:../../tests/qtest/libqos/virtio.c:226:qvirtio_wait_used_elem: assertion failed (got_desc_idx == desc_idx): (50331648 == 0)
>
> What was actually happening is the guest_malloc() effectively pushed
> the allocation of the vring into the next page which just happened to
> have clear memory. After much tedious tracing of the code I could see
Many thanks for the tedious investigation!
> that qvring_init() does attempt initialise a bunch of the vring
> structures but skips the vring->used.idx value. It is probably not
> wise to assume guest memory is zeroed anyway. Once the ring is
> properly initialised the hack is no longer needed to get things
> working.
>
> Thanks-to: John Snow <jsnow@redhat.com> for helping debug
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  tests/qtest/libqos/virtio.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tests/qtest/libqos/virtio.c b/tests/qtest/libqos/virtio.c
> index 6fe7bf9555..fba9186659 100644
> --- a/tests/qtest/libqos/virtio.c
> +++ b/tests/qtest/libqos/virtio.c
> @@ -260,6 +260,8 @@ void qvring_init(QTestState *qts, const QGuestAllocator *alloc, QVirtQueue *vq,
>  
>      /* vq->used->flags */
>      qvirtio_writew(vq->vdev, qts, vq->used, 0);
> +    /* vq->used->idx */
> +    qvirtio_writew(vq->vdev, qts, vq->used + 2, 0);
>      /* vq->used->avail_event */
>      qvirtio_writew(vq->vdev, qts, vq->used + 2 +
>                     sizeof(struct vring_used_elem) * vq->size, 0);
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>

Eric
diff mbox series

Patch

diff --git a/tests/qtest/libqos/virtio.c b/tests/qtest/libqos/virtio.c
index 6fe7bf9555..fba9186659 100644
--- a/tests/qtest/libqos/virtio.c
+++ b/tests/qtest/libqos/virtio.c
@@ -260,6 +260,8 @@  void qvring_init(QTestState *qts, const QGuestAllocator *alloc, QVirtQueue *vq,
 
     /* vq->used->flags */
     qvirtio_writew(vq->vdev, qts, vq->used, 0);
+    /* vq->used->idx */
+    qvirtio_writew(vq->vdev, qts, vq->used + 2, 0);
     /* vq->used->avail_event */
     qvirtio_writew(vq->vdev, qts, vq->used + 2 +
                    sizeof(struct vring_used_elem) * vq->size, 0);