diff mbox series

[v4,4/5] hw/virtio: generalise CHR_EVENT_CLOSED handling

Message ID 20221130112439.2527228-5-alex.bennee@linaro.org
State New
Headers show
Series final vhost-user fixes | expand

Commit Message

Alex Bennée Nov. 30, 2022, 11:24 a.m. UTC
..and use for both virtio-user-blk and virtio-user-gpio. This avoids
the circular close by deferring shutdown due to disconnection until a
later point. virtio-user-blk already had this mechanism in place so
generalise it as a vhost-user helper function and use for both blk and
gpio devices.

While we are at it we also fix up vhost-user-gpio to re-establish the
event handler after close down so we can reconnect later.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
---
 include/hw/virtio/vhost-user.h | 18 +++++++++
 hw/block/vhost-user-blk.c      | 41 +++-----------------
 hw/virtio/vhost-user-gpio.c    | 11 +++++-
 hw/virtio/vhost-user.c         | 71 ++++++++++++++++++++++++++++++++++
 4 files changed, 104 insertions(+), 37 deletions(-)

Comments

Stefan Hajnoczi Nov. 30, 2022, 9:21 p.m. UTC | #1
On Wed, Nov 30, 2022 at 11:24:38AM +0000, Alex Bennée wrote:
> ..and use for both virtio-user-blk and virtio-user-gpio. This avoids
> the circular close by deferring shutdown due to disconnection until a
> later point.

I thought re-entrancy was already avoided by Patch 3?

> virtio-user-blk already had this mechanism in place so
> generalise it as a vhost-user helper function and use for both blk and
> gpio devices.
> 
> While we are at it we also fix up vhost-user-gpio to re-establish the
> event handler after close down so we can reconnect later.

I don't understand how the BH is supposed to help. Normally BHs are used
to defer processing while we're in a call stack of some depth. The
callers still require the resource we want to free, so we scheduled a
BH.

That's not the case here. The CHR_EVENT_CLOSED callback is just another
top-level event handler, not something that is called deep in a call
stack.

This BH approach isn't safe against nested event loops (something that
calls aio_poll() to wait), because now the BH can be invoked deep in a
call stack.

That said, if Patch 3 isn't enough and this patch fixes a problem, then
let's keep it for 7.2. It feels like a possibly incomplete solution and
maybe something that should be solved differently in 8.0.

> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> ---
>  include/hw/virtio/vhost-user.h | 18 +++++++++
>  hw/block/vhost-user-blk.c      | 41 +++-----------------
>  hw/virtio/vhost-user-gpio.c    | 11 +++++-
>  hw/virtio/vhost-user.c         | 71 ++++++++++++++++++++++++++++++++++
>  4 files changed, 104 insertions(+), 37 deletions(-)
> 
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index c6e693cd3f..191216a74f 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -68,4 +68,22 @@ bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
>   */
>  void vhost_user_cleanup(VhostUserState *user);
>  
> +/**
> + * vhost_user_async_close() - cleanup vhost-user post connection drop
> + * @d: DeviceState for the associated device (passed to callback)
> + * @chardev: the CharBackend associated with the connection
> + * @vhost: the common vhost device
> + * @cb: the user callback function to complete the clean-up
> + *
> + * This function is used to handle the shutdown of a vhost-user
> + * connection to a backend. We handle this centrally to make sure we
> + * do all the steps and handle potential races due to VM shutdowns.
> + * Once the connection is disabled we call a backhalf to ensure
> + */
> +typedef void (*vu_async_close_fn)(DeviceState *cb);
> +
> +void vhost_user_async_close(DeviceState *d,
> +                            CharBackend *chardev, struct vhost_dev *vhost,
> +                            vu_async_close_fn cb);
> +
>  #endif
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 1177064631..aff4d2b8cb 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -369,17 +369,10 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>      vhost_user_blk_stop(vdev);
>  
>      vhost_dev_cleanup(&s->dev);
> -}
>  
> -static void vhost_user_blk_chr_closed_bh(void *opaque)
> -{
> -    DeviceState *dev = opaque;
> -    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> -    VHostUserBlk *s = VHOST_USER_BLK(vdev);
> -
> -    vhost_user_blk_disconnect(dev);
> +    /* Re-instate the event handler for new connections */
>      qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
> -                             NULL, opaque, NULL, true);
> +                             NULL, dev, NULL, true);
>  }
>  
>  static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
> @@ -398,33 +391,9 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>          }
>          break;
>      case CHR_EVENT_CLOSED:
> -        if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> -            /*
> -             * A close event may happen during a read/write, but vhost
> -             * code assumes the vhost_dev remains setup, so delay the
> -             * stop & clear.
> -             */
> -            AioContext *ctx = qemu_get_current_aio_context();
> -
> -            qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
> -                    NULL, NULL, false);
> -            aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
> -
> -            /*
> -             * Move vhost device to the stopped state. The vhost-user device
> -             * will be clean up and disconnected in BH. This can be useful in
> -             * the vhost migration code. If disconnect was caught there is an
> -             * option for the general vhost code to get the dev state without
> -             * knowing its type (in this case vhost-user).
> -             *
> -             * FIXME: this is sketchy to be reaching into vhost_dev
> -             * now because we are forcing something that implies we
> -             * have executed vhost_dev_stop() but that won't happen
> -             * until vhost_user_blk_stop() gets called from the bh.
> -             * Really this state check should be tracked locally.
> -             */
> -            s->dev.started = false;
> -        }
> +        /* defer close until later to avoid circular close */
> +        vhost_user_async_close(dev, &s->chardev, &s->dev,
> +                               vhost_user_blk_disconnect);
>          break;
>      case CHR_EVENT_BREAK:
>      case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index be9be08b4c..b7b82a1099 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -233,6 +233,8 @@ static int vu_gpio_connect(DeviceState *dev, Error **errp)
>      return 0;
>  }
>  
> +static void vu_gpio_event(void *opaque, QEMUChrEvent event);
> +
>  static void vu_gpio_disconnect(DeviceState *dev)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> @@ -245,6 +247,11 @@ static void vu_gpio_disconnect(DeviceState *dev)
>  
>      vu_gpio_stop(vdev);
>      vhost_dev_cleanup(&gpio->vhost_dev);
> +
> +    /* Re-instate the event handler for new connections */
> +    qemu_chr_fe_set_handlers(&gpio->chardev,
> +                             NULL, NULL, vu_gpio_event,
> +                             NULL, dev, NULL, true);
>  }
>  
>  static void vu_gpio_event(void *opaque, QEMUChrEvent event)
> @@ -262,7 +269,9 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>          }
>          break;
>      case CHR_EVENT_CLOSED:
> -        vu_gpio_disconnect(dev);
> +        /* defer close until later to avoid circular close */
> +        vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
> +                               vu_gpio_disconnect);
>          break;
>      case CHR_EVENT_BREAK:
>      case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index abe23d4ebe..8f635844af 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -21,6 +21,7 @@
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/sockets.h"
> +#include "sysemu/runstate.h"
>  #include "sysemu/cryptodev.h"
>  #include "migration/migration.h"
>  #include "migration/postcopy-ram.h"
> @@ -2670,6 +2671,76 @@ void vhost_user_cleanup(VhostUserState *user)
>      user->chr = NULL;
>  }
>  
> +
> +typedef struct {
> +    vu_async_close_fn cb;
> +    DeviceState *dev;
> +    CharBackend *cd;
> +    struct vhost_dev *vhost;
> +} VhostAsyncCallback;
> +
> +static void vhost_user_async_close_bh(void *opaque)
> +{
> +    VhostAsyncCallback *data = opaque;
> +    struct vhost_dev *vhost = data->vhost;
> +
> +    /*
> +     * If the vhost_dev has been cleared in the meantime there is
> +     * nothing left to do as some other path has completed the
> +     * cleanup.
> +     */
> +    if (vhost->vdev) {
> +        data->cb(data->dev);
> +    }
> +
> +    g_free(data);
> +}
> +
> +/*
> + * We only schedule the work if the machine is running. If suspended
> + * we want to keep all the in-flight data as is for migration
> + * purposes.
> + */
> +void vhost_user_async_close(DeviceState *d,
> +                            CharBackend *chardev, struct vhost_dev *vhost,
> +                            vu_async_close_fn cb)
> +{
> +    if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +        /*
> +         * A close event may happen during a read/write, but vhost
> +         * code assumes the vhost_dev remains setup, so delay the
> +         * stop & clear.
> +         */
> +        AioContext *ctx = qemu_get_current_aio_context();
> +        VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1);
> +
> +        /* Save data for the callback */
> +        data->cb = cb;
> +        data->dev = d;
> +        data->cd = chardev;
> +        data->vhost = vhost;
> +
> +        /* Disable any further notifications on the chardev */
> +        qemu_chr_fe_set_handlers(chardev,
> +                                 NULL, NULL, NULL, NULL, NULL, NULL,
> +                                 false);
> +
> +        aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data);
> +
> +        /*
> +         * Move vhost device to the stopped state. The vhost-user device
> +         * will be clean up and disconnected in BH. This can be useful in
> +         * the vhost migration code. If disconnect was caught there is an
> +         * option for the general vhost code to get the dev state without
> +         * knowing its type (in this case vhost-user).
> +         *
> +         * Note if the vhost device is fully cleared by the time we
> +         * execute the bottom half we won't continue with the cleanup.
> +         */
> +        vhost->started = false;
> +    }
> +}
> +
>  static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
>  {
>      if (!virtio_has_feature(dev->protocol_features,
> -- 
> 2.34.1
>
Michael S. Tsirkin Dec. 1, 2022, 6:56 a.m. UTC | #2
On Wed, Nov 30, 2022 at 11:24:38AM +0000, Alex Bennée wrote:
> ..and use for both virtio-user-blk and virtio-user-gpio. This avoids
> the circular close by deferring shutdown due to disconnection until a
> later point. virtio-user-blk already had this mechanism in place so
> generalise it as a vhost-user helper function and use for both blk and
> gpio devices.
> 
> While we are at it we also fix up vhost-user-gpio to re-establish the
> event handler after close down so we can reconnect later.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>


I have some ideas on how to improve the coding style here. Can be patches on
top.

> ---
>  include/hw/virtio/vhost-user.h | 18 +++++++++
>  hw/block/vhost-user-blk.c      | 41 +++-----------------
>  hw/virtio/vhost-user-gpio.c    | 11 +++++-
>  hw/virtio/vhost-user.c         | 71 ++++++++++++++++++++++++++++++++++
>  4 files changed, 104 insertions(+), 37 deletions(-)
> 
> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
> index c6e693cd3f..191216a74f 100644
> --- a/include/hw/virtio/vhost-user.h
> +++ b/include/hw/virtio/vhost-user.h
> @@ -68,4 +68,22 @@ bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
>   */
>  void vhost_user_cleanup(VhostUserState *user);
>  
> +/**
> + * vhost_user_async_close() - cleanup vhost-user post connection drop
> + * @d: DeviceState for the associated device (passed to callback)
> + * @chardev: the CharBackend associated with the connection
> + * @vhost: the common vhost device
> + * @cb: the user callback function to complete the clean-up
> + *
> + * This function is used to handle the shutdown of a vhost-user
> + * connection to a backend. We handle this centrally to make sure we
> + * do all the steps and handle potential races due to VM shutdowns.
> + * Once the connection is disabled we call a backhalf to ensure

This sentence is incomplete. Fixing that can be done later.

> + */
> +typedef void (*vu_async_close_fn)(DeviceState *cb);

we should change the typedef to VhostUserAsyncCloseFn to agree with
the coding style:

	 Enum type
	names and function type names should also be in CamelCase. 

also, it's best to eschew abbreviation.

> +
> +void vhost_user_async_close(DeviceState *d,
> +                            CharBackend *chardev, struct vhost_dev *vhost,
> +                            vu_async_close_fn cb);
> +
>  #endif
> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
> index 1177064631..aff4d2b8cb 100644
> --- a/hw/block/vhost-user-blk.c
> +++ b/hw/block/vhost-user-blk.c
> @@ -369,17 +369,10 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>      vhost_user_blk_stop(vdev);
>  
>      vhost_dev_cleanup(&s->dev);
> -}
>  
> -static void vhost_user_blk_chr_closed_bh(void *opaque)
> -{
> -    DeviceState *dev = opaque;
> -    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> -    VHostUserBlk *s = VHOST_USER_BLK(vdev);
> -
> -    vhost_user_blk_disconnect(dev);
> +    /* Re-instate the event handler for new connections */
>      qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
> -                             NULL, opaque, NULL, true);
> +                             NULL, dev, NULL, true);
>  }
>  
>  static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
> @@ -398,33 +391,9 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>          }
>          break;
>      case CHR_EVENT_CLOSED:
> -        if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> -            /*
> -             * A close event may happen during a read/write, but vhost
> -             * code assumes the vhost_dev remains setup, so delay the
> -             * stop & clear.
> -             */
> -            AioContext *ctx = qemu_get_current_aio_context();
> -
> -            qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
> -                    NULL, NULL, false);
> -            aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
> -
> -            /*
> -             * Move vhost device to the stopped state. The vhost-user device
> -             * will be clean up and disconnected in BH. This can be useful in
> -             * the vhost migration code. If disconnect was caught there is an
> -             * option for the general vhost code to get the dev state without
> -             * knowing its type (in this case vhost-user).
> -             *
> -             * FIXME: this is sketchy to be reaching into vhost_dev
> -             * now because we are forcing something that implies we
> -             * have executed vhost_dev_stop() but that won't happen
> -             * until vhost_user_blk_stop() gets called from the bh.
> -             * Really this state check should be tracked locally.
> -             */
> -            s->dev.started = false;
> -        }
> +        /* defer close until later to avoid circular close */
> +        vhost_user_async_close(dev, &s->chardev, &s->dev,
> +                               vhost_user_blk_disconnect);
>          break;
>      case CHR_EVENT_BREAK:
>      case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
> index be9be08b4c..b7b82a1099 100644
> --- a/hw/virtio/vhost-user-gpio.c
> +++ b/hw/virtio/vhost-user-gpio.c
> @@ -233,6 +233,8 @@ static int vu_gpio_connect(DeviceState *dev, Error **errp)
>      return 0;
>  }
>  
> +static void vu_gpio_event(void *opaque, QEMUChrEvent event);
> +
>  static void vu_gpio_disconnect(DeviceState *dev)
>  {
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);

we should reorder functions down the road to avoid declarations like
this.

> @@ -245,6 +247,11 @@ static void vu_gpio_disconnect(DeviceState *dev)
>  
>      vu_gpio_stop(vdev);
>      vhost_dev_cleanup(&gpio->vhost_dev);
> +
> +    /* Re-instate the event handler for new connections */
> +    qemu_chr_fe_set_handlers(&gpio->chardev,
> +                             NULL, NULL, vu_gpio_event,
> +                             NULL, dev, NULL, true);
>  }
>  
>  static void vu_gpio_event(void *opaque, QEMUChrEvent event)
> @@ -262,7 +269,9 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>          }
>          break;
>      case CHR_EVENT_CLOSED:
> -        vu_gpio_disconnect(dev);
> +        /* defer close until later to avoid circular close */
> +        vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
> +                               vu_gpio_disconnect);
>          break;
>      case CHR_EVENT_BREAK:
>      case CHR_EVENT_MUX_IN:
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index abe23d4ebe..8f635844af 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -21,6 +21,7 @@
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
>  #include "qemu/sockets.h"
> +#include "sysemu/runstate.h"
>  #include "sysemu/cryptodev.h"
>  #include "migration/migration.h"
>  #include "migration/postcopy-ram.h"
> @@ -2670,6 +2671,76 @@ void vhost_user_cleanup(VhostUserState *user)
>      user->chr = NULL;
>  }
>  
> +

extra empty line

> +typedef struct {
> +    vu_async_close_fn cb;

cb can be renamed to e.g. "close" for clarity?

> +    DeviceState *dev;
> +    CharBackend *cd;
> +    struct vhost_dev *vhost;
> +} VhostAsyncCallback;
> +
> +static void vhost_user_async_close_bh(void *opaque)
> +{
> +    VhostAsyncCallback *data = opaque;
> +    struct vhost_dev *vhost = data->vhost;
> +
> +    /*
> +     * If the vhost_dev has been cleared in the meantime there is
> +     * nothing left to do as some other path has completed the
> +     * cleanup.
> +     */
> +    if (vhost->vdev) {

it's clearer to drop the vhost variable here. It's only used once.
Above will be
if (data->vhost->vdev)

> +        data->cb(data->dev);
> +    }
> +
> +    g_free(data);
> +}
> +
> +/*
> + * We only schedule the work if the machine is running. If suspended
> + * we want to keep all the in-flight data as is for migration
> + * purposes.
> + */
> +void vhost_user_async_close(DeviceState *d,
> +                            CharBackend *chardev, struct vhost_dev *vhost,
> +                            vu_async_close_fn cb)
> +{
> +    if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +        /*
> +         * A close event may happen during a read/write, but vhost
> +         * code assumes the vhost_dev remains setup, so delay the
> +         * stop & clear.
> +         */
> +        AioContext *ctx = qemu_get_current_aio_context();
> +        VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1);
> +
> +        /* Save data for the callback */
> +        data->cb = cb;
> +        data->dev = d;
> +        data->cd = chardev;
> +        data->vhost = vhost;
> +
> +        /* Disable any further notifications on the chardev */
> +        qemu_chr_fe_set_handlers(chardev,
> +                                 NULL, NULL, NULL, NULL, NULL, NULL,
> +                                 false);
> +
> +        aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data);
> +
> +        /*
> +         * Move vhost device to the stopped state. The vhost-user device
> +         * will be clean up and disconnected in BH. This can be useful in
> +         * the vhost migration code. If disconnect was caught there is an
> +         * option for the general vhost code to get the dev state without
> +         * knowing its type (in this case vhost-user).
> +         *
> +         * Note if the vhost device is fully cleared by the time we
> +         * execute the bottom half we won't continue with the cleanup.
> +         */
> +        vhost->started = false;
> +    }
> +}
> +
>  static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
>  {
>      if (!virtio_has_feature(dev->protocol_features,
> -- 
> 2.34.1
Alex Bennée Dec. 1, 2022, 5:05 p.m. UTC | #3
Stefan Hajnoczi <stefanha@redhat.com> writes:

> [[PGP Signed Part:Undecided]]
> On Wed, Nov 30, 2022 at 11:24:38AM +0000, Alex Bennée wrote:
>> ..and use for both virtio-user-blk and virtio-user-gpio. This avoids
>> the circular close by deferring shutdown due to disconnection until a
>> later point.
>
> I thought re-entrancy was already avoided by Patch 3?

No because vu_gpio_disconnect() gets called while we are in the process
of tearing things down for the vu_gpio_stop(). As it hasn't disconnected
yet we end up doing the teardown there:

  #2  0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255
  #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
  #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
  #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
  #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
  #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
  #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
  #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
  #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
  #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
  #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
  #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
  #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
  #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142

and when we finally return the disconnect event has wiped out vhost_user.

>
>> virtio-user-blk already had this mechanism in place so
>> generalise it as a vhost-user helper function and use for both blk and
>> gpio devices.
>> 
>> While we are at it we also fix up vhost-user-gpio to re-establish the
>> event handler after close down so we can reconnect later.
>
> I don't understand how the BH is supposed to help. Normally BHs are used
> to defer processing while we're in a call stack of some depth. The
> callers still require the resource we want to free, so we scheduled a
> BH.
>
> That's not the case here. The CHR_EVENT_CLOSED callback is just another
> top-level event handler, not something that is called deep in a call
> stack.

Well it is in the shutdown case because as we write to the vhost_user
backend to shut things down it dies and we immediately process the
disconnect event.

It is racy though so some times you get away with it.

> This BH approach isn't safe against nested event loops (something that
> calls aio_poll() to wait), because now the BH can be invoked deep in a
> call stack.

Well I did ask if anyone had any better suggestions. In this case I'm
mostly just copying what vhost-user-blk did.

>
> That said, if Patch 3 isn't enough and this patch fixes a problem, then
> let's keep it for 7.2. It feels like a possibly incomplete solution and
> maybe something that should be solved differently in 8.0.
>
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
>> ---
>>  include/hw/virtio/vhost-user.h | 18 +++++++++
>>  hw/block/vhost-user-blk.c      | 41 +++-----------------
>>  hw/virtio/vhost-user-gpio.c    | 11 +++++-
>>  hw/virtio/vhost-user.c         | 71 ++++++++++++++++++++++++++++++++++
>>  4 files changed, 104 insertions(+), 37 deletions(-)
>> 
>> diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
>> index c6e693cd3f..191216a74f 100644
>> --- a/include/hw/virtio/vhost-user.h
>> +++ b/include/hw/virtio/vhost-user.h
>> @@ -68,4 +68,22 @@ bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
>>   */
>>  void vhost_user_cleanup(VhostUserState *user);
>>  
>> +/**
>> + * vhost_user_async_close() - cleanup vhost-user post connection drop
>> + * @d: DeviceState for the associated device (passed to callback)
>> + * @chardev: the CharBackend associated with the connection
>> + * @vhost: the common vhost device
>> + * @cb: the user callback function to complete the clean-up
>> + *
>> + * This function is used to handle the shutdown of a vhost-user
>> + * connection to a backend. We handle this centrally to make sure we
>> + * do all the steps and handle potential races due to VM shutdowns.
>> + * Once the connection is disabled we call a backhalf to ensure
>> + */
>> +typedef void (*vu_async_close_fn)(DeviceState *cb);
>> +
>> +void vhost_user_async_close(DeviceState *d,
>> +                            CharBackend *chardev, struct vhost_dev *vhost,
>> +                            vu_async_close_fn cb);
>> +
>>  #endif
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index 1177064631..aff4d2b8cb 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -369,17 +369,10 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>>      vhost_user_blk_stop(vdev);
>>  
>>      vhost_dev_cleanup(&s->dev);
>> -}
>>  
>> -static void vhost_user_blk_chr_closed_bh(void *opaque)
>> -{
>> -    DeviceState *dev = opaque;
>> -    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> -    VHostUserBlk *s = VHOST_USER_BLK(vdev);
>> -
>> -    vhost_user_blk_disconnect(dev);
>> +    /* Re-instate the event handler for new connections */
>>      qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
>> -                             NULL, opaque, NULL, true);
>> +                             NULL, dev, NULL, true);
>>  }
>>  
>>  static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>> @@ -398,33 +391,9 @@ static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
>>          }
>>          break;
>>      case CHR_EVENT_CLOSED:
>> -        if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>> -            /*
>> -             * A close event may happen during a read/write, but vhost
>> -             * code assumes the vhost_dev remains setup, so delay the
>> -             * stop & clear.
>> -             */
>> -            AioContext *ctx = qemu_get_current_aio_context();
>> -
>> -            qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
>> -                    NULL, NULL, false);
>> -            aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
>> -
>> -            /*
>> -             * Move vhost device to the stopped state. The vhost-user device
>> -             * will be clean up and disconnected in BH. This can be useful in
>> -             * the vhost migration code. If disconnect was caught there is an
>> -             * option for the general vhost code to get the dev state without
>> -             * knowing its type (in this case vhost-user).
>> -             *
>> -             * FIXME: this is sketchy to be reaching into vhost_dev
>> -             * now because we are forcing something that implies we
>> -             * have executed vhost_dev_stop() but that won't happen
>> -             * until vhost_user_blk_stop() gets called from the bh.
>> -             * Really this state check should be tracked locally.
>> -             */
>> -            s->dev.started = false;
>> -        }
>> +        /* defer close until later to avoid circular close */
>> +        vhost_user_async_close(dev, &s->chardev, &s->dev,
>> +                               vhost_user_blk_disconnect);
>>          break;
>>      case CHR_EVENT_BREAK:
>>      case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
>> index be9be08b4c..b7b82a1099 100644
>> --- a/hw/virtio/vhost-user-gpio.c
>> +++ b/hw/virtio/vhost-user-gpio.c
>> @@ -233,6 +233,8 @@ static int vu_gpio_connect(DeviceState *dev, Error **errp)
>>      return 0;
>>  }
>>  
>> +static void vu_gpio_event(void *opaque, QEMUChrEvent event);
>> +
>>  static void vu_gpio_disconnect(DeviceState *dev)
>>  {
>>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>> @@ -245,6 +247,11 @@ static void vu_gpio_disconnect(DeviceState *dev)
>>  
>>      vu_gpio_stop(vdev);
>>      vhost_dev_cleanup(&gpio->vhost_dev);
>> +
>> +    /* Re-instate the event handler for new connections */
>> +    qemu_chr_fe_set_handlers(&gpio->chardev,
>> +                             NULL, NULL, vu_gpio_event,
>> +                             NULL, dev, NULL, true);
>>  }
>>  
>>  static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>> @@ -262,7 +269,9 @@ static void vu_gpio_event(void *opaque, QEMUChrEvent event)
>>          }
>>          break;
>>      case CHR_EVENT_CLOSED:
>> -        vu_gpio_disconnect(dev);
>> +        /* defer close until later to avoid circular close */
>> +        vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
>> +                               vu_gpio_disconnect);
>>          break;
>>      case CHR_EVENT_BREAK:
>>      case CHR_EVENT_MUX_IN:
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index abe23d4ebe..8f635844af 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -21,6 +21,7 @@
>>  #include "qemu/error-report.h"
>>  #include "qemu/main-loop.h"
>>  #include "qemu/sockets.h"
>> +#include "sysemu/runstate.h"
>>  #include "sysemu/cryptodev.h"
>>  #include "migration/migration.h"
>>  #include "migration/postcopy-ram.h"
>> @@ -2670,6 +2671,76 @@ void vhost_user_cleanup(VhostUserState *user)
>>      user->chr = NULL;
>>  }
>>  
>> +
>> +typedef struct {
>> +    vu_async_close_fn cb;
>> +    DeviceState *dev;
>> +    CharBackend *cd;
>> +    struct vhost_dev *vhost;
>> +} VhostAsyncCallback;
>> +
>> +static void vhost_user_async_close_bh(void *opaque)
>> +{
>> +    VhostAsyncCallback *data = opaque;
>> +    struct vhost_dev *vhost = data->vhost;
>> +
>> +    /*
>> +     * If the vhost_dev has been cleared in the meantime there is
>> +     * nothing left to do as some other path has completed the
>> +     * cleanup.
>> +     */
>> +    if (vhost->vdev) {
>> +        data->cb(data->dev);
>> +    }
>> +
>> +    g_free(data);
>> +}
>> +
>> +/*
>> + * We only schedule the work if the machine is running. If suspended
>> + * we want to keep all the in-flight data as is for migration
>> + * purposes.
>> + */
>> +void vhost_user_async_close(DeviceState *d,
>> +                            CharBackend *chardev, struct vhost_dev *vhost,
>> +                            vu_async_close_fn cb)
>> +{
>> +    if (!runstate_check(RUN_STATE_SHUTDOWN)) {
>> +        /*
>> +         * A close event may happen during a read/write, but vhost
>> +         * code assumes the vhost_dev remains setup, so delay the
>> +         * stop & clear.
>> +         */
>> +        AioContext *ctx = qemu_get_current_aio_context();
>> +        VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1);
>> +
>> +        /* Save data for the callback */
>> +        data->cb = cb;
>> +        data->dev = d;
>> +        data->cd = chardev;
>> +        data->vhost = vhost;
>> +
>> +        /* Disable any further notifications on the chardev */
>> +        qemu_chr_fe_set_handlers(chardev,
>> +                                 NULL, NULL, NULL, NULL, NULL, NULL,
>> +                                 false);
>> +
>> +        aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data);
>> +
>> +        /*
>> +         * Move vhost device to the stopped state. The vhost-user device
>> +         * will be clean up and disconnected in BH. This can be useful in
>> +         * the vhost migration code. If disconnect was caught there is an
>> +         * option for the general vhost code to get the dev state without
>> +         * knowing its type (in this case vhost-user).
>> +         *
>> +         * Note if the vhost device is fully cleared by the time we
>> +         * execute the bottom half we won't continue with the cleanup.
>> +         */
>> +        vhost->started = false;
>> +    }
>> +}
>> +
>>  static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
>>  {
>>      if (!virtio_has_feature(dev->protocol_features,
>> -- 
>> 2.34.1
>> 
>
> [[End of PGP Signed Part]]
diff mbox series

Patch

diff --git a/include/hw/virtio/vhost-user.h b/include/hw/virtio/vhost-user.h
index c6e693cd3f..191216a74f 100644
--- a/include/hw/virtio/vhost-user.h
+++ b/include/hw/virtio/vhost-user.h
@@ -68,4 +68,22 @@  bool vhost_user_init(VhostUserState *user, CharBackend *chr, Error **errp);
  */
 void vhost_user_cleanup(VhostUserState *user);
 
+/**
+ * vhost_user_async_close() - cleanup vhost-user post connection drop
+ * @d: DeviceState for the associated device (passed to callback)
+ * @chardev: the CharBackend associated with the connection
+ * @vhost: the common vhost device
+ * @cb: the user callback function to complete the clean-up
+ *
+ * This function is used to handle the shutdown of a vhost-user
+ * connection to a backend. We handle this centrally to make sure we
+ * do all the steps and handle potential races due to VM shutdowns.
+ * Once the connection is disabled we call a backhalf to ensure
+ */
+typedef void (*vu_async_close_fn)(DeviceState *cb);
+
+void vhost_user_async_close(DeviceState *d,
+                            CharBackend *chardev, struct vhost_dev *vhost,
+                            vu_async_close_fn cb);
+
 #endif
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 1177064631..aff4d2b8cb 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -369,17 +369,10 @@  static void vhost_user_blk_disconnect(DeviceState *dev)
     vhost_user_blk_stop(vdev);
 
     vhost_dev_cleanup(&s->dev);
-}
 
-static void vhost_user_blk_chr_closed_bh(void *opaque)
-{
-    DeviceState *dev = opaque;
-    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
-    VHostUserBlk *s = VHOST_USER_BLK(vdev);
-
-    vhost_user_blk_disconnect(dev);
+    /* Re-instate the event handler for new connections */
     qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
-                             NULL, opaque, NULL, true);
+                             NULL, dev, NULL, true);
 }
 
 static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
@@ -398,33 +391,9 @@  static void vhost_user_blk_event(void *opaque, QEMUChrEvent event)
         }
         break;
     case CHR_EVENT_CLOSED:
-        if (!runstate_check(RUN_STATE_SHUTDOWN)) {
-            /*
-             * A close event may happen during a read/write, but vhost
-             * code assumes the vhost_dev remains setup, so delay the
-             * stop & clear.
-             */
-            AioContext *ctx = qemu_get_current_aio_context();
-
-            qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
-                    NULL, NULL, false);
-            aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);
-
-            /*
-             * Move vhost device to the stopped state. The vhost-user device
-             * will be clean up and disconnected in BH. This can be useful in
-             * the vhost migration code. If disconnect was caught there is an
-             * option for the general vhost code to get the dev state without
-             * knowing its type (in this case vhost-user).
-             *
-             * FIXME: this is sketchy to be reaching into vhost_dev
-             * now because we are forcing something that implies we
-             * have executed vhost_dev_stop() but that won't happen
-             * until vhost_user_blk_stop() gets called from the bh.
-             * Really this state check should be tracked locally.
-             */
-            s->dev.started = false;
-        }
+        /* defer close until later to avoid circular close */
+        vhost_user_async_close(dev, &s->chardev, &s->dev,
+                               vhost_user_blk_disconnect);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user-gpio.c b/hw/virtio/vhost-user-gpio.c
index be9be08b4c..b7b82a1099 100644
--- a/hw/virtio/vhost-user-gpio.c
+++ b/hw/virtio/vhost-user-gpio.c
@@ -233,6 +233,8 @@  static int vu_gpio_connect(DeviceState *dev, Error **errp)
     return 0;
 }
 
+static void vu_gpio_event(void *opaque, QEMUChrEvent event);
+
 static void vu_gpio_disconnect(DeviceState *dev)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -245,6 +247,11 @@  static void vu_gpio_disconnect(DeviceState *dev)
 
     vu_gpio_stop(vdev);
     vhost_dev_cleanup(&gpio->vhost_dev);
+
+    /* Re-instate the event handler for new connections */
+    qemu_chr_fe_set_handlers(&gpio->chardev,
+                             NULL, NULL, vu_gpio_event,
+                             NULL, dev, NULL, true);
 }
 
 static void vu_gpio_event(void *opaque, QEMUChrEvent event)
@@ -262,7 +269,9 @@  static void vu_gpio_event(void *opaque, QEMUChrEvent event)
         }
         break;
     case CHR_EVENT_CLOSED:
-        vu_gpio_disconnect(dev);
+        /* defer close until later to avoid circular close */
+        vhost_user_async_close(dev, &gpio->chardev, &gpio->vhost_dev,
+                               vu_gpio_disconnect);
         break;
     case CHR_EVENT_BREAK:
     case CHR_EVENT_MUX_IN:
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index abe23d4ebe..8f635844af 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -21,6 +21,7 @@ 
 #include "qemu/error-report.h"
 #include "qemu/main-loop.h"
 #include "qemu/sockets.h"
+#include "sysemu/runstate.h"
 #include "sysemu/cryptodev.h"
 #include "migration/migration.h"
 #include "migration/postcopy-ram.h"
@@ -2670,6 +2671,76 @@  void vhost_user_cleanup(VhostUserState *user)
     user->chr = NULL;
 }
 
+
+typedef struct {
+    vu_async_close_fn cb;
+    DeviceState *dev;
+    CharBackend *cd;
+    struct vhost_dev *vhost;
+} VhostAsyncCallback;
+
+static void vhost_user_async_close_bh(void *opaque)
+{
+    VhostAsyncCallback *data = opaque;
+    struct vhost_dev *vhost = data->vhost;
+
+    /*
+     * If the vhost_dev has been cleared in the meantime there is
+     * nothing left to do as some other path has completed the
+     * cleanup.
+     */
+    if (vhost->vdev) {
+        data->cb(data->dev);
+    }
+
+    g_free(data);
+}
+
+/*
+ * We only schedule the work if the machine is running. If suspended
+ * we want to keep all the in-flight data as is for migration
+ * purposes.
+ */
+void vhost_user_async_close(DeviceState *d,
+                            CharBackend *chardev, struct vhost_dev *vhost,
+                            vu_async_close_fn cb)
+{
+    if (!runstate_check(RUN_STATE_SHUTDOWN)) {
+        /*
+         * A close event may happen during a read/write, but vhost
+         * code assumes the vhost_dev remains setup, so delay the
+         * stop & clear.
+         */
+        AioContext *ctx = qemu_get_current_aio_context();
+        VhostAsyncCallback *data = g_new0(VhostAsyncCallback, 1);
+
+        /* Save data for the callback */
+        data->cb = cb;
+        data->dev = d;
+        data->cd = chardev;
+        data->vhost = vhost;
+
+        /* Disable any further notifications on the chardev */
+        qemu_chr_fe_set_handlers(chardev,
+                                 NULL, NULL, NULL, NULL, NULL, NULL,
+                                 false);
+
+        aio_bh_schedule_oneshot(ctx, vhost_user_async_close_bh, data);
+
+        /*
+         * Move vhost device to the stopped state. The vhost-user device
+         * will be clean up and disconnected in BH. This can be useful in
+         * the vhost migration code. If disconnect was caught there is an
+         * option for the general vhost code to get the dev state without
+         * knowing its type (in this case vhost-user).
+         *
+         * Note if the vhost device is fully cleared by the time we
+         * execute the bottom half we won't continue with the cleanup.
+         */
+        vhost->started = false;
+    }
+}
+
 static int vhost_user_dev_start(struct vhost_dev *dev, bool started)
 {
     if (!virtio_has_feature(dev->protocol_features,