diff mbox series

[v2,3/3] vdpa/mlx5: defer clear_virtqueues to until DRIVER_OK

Message ID 1612993680-29454-4-git-send-email-si-wei.liu@oracle.com
State New
Headers show
Series None | expand

Commit Message

Si-Wei Liu Feb. 10, 2021, 9:48 p.m. UTC
While virtq is stopped,  get_vq_state() is supposed to
be  called to  get  sync'ed  with  the latest internal
avail_index from device. The saved avail_index is used
to restate  the virtq  once device is started.  Commit
b35ccebe3ef7 introduced the clear_virtqueues() routine
to  reset  the saved  avail_index,  however, the index
gets cleared a bit earlier before get_vq_state() tries
to read it. This would cause consistency problems when
virtq is restarted, e.g. through a series of link down
and link up events. We  could  defer  the  clearing of
avail_index  to  until  the  device  is to be started,
i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in
set_status().

Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
 drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eli Cohen Feb. 11, 2021, 7:33 a.m. UTC | #1
On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote:
> While virtq is stopped,  get_vq_state() is supposed to

> be  called to  get  sync'ed  with  the latest internal

> avail_index from device. The saved avail_index is used

> to restate  the virtq  once device is started.  Commit

> b35ccebe3ef7 introduced the clear_virtqueues() routine

> to  reset  the saved  avail_index,  however, the index

> gets cleared a bit earlier before get_vq_state() tries

> to read it. This would cause consistency problems when

> virtq is restarted, e.g. through a series of link down

> and link up events. We  could  defer  the  clearing of

> avail_index  to  until  the  device  is to be started,

> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in

> set_status().

> 

> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")

> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>

> Acked-by: Jason Wang <jasowang@redhat.com>


Acked-by: Eli Cohen <elic@nvidia.com>


> ---

>  drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c

> index 7c1f789..ce6aae8 100644

> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c

> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c

> @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

>  	if (!status) {

>  		mlx5_vdpa_info(mvdev, "performing device reset\n");

>  		teardown_driver(ndev);

> -		clear_virtqueues(ndev);

>  		mlx5_vdpa_destroy_mr(&ndev->mvdev);

>  		ndev->mvdev.status = 0;

>  		++mvdev->generation;

> @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

>  

>  	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {

>  		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {

> +			clear_virtqueues(ndev);

>  			err = setup_driver(ndev);

>  			if (err) {

>  				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");

> -- 

> 1.8.3.1

>
Eli Cohen Feb. 16, 2021, 3:21 p.m. UTC | #2
On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote:
> On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote:

> > While virtq is stopped,  get_vq_state() is supposed to

> > be  called to  get  sync'ed  with  the latest internal

> > avail_index from device. The saved avail_index is used

> > to restate  the virtq  once device is started.  Commit

> > b35ccebe3ef7 introduced the clear_virtqueues() routine

> > to  reset  the saved  avail_index,  however, the index

> > gets cleared a bit earlier before get_vq_state() tries

> > to read it. This would cause consistency problems when

> > virtq is restarted, e.g. through a series of link down

> > and link up events. We  could  defer  the  clearing of

> > avail_index  to  until  the  device  is to be started,

> > i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in

> > set_status().

> > 

> > Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")

> > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>

> > Acked-by: Jason Wang <jasowang@redhat.com>

> 

> Acked-by: Eli Cohen <elic@nvidia.com>

> 


I take it back. I think we don't need to clear the indexes at all. In
case we need to restore indexes we'll get the right values through
set_vq_state(). If we suspend the virtqueue due to VM being suspended,
qemu will query first and will provide the the queried value. In case of
VM reboot, it will provide 0 in set_vq_state().

I am sending a patch that addresses both reboot and suspend.

> > ---

> >  drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-

> >  1 file changed, 1 insertion(+), 1 deletion(-)

> > 

> > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c

> > index 7c1f789..ce6aae8 100644

> > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c

> > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c

> > @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

> >  	if (!status) {

> >  		mlx5_vdpa_info(mvdev, "performing device reset\n");

> >  		teardown_driver(ndev);

> > -		clear_virtqueues(ndev);

> >  		mlx5_vdpa_destroy_mr(&ndev->mvdev);

> >  		ndev->mvdev.status = 0;

> >  		++mvdev->generation;

> > @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

> >  

> >  	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {

> >  		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {

> > +			clear_virtqueues(ndev);

> >  			err = setup_driver(ndev);

> >  			if (err) {

> >  				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");

> > -- 

> > 1.8.3.1

> >
Si-Wei Liu Feb. 17, 2021, 9:55 p.m. UTC | #3
On 2/16/2021 7:21 AM, Eli Cohen wrote:
> On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote:

>> On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote:

>>> While virtq is stopped,  get_vq_state() is supposed to

>>> be  called to  get  sync'ed  with  the latest internal

>>> avail_index from device. The saved avail_index is used

>>> to restate  the virtq  once device is started.  Commit

>>> b35ccebe3ef7 introduced the clear_virtqueues() routine

>>> to  reset  the saved  avail_index,  however, the index

>>> gets cleared a bit earlier before get_vq_state() tries

>>> to read it. This would cause consistency problems when

>>> virtq is restarted, e.g. through a series of link down

>>> and link up events. We  could  defer  the  clearing of

>>> avail_index  to  until  the  device  is to be started,

>>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in

>>> set_status().

>>>

>>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index after change map")

>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>

>>> Acked-by: Jason Wang <jasowang@redhat.com>

>> Acked-by: Eli Cohen <elic@nvidia.com>

>>

> I take it back. I think we don't need to clear the indexes at all. In

> case we need to restore indexes we'll get the right values through

> set_vq_state(). If we suspend the virtqueue due to VM being suspended,

> qemu will query first and will provide the the queried value. In case of

> VM reboot, it will provide 0 in set_vq_state().

>

> I am sending a patch that addresses both reboot and suspend.

With set_vq_state() repurposed to restoring used_index I'm fine with 
this approach.

Do I have to repost a v3 of this series while dropping the 3rd patch?

-Siwei
>

>>> ---

>>>   drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +-

>>>   1 file changed, 1 insertion(+), 1 deletion(-)

>>>

>>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c

>>> index 7c1f789..ce6aae8 100644

>>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c

>>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c

>>> @@ -1777,7 +1777,6 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

>>>   	if (!status) {

>>>   		mlx5_vdpa_info(mvdev, "performing device reset\n");

>>>   		teardown_driver(ndev);

>>> -		clear_virtqueues(ndev);

>>>   		mlx5_vdpa_destroy_mr(&ndev->mvdev);

>>>   		ndev->mvdev.status = 0;

>>>   		++mvdev->generation;

>>> @@ -1786,6 +1785,7 @@ static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)

>>>   

>>>   	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {

>>>   		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {

>>> +			clear_virtqueues(ndev);

>>>   			err = setup_driver(ndev);

>>>   			if (err) {

>>>   				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");

>>> -- 

>>> 1.8.3.1

>>>
Jason Wang Feb. 18, 2021, 6:37 a.m. UTC | #4
On 2021/2/18 上午5:55, Si-Wei Liu wrote:
>

>

> On 2/16/2021 7:21 AM, Eli Cohen wrote:

>> On Thu, Feb 11, 2021 at 09:33:14AM +0200, Eli Cohen wrote:

>>> On Wed, Feb 10, 2021 at 01:48:00PM -0800, Si-Wei Liu wrote:

>>>> While virtq is stopped,  get_vq_state() is supposed to

>>>> be  called to  get  sync'ed  with  the latest internal

>>>> avail_index from device. The saved avail_index is used

>>>> to restate  the virtq  once device is started.  Commit

>>>> b35ccebe3ef7 introduced the clear_virtqueues() routine

>>>> to  reset  the saved  avail_index,  however, the index

>>>> gets cleared a bit earlier before get_vq_state() tries

>>>> to read it. This would cause consistency problems when

>>>> virtq is restarted, e.g. through a series of link down

>>>> and link up events. We  could  defer  the  clearing of

>>>> avail_index  to  until  the  device  is to be started,

>>>> i.e. until  VIRTIO_CONFIG_S_DRIVER_OK  is set again in

>>>> set_status().

>>>>

>>>> Fixes: b35ccebe3ef7 ("vdpa/mlx5: Restore the hardware used index 

>>>> after change map")

>>>> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>

>>>> Acked-by: Jason Wang <jasowang@redhat.com>

>>> Acked-by: Eli Cohen <elic@nvidia.com>

>>>

>> I take it back. I think we don't need to clear the indexes at all. In

>> case we need to restore indexes we'll get the right values through

>> set_vq_state(). If we suspend the virtqueue due to VM being suspended,

>> qemu will query first and will provide the the queried value. In case of

>> VM reboot, it will provide 0 in set_vq_state().

>>

>> I am sending a patch that addresses both reboot and suspend.

> With set_vq_state() repurposed to restoring used_index I'm fine with 

> this approach.

>

> Do I have to repost a v3 of this series while dropping the 3rd patch?

>

> -Siwei 



Yes, please.

Thanks
diff mbox series

Patch

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 7c1f789..ce6aae8 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -1777,7 +1777,6 @@  static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 	if (!status) {
 		mlx5_vdpa_info(mvdev, "performing device reset\n");
 		teardown_driver(ndev);
-		clear_virtqueues(ndev);
 		mlx5_vdpa_destroy_mr(&ndev->mvdev);
 		ndev->mvdev.status = 0;
 		++mvdev->generation;
@@ -1786,6 +1785,7 @@  static void mlx5_vdpa_set_status(struct vdpa_device *vdev, u8 status)
 
 	if ((status ^ ndev->mvdev.status) & VIRTIO_CONFIG_S_DRIVER_OK) {
 		if (status & VIRTIO_CONFIG_S_DRIVER_OK) {
+			clear_virtqueues(ndev);
 			err = setup_driver(ndev);
 			if (err) {
 				mlx5_vdpa_warn(mvdev, "failed to setup driver\n");