[01/15] md: remove the code to flush an old instance in md_open

Message ID 20210330161727.2297292-2-hch@lst.de
State New
Headers show
Series
  • [01/15] md: remove the code to flush an old instance in md_open
Related show

Commit Message

Christoph Hellwig March 30, 2021, 4:17 p.m.
Due to the flush_workqueue() call in md_alloc no previous instance of
mddev can still be around at this point.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 35 +++++++----------------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

Comments

heming.zhao@suse.com March 31, 2021, 3:29 a.m. | #1
On 3/31/21 12:17 AM, Christoph Hellwig wrote:
> Due to the flush_workqueue() call in md_alloc no previous instance of

> mddev can still be around at this point.

> 

> Signed-off-by: Christoph Hellwig <hch@lst.de>

> ---

>   drivers/md/md.c | 35 +++++++----------------------------

>   1 file changed, 7 insertions(+), 28 deletions(-)

> 

> diff --git a/drivers/md/md.c b/drivers/md/md.c

> index 368cad6cd53a6e..cd2d825dd4f881 100644

> --- a/drivers/md/md.c

> +++ b/drivers/md/md.c

> @@ -7807,43 +7807,22 @@ static int md_open(struct block_device *bdev, fmode_t mode)

>   	 * Succeed if we can lock the mddev, which confirms that

>   	 * it isn't being stopped right now.

>   	 */

> -	struct mddev *mddev = mddev_find(bdev->bd_dev);

> +	struct mddev *mddev = bdev->bd_disk->private_data;

>   	int err;

>   

> -	if (!mddev)

> -		return -ENODEV;

> -

> -	if (mddev->gendisk != bdev->bd_disk) {

> -		/* we are racing with mddev_put which is discarding this

> -		 * bd_disk.

> -		 */

> -		mddev_put(mddev);

> -		/* Wait until bdev->bd_disk is definitely gone */

> -		if (work_pending(&mddev->del_work))

> -			flush_workqueue(md_misc_wq);

> -		/* Then retry the open from the top */

> -		return -ERESTARTSYS;

> -	}

> -	BUG_ON(mddev != bdev->bd_disk->private_data);

> -

> -	if ((err = mutex_lock_interruptible(&mddev->open_mutex)))

> -		goto out;

> -

> +	err = mutex_lock_interruptible(&mddev->open_mutex);

> +	if (err)

> +		return err;

>   	if (test_bit(MD_CLOSING, &mddev->flags)) {

>   		mutex_unlock(&mddev->open_mutex);

> -		err = -ENODEV;

> -		goto out;

> +		return -ENODEV;

>   	}

> -

> -	err = 0;

> +	mddev_get(mddev);

>   	atomic_inc(&mddev->openers);

>   	mutex_unlock(&mddev->open_mutex);

>   

>   	bdev_check_media_change(bdev);

> - out:

> -	if (err)

> -		mddev_put(mddev);

> -	return err;

> +	return 0;

>   }

>   

>   static void md_release(struct gendisk *disk, fmode_t mode)

> 


Hello Christoph,

After applying your patch, the md_open() will be:
```
static int md_open(struct block_device *bdev, fmode_t mode)
{
     /* ...  */
     struct mddev *mddev = bdev->bd_disk->private_data;
     int err;

     err = mutex_lock_interruptible(&mddev->open_mutex);
     if (err)
         return err;

     if (test_bit(MD_CLOSING, &mddev->flags)) {
         mutex_unlock(&mddev->open_mutex);
         return -ENODEV;
     }

     mddev_get(mddev);
     atomic_inc(&mddev->openers);
     mutex_unlock(&mddev->open_mutex);

     bdev_check_media_change(bdev);
     return 0;
}
```

in clean path, MD_CLOSING only lives a very short time, then be cleaned in md_clean:
```
ioctl
  + test_and_set_bit(MD_CLOSING, &mddev->flags)
  + do_md_stop //case STOP_ARRAY
     md_clean
      mddev->flags = 0;
```

when userspace "mdadm -Ss" finish (the ioctl STOP_ARRAY returns),
mddev->flags will be zero. and you can see my patch email (date: 2021-3-30).
At this time, userspace will execute "mdadm --monitor" to scan the
closing md device. the md_open will trigger very soon. at this time,
bdev->bd_disk->private_data is only a skeleton, your shouldn't trust & use it.

So mddev with MD_CLOSING protection, the md_open is not safety.

PS:
Neil Brown legacy commit d3374825ce57ba2214d37502397 also describes this condition.

Thanks,
heming
Christoph Hellwig March 31, 2021, 6:53 a.m. | #2
On Wed, Mar 31, 2021 at 11:29:39AM +0800, heming.zhao@suse.com wrote:
> when userspace "mdadm -Ss" finish (the ioctl STOP_ARRAY returns),

> mddev->flags will be zero. and you can see my patch email (date: 2021-3-30).

> At this time, userspace will execute "mdadm --monitor" to scan the

> closing md device. the md_open will trigger very soon. at this time,

> bdev->bd_disk->private_data is only a skeleton, your shouldn't trust & use it.


Ermm, the block layer rules require the device to be fully set up
when add_disk is called.  So if that is not the case (and I'd like
to see hints how) we need to fix this properly instead of using a hack
in ->open.

Patch

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 368cad6cd53a6e..cd2d825dd4f881 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7807,43 +7807,22 @@  static int md_open(struct block_device *bdev, fmode_t mode)
 	 * Succeed if we can lock the mddev, which confirms that
 	 * it isn't being stopped right now.
 	 */
-	struct mddev *mddev = mddev_find(bdev->bd_dev);
+	struct mddev *mddev = bdev->bd_disk->private_data;
 	int err;
 
-	if (!mddev)
-		return -ENODEV;
-
-	if (mddev->gendisk != bdev->bd_disk) {
-		/* we are racing with mddev_put which is discarding this
-		 * bd_disk.
-		 */
-		mddev_put(mddev);
-		/* Wait until bdev->bd_disk is definitely gone */
-		if (work_pending(&mddev->del_work))
-			flush_workqueue(md_misc_wq);
-		/* Then retry the open from the top */
-		return -ERESTARTSYS;
-	}
-	BUG_ON(mddev != bdev->bd_disk->private_data);
-
-	if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
-		goto out;
-
+	err = mutex_lock_interruptible(&mddev->open_mutex);
+	if (err)
+		return err;
 	if (test_bit(MD_CLOSING, &mddev->flags)) {
 		mutex_unlock(&mddev->open_mutex);
-		err = -ENODEV;
-		goto out;
+		return -ENODEV;
 	}
-
-	err = 0;
+	mddev_get(mddev);
 	atomic_inc(&mddev->openers);
 	mutex_unlock(&mddev->open_mutex);
 
 	bdev_check_media_change(bdev);
- out:
-	if (err)
-		mddev_put(mddev);
-	return err;
+	return 0;
 }
 
 static void md_release(struct gendisk *disk, fmode_t mode)