diff mbox series

scsi: sd: unregister device if device_add_disk() failed in sd_probe()

Message ID 20231208082335.1754205-1-linan666@huaweicloud.com
State New
Headers show
Series scsi: sd: unregister device if device_add_disk() failed in sd_probe() | expand

Commit Message

Li Nan Dec. 8, 2023, 8:23 a.m. UTC
From: Li Nan <linan122@huawei.com>

"if device_add() succeeds, you should call device_del() when you want to
get rid of it."

In sd_probe(), device_add_disk() fails when device_add() has already
succeeded, so change put_device() to device_unregister() to ensure device
resources are released.

Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/scsi/sd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Li Nan Jan. 29, 2024, 1:26 p.m. UTC | #1
friendly ping ...

在 2023/12/8 16:23, linan666@huaweicloud.com 写道:
> From: Li Nan <linan122@huawei.com>
> 
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
> 
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
> 
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/scsi/sd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>   
>   	error = device_add_disk(dev, gd, NULL);
>   	if (error) {
> -		put_device(&sdkp->disk_dev);
> +		device_unregister(&sdkp->disk_dev);
>   		put_disk(gd);
>   		goto out;
>   	}
Luis Chamberlain Jan. 29, 2024, 5:46 p.m. UTC | #2
On Fri, Dec 22, 2023 at 04:27:16PM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/12/22 14:49, Luis Chamberlain 写道:
> > On Fri, Dec 08, 2023 at 04:23:35PM +0800, linan666@huaweicloud.com wrote:
> > > From: Li Nan <linan122@huawei.com>
> > > 
> > > "if device_add() succeeds, you should call device_del() when you want to
> > > get rid of it."
> > > 
> > > In sd_probe(), device_add_disk() fails when device_add() has already
> > > succeeded, so change put_device() to device_unregister() to ensure device
> > > resources are released.
> > > 
> > > Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> > > Signed-off-by: Li Nan <linan122@huawei.com>
> > 
> > Nacked-by: Luis Chamberlain <mcgrof@kernel.org>
> > 
> > > ---
> > >   drivers/scsi/sd.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> > > index 542a4bbb21bc..d81cbeee06eb 100644
> > > --- a/drivers/scsi/sd.c
> > > +++ b/drivers/scsi/sd.c
> > > @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
> > >   	error = device_add_disk(dev, gd, NULL);
> > >   	if (error) {
> > > -		put_device(&sdkp->disk_dev);
> > > +		device_unregister(&sdkp->disk_dev);
> > >   		put_disk(gd);
> > >   		goto out;
> > >   	}
> > 
> > This is incorrect, device_unregister() calls:
> > 
> > void device_unregister(struct device *dev)
> > {
> > 	pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
> > 	device_del(dev);
> > 	put_device(dev);
> > }
> > 
> > So you're adding what you believe to be a correct missing device_del().
> > But what you missed is that if device_add_disk() fails then device_add()
> > did not succeed because the new code we have in the kernel *today* unwinds
> > this for us now.
> 
> I'm confused here, there are two device here, one is 'sdkp->disk_dev',
> one is gendisk->part0->bd_device, and the order in which they
> initialize:
> 
> sd_probe
> device_add(&sdkp->disk_dev) -> succeed
> device_add_disk -> failed, and device_add(bd_device) did not succeed
> put_device(&sdkp->disk_dev) -> device_del is missed
> 
> I don't see that if device_add_disk() fail, device_del() for
> 'sdkp->disk_dev'is called from anywhere. Do I missing anything?

Ah then the fix is still incorrect and the commit log should
describe that this is for another device.

How about this instead?
Yu Kuai Jan. 30, 2024, 1:30 a.m. UTC | #3
Hi,

在 2024/01/30 1:46, Luis Chamberlain 写道:
> On Fri, Dec 22, 2023 at 04:27:16PM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/12/22 14:49, Luis Chamberlain 写道:
>>> On Fri, Dec 08, 2023 at 04:23:35PM +0800, linan666@huaweicloud.com wrote:
>>>> From: Li Nan <linan122@huawei.com>
>>>>
>>>> "if device_add() succeeds, you should call device_del() when you want to
>>>> get rid of it."
>>>>
>>>> In sd_probe(), device_add_disk() fails when device_add() has already
>>>> succeeded, so change put_device() to device_unregister() to ensure device
>>>> resources are released.
>>>>
>>>> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
>>>> Signed-off-by: Li Nan <linan122@huawei.com>
>>>
>>> Nacked-by: Luis Chamberlain <mcgrof@kernel.org>
>>>
>>>> ---
>>>>    drivers/scsi/sd.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>>>> index 542a4bbb21bc..d81cbeee06eb 100644
>>>> --- a/drivers/scsi/sd.c
>>>> +++ b/drivers/scsi/sd.c
>>>> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>>>>    	error = device_add_disk(dev, gd, NULL);
>>>>    	if (error) {
>>>> -		put_device(&sdkp->disk_dev);
>>>> +		device_unregister(&sdkp->disk_dev);
>>>>    		put_disk(gd);
>>>>    		goto out;
>>>>    	}
>>>
>>> This is incorrect, device_unregister() calls:
>>>
>>> void device_unregister(struct device *dev)
>>> {
>>> 	pr_debug("device: '%s': %s\n", dev_name(dev), __func__);
>>> 	device_del(dev);
>>> 	put_device(dev);
>>> }
>>>
>>> So you're adding what you believe to be a correct missing device_del().
>>> But what you missed is that if device_add_disk() fails then device_add()
>>> did not succeed because the new code we have in the kernel *today* unwinds
>>> this for us now.
>>
>> I'm confused here, there are two device here, one is 'sdkp->disk_dev',
>> one is gendisk->part0->bd_device, and the order in which they
>> initialize:
>>
>> sd_probe
>> device_add(&sdkp->disk_dev) -> succeed
>> device_add_disk -> failed, and device_add(bd_device) did not succeed
>> put_device(&sdkp->disk_dev) -> device_del is missed
>>
>> I don't see that if device_add_disk() fail, device_del() for
>> 'sdkp->disk_dev'is called from anywhere. Do I missing anything?
> 
> Ah then the fix is still incorrect and the commit log should
> describe that this is for another device.
> 
> How about this instead?
> 
>>From c3f6e03f4a82aa253b6c487a293dcd576393b606 Mon Sep 17 00:00:00 2001
> From: Luis Chamberlain <mcgrof@kernel.org>
> Date: Mon, 29 Jan 2024 09:25:18 -0800
> Subject: [PATCH] sd: remove extra put_device() for extra scsi device
> 
> The sd driver first device_add() its own device, and later use
> device_add_disk() with another device. When we added error handling
> for device_add_disk() we now call put_disk() and that will trigger
> disk_release() when the refcount is 0. That will end up calling
> the block driver's disk->fops->free_disk() if one is defined. The

This is incorrect. GD_ADDED will only set when device_add_disk()
succeed, and free_disk() will only be called from disk_release() if
GD_ADDED is set. I think Li Nan's patch is correct.

> sd driver has scsi_disk_free_disk() as its free_disk() and that
> does the proper put_device(&sdkp->disk_dev) for us so we should not
> need to call it, however we are left still missing the device_del()
> for it.
> 
> While at it, unwind with scsi_autopm_put_device(sdp) *prior* to
> putting to device as we do in sd_remove().
> 
> Reported-by: Li Nan <linan122@huawei.com>
> Reported-by: Yu Kuai <yukuai1@huaweicloud.com>
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   drivers/scsi/sd.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 7f949adbadfd..6475a3c947f8 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3693,8 +3693,9 @@ static int sd_probe(struct device *dev)
>   
>   	error = device_add(&sdkp->disk_dev);
>   	if (error) {
> +		scsi_autopm_put_device(sdp);
>   		put_device(&sdkp->disk_dev);
> -		goto out;
> +		return error;

I don't see why this is necessary, the tag 'out' is still there. If
you think is a problem, I think you need a separate patch to call
scsi_autopm_put_device() before putting the device.

Thanks,
Kuai

>   	}
>   
>   	dev_set_drvdata(dev, sdkp);
> @@ -3734,9 +3735,10 @@ static int sd_probe(struct device *dev)
>   
>   	error = device_add_disk(dev, gd, NULL);
>   	if (error) {
> -		put_device(&sdkp->disk_dev);
> +		scsi_autopm_put_device(sdp);
> +		device_del(&sdkp->disk_dev);
>   		put_disk(gd);
> -		goto out;
> +		return error;
>   	}
>   
>   	if (sdkp->security) {
>
Li Nan Feb. 22, 2024, 9:24 a.m. UTC | #4
friendly ping...

在 2023/12/8 16:23, linan666@huaweicloud.com 写道:
> From: Li Nan <linan122@huawei.com>
> 
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
> 
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
> 
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/scsi/sd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>   
>   	error = device_add_disk(dev, gd, NULL);
>   	if (error) {
> -		put_device(&sdkp->disk_dev);
> +		device_unregister(&sdkp->disk_dev);
>   		put_disk(gd);
>   		goto out;
>   	}
Bart Van Assche April 1, 2024, 5:49 p.m. UTC | #5
On 12/8/23 00:23, linan666@huaweicloud.com wrote:
> From: Li Nan <linan122@huawei.com>
> 
> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
> 
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
> 
> Fixes: 2a7a891f4c40 ("scsi: sd: Add error handling support for add_disk()")
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/scsi/sd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index 542a4bbb21bc..d81cbeee06eb 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -3736,7 +3736,7 @@ static int sd_probe(struct device *dev)
>   
>   	error = device_add_disk(dev, gd, NULL);
>   	if (error) {
> -		put_device(&sdkp->disk_dev);
> +		device_unregister(&sdkp->disk_dev);
>   		put_disk(gd);
>   		goto out;
>   	}

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Bart Van Assche April 1, 2024, 5:51 p.m. UTC | #6
On 3/31/24 18:31, Yu Kuai wrote:
> I was shocked that this patch is still there. This patch is easy and
> straightforward.
> 
> LGTM
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> 
> BTW, Nan, it will be better if you have a reporducer for this.
Agreed that a reproducer would help. No information is present in the
patch description about how this issue was detected nor about how it was
tested. That probably would have encouraged reviewers.

Bart.
Damien Le Moal April 2, 2024, 12:16 a.m. UTC | #7
On 4/2/24 02:51, Bart Van Assche wrote:
> On 3/31/24 18:31, Yu Kuai wrote:
>> I was shocked that this patch is still there. This patch is easy and
>> straightforward.
>>
>> LGTM
>> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
>>
>> BTW, Nan, it will be better if you have a reporducer for this.
> Agreed that a reproducer would help. No information is present in the
> patch description about how this issue was detected nor about how it was
> tested. That probably would have encouraged reviewers.

May be submit the reproducer as a blktest case while at it ?

> 
> Bart.
>
Martin K. Petersen April 2, 2024, 1:48 a.m. UTC | #8
On Fri, 08 Dec 2023 16:23:35 +0800, linan666@huaweicloud.com wrote:

> "if device_add() succeeds, you should call device_del() when you want to
> get rid of it."
> 
> In sd_probe(), device_add_disk() fails when device_add() has already
> succeeded, so change put_device() to device_unregister() to ensure device
> resources are released.
> 
> [...]

Applied to 6.9/scsi-fixes, thanks!

[1/1] scsi: sd: unregister device if device_add_disk() failed in sd_probe()
      https://git.kernel.org/mkp/scsi/c/0296bea01cfa
diff mbox series

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 542a4bbb21bc..d81cbeee06eb 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3736,7 +3736,7 @@  static int sd_probe(struct device *dev)
 
 	error = device_add_disk(dev, gd, NULL);
 	if (error) {
-		put_device(&sdkp->disk_dev);
+		device_unregister(&sdkp->disk_dev);
 		put_disk(gd);
 		goto out;
 	}