diff mbox series

restrict legal sdev_state transitions via sysfs

Message ID 20220924000241.2967323-1-ushankar@purestorage.com
State New
Headers show
Series restrict legal sdev_state transitions via sysfs | expand

Commit Message

Uday Shankar Sept. 24, 2022, 12:02 a.m. UTC
Userspace can currently write to sysfs to transition sdev_state to
RUNNING or OFFLINE from any source state. This causes issues because
proper transitioning out of some states involves steps besides just
changing sdev_state, so allowing userspace to change sdev_state
regardless of the source state can result in inconsistencies; e.g. with
iscsi we can end up with sdev_state == SDEV_RUNNING while the device
queue is quiesced. Any task attempting IO on the device will then hang,
and in more recent kernels, iscsid will hang as well. More detail about
this bug is provided in my first attempt:
https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Suggested-by: Mike Christie <michael.christie@oracle.com>
---
Looking for feedback on the "allowed source states" list. The bug I hit
is solved by prohibiting transitions out of SDEV_BLOCKED, but I think
most others shouldn't be allowed either.

 drivers/scsi/scsi_sysfs.c | 8 ++++++++
 1 file changed, 8 insertions(+)


base-commit: 7f615c1b5986ff08a725ee489e838c90f8197bcd

Comments

Mike Christie Sept. 30, 2022, 2:57 a.m. UTC | #1
On 9/23/22 7:02 PM, Uday Shankar wrote:
> ---
> Looking for feedback on the "allowed source states" list. The bug I hit
> is solved by prohibiting transitions out of SDEV_BLOCKED, but I think
> most others shouldn't be allowed either.

I think it's ok to be restrictive:

1. BLOCKED is just broken. When the transport classes and scsi_lib transition
out of that state they do a lot more than just set the set. We are leaving
the kernel in mismatched state where the device state is running but the
block layerand transport classes are not ready for IO.

2. CREATED does not happen. We go into RUNNING then do scsi_sysfs_add_sdev so
userspace should not see the CREATED state.

3. I'm not 100% sure about SDEV_QUIESCE though. It looks like it has similar issues
as BLOCKED where scsi_device_resume will undo things it did in scsi_device_quiesce,
so we can't just set the state to RUNNING and expect things to work. I'm not
sure about the scsi_transport_spi cases.

It would be best for James or Hannes to comment because they know that code well.

4. The transport classes are the ones that put devs in SDEV_TRANSPORT_OFFLINE
and then transition them when they are ready. The block layer is at least in
the correct state, but the transport classes may not be ready for IO since they
are not expecting IO to be queued to them at that time.

> 
>  drivers/scsi/scsi_sysfs.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 9dad2fd5297f..b38c30fe681d 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -822,6 +822,14 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>  	}
>  
>  	mutex_lock(&sdev->state_mutex);
> +	switch (sdev->sdev_state) {
> +	case SDEV_RUNNING:
> +	case SDEV_OFFLINE:
> +		break;
> +	default:
> +		mutex_unlock(&sdev->state_mutex);
> +		return -EINVAL;
> +	}
>  	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
>  		ret = 0;
>  	} else {
> 
> base-commit: 7f615c1b5986ff08a725ee489e838c90f8197bcd
Uday Shankar Oct. 6, 2022, 7:27 p.m. UTC | #2
On Thu, Sep 29, 2022 at 09:57:19PM -0500, Mike Christie wrote:
> On 9/23/22 7:02 PM, Uday Shankar wrote:
> > ---
> > Looking for feedback on the "allowed source states" list. The bug I hit
> > is solved by prohibiting transitions out of SDEV_BLOCKED, but I think
> > most others shouldn't be allowed either.
> 
> I think it's ok to be restrictive:
> 
> 1. BLOCKED is just broken. When the transport classes and scsi_lib transition
> out of that state they do a lot more than just set the set. We are leaving
> the kernel in mismatched state where the device state is running but the
> block layerand transport classes are not ready for IO.
> 
> 2. CREATED does not happen. We go into RUNNING then do scsi_sysfs_add_sdev so
> userspace should not see the CREATED state.
> 
> 3. I'm not 100% sure about SDEV_QUIESCE though. It looks like it has similar issues
> as BLOCKED where scsi_device_resume will undo things it did in scsi_device_quiesce,
> so we can't just set the state to RUNNING and expect things to work. I'm not
> sure about the scsi_transport_spi cases.
> 
> It would be best for James or Hannes to comment because they know that code well.

Adding Hannes to CC.

> 4. The transport classes are the ones that put devs in SDEV_TRANSPORT_OFFLINE
> and then transition them when they are ready. The block layer is at least in
> the correct state, but the transport classes may not be ready for IO since they
> are not expecting IO to be queued to them at that time.
> 
> > 
> >  drivers/scsi/scsi_sysfs.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 9dad2fd5297f..b38c30fe681d 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -822,6 +822,14 @@ store_state_field(struct device *dev, struct device_attribute *attr,
> >  	}
> >  
> >  	mutex_lock(&sdev->state_mutex);
> > +	switch (sdev->sdev_state) {
> > +	case SDEV_RUNNING:
> > +	case SDEV_OFFLINE:
> > +		break;
> > +	default:
> > +		mutex_unlock(&sdev->state_mutex);
> > +		return -EINVAL;
> > +	}
> >  	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
> >  		ret = 0;
> >  	} else {
> > 
> > base-commit: 7f615c1b5986ff08a725ee489e838c90f8197bcd
>
Bart Van Assche Oct. 6, 2022, 7:53 p.m. UTC | #3
On 9/23/22 17:02, Uday Shankar wrote:
> Userspace can currently write to sysfs to transition sdev_state to
> RUNNING or OFFLINE from any source state. This causes issues because
> proper transitioning out of some states involves steps besides just
> changing sdev_state, so allowing userspace to change sdev_state
> regardless of the source state can result in inconsistencies; e.g. with
> iscsi we can end up with sdev_state == SDEV_RUNNING while the device
> queue is quiesced. Any task attempting IO on the device will then hang,
> and in more recent kernels, iscsid will hang as well. More detail about
> this bug is provided in my first attempt:
> https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ
> 
> Signed-off-by: Uday Shankar <ushankar@purestorage.com>
> Suggested-by: Mike Christie <michael.christie@oracle.com>
> ---
> Looking for feedback on the "allowed source states" list. The bug I hit
> is solved by prohibiting transitions out of SDEV_BLOCKED, but I think
> most others shouldn't be allowed either.
> 
>   drivers/scsi/scsi_sysfs.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 9dad2fd5297f..b38c30fe681d 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -822,6 +822,14 @@ store_state_field(struct device *dev, struct device_attribute *attr,
>   	}
>   
>   	mutex_lock(&sdev->state_mutex);
> +	switch (sdev->sdev_state) {
> +	case SDEV_RUNNING:
> +	case SDEV_OFFLINE:
> +		break;
> +	default:
> +		mutex_unlock(&sdev->state_mutex);
> +		return -EINVAL;
> +	}
>   	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
>   		ret = 0;
>   	} else {

The return value -EAGAIN might be more appropriate since it is not the value
written into sysfs that is invalid but the current state that is inappropriate
for the requested transition.

Thanks,

Bart.
Hannes Reinecke Oct. 12, 2022, 8:13 a.m. UTC | #4
On 9/30/22 04:57, Mike Christie wrote:
> On 9/23/22 7:02 PM, Uday Shankar wrote:
>> ---
>> Looking for feedback on the "allowed source states" list. The bug I hit
>> is solved by prohibiting transitions out of SDEV_BLOCKED, but I think
>> most others shouldn't be allowed either.
> 
> I think it's ok to be restrictive:
> 
> 1. BLOCKED is just broken. When the transport classes and scsi_lib transition
> out of that state they do a lot more than just set the set. We are leaving
> the kernel in mismatched state where the device state is running but the
> block layerand transport classes are not ready for IO.
> 
> 2. CREATED does not happen. We go into RUNNING then do scsi_sysfs_add_sdev so
> userspace should not see the CREATED state.
> 
> 3. I'm not 100% sure about SDEV_QUIESCE though. It looks like it has similar issues
> as BLOCKED where scsi_device_resume will undo things it did in scsi_device_quiesce,
> so we can't just set the state to RUNNING and expect things to work. I'm not
> sure about the scsi_transport_spi cases.
> 
> It would be best for James or Hannes to comment because they know that code well.
> 
Indeed, you are correct.
The only sensible state transitions to be modified from userland is 
indeed between SDEV_RUNNING and SDEV_OFFLINE.
All other states are in fact part of the SCSI midlayer operations, and 
there a quite strict rules on which state transitions are allowed and 
who should be initiating them.

So yes, I'm fine with this patch.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
Martin K. Petersen Oct. 18, 2022, 3:52 a.m. UTC | #5
On Fri, 23 Sep 2022 18:02:42 -0600, Uday Shankar wrote:

> Userspace can currently write to sysfs to transition sdev_state to
> RUNNING or OFFLINE from any source state. This causes issues because
> proper transitioning out of some states involves steps besides just
> changing sdev_state, so allowing userspace to change sdev_state
> regardless of the source state can result in inconsistencies; e.g. with
> iscsi we can end up with sdev_state == SDEV_RUNNING while the device
> queue is quiesced. Any task attempting IO on the device will then hang,
> and in more recent kernels, iscsid will hang as well. More detail about
> this bug is provided in my first attempt:
> https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ
> 
> [...]

Applied to 6.1/scsi-fixes, thanks!

[1/1] restrict legal sdev_state transitions via sysfs
      https://git.kernel.org/mkp/scsi/c/2331ce6126be
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 9dad2fd5297f..b38c30fe681d 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -822,6 +822,14 @@  store_state_field(struct device *dev, struct device_attribute *attr,
 	}
 
 	mutex_lock(&sdev->state_mutex);
+	switch (sdev->sdev_state) {
+	case SDEV_RUNNING:
+	case SDEV_OFFLINE:
+		break;
+	default:
+		mutex_unlock(&sdev->state_mutex);
+		return -EINVAL;
+	}
 	if (sdev->sdev_state == SDEV_RUNNING && state == SDEV_RUNNING) {
 		ret = 0;
 	} else {