[5/6] hisi_sas: add hisi_sas_slave_configure()

Message ID	1455625351-165881-6-git-send-email-john.garry@huawei.com
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; From: John Garry <john.garry@huawei.com> To: <JBottomley@odin.com>, <martin.petersen@oracle.com> CC: <linuxarm@huawei.com>, <zhangfei.gao@linaro.org>, <xuwei5@hisilicon.com>, <john.garry2@mail.dcu.ie>, <linux-scsi@vger.kernel.org>, <linux-kernel@vger.kernel.org>, John Garry <john.garry@huawei.com> Subject: [PATCH 5/6] hisi_sas: add hisi_sas_slave_configure() Date: Tue, 16 Feb 2016 20:22:30 +0800 Message-ID: <1455625351-165881-6-git-send-email-john.garry@huawei.com> In-Reply-To: <1455625351-165881-1-git-send-email-john.garry@huawei.com> References: <1455625351-165881-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk

John Garry Feb. 16, 2016, 12:22 p.m. UTC

In high-datarate aging tests, it is found that
the SCSI framework can periodically
issue lu resets to the device. This is because scsi
commands begin to timeout. It is found that TASK SET
FULL may be returned many times for the same command,
causing the timeouts.
To overcome this, the queue depth for the device needs
to be reduced to 64 (from 256, set in
sas_slave_configure()).

Signed-off-by: John Garry <john.garry@huawei.com>

---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

-- 
1.9.1

John Garry Feb. 16, 2016, 4:56 p.m. UTC | #1

On 16/02/2016 15:33, Hannes Reinecke wrote:
> On 02/16/2016 01:22 PM, John Garry wrote:

>> In high-datarate aging tests, it is found that

>> the SCSI framework can periodically

>> issue lu resets to the device. This is because scsi

>> commands begin to timeout. It is found that TASK SET

>> FULL may be returned many times for the same command,

>> causing the timeouts.

>> To overcome this, the queue depth for the device needs

>> to be reduced to 64 (from 256, set in

>> sas_slave_configure()).

>>

> Hmm. TASK SET FULL should cause the queue depth to be reduced

> automatically, no?

>

> Cheers,

>

> Hannes

>

I need to double-check if Task set full reduces the depth, I don't think 
it does.

Regardless I found we were getting a combination of commands being 
retried due to Task Set Full and also SAS_QUEUE_FULL errors. For sure 
the SAS_QUEUE_FULL task errors reduce the queue depth in 
scsi_track_queue_full(). However I found it to be very slow in tracking, 
and we were getting commands timing out before the queue depth fell enough.

It would be nice to change default queue depth in sas_slave_configure() 
to a lower value so we can avoid this patch, but I am not sure if other 
vendor's HBA performance would be affected. From looking at the history 
of sas_slave_configure(), it would seem the value of 256 was inherited 
from mpt2sas driver, so I'm not sure.

Cheers,
John

John Garry Feb. 18, 2016, 10:12 a.m. UTC | #2

On 18/02/2016 07:40, Hannes Reinecke wrote:
> On 02/16/2016 05:56 PM, John Garry wrote:

>> On 16/02/2016 15:33, Hannes Reinecke wrote:

>>> On 02/16/2016 01:22 PM, John Garry wrote:

>>>> In high-datarate aging tests, it is found that

>>>> the SCSI framework can periodically

>>>> issue lu resets to the device. This is because scsi

>>>> commands begin to timeout. It is found that TASK SET

>>>> FULL may be returned many times for the same command,

>>>> causing the timeouts.

>>>> To overcome this, the queue depth for the device needs

>>>> to be reduced to 64 (from 256, set in

>>>> sas_slave_configure()).

>>>>

>>> Hmm. TASK SET FULL should cause the queue depth to be reduced

>>> automatically, no?

>>>

>>> Cheers,

>>>

>>> Hannes

>>>

>>

>> I need to double-check if Task set full reduces the depth, I don't

>> think it does.

>>

>> Regardless I found we were getting a combination of commands being

>> retried due to Task Set Full and also SAS_QUEUE_FULL errors. For

>> sure the SAS_QUEUE_FULL task errors reduce the queue depth in

>> scsi_track_queue_full(). However I found it to be very slow in

>> tracking, and we were getting commands timing out before the queue

>> depth fell enough.

>>

>> It would be nice to change default queue depth in

>> sas_slave_configure() to a lower value so we can avoid this patch,

>> but I am not sure if other vendor's HBA performance would be

>> affected. From looking at the history of sas_slave_configure(), it

>> would seem the value of 256 was inherited from mpt2sas driver, so

>> I'm not sure.

>>

> Well, the classical thing would be to associate each request tag

> with a SAS task; or, in your case, associate each slot index with a

> request tag.

> You probably would need to reserve some slots for TMFs, ie you'd

> need to decrease the resulting ->can_queue variable by that.

> But once you've done that you shouldn't hit any QUEUE_FULL issues,

> as the block layer will ensure that no tags will be reused while the

> command is in flight.

> Plus this is something you really need to be doing if you ever

> consider moving to scsi-mq ...

>

> Cheers,

>

> Hannes

>

Hi,

So would you recommend this method under the assumption that the 
can_queue value for the host is similar to the queue depth for the device?

Regards,
John

John Garry Feb. 18, 2016, 10:57 a.m. UTC | #3

On 18/02/2016 10:30, Hannes Reinecke wrote:
> On 02/18/2016 11:12 AM, John Garry wrote:

>> On 18/02/2016 07:40, Hannes Reinecke wrote:

> [ .. ]

>>> Well, the classical thing would be to associate each request tag

>>> with a SAS task; or, in your case, associate each slot index with a

>>> request tag.

>>> You probably would need to reserve some slots for TMFs, ie you'd

>>> need to decrease the resulting ->can_queue variable by that.

>>> But once you've done that you shouldn't hit any QUEUE_FULL issues,

>>> as the block layer will ensure that no tags will be reused while the

>>> command is in flight.

>>> Plus this is something you really need to be doing if you ever

>>> consider moving to scsi-mq ...

>>>

>>> Cheers,

>>>

>>> Hannes

>>>

>> Hi,

>>

>> So would you recommend this method under the assumption that the

>> can_queue value for the host is similar to the queue depth for the

>> device?

>>

> That depends.

> Typically the can_queue setting reflects the number of commands the

> _host_ can queue internally (due to hardware limitations etc).

> They do not necessarily reflect the queue depth for the device

> (unless you have a single device, of course).

> So if the host has a hardware limit on the number of commands it can

> queue, it should set the 'can_queue' variable to the appropriate

> number; a host-wide shared tag map is always assumed with recent

> kernels.

>

> The queue_depth of an individual device is controlled by the

> 'cmd_per_lun' setting, and of course capped by can_queue.

>

> But yes, I definitely recommend this method.

> Is saves one _so much_ time trying to figure out which command slot

> to use. Drawback is that you have to have some sort of fixed order

> on them slots to do an efficient lookup.

>

> Cheers,

>

> Hannes

>


I would like to make a point on cmd_per_lun before considering tagging 
slots: For our host the can_queue is considerably greater than 
cmd_per_lun (even though we initially set the same in the host template, 
which would be incorrect). Regardless I find the host cmd_per_lun is 
effectively ignored for the slave device queue depth as it is reset in 
sas_slave_configure() to 256 [if this function is used and tagging 
enabled]. So if we we choose a reasonable cmd_per_lun for our host, it 
is ignored, right? Or am I missing something?

Thanks,
John

John Garry Feb. 19, 2016, 10:46 a.m. UTC | #4

On 18/02/2016 10:57, John Garry wrote:
> On 18/02/2016 10:30, Hannes Reinecke wrote:

>> On 02/18/2016 11:12 AM, John Garry wrote:

>>> On 18/02/2016 07:40, Hannes Reinecke wrote:

>> [ .. ]

>>>> Well, the classical thing would be to associate each request tag

>>>> with a SAS task; or, in your case, associate each slot index with a

>>>> request tag.

>>>> You probably would need to reserve some slots for TMFs, ie you'd

>>>> need to decrease the resulting ->can_queue variable by that.

>>>> But once you've done that you shouldn't hit any QUEUE_FULL issues,

>>>> as the block layer will ensure that no tags will be reused while the

>>>> command is in flight.

>>>> Plus this is something you really need to be doing if you ever

>>>> consider moving to scsi-mq ...

>>>>

>>>> Cheers,

>>>>

>>>> Hannes

>>>>

>>> Hi,

>>>

>>> So would you recommend this method under the assumption that the

>>> can_queue value for the host is similar to the queue depth for the

>>> device?

>>>

>> That depends.

>> Typically the can_queue setting reflects the number of commands the

>> _host_ can queue internally (due to hardware limitations etc).

>> They do not necessarily reflect the queue depth for the device

>> (unless you have a single device, of course).

>> So if the host has a hardware limit on the number of commands it can

>> queue, it should set the 'can_queue' variable to the appropriate

>> number; a host-wide shared tag map is always assumed with recent

>> kernels.

>>

>> The queue_depth of an individual device is controlled by the

>> 'cmd_per_lun' setting, and of course capped by can_queue.

>>

>> But yes, I definitely recommend this method.

>> Is saves one _so much_ time trying to figure out which command slot

>> to use. Drawback is that you have to have some sort of fixed order

>> on them slots to do an efficient lookup.

>>

>> Cheers,

>>

>> Hannes

>>

>

> I would like to make a point on cmd_per_lun before considering tagging

> slots: For our host the can_queue is considerably greater than

> cmd_per_lun (even though we initially set the same in the host template,

> which would be incorrect). Regardless I find the host cmd_per_lun is

> effectively ignored for the slave device queue depth as it is reset in

> sas_slave_configure() to 256 [if this function is used and tagging

> enabled]. So if we we choose a reasonable cmd_per_lun for our host, it

> is ignored, right? Or am I missing something?

>

> Thanks,

> John

>

>


I would like to make another point about why I am making this change in 
case it is not clear. The queue full events are form 
TRANS_TX_CREDIT_TIMEOUT_ERR and TRANS_TX_CLOSE_NORMAL_ERR errors in the 
slot: I want the slot retried when this occurs, so I set status as 
SAS_QUEUE_FULL just so we will report DID_SOFT_ERR to SCSI midlayer so 
we get a retry. I could use SAS_OPEN_REJECT alternatively as the error 
which would have the same affect.
The queue full are not from all slots being consumed in the HBA.

thanks again,
John


> _______________________________________________

> linuxarm mailing list

> linuxarm@huawei.com

> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm

John Garry Feb. 22, 2016, 10:02 a.m. UTC | #5

>> I would like to make another point about why I am making this change

>> in case it is not clear. The queue full events are form

>> TRANS_TX_CREDIT_TIMEOUT_ERR and TRANS_TX_CLOSE_NORMAL_ERR errors in

>> the slot: I want the slot retried when this occurs, so I set status

>> as SAS_QUEUE_FULL just so we will report DID_SOFT_ERR to SCSI

>> midlayer so we get a retry. I could use SAS_OPEN_REJECT

>> alternatively as the error which would have the same affect.

>> The queue full are not from all slots being consumed in the HBA.

>>

> Ah, right. So you might be getting those errors even with some free

> slots on the HBA. As such they are roughly equivalent to a

> QUEUE_FULL SCSI statue, right?

> So after reading SPL I guess you are right here; using tags wouldn't

> help for this situation.

>


Yes, I think we have 90% of slots free in the host when this occurs for 
one particular test - Our v2 hw has 2K slots, which is >> cmd_per_lun. 
The errors are equivalent to queue full for the device.

Thanks,
John

> Cheers,

>

> Hannes

>

[5/6] hisi_sas: add hisi_sas_slave_configure()

Commit Message

Comments

Patch