diff mbox series

[1/5] scsi: hisi_sas: Add host_tagset_enable module param for v3 hw

Message ID 20250329073236.2300582-2-liyihang9@huawei.com
State New
Headers show
Series hisi_sas: Misc patches and cleanups | expand

Commit Message

Yihang Li March 29, 2025, 7:32 a.m. UTC
From: Xingui Yang <yangxingui@huawei.com>

After driver exposes all HW queues and application submits IO to multiple
queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
ata_qc_defer() causes non-NCQ commands to be requeued and possibly repeated
forever. Add host_tagset_enable module parameter to expose multiple queues
for v3 hw.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Yihang Li <liyihang9@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas.h       |  1 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 10 +++++----
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 30 ++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 6 deletions(-)

Comments

John Garry March 29, 2025, 8:50 a.m. UTC | #1
On 29/03/2025 07:32, Yihang Li wrote:

+

> From: Xingui Yang<yangxingui@huawei.com>
> 
> After driver exposes all HW queues and application submits IO to multiple
> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
> ata_qc_defer() causes non-NCQ commands to be requeued and possibly repeated
> forever.

I don't think that it is a good idea to mask out bugs with module 
parameters.

Was this the same libata/libsas issue reported some time ago?

> Add host_tagset_enable module parameter to expose multiple queues
> for v3 hw.
> 
> Signed-off-by: Xingui Yang<yangxingui@huawei.com>
> Signed-off-by: Yihang Li<liyihang9@huawei.com>
yangxingui March 29, 2025, 9:49 a.m. UTC | #2
Hi,John

On 2025/3/29 16:50, John Garry wrote:
> On 29/03/2025 07:32, Yihang Li wrote:
> 
> +
> 
>> From: Xingui Yang<yangxingui@huawei.com>
>>
>> After driver exposes all HW queues and application submits IO to multiple
>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly 
>> repeated
>> forever.
> 
> I don't think that it is a good idea to mask out bugs with module 
> parameters.
> 
> Was this the same libata/libsas issue reported some time ago?

Yeah,related to this issue: 
https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/

And, Niklas tried to help fix this problem: 
https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/

Considering that there is no formal solution yet. And our users rarely 
use SATA disks and SAS disks together on a single machine. For this 
reason, they can flexibly turn off the exposure of multiple queues in 
the scenario of using only SATA disks. In addition, it is also 
convenient to conduct performance comparison tests to expose multiple 
hardware queues and single queues.

Thanks,
Xingui
Niklas Cassel March 31, 2025, 8:40 a.m. UTC | #3
Hello Xingui,

On Sat, Mar 29, 2025 at 05:49:47PM +0800, yangxingui wrote:
> Hi,John
> 
> On 2025/3/29 16:50, John Garry wrote:
> > On 29/03/2025 07:32, Yihang Li wrote:
> > 
> > +
> > 
> > > From: Xingui Yang<yangxingui@huawei.com>
> > > 
> > > After driver exposes all HW queues and application submits IO to multiple
> > > queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
> > > ata_qc_defer() causes non-NCQ commands to be requeued and possibly
> > > repeated
> > > forever.
> > 
> > I don't think that it is a good idea to mask out bugs with module
> > parameters.
> > 
> > Was this the same libata/libsas issue reported some time ago?
> 
> Yeah,related to this issue: https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
> 
> And, Niklas tried to help fix this problem:
> https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/
> 
> Considering that there is no formal solution yet. And our users rarely use
> SATA disks and SAS disks together on a single machine. For this reason, they
> can flexibly turn off the exposure of multiple queues in the scenario of
> using only SATA disks. In addition, it is also convenient to conduct
> performance comparison tests to expose multiple hardware queues and single
> queues.

The solution I sent is not good since it relies on EH.

One would need to come up with a better solution to fix libsas drivers,
possibly a workqueue.

I think Damien is planning to add a workqueue submit path to libata,
if so, perhaps we could base a better solution on top of that.


Kind regards,
Niklas
yangxingui April 1, 2025, 1:18 a.m. UTC | #4
On 2025/3/31 16:40, Niklas Cassel wrote:
> Hello Xingui,
> 
> On Sat, Mar 29, 2025 at 05:49:47PM +0800, yangxingui wrote:
>> Hi,John
>>
>> On 2025/3/29 16:50, John Garry wrote:
>>> On 29/03/2025 07:32, Yihang Li wrote:
>>>
>>> +
>>>
>>>> From: Xingui Yang<yangxingui@huawei.com>
>>>>
>>>> After driver exposes all HW queues and application submits IO to multiple
>>>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>>>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly
>>>> repeated
>>>> forever.
>>>
>>> I don't think that it is a good idea to mask out bugs with module
>>> parameters.
>>>
>>> Was this the same libata/libsas issue reported some time ago?
>>
>> Yeah,related to this issue: https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
>>
>> And, Niklas tried to help fix this problem:
>> https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/
>>
>> Considering that there is no formal solution yet. And our users rarely use
>> SATA disks and SAS disks together on a single machine. For this reason, they
>> can flexibly turn off the exposure of multiple queues in the scenario of
>> using only SATA disks. In addition, it is also convenient to conduct
>> performance comparison tests to expose multiple hardware queues and single
>> queues.
> 
> The solution I sent is not good since it relies on EH.
> 
> One would need to come up with a better solution to fix libsas drivers,
> possibly a workqueue.
> 
> I think Damien is planning to add a workqueue submit path to libata,
> if so, perhaps we could base a better solution on top of that.

Thank you for your solution. As you said, we may need a better solution.

Thanks,
Xingui
.
diff mbox series

Patch

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index e17f5d8226bf..f93eefe68ccb 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -456,6 +456,7 @@  struct hisi_hba {
 	u32 intr_coal_count;	/* Interrupt count to coalesce */
 
 	int cq_nvecs;
+	unsigned int *reply_map;
 
 	/* bist */
 	enum sas_linkrate debugfs_bist_linkrate;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 7a484ad0f9ab..3d2e8ec7f110 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -489,11 +489,12 @@  static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	struct asd_sas_port *sas_port = device->port;
 	struct hisi_sas_device *sas_dev = device->lldd_dev;
 	bool internal_abort = sas_is_internal_abort(task);
+	struct request *rq = sas_task_find_rq(task);
 	struct hisi_sas_dq *dq = NULL;
 	struct hisi_sas_port *port;
 	struct hisi_hba *hisi_hba;
 	struct hisi_sas_slot *slot;
-	struct request *rq = NULL;
+	unsigned int dq_index;
 	struct device *dev;
 	int rc;
 
@@ -548,9 +549,10 @@  static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		rq = sas_task_find_rq(task);
-		if (rq) {
-			unsigned int dq_index;
+		if (hisi_hba->reply_map) {
+			dq_index = hisi_hba->reply_map[raw_smp_processor_id()];
+			dq = &hisi_hba->dq[dq_index];
+		} else if (rq) {
 			u32 blk_tag;
 
 			blk_tag = blk_mq_unique_tag(rq);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 6a0656f3b596..a1f0a59a8c8d 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -561,7 +561,12 @@  MODULE_PARM_DESC(prot_mask, " host protection capabilities mask, def=0x0 ");
 /* the index of iopoll queues are bigger than interrupt queues' */
 static int experimental_iopoll_q_cnt;
 module_param(experimental_iopoll_q_cnt, int, 0444);
-MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0");
+MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0 &\n\t\t"
+		"this parameter is effective only if host_tagset_enable=1");
+
+static bool host_tagset_enable = true;
+module_param(host_tagset_enable, bool, 0444);
+MODULE_PARM_DESC(host_tagset_enable, "shared host tagset enable (0-1), def=1");
 
 static int debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba);
 
@@ -2574,6 +2579,18 @@  static int interrupt_preinit_v3_hw(struct hisi_hba *hisi_hba)
 	hisi_hba->cq_nvecs = vectors - BASE_VECTORS_V3_HW - hisi_hba->iopoll_q_cnt;
 	shost->nr_hw_queues = hisi_hba->cq_nvecs + hisi_hba->iopoll_q_cnt;
 
+	if (!host_tagset_enable) {
+		shost->host_tagset = 0;
+		shost->nr_hw_queues = 1;
+		hisi_hba->reply_map = devm_kcalloc(&pdev->dev, nr_cpu_ids,
+				sizeof(unsigned int),
+				GFP_KERNEL);
+		if (!hisi_hba->reply_map) {
+			hisi_sas_v3_free_vectors(hisi_hba);
+			return -ENOMEM;
+		}
+	}
+
 	return devm_add_action(&pdev->dev, hisi_sas_v3_free_vectors, pdev);
 }
 
@@ -2615,6 +2632,7 @@  static int interrupt_init_v3_hw(struct hisi_hba *hisi_hba)
 		int nr = hisi_sas_intr_conv ? 16 : 16 + i;
 		unsigned long irqflags = hisi_sas_intr_conv ? IRQF_SHARED :
 							      IRQF_ONESHOT;
+		int cpu;
 
 		cq->irq_no = pci_irq_vector(pdev, nr);
 		rc = devm_request_threaded_irq(dev, cq->irq_no,
@@ -2632,6 +2650,9 @@  static int interrupt_init_v3_hw(struct hisi_hba *hisi_hba)
 			dev_err(dev, "could not get cq%d irq affinity!\n", i);
 			return -ENOENT;
 		}
+		if (!host_tagset_enable)
+			for_each_cpu(cpu, cq->irq_mask)
+				hisi_hba->reply_map[cpu] = i;
 	}
 
 	return 0;
@@ -3318,6 +3339,9 @@  static void hisi_sas_map_queues(struct Scsi_Host *shost)
 	struct blk_mq_queue_map *qmap;
 	int i, qoff;
 
+	if (shost->nr_hw_queues == 1)
+		return;
+
 	for (i = 0, qoff = 0; i < shost->nr_maps; i++) {
 		qmap = &shost->tag_set.map[i];
 		if (i == HCTX_TYPE_DEFAULT) {
@@ -3435,7 +3459,9 @@  hisi_sas_shost_alloc_pci(struct pci_dev *pdev)
 	if (check_fw_info_v3_hw(hisi_hba) < 0)
 		goto err_out;
 
-	if (experimental_iopoll_q_cnt < 0 ||
+	if (!host_tagset_enable)
+		dev_info(dev, "Shared host tag set disabled, iopoll queue cnt uses default 0\n");
+	else if (experimental_iopoll_q_cnt < 0 ||
 		experimental_iopoll_q_cnt >= hisi_hba->queue_count)
 		dev_err(dev, "iopoll queue count %d cannot exceed or equal 16, using default 0\n",
 			experimental_iopoll_q_cnt);