mbox series

[v5,0/3] Disable fair tag sharing for UFS devices

Message ID 20231114180426.1184601-1-bvanassche@acm.org
Headers show
Series Disable fair tag sharing for UFS devices | expand

Message

Bart Van Assche Nov. 14, 2023, 6:04 p.m. UTC
Hi Jens,

The fair tag sharing algorithm reduces performance for UFS devices
significantly. This is because UFS devices have multiple logical units, a
limited queue depth (32 for UFS 3.1 devices), because it happens often that
multiple logical units are accessed and also because it takes time to
give tags back after activity on a request queue has stopped. This patch series
restores UFS device performance to that of the legacy block layer by disabling
fair tag sharing for UFS devices.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v4:
 - Rebased on top of kernel v6.7-rc1.

Changes compared to v3:
 - Instead of disabling fair tag sharing for all block drivers, introduce a
   flag for disabling it conditionally.

Changes between v2 and v3:
 - Rebased on top of the latest kernel.

Changes between v1 and v2:
 - Restored the tags->active_queues variable and thereby fixed the
   "uninitialized variable" warning reported by the kernel test robot.

Bart Van Assche (3):
  block: Introduce flag BLK_MQ_F_DISABLE_FAIR_TAG_SHARING
  scsi: core: Support disabling fair tag sharing
  scsi: ufs: Disable fair tag sharing

 block/blk-mq-debugfs.c    | 1 +
 block/blk-mq.h            | 3 ++-
 drivers/scsi/hosts.c      | 1 +
 drivers/scsi/scsi_lib.c   | 2 ++
 drivers/ufs/core/ufshcd.c | 1 +
 include/linux/blk-mq.h    | 1 +
 include/scsi/scsi_host.h  | 6 ++++++
 7 files changed, 14 insertions(+), 1 deletion(-)

Comments

Bart Van Assche Nov. 15, 2023, 6:19 p.m. UTC | #1
On 11/14/23 23:24, Yu Kuai wrote:
> 在 2023/11/15 2:04, Bart Van Assche 写道:
>> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
>> index d7f51b84f3c7..872f87001374 100644
>> --- a/drivers/scsi/hosts.c
>> +++ b/drivers/scsi/hosts.c
>> @@ -442,6 +442,7 @@ struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int priv
>>       shost->no_write_same = sht->no_write_same;
>>       shost->host_tagset = sht->host_tagset;
>>       shost->queuecommand_may_block = sht->queuecommand_may_block;
>> +    shost->disable_fair_tag_sharing = sht->disable_fair_tag_sharing;
> 
> Can we also consider to disable fair tag sharing by default for the
> driver that total driver tags is less than a threshold?
I don't want to do this because such a change could disable fair tag
sharing for drivers that support both SSDs and hard disks being associated
with a single SCSI host.

Thanks,

Bart.
Yu Kuai Nov. 16, 2023, 1:08 a.m. UTC | #2
Hi,

在 2023/11/16 2:19, Bart Van Assche 写道:
> On 11/14/23 23:24, Yu Kuai wrote:
>> 在 2023/11/15 2:04, Bart Van Assche 写道:
>>> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
>>> index d7f51b84f3c7..872f87001374 100644
>>> --- a/drivers/scsi/hosts.c
>>> +++ b/drivers/scsi/hosts.c
>>> @@ -442,6 +442,7 @@ struct Scsi_Host *scsi_host_alloc(const struct 
>>> scsi_host_template *sht, int priv
>>>       shost->no_write_same = sht->no_write_same;
>>>       shost->host_tagset = sht->host_tagset;
>>>       shost->queuecommand_may_block = sht->queuecommand_may_block;
>>> +    shost->disable_fair_tag_sharing = sht->disable_fair_tag_sharing;
>>
>> Can we also consider to disable fair tag sharing by default for the
>> driver that total driver tags is less than a threshold?
> I don't want to do this because such a change could disable fair tag
> sharing for drivers that support both SSDs and hard disks being associated
> with a single SCSI host.

Ok, then is this possible to add a sysfs entry to disable/enable fair
tag sharing manually?

Thanks,
Kuai

> 
> Thanks,
> 
> Bart.
> 
> .
>
Bart Van Assche Nov. 16, 2023, 9:35 p.m. UTC | #3
On 11/15/23 17:08, Yu Kuai wrote:
> 在 2023/11/16 2:19, Bart Van Assche 写道:
>> On 11/14/23 23:24, Yu Kuai wrote:
>>> Can we also consider to disable fair tag sharing by default for the
>>> driver that total driver tags is less than a threshold?
>> I don't want to do this because such a change could disable fair tag
>> sharing for drivers that support both SSDs and hard disks being associated
>> with a single SCSI host.
> 
> Ok, then is this possible to add a sysfs entry to disable/enable fair
> tag sharing manually?

Hi Yu,

I will look into this.

Thanks,

Bart.
Bart Van Assche Nov. 20, 2023, 11:03 p.m. UTC | #4
On 11/15/23 17:08, Yu Kuai wrote:
> Hi,
> 
> 在 2023/11/16 2:19, Bart Van Assche 写道:
>> On 11/14/23 23:24, Yu Kuai wrote:
>>> 在 2023/11/15 2:04, Bart Van Assche 写道:
>>>> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
>>>> index d7f51b84f3c7..872f87001374 100644
>>>> --- a/drivers/scsi/hosts.c
>>>> +++ b/drivers/scsi/hosts.c
>>>> @@ -442,6 +442,7 @@ struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int priv
>>>>       shost->no_write_same = sht->no_write_same;
>>>>       shost->host_tagset = sht->host_tagset;
>>>>       shost->queuecommand_may_block = sht->queuecommand_may_block;
>>>> +    shost->disable_fair_tag_sharing = sht->disable_fair_tag_sharing;
>>>
>>> Can we also consider to disable fair tag sharing by default for the
>>> driver that total driver tags is less than a threshold?
>> I don't want to do this because such a change could disable fair tag
>> sharing for drivers that support both SSDs and hard disks being associated
>> with a single SCSI host.
> 
> Ok, then is this possible to add a sysfs entry to disable/enable fair
> tag sharing manually?

How about replacing patch 1/3 from this series with the patch below?

Thanks,

Bart.


     block: Make fair tag sharing configurable

     The fair sharing algorithm has a negative performance impact for storage
     devices for which the full queue depth is required to reach peak
     performance, e.g. UFS devices. This is because it takes long after a
     request queue became inactive until tags are reassigned to the active
     request queue(s). Since making tag sharing fair is not needed if the
     request processing latency is similar for all request queues, introduce
     a sysfs attribute for controlling the fair tag sharing algorithm.
     Increase BLK_MQ_F_ALLOC_POLICY_START_BIT to prevent that the fair tag
     sharing flag overlaps with the tag allocation policy.

     Cc: Christoph Hellwig <hch@lst.de>
     Cc: Martin K. Petersen <martin.petersen@oracle.com>
     Cc: Ming Lei <ming.lei@redhat.com>
     Cc: Keith Busch <kbusch@kernel.org>
     Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
     Cc: Yu Kuai <yukuai1@huaweicloud.com>
     Cc: Ed Tsai <ed.tsai@mediatek.com>
     Signed-off-by: Bart Van Assche <bvanassche@acm.org>

diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index 1fe9a553c37b..7b66eb938882 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -269,6 +269,19 @@ Description:
  		specific passthrough mechanisms.


+What:		/sys/block/<disk>/queue/fair_sharing
+Date:		November 2023
+Contact:	linux-block@vger.kernel.org
+Description:
+		[RW] If hardware queues are shared across request queues, by
+		default the request tags are distributed evenly across the
+		active request queues. If the total number of tags is low and
+		if the workload differs per request queue this approach may
+		reduce throughput. This sysfs attribute controls whether or not
+		the fair tag sharing algorithm is enabled. 1 means enabled
+		while 0 means disabled.
+
+
  What:		/sys/block/<disk>/queue/fua
  Date:		May 2018
  Contact:	linux-block@vger.kernel.org
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 5cbeb9344f2f..f41408103106 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -198,6 +198,7 @@ static const char *const hctx_flag_name[] = {
  	HCTX_FLAG_NAME(NO_SCHED),
  	HCTX_FLAG_NAME(STACKING),
  	HCTX_FLAG_NAME(TAG_HCTX_SHARED),
+	HCTX_FLAG_NAME(DISABLE_FAIR_TAG_SHARING),
  };
  #undef HCTX_FLAG_NAME

diff --git a/block/blk-mq.h b/block/blk-mq.h
index f75a9ecfebde..eda6bd0611ea 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -416,7 +416,8 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
  {
  	unsigned int depth, users;

-	if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED))
+	if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) ||
+	    (hctx->flags & BLK_MQ_F_DISABLE_FAIR_TAG_SHARING))
  		return true;

  	/*
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 63e481262336..f044bbe57509 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -473,6 +473,43 @@ static ssize_t queue_dax_show(struct request_queue *q, char *page)
  	return queue_var_show(blk_queue_dax(q), page);
  }

+static ssize_t queue_fair_sharing_show(struct request_queue *q, char *page)
+{
+	struct blk_mq_hw_ctx *hctx;
+	unsigned long i;
+	bool fair_sharing = true;
+
+	/* q->sysfs_lock serializes against blk_mq_realloc_hw_ctxs() */
+	queue_for_each_hw_ctx(q, hctx, i)
+		if (hctx->flags & BLK_MQ_F_DISABLE_FAIR_TAG_SHARING)
+			fair_sharing = false;
+
+	return sysfs_emit(page, "%u\n", fair_sharing);
+}
+
+static ssize_t queue_fair_sharing_store(struct request_queue *q,
+					const char *page, size_t count)
+{
+	struct blk_mq_hw_ctx *hctx;
+	unsigned long i;
+	int res, val;
+
+	res = kstrtoint(page, 0, &val);
+	if (res < 0)
+		return res;
+
+	/* q->sysfs_lock serializes against blk_mq_realloc_hw_ctxs() */
+	if (val) {
+		queue_for_each_hw_ctx(q, hctx, i)
+			hctx->flags &= ~BLK_MQ_F_DISABLE_FAIR_TAG_SHARING;
+	} else {
+		queue_for_each_hw_ctx(q, hctx, i)
+			hctx->flags |= BLK_MQ_F_DISABLE_FAIR_TAG_SHARING;
+	}
+
+	return count;
+}
+
  #define QUEUE_RO_ENTRY(_prefix, _name)			\
  static struct queue_sysfs_entry _prefix##_entry = {	\
  	.attr	= { .name = _name, .mode = 0444 },	\
@@ -542,6 +579,7 @@ QUEUE_RW_ENTRY(queue_nonrot, "rotational");
  QUEUE_RW_ENTRY(queue_iostats, "iostats");
  QUEUE_RW_ENTRY(queue_random, "add_random");
  QUEUE_RW_ENTRY(queue_stable_writes, "stable_writes");
+QUEUE_RW_ENTRY(queue_fair_sharing, "fair_sharing");

  #ifdef CONFIG_BLK_WBT
  static ssize_t queue_var_store64(s64 *var, const char *page)
@@ -664,6 +702,7 @@ static struct attribute *blk_mq_queue_attrs[] = {
  	&elv_iosched_entry.attr,
  	&queue_rq_affinity_entry.attr,
  	&queue_io_timeout_entry.attr,
+	&queue_fair_sharing_entry.attr,
  #ifdef CONFIG_BLK_WBT
  	&queue_wb_lat_entry.attr,
  #endif
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1ab3081c82ed..fd5a51e8b628 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -662,7 +662,8 @@ enum {
  	 * or shared hwqs instead of 'mq-deadline'.
  	 */
  	BLK_MQ_F_NO_SCHED_BY_DEFAULT	= 1 << 7,
-	BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
+	BLK_MQ_F_DISABLE_FAIR_TAG_SHARING = 1 << 8,
+	BLK_MQ_F_ALLOC_POLICY_START_BIT = 16,
  	BLK_MQ_F_ALLOC_POLICY_BITS = 1,

  	BLK_MQ_S_STOPPED	= 0,
Yu Kuai Nov. 21, 2023, 1:35 a.m. UTC | #5
Hi,

在 2023/11/21 7:03, Bart Van Assche 写道:
> On 11/15/23 17:08, Yu Kuai wrote:
>> Hi,
>>
>> 在 2023/11/16 2:19, Bart Van Assche 写道:
>>> On 11/14/23 23:24, Yu Kuai wrote:
>>>> 在 2023/11/15 2:04, Bart Van Assche 写道:
>>>>> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
>>>>> index d7f51b84f3c7..872f87001374 100644
>>>>> --- a/drivers/scsi/hosts.c
>>>>> +++ b/drivers/scsi/hosts.c
>>>>> @@ -442,6 +442,7 @@ struct Scsi_Host *scsi_host_alloc(const struct 
>>>>> scsi_host_template *sht, int priv
>>>>>       shost->no_write_same = sht->no_write_same;
>>>>>       shost->host_tagset = sht->host_tagset;
>>>>>       shost->queuecommand_may_block = sht->queuecommand_may_block;
>>>>> +    shost->disable_fair_tag_sharing = sht->disable_fair_tag_sharing;
>>>>
>>>> Can we also consider to disable fair tag sharing by default for the
>>>> driver that total driver tags is less than a threshold?
>>> I don't want to do this because such a change could disable fair tag
>>> sharing for drivers that support both SSDs and hard disks being 
>>> associated
>>> with a single SCSI host.
>>
>> Ok, then is this possible to add a sysfs entry to disable/enable fair
>> tag sharing manually?
> 
> How about replacing patch 1/3 from this series with the patch below?
> 
> Thanks,
> 
> Bart.
> 
> 
>      block: Make fair tag sharing configurable
> 
>      The fair sharing algorithm has a negative performance impact for 
> storage
>      devices for which the full queue depth is required to reach peak
>      performance, e.g. UFS devices. This is because it takes long after a
>      request queue became inactive until tags are reassigned to the active
>      request queue(s). Since making tag sharing fair is not needed if the
>      request processing latency is similar for all request queues, 
> introduce
>      a sysfs attribute for controlling the fair tag sharing algorithm.
>      Increase BLK_MQ_F_ALLOC_POLICY_START_BIT to prevent that the fair tag
>      sharing flag overlaps with the tag allocation policy.
> 
>      Cc: Christoph Hellwig <hch@lst.de>
>      Cc: Martin K. Petersen <martin.petersen@oracle.com>
>      Cc: Ming Lei <ming.lei@redhat.com>
>      Cc: Keith Busch <kbusch@kernel.org>
>      Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
>      Cc: Yu Kuai <yukuai1@huaweicloud.com>
>      Cc: Ed Tsai <ed.tsai@mediatek.com>
>      Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> 
> diff --git a/Documentation/ABI/stable/sysfs-block 
> b/Documentation/ABI/stable/sysfs-block
> index 1fe9a553c37b..7b66eb938882 100644
> --- a/Documentation/ABI/stable/sysfs-block
> +++ b/Documentation/ABI/stable/sysfs-block
> @@ -269,6 +269,19 @@ Description:
>           specific passthrough mechanisms.
> 
> 
> +What:        /sys/block/<disk>/queue/fair_sharing
> +Date:        November 2023
> +Contact:    linux-block@vger.kernel.org
> +Description:
> +        [RW] If hardware queues are shared across request queues, by
> +        default the request tags are distributed evenly across the
> +        active request queues. If the total number of tags is low and
> +        if the workload differs per request queue this approach may
> +        reduce throughput. This sysfs attribute controls whether or not
> +        the fair tag sharing algorithm is enabled. 1 means enabled
> +        while 0 means disabled.
> +
> +
>   What:        /sys/block/<disk>/queue/fua
>   Date:        May 2018
>   Contact:    linux-block@vger.kernel.org
> diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
> index 5cbeb9344f2f..f41408103106 100644
> --- a/block/blk-mq-debugfs.c
> +++ b/block/blk-mq-debugfs.c
> @@ -198,6 +198,7 @@ static const char *const hctx_flag_name[] = {
>       HCTX_FLAG_NAME(NO_SCHED),
>       HCTX_FLAG_NAME(STACKING),
>       HCTX_FLAG_NAME(TAG_HCTX_SHARED),
> +    HCTX_FLAG_NAME(DISABLE_FAIR_TAG_SHARING),
>   };
>   #undef HCTX_FLAG_NAME
> 
> diff --git a/block/blk-mq.h b/block/blk-mq.h
> index f75a9ecfebde..eda6bd0611ea 100644
> --- a/block/blk-mq.h
> +++ b/block/blk-mq.h
> @@ -416,7 +416,8 @@ static inline bool hctx_may_queue(struct 
> blk_mq_hw_ctx *hctx,
>   {
>       unsigned int depth, users;
> 
> -    if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED))
> +    if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) ||
> +        (hctx->flags & BLK_MQ_F_DISABLE_FAIR_TAG_SHARING))
>           return true;
> 
>       /*
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 63e481262336..f044bbe57509 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -473,6 +473,43 @@ static ssize_t queue_dax_show(struct request_queue 
> *q, char *page)
>       return queue_var_show(blk_queue_dax(q), page);
>   }
> 
> +static ssize_t queue_fair_sharing_show(struct request_queue *q, char 
> *page)
> +{
> +    struct blk_mq_hw_ctx *hctx;
> +    unsigned long i;
> +    bool fair_sharing = true;
> +
> +    /* q->sysfs_lock serializes against blk_mq_realloc_hw_ctxs() */
> +    queue_for_each_hw_ctx(q, hctx, i)
> +        if (hctx->flags & BLK_MQ_F_DISABLE_FAIR_TAG_SHARING)
> +            fair_sharing = false;
> +
> +    return sysfs_emit(page, "%u\n", fair_sharing);
> +}
> +
> +static ssize_t queue_fair_sharing_store(struct request_queue *q,
> +                    const char *page, size_t count)
> +{
> +    struct blk_mq_hw_ctx *hctx;
> +    unsigned long i;
> +    int res, val;
> +
> +    res = kstrtoint(page, 0, &val);
> +    if (res < 0)
> +        return res;
> +
> +    /* q->sysfs_lock serializes against blk_mq_realloc_hw_ctxs() */
> +    if (val) {
> +        queue_for_each_hw_ctx(q, hctx, i)
> +            hctx->flags &= ~BLK_MQ_F_DISABLE_FAIR_TAG_SHARING;
> +    } else {
> +        queue_for_each_hw_ctx(q, hctx, i)
> +            hctx->flags |= BLK_MQ_F_DISABLE_FAIR_TAG_SHARING;
> +    }

I'm not sure that change just one queue instead of all queues using the
same tag_set won't case any regression, for example,
BLK_MQ_F_TAG_QUEUE_SHARED is not cleared, and other queues are still
sharing tags fairly while this queue doesn't.

Perhaps can we add a helper similiar to __blk_mq_update_nr_hw_queues
to update all queues using the same tag_set?

Thanks,
Kuai

> +
> +    return count;
> +}
> +
>   #define QUEUE_RO_ENTRY(_prefix, _name)            \
>   static struct queue_sysfs_entry _prefix##_entry = {    \
>       .attr    = { .name = _name, .mode = 0444 },    \
> @@ -542,6 +579,7 @@ QUEUE_RW_ENTRY(queue_nonrot, "rotational");
>   QUEUE_RW_ENTRY(queue_iostats, "iostats");
>   QUEUE_RW_ENTRY(queue_random, "add_random");
>   QUEUE_RW_ENTRY(queue_stable_writes, "stable_writes");
> +QUEUE_RW_ENTRY(queue_fair_sharing, "fair_sharing");
> 
>   #ifdef CONFIG_BLK_WBT
>   static ssize_t queue_var_store64(s64 *var, const char *page)
> @@ -664,6 +702,7 @@ static struct attribute *blk_mq_queue_attrs[] = {
>       &elv_iosched_entry.attr,
>       &queue_rq_affinity_entry.attr,
>       &queue_io_timeout_entry.attr,
> +    &queue_fair_sharing_entry.attr,
>   #ifdef CONFIG_BLK_WBT
>       &queue_wb_lat_entry.attr,
>   #endif
> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
> index 1ab3081c82ed..fd5a51e8b628 100644
> --- a/include/linux/blk-mq.h
> +++ b/include/linux/blk-mq.h
> @@ -662,7 +662,8 @@ enum {
>        * or shared hwqs instead of 'mq-deadline'.
>        */
>       BLK_MQ_F_NO_SCHED_BY_DEFAULT    = 1 << 7,
> -    BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
> +    BLK_MQ_F_DISABLE_FAIR_TAG_SHARING = 1 << 8,
> +    BLK_MQ_F_ALLOC_POLICY_START_BIT = 16,
>       BLK_MQ_F_ALLOC_POLICY_BITS = 1,
> 
>       BLK_MQ_S_STOPPED    = 0,
> 
> 
> .
>
Bart Van Assche Nov. 27, 2023, 11:05 p.m. UTC | #6
On 11/22/23 22:29, Yu Kuai wrote:
> 在 2023/11/22 3:32, Bart Van Assche 写道:
>> +static ssize_t queue_fair_sharing_store(struct request_queue *q,
>> +                    const char *page, size_t count)
>> +{
>> +    const unsigned int DFTS_BIT = ilog2(BLK_MQ_F_DISABLE_FAIR_TAG_SHARING);
>> +    struct blk_mq_tag_set *set = q->tag_set;
>> +    struct blk_mq_hw_ctx *hctx;
>> +    unsigned long i;
>> +    int res;
>> +    bool val;
>> +
>> +    res = kstrtobool(page, &val);
>> +    if (res < 0)
>> +        return res;
>> +
>> +    mutex_lock(&set->tag_list_lock);
>> +    clear_bit(DFTS_BIT, &set->flags);
>> +    list_for_each_entry(q, &set->tag_list, tag_set_list) {
>> +        /* Serialize against blk_mq_realloc_hw_ctxs() */
> 
> If set/clear bit concurrent with test bit from io path, will there be
> problem? Why don't freeze these queues?

If that happens the changes applied through this sysfs attribute may only take
effect after a short delay (depending on how fast changes are propagated from
one CPU to another). I don't think that this is an issue?
  >> +#define QUEUE_RW_ENTRY_NO_SYSFS_MUTEX(_prefix, _name)       \
>> +    static struct queue_sysfs_entry _prefix##_entry = { \
>> +        .attr = { .name = _name, .mode = 0644 },    \
>> +        .show = _prefix##_show,                     \
>> +        .store = _prefix##_store,                   \
>> +        .no_sysfs_mutex = true,                     \
>> +    };
>> +
> 
> This actually change all the queues from the same tagset, can we add
> this new entry to /sys/class/scsi_host/hostx/xxx ?

That would make it impossible to disable fair tag sharing for block drivers
that are not based on the SCSI core. Are you sure that's what you want?

Thanks,

Bart.
Yu Kuai Nov. 28, 2023, 2:03 a.m. UTC | #7
Hi,

在 2023/11/28 7:05, Bart Van Assche 写道:
> On 11/22/23 22:29, Yu Kuai wrote:
>> 在 2023/11/22 3:32, Bart Van Assche 写道:
>>> +static ssize_t queue_fair_sharing_store(struct request_queue *q,
>>> +                    const char *page, size_t count)
>>> +{
>>> +    const unsigned int DFTS_BIT = 
>>> ilog2(BLK_MQ_F_DISABLE_FAIR_TAG_SHARING);
>>> +    struct blk_mq_tag_set *set = q->tag_set;
>>> +    struct blk_mq_hw_ctx *hctx;
>>> +    unsigned long i;
>>> +    int res;
>>> +    bool val;
>>> +
>>> +    res = kstrtobool(page, &val);
>>> +    if (res < 0)
>>> +        return res;
>>> +
>>> +    mutex_lock(&set->tag_list_lock);
>>> +    clear_bit(DFTS_BIT, &set->flags);
>>> +    list_for_each_entry(q, &set->tag_list, tag_set_list) {
>>> +        /* Serialize against blk_mq_realloc_hw_ctxs() */
>>
>> If set/clear bit concurrent with test bit from io path, will there be
>> problem? Why don't freeze these queues?
> 
> If that happens the changes applied through this sysfs attribute may 
> only take
> effect after a short delay (depending on how fast changes are propagated 
> from
> one CPU to another). I don't think that this is an issue?

Because wake_batch is not updated, hence actually wait/wakeup is still
the same before tag sharing is disabled.

I was worried that there might be missing wakeups, why not using
blk_mq_update_tag_set_shared() directly to disable tag sharing? And for
new disks, change blk_mq_add_queue_tag_set() to not set
BLK_MQ_F_TAG_QUEUE_SHARED as well. This way we only need a new flag for
tag_set, that's why I want to add the new sysfs entry for scsi_host,
since there are no entry represent tag_set for now...

>   >> +#define QUEUE_RW_ENTRY_NO_SYSFS_MUTEX(_prefix, _name)       \
>>> +    static struct queue_sysfs_entry _prefix##_entry = { \
>>> +        .attr = { .name = _name, .mode = 0644 },    \
>>> +        .show = _prefix##_show,                     \
>>> +        .store = _prefix##_store,                   \
>>> +        .no_sysfs_mutex = true,                     \
>>> +    };
>>> +
>>
>> This actually change all the queues from the same tagset, can we add
>> this new entry to /sys/class/scsi_host/hostx/xxx ?
> 
> That would make it impossible to disable fair tag sharing for block drivers
> that are not based on the SCSI core. Are you sure that's what you want?

Yes, if there are other drivers that are sharing driver tags, this is
not good, can you give some examples?

Thanks,
Kuai
> 
> Thanks,
> 
> Bart.
> .
>
Bart Van Assche Nov. 28, 2023, 6:17 p.m. UTC | #8
On 11/27/23 18:03, Yu Kuai wrote:
> I was worried that there might be missing wakeups, why not using
> blk_mq_update_tag_set_shared() directly to disable tag sharing?

I think that calling blk_mq_update_tag_set_shared() to disable tag sharing
would be wrong because BLK_MQ_F_TAG_QUEUE_SHARED is also used for other
purposes than fair tag sharing. See e.g. blk_mq_mark_tag_wait().

>>   >> +#define QUEUE_RW_ENTRY_NO_SYSFS_MUTEX(_prefix, _name)       \
>>>> +    static struct queue_sysfs_entry _prefix##_entry = { \
>>>> +        .attr = { .name = _name, .mode = 0644 },    \
>>>> +        .show = _prefix##_show,                     \
>>>> +        .store = _prefix##_store,                   \
>>>> +        .no_sysfs_mutex = true,                     \
>>>> +    };
>>>> +
>>>
>>> This actually change all the queues from the same tagset, can we add
>>> this new entry to /sys/class/scsi_host/hostx/xxx ?
>>
>> That would make it impossible to disable fair tag sharing for block drivers
>> that are not based on the SCSI core. Are you sure that's what you want?
> 
> Yes, if there are other drivers that are sharing driver tags, this is
> not good, can you give some examples?

There is one tag set for all NVMe namespaces associated with the same
controller. Anyway, I will move this sysfs attribute to the SCSI host and
will organize the code such that a similar sysfs attribute can be added
easily to other block drivers than the SCSI core if that would be considered
useful.

Bart.