mbox series

[v6,00/16] Add Multi Circular Queue Support

Message ID cover.1669684648.git.quic_asutoshd@quicinc.com
Headers show
Series Add Multi Circular Queue Support | expand

Message

Asutosh Das Nov. 29, 2022, 1:20 a.m. UTC
UFS Multi-Circular Queue (MCQ) has been added in UFSHCI v4.0 to improve storage performance.
The implementation uses the shared tagging mechanism so that tags are shared
among the hardware queues. The number of hardware queues is configurable.
This series doesn't include the ESI implementation for completion handling.
This implementation has been verified by on a Qualcomm platform.

Please take a look and let us know your thoughts.

v5 -> v6:
- Addressed Mani's comments
- Addressed Bart's comments

v4 -> v5:
- Fixed failure to fallback to SDB during initialization
- Fixed failure when rpm-lvl=5 in the ufshcd_host_reset_and_restore() path
- Improved ufshcd_mcq_config_nr_queues() to handle different configurations
- Addressed Bart's comments
- Verified read/write using FIO, clock gating, runtime-pm[lvl=3, lvl=5]

v3 -> v4:
- Added a kernel module parameter to disable MCQ mode
- Added Bart's reviewed-by tag for some patches
- Addressed Bart's comments

v2 -> v3:
- Split ufshcd_config_mcq() into ufshcd_alloc_mcq() and ufshcd_config_mcq()
- Use devm_kzalloc() in ufshcd_mcq_init()
- Free memory and resource allocation on error paths
- Corrected typos in code comments

v1 -> v2:
- Added a non MCQ related change to use a function to extrace ufs extended
feature
- Addressed Mani's comments
- Addressed Bart's comments

v1:
- Split the changes
- Addressed Bart's comments
- Addressed Bean's comments

* RFC versions:
v2 -> v3:
- Split the changes based on functionality
- Addressed queue configuration issues
- Faster SQE tail pointer increments
- Addressed comments from Bart and Manivannan

v1 -> v2:
- Enabled host_tagset
- Added queue num configuration support
- Added one more vops to allow vendor provide the wanted MAC
- Determine nutrs and can_queue by considering both MAC, bqueuedepth and EXT_IID support
- Postponed MCQ initialization and scsi_add_host() to async probe
- Used (EXT_IID, Task Tag) tuple to support up to 4096 tasks (theoretically)


Asutosh Das (16):
  ufs: core: Optimize duplicate code to read extended feature
  ufs: core: Probe for ext_iid support
  ufs: core: Introduce Multi-circular queue capability
  ufs: core: Defer adding host to scsi if mcq is supported
  ufs: core: mcq: Add support to allocate multiple queues
  ufs: core: mcq: Configure resource regions
  ufs: core: mcq: Calculate queue depth
  ufs: core: mcq: Allocate memory for mcq mode
  ufs: core: mcq: Configure operation and runtime interface
  ufs: core: mcq: Use shared tags for MCQ mode
  ufs: core: Prepare ufshcd_send_command for mcq
  ufs: core: mcq: Find hardware queue to queue request
  ufs: core: Prepare for completion in mcq
  ufs: mcq: Add completion support of a cqe
  ufs: core: mcq: Add completion support in poll
  ufs: core: mcq: Enable Multi Circular Queue

 drivers/ufs/core/Makefile      |   2 +-
 drivers/ufs/core/ufs-mcq.c     | 416 +++++++++++++++++++++++++++++++++++++++++
 drivers/ufs/core/ufshcd-priv.h |  92 ++++++++-
 drivers/ufs/core/ufshcd.c      | 395 +++++++++++++++++++++++++++++++-------
 drivers/ufs/host/ufs-qcom.c    | 148 +++++++++++++++
 drivers/ufs/host/ufs-qcom.h    |   5 +
 include/ufs/ufs.h              |   6 +
 include/ufs/ufshcd.h           | 128 +++++++++++++
 include/ufs/ufshci.h           |  64 +++++++
 9 files changed, 1189 insertions(+), 67 deletions(-)
 create mode 100644 drivers/ufs/core/ufs-mcq.c

Comments

Manivannan Sadhasivam Nov. 29, 2022, 3:25 p.m. UTC | #1
On Mon, Nov 28, 2022 at 05:20:43PM -0800, Asutosh Das wrote:
> Task Tag is limited to 8 bits and this restricts the number
> of active IOs to 255.
> In Multi-circular queue mode, this may not be enough.
> The specification provides EXT_IID which can be used to increase
> the number of IOs if the UFS device and UFSHC support it.
> This patch adds support to probe for ext_iid support in
> ufs device and UFSHC.
> 
> Co-developed-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>

Here also, my tag is missed:

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>

Thanks,
Mani

> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Reviewed-by: Avri Altman <avri.altman@wdc.com>
> ---
>  drivers/ufs/core/ufshcd.c | 31 +++++++++++++++++++++++++++++++
>  include/ufs/ufs.h         |  4 ++++
>  include/ufs/ufshcd.h      |  4 ++++
>  include/ufs/ufshci.h      |  7 +++++++
>  4 files changed, 46 insertions(+)
> 
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 6ea22b5..595fd3c 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -2258,6 +2258,10 @@ static inline int ufshcd_hba_capabilities(struct ufs_hba *hba)
>  	if (err)
>  		dev_err(hba->dev, "crypto setup failed\n");
>  
> +	hba->mcq_capabilities = ufshcd_readl(hba, REG_MCQCAP);
> +	hba->ext_iid_sup = FIELD_GET(MASK_EXT_IID_SUPPORT,
> +				     hba->mcq_capabilities);
> +
>  	return err;
>  }
>  
> @@ -7687,6 +7691,30 @@ static void ufshcd_temp_notif_probe(struct ufs_hba *hba, const u8 *desc_buf)
>  	}
>  }
>  
> +static void ufshcd_ext_iid_probe(struct ufs_hba *hba, u8 *desc_buf)
> +{
> +	struct ufs_dev_info *dev_info = &hba->dev_info;
> +	u32 ext_ufs_feature;
> +	u32 ext_iid_en = 0;
> +	int err;
> +
> +	/* Only UFS-4.0 and above may support EXT_IID */
> +	if (dev_info->wspecversion < 0x400)
> +		goto out;
> +
> +	ext_ufs_feature = ufs_get_ext_ufs_feature(hba, desc_buf);
> +	if (!(ext_ufs_feature & UFS_DEV_EXT_IID_SUP))
> +		goto out;
> +
> +	err = ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_READ_ATTR,
> +				      QUERY_ATTR_IDN_EXT_IID_EN, 0, 0, &ext_iid_en);
> +	if (err)
> +		dev_err(hba->dev, "failed reading bEXTIIDEn. err = %d\n", err);
> +
> +out:
> +	dev_info->b_ext_iid_en = ext_iid_en;
> +}
> +
>  void ufshcd_fixup_dev_quirks(struct ufs_hba *hba,
>  			     const struct ufs_dev_quirk *fixups)
>  {
> @@ -7785,6 +7813,9 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
>  
>  	ufshcd_temp_notif_probe(hba, desc_buf);
>  
> +	if (hba->ext_iid_sup)
> +		ufshcd_ext_iid_probe(hba, desc_buf);
> +
>  	/*
>  	 * ufshcd_read_string_desc returns size of the string
>  	 * reset the error value
> diff --git a/include/ufs/ufs.h b/include/ufs/ufs.h
> index 1bba3fe..ba2a1d8 100644
> --- a/include/ufs/ufs.h
> +++ b/include/ufs/ufs.h
> @@ -165,6 +165,7 @@ enum attr_idn {
>  	QUERY_ATTR_IDN_AVAIL_WB_BUFF_SIZE       = 0x1D,
>  	QUERY_ATTR_IDN_WB_BUFF_LIFE_TIME_EST    = 0x1E,
>  	QUERY_ATTR_IDN_CURR_WB_BUFF_SIZE        = 0x1F,
> +	QUERY_ATTR_IDN_EXT_IID_EN		= 0x2A,
>  };
>  
>  /* Descriptor idn for Query requests */
> @@ -352,6 +353,7 @@ enum {
>  	UFS_DEV_EXT_TEMP_NOTIF		= BIT(6),
>  	UFS_DEV_HPB_SUPPORT		= BIT(7),
>  	UFS_DEV_WRITE_BOOSTER_SUP	= BIT(8),
> +	UFS_DEV_EXT_IID_SUP		= BIT(16),
>  };
>  #define UFS_DEV_HPB_SUPPORT_VERSION		0x310
>  
> @@ -601,6 +603,8 @@ struct ufs_dev_info {
>  
>  	bool	b_rpm_dev_flush_capable;
>  	u8	b_presrv_uspc_en;
> +	/* UFS EXT_IID Enable */
> +	bool	b_ext_iid_en;
>  };
>  
>  /*
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index 5cf81df..aec37cb9 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -747,6 +747,7 @@ struct ufs_hba_monitor {
>   * @outstanding_lock: Protects @outstanding_reqs.
>   * @outstanding_reqs: Bits representing outstanding transfer requests
>   * @capabilities: UFS Controller Capabilities
> + * @mcq_capabilities: UFS Multi Circular Queue capabilities
>   * @nutrs: Transfer Request Queue depth supported by controller
>   * @nutmrs: Task Management Queue depth supported by controller
>   * @reserved_slot: Used to submit device commands. Protected by @dev_cmd.lock.
> @@ -830,6 +831,7 @@ struct ufs_hba_monitor {
>   *	device
>   * @complete_put: whether or not to call ufshcd_rpm_put() from inside
>   *	ufshcd_resume_complete()
> + * @ext_iid_sup: is EXT_IID is supported by UFSHC
>   */
>  struct ufs_hba {
>  	void __iomem *mmio_base;
> @@ -871,6 +873,7 @@ struct ufs_hba {
>  
>  	u32 capabilities;
>  	int nutrs;
> +	u32 mcq_capabilities;
>  	int nutmrs;
>  	u32 reserved_slot;
>  	u32 ufs_version;
> @@ -978,6 +981,7 @@ struct ufs_hba {
>  #endif
>  	u32 luns_avail;
>  	bool complete_put;
> +	bool ext_iid_sup;
>  };
>  
>  /* Returns true if clocks can be gated. Otherwise false */
> diff --git a/include/ufs/ufshci.h b/include/ufs/ufshci.h
> index f525566..4d4da06 100644
> --- a/include/ufs/ufshci.h
> +++ b/include/ufs/ufshci.h
> @@ -22,6 +22,7 @@ enum {
>  /* UFSHCI Registers */
>  enum {
>  	REG_CONTROLLER_CAPABILITIES		= 0x00,
> +	REG_MCQCAP				= 0x04,
>  	REG_UFS_VERSION				= 0x08,
>  	REG_CONTROLLER_DEV_ID			= 0x10,
>  	REG_CONTROLLER_PROD_ID			= 0x14,
> @@ -68,6 +69,12 @@ enum {
>  	MASK_OUT_OF_ORDER_DATA_DELIVERY_SUPPORT	= 0x02000000,
>  	MASK_UIC_DME_TEST_MODE_SUPPORT		= 0x04000000,
>  	MASK_CRYPTO_SUPPORT			= 0x10000000,
> +	MASK_MCQ_SUPPORT			= 0x40000000,
> +};
> +
> +/* MCQ capability mask */
> +enum {
> +	MASK_EXT_IID_SUPPORT = 0x00000400,
>  };
>  
>  #define UFS_MASK(mask, offset)		((mask) << (offset))
> -- 
> 2.7.4
>
Manivannan Sadhasivam Nov. 29, 2022, 3:28 p.m. UTC | #2
On Mon, Nov 28, 2022 at 05:20:46PM -0800, Asutosh Das wrote:
> Multi-circular queue (MCQ) has been added in UFSHC v4.0
> standard in addition to the Single Doorbell mode.
> The MCQ mode supports multiple submission and completion queues.
> Add support to allocate and configure the queues.
> Add module parameters support to configure the queues.
> 
> Co-developed-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>

Thanks,
Mani

> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/ufs/core/Makefile      |   2 +-
>  drivers/ufs/core/ufs-mcq.c     | 125 +++++++++++++++++++++++++++++++++++++++++
>  drivers/ufs/core/ufshcd-priv.h |   1 +
>  drivers/ufs/core/ufshcd.c      |  12 ++++
>  include/ufs/ufshcd.h           |   4 ++
>  5 files changed, 143 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/ufs/core/ufs-mcq.c
> 
> diff --git a/drivers/ufs/core/Makefile b/drivers/ufs/core/Makefile
> index 62f38c5..4d02e0f 100644
> --- a/drivers/ufs/core/Makefile
> +++ b/drivers/ufs/core/Makefile
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  obj-$(CONFIG_SCSI_UFSHCD)		+= ufshcd-core.o
> -ufshcd-core-y				+= ufshcd.o ufs-sysfs.o
> +ufshcd-core-y				+= ufshcd.o ufs-sysfs.o ufs-mcq.o
>  ufshcd-core-$(CONFIG_DEBUG_FS)		+= ufs-debugfs.o
>  ufshcd-core-$(CONFIG_SCSI_UFS_BSG)	+= ufs_bsg.o
>  ufshcd-core-$(CONFIG_SCSI_UFS_CRYPTO)	+= ufshcd-crypto.o
> diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
> new file mode 100644
> index 0000000..bf08ec5
> --- /dev/null
> +++ b/drivers/ufs/core/ufs-mcq.c
> @@ -0,0 +1,125 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center. All rights reserved.
> + *
> + * Authors:
> + *	Asutosh Das <quic_asutoshd@quicinc.com>
> + *	Can Guo <quic_cang@quicinc.com>
> + */
> +
> +#include <asm/unaligned.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/module.h>
> +#include <linux/platform_device.h>
> +#include "ufshcd-priv.h"
> +
> +#define MAX_QUEUE_SUP GENMASK(7, 0)
> +#define UFS_MCQ_MIN_RW_QUEUES 2
> +#define UFS_MCQ_MIN_READ_QUEUES 0
> +#define UFS_MCQ_NUM_DEV_CMD_QUEUES 1
> +#define UFS_MCQ_MIN_POLL_QUEUES 0
> +
> +static int rw_queue_count_set(const char *val, const struct kernel_param *kp)
> +{
> +	return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_RW_QUEUES,
> +				     num_possible_cpus());
> +}
> +
> +static const struct kernel_param_ops rw_queue_count_ops = {
> +	.set = rw_queue_count_set,
> +	.get = param_get_uint,
> +};
> +
> +static unsigned int rw_queues;
> +module_param_cb(rw_queues, &rw_queue_count_ops, &rw_queues, 0644);
> +MODULE_PARM_DESC(rw_queues,
> +		 "Number of interrupt driven I/O queues used for rw. Default value is nr_cpus");
> +
> +static int read_queue_count_set(const char *val, const struct kernel_param *kp)
> +{
> +	return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_READ_QUEUES,
> +				     num_possible_cpus());
> +}
> +
> +static const struct kernel_param_ops read_queue_count_ops = {
> +	.set = read_queue_count_set,
> +	.get = param_get_uint,
> +};
> +
> +static unsigned int read_queues;
> +module_param_cb(read_queues, &read_queue_count_ops, &read_queues, 0644);
> +MODULE_PARM_DESC(read_queues,
> +		 "Number of interrupt driven read queues used for read. Default value is 0");
> +
> +static int poll_queue_count_set(const char *val, const struct kernel_param *kp)
> +{
> +	return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_POLL_QUEUES,
> +				     num_possible_cpus());
> +}
> +
> +static const struct kernel_param_ops poll_queue_count_ops = {
> +	.set = poll_queue_count_set,
> +	.get = param_get_uint,
> +};
> +
> +static unsigned int poll_queues = 1;
> +module_param_cb(poll_queues, &poll_queue_count_ops, &poll_queues, 0644);
> +MODULE_PARM_DESC(poll_queues,
> +		 "Number of poll queues used for r/w. Default value is 1");
> +
> +static int ufshcd_mcq_config_nr_queues(struct ufs_hba *hba)
> +{
> +	int i;
> +	u32 hba_maxq, rem, tot_queues;
> +	struct Scsi_Host *host = hba->host;
> +
> +	hba_maxq = FIELD_GET(MAX_QUEUE_SUP, hba->mcq_capabilities);
> +
> +	tot_queues = UFS_MCQ_NUM_DEV_CMD_QUEUES + read_queues + poll_queues +
> +			rw_queues;
> +
> +	if (hba_maxq < tot_queues) {
> +		dev_err(hba->dev, "Total queues (%d) exceeds HC capacity (%d)\n",
> +			tot_queues, hba_maxq);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	rem = hba_maxq - UFS_MCQ_NUM_DEV_CMD_QUEUES;
> +
> +	if (rw_queues) {
> +		hba->nr_queues[HCTX_TYPE_DEFAULT] = rw_queues;
> +		rem -= hba->nr_queues[HCTX_TYPE_DEFAULT];
> +	} else {
> +		rw_queues = num_possible_cpus();
> +	}
> +
> +	if (poll_queues) {
> +		hba->nr_queues[HCTX_TYPE_POLL] = poll_queues;
> +		rem -= hba->nr_queues[HCTX_TYPE_POLL];
> +	}
> +
> +	if (read_queues) {
> +		hba->nr_queues[HCTX_TYPE_READ] = read_queues;
> +		rem -= hba->nr_queues[HCTX_TYPE_READ];
> +	}
> +
> +	if (!hba->nr_queues[HCTX_TYPE_DEFAULT])
> +		hba->nr_queues[HCTX_TYPE_DEFAULT] = min3(rem, rw_queues,
> +							 num_possible_cpus());
> +
> +	for (i = 0; i < HCTX_MAX_TYPES; i++)
> +		host->nr_hw_queues += hba->nr_queues[i];
> +
> +	hba->nr_hw_queues = host->nr_hw_queues + UFS_MCQ_NUM_DEV_CMD_QUEUES;
> +	return 0;
> +}
> +
> +int ufshcd_mcq_init(struct ufs_hba *hba)
> +{
> +	int ret;
> +
> +	ret = ufshcd_mcq_config_nr_queues(hba);
> +
> +	return ret;
> +}
> +
> diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h
> index a9e8e1f..9368ba2 100644
> --- a/drivers/ufs/core/ufshcd-priv.h
> +++ b/drivers/ufs/core/ufshcd-priv.h
> @@ -61,6 +61,7 @@ int ufshcd_query_attr(struct ufs_hba *hba, enum query_opcode opcode,
>  int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode,
>  	enum flag_idn idn, u8 index, bool *flag_res);
>  void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit);
> +int ufshcd_mcq_init(struct ufs_hba *hba);
>  
>  #define SD_ASCII_STD true
>  #define SD_RAW false
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 3c2220c..9b78814 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -8220,6 +8220,11 @@ static int ufshcd_add_lus(struct ufs_hba *hba)
>  	return ret;
>  }
>  
> +static int ufshcd_alloc_mcq(struct ufs_hba *hba)
> +{
> +	return ufshcd_mcq_init(hba);
> +}
> +
>  /**
>   * ufshcd_probe_hba - probe hba to detect device and initialize it
>   * @hba: per-adapter instance
> @@ -8269,6 +8274,13 @@ static int ufshcd_probe_hba(struct ufs_hba *hba, bool init_dev_params)
>  			goto out;
>  
>  		if (is_mcq_supported(hba)) {
> +			ret = ufshcd_alloc_mcq(hba);
> +			if (ret) {
> +				/* Continue with SDB mode */
> +				use_mcq_mode = false;
> +				dev_err(hba->dev, "MCQ mode is disabled, err=%d\n",
> +					 ret);
> +			}
>  			ret = scsi_add_host(host, hba->dev);
>  			if (ret) {
>  				dev_err(hba->dev, "scsi_add_host failed\n");
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index 70c0f9f..146b613 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -829,6 +829,8 @@ struct ufs_hba_monitor {
>   *	ee_ctrl_mask
>   * @luns_avail: number of regular and well known LUNs supported by the UFS
>   *	device
> + * @nr_hw_queues: number of hardware queues configured
> + * @nr_queues: number of Queues of different queue types
>   * @complete_put: whether or not to call ufshcd_rpm_put() from inside
>   *	ufshcd_resume_complete()
>   * @ext_iid_sup: is EXT_IID is supported by UFSHC
> @@ -981,6 +983,8 @@ struct ufs_hba {
>  	u32 debugfs_ee_rate_limit_ms;
>  #endif
>  	u32 luns_avail;
> +	unsigned int nr_hw_queues;
> +	unsigned int nr_queues[HCTX_MAX_TYPES];
>  	bool complete_put;
>  	bool ext_iid_sup;
>  	bool mcq_sup;
> -- 
> 2.7.4
>
Manivannan Sadhasivam Nov. 29, 2022, 3:33 p.m. UTC | #3
On Mon, Nov 28, 2022 at 05:20:48PM -0800, Asutosh Das wrote:
> The ufs device defines the supported queuedepth by
> bqueuedepth which has a max value of 256.
> The HC defines MAC (Max Active Commands) that define
> the max number of commands that in flight to the ufs
> device.
> Calculate and configure the nutrs based on both these
> values.
> 
> Co-developed-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Can Guo <quic_cang@quicinc.com>
> Signed-off-by: Asutosh Das <quic_asutoshd@quicinc.com>

Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>

Thanks,
Mani

> ---
>  drivers/ufs/core/ufs-mcq.c     | 35 +++++++++++++++++++++++++++++++++++
>  drivers/ufs/core/ufshcd-priv.h |  9 +++++++++
>  drivers/ufs/core/ufshcd.c      | 17 ++++++++++++++++-
>  drivers/ufs/host/ufs-qcom.c    |  7 +++++++
>  drivers/ufs/host/ufs-qcom.h    |  1 +
>  include/ufs/ufs.h              |  2 ++
>  include/ufs/ufshcd.h           |  2 ++
>  include/ufs/ufshci.h           |  1 +
>  8 files changed, 73 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
> index d6807e3..6f66bd7 100644
> --- a/drivers/ufs/core/ufs-mcq.c
> +++ b/drivers/ufs/core/ufs-mcq.c
> @@ -19,6 +19,9 @@
>  #define UFS_MCQ_NUM_DEV_CMD_QUEUES 1
>  #define UFS_MCQ_MIN_POLL_QUEUES 0
>  
> +#define MAX_DEV_CMD_ENTRIES	2
> +#define MCQ_CFG_MAC_MASK	GENMASK(16, 8)
> +
>  static int rw_queue_count_set(const char *val, const struct kernel_param *kp)
>  {
>  	return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_RW_QUEUES,
> @@ -67,6 +70,38 @@ module_param_cb(poll_queues, &poll_queue_count_ops, &poll_queues, 0644);
>  MODULE_PARM_DESC(poll_queues,
>  		 "Number of poll queues used for r/w. Default value is 1");
>  
> +/**
> + * ufshcd_mcq_decide_queue_depth - decide the queue depth
> + * @hba - per adapter instance
> + *
> + * Returns queue-depth on success, non-zero on error
> + *
> + * MAC - Max. Active Command of the Host Controller (HC)
> + * HC wouldn't send more than this commands to the device.
> + * It is mandatory to implement get_hba_mac() to enable MCQ mode.
> + * Calculates and adjusts the queue depth based on the depth
> + * supported by the HC and ufs device.
> + */
> +int ufshcd_mcq_decide_queue_depth(struct ufs_hba *hba)
> +{
> +	int mac;
> +
> +	/* Mandatory to implement get_hba_mac() */
> +	mac = ufshcd_mcq_vops_get_hba_mac(hba);
> +	if (mac < 0) {
> +		dev_err(hba->dev, "Failed to get mac, err=%d\n", mac);
> +		return mac;
> +	}
> +
> +	WARN_ON_ONCE(!hba->dev_info.bqueuedepth);
> +	/*
> +	 * max. value of bqueuedepth = 256, mac is host dependent.
> +	 * It is mandatory for UFS device to define bQueueDepth if
> +	 * shared queuing architecture is enabled.
> +	 */
> +	return min_t(int, mac, hba->dev_info.bqueuedepth);
> +}
> +
>  static int ufshcd_mcq_config_nr_queues(struct ufs_hba *hba)
>  {
>  	int i;
> diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h
> index 74cb17b9..da974a9 100644
> --- a/drivers/ufs/core/ufshcd-priv.h
> +++ b/drivers/ufs/core/ufshcd-priv.h
> @@ -62,6 +62,7 @@ int ufshcd_query_flag(struct ufs_hba *hba, enum query_opcode opcode,
>  	enum flag_idn idn, u8 index, bool *flag_res);
>  void ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit);
>  int ufshcd_mcq_init(struct ufs_hba *hba);
> +int ufshcd_mcq_decide_queue_depth(struct ufs_hba *hba);
>  
>  #define SD_ASCII_STD true
>  #define SD_RAW false
> @@ -235,6 +236,14 @@ static inline int ufshcd_vops_mcq_config_resource(struct ufs_hba *hba)
>  	return -EOPNOTSUPP;
>  }
>  
> +static inline int ufshcd_mcq_vops_get_hba_mac(struct ufs_hba *hba)
> +{
> +	if (hba->vops && hba->vops->get_hba_mac)
> +		return hba->vops->get_hba_mac(hba);
> +
> +	return -EOPNOTSUPP;
> +}
> +
>  extern const struct ufs_pm_lvl_states ufs_pm_lvl_states[];
>  
>  /**
> diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
> index 9b78814..e17159a 100644
> --- a/drivers/ufs/core/ufshcd.c
> +++ b/drivers/ufs/core/ufshcd.c
> @@ -7807,6 +7807,7 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
>  	/* getting Specification Version in big endian format */
>  	dev_info->wspecversion = desc_buf[DEVICE_DESC_PARAM_SPEC_VER] << 8 |
>  				      desc_buf[DEVICE_DESC_PARAM_SPEC_VER + 1];
> +	dev_info->bqueuedepth = desc_buf[DEVICE_DESC_PARAM_Q_DPTH];
>  	b_ufs_feature_sup = desc_buf[DEVICE_DESC_PARAM_UFS_FEAT];
>  
>  	model_index = desc_buf[DEVICE_DESC_PARAM_PRDCT_NAME];
> @@ -8222,7 +8223,21 @@ static int ufshcd_add_lus(struct ufs_hba *hba)
>  
>  static int ufshcd_alloc_mcq(struct ufs_hba *hba)
>  {
> -	return ufshcd_mcq_init(hba);
> +	int ret;
> +	int old_nutrs = hba->nutrs;
> +
> +	ret = ufshcd_mcq_decide_queue_depth(hba);
> +	if (ret < 0)
> +		return ret;
> +
> +	hba->nutrs = ret;
> +	ret = ufshcd_mcq_init(hba);
> +	if (ret) {
> +		hba->nutrs = old_nutrs;
> +		return ret;
> +	}
> +
> +	return 0;
>  }
>  
>  /**
> diff --git a/drivers/ufs/host/ufs-qcom.c b/drivers/ufs/host/ufs-qcom.c
> index 6bea541..ad7cde2 100644
> --- a/drivers/ufs/host/ufs-qcom.c
> +++ b/drivers/ufs/host/ufs-qcom.c
> @@ -1526,6 +1526,12 @@ static int ufs_qcom_mcq_config_resource(struct ufs_hba *hba)
>  	return ret;
>  }
>  
> +static int ufs_qcom_get_hba_mac(struct ufs_hba *hba)
> +{
> +	/* Qualcomm HC supports up to 64 */
> +	return MAX_SUPP_MAC;
> +}
> +
>  /*
>   * struct ufs_hba_qcom_vops - UFS QCOM specific variant operations
>   *
> @@ -1550,6 +1556,7 @@ static const struct ufs_hba_variant_ops ufs_hba_qcom_vops = {
>  	.config_scaling_param = ufs_qcom_config_scaling_param,
>  	.program_key		= ufs_qcom_ice_program_key,
>  	.mcq_config_resource	= ufs_qcom_mcq_config_resource,
> +	.get_hba_mac		= ufs_qcom_get_hba_mac,
>  };
>  
>  /**
> diff --git a/drivers/ufs/host/ufs-qcom.h b/drivers/ufs/host/ufs-qcom.h
> index 44466a3..f86e532 100644
> --- a/drivers/ufs/host/ufs-qcom.h
> +++ b/drivers/ufs/host/ufs-qcom.h
> @@ -16,6 +16,7 @@
>  #define HBRN8_POLL_TOUT_MS      100
>  #define DEFAULT_CLK_RATE_HZ     1000000
>  #define BUS_VECTOR_NAME_LEN     32
> +#define MAX_SUPP_MAC		64
>  
>  #define UFS_HW_VER_MAJOR_SHFT	(28)
>  #define UFS_HW_VER_MAJOR_MASK	(0x000F << UFS_HW_VER_MAJOR_SHFT)
> diff --git a/include/ufs/ufs.h b/include/ufs/ufs.h
> index ba2a1d8..5112418 100644
> --- a/include/ufs/ufs.h
> +++ b/include/ufs/ufs.h
> @@ -591,6 +591,8 @@ struct ufs_dev_info {
>  	u8	*model;
>  	u16	wspecversion;
>  	u32	clk_gating_wait_us;
> +	/* Stores the depth of queue in UFS device */
> +	u8	bqueuedepth;
>  
>  	/* UFS HPB related flag */
>  	bool	hpb_enabled;
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index 0e21a6a..9d7829a 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -298,6 +298,7 @@ struct ufs_pwr_mode_info {
>   * @program_key: program or evict an inline encryption key
>   * @event_notify: called to notify important events
>   * @mcq_config_resource: called to configure MCQ platform resources
> + * @get_hba_mac: called to get vendor specific mac value, mandatory for mcq mode
>   */
>  struct ufs_hba_variant_ops {
>  	const char *name;
> @@ -337,6 +338,7 @@ struct ufs_hba_variant_ops {
>  	void	(*event_notify)(struct ufs_hba *hba,
>  				enum ufs_event_type evt, void *data);
>  	int	(*mcq_config_resource)(struct ufs_hba *hba);
> +	int	(*get_hba_mac)(struct ufs_hba *hba);
>  };
>  
>  /* clock gating state  */
> diff --git a/include/ufs/ufshci.h b/include/ufs/ufshci.h
> index 4d4da06..67fcebd 100644
> --- a/include/ufs/ufshci.h
> +++ b/include/ufs/ufshci.h
> @@ -57,6 +57,7 @@ enum {
>  	REG_UFS_CCAP				= 0x100,
>  	REG_UFS_CRYPTOCAP			= 0x104,
>  
> +	REG_UFS_MCQ_CFG				= 0x380,
>  	UFSHCI_CRYPTO_REG_SPACE_SIZE		= 0x400,
>  };
>  
> -- 
> 2.7.4
>
Asutosh Das Nov. 29, 2022, 4:30 p.m. UTC | #4
On Tue, Nov 29 2022 at 07:39 -0800, Manivannan Sadhasivam wrote:
>Hi Asutosh,
>
>On Mon, Nov 28, 2022 at 05:20:41PM -0800, Asutosh Das wrote:
>> UFS Multi-Circular Queue (MCQ) has been added in UFSHCI v4.0 to improve storage performance.
>> The implementation uses the shared tagging mechanism so that tags are shared
>> among the hardware queues. The number of hardware queues is configurable.
>> This series doesn't include the ESI implementation for completion handling.
>> This implementation has been verified by on a Qualcomm platform.
>>
>> Please take a look and let us know your thoughts.
>>
>> v5 -> v6:
>> - Addressed Mani's comments
>> - Addressed Bart's comments
>>
>
>Thanks for the continuous update of the series. In this version, you
>seem to have missed some review tags from myself and Bart. Please
>make sure to collect all tags in the next version if you happen to
>send one.
>
Thanks Mani. I will send an updated series with the tags added today.

-asd

>Thanks,
>Mani
>
>> v4 -> v5:
>> - Fixed failure to fallback to SDB during initialization
>> - Fixed failure when rpm-lvl=5 in the ufshcd_host_reset_and_restore() path
>> - Improved ufshcd_mcq_config_nr_queues() to handle different configurations
>> - Addressed Bart's comments
>> - Verified read/write using FIO, clock gating, runtime-pm[lvl=3, lvl=5]
>>
>> v3 -> v4:
>> - Added a kernel module parameter to disable MCQ mode
>> - Added Bart's reviewed-by tag for some patches
>> - Addressed Bart's comments
>>
>> v2 -> v3:
>> - Split ufshcd_config_mcq() into ufshcd_alloc_mcq() and ufshcd_config_mcq()
>> - Use devm_kzalloc() in ufshcd_mcq_init()
>> - Free memory and resource allocation on error paths
>> - Corrected typos in code comments
>>
>> v1 -> v2:
>> - Added a non MCQ related change to use a function to extrace ufs extended
>> feature
>> - Addressed Mani's comments
>> - Addressed Bart's comments
>>
>> v1:
>> - Split the changes
>> - Addressed Bart's comments
>> - Addressed Bean's comments
>>
>> * RFC versions:
>> v2 -> v3:
>> - Split the changes based on functionality
>> - Addressed queue configuration issues
>> - Faster SQE tail pointer increments
>> - Addressed comments from Bart and Manivannan
>>
>> v1 -> v2:
>> - Enabled host_tagset
>> - Added queue num configuration support
>> - Added one more vops to allow vendor provide the wanted MAC
>> - Determine nutrs and can_queue by considering both MAC, bqueuedepth and EXT_IID support
>> - Postponed MCQ initialization and scsi_add_host() to async probe
>> - Used (EXT_IID, Task Tag) tuple to support up to 4096 tasks (theoretically)
>>
>>
>> Asutosh Das (16):
>>   ufs: core: Optimize duplicate code to read extended feature
>>   ufs: core: Probe for ext_iid support
>>   ufs: core: Introduce Multi-circular queue capability
>>   ufs: core: Defer adding host to scsi if mcq is supported
>>   ufs: core: mcq: Add support to allocate multiple queues
>>   ufs: core: mcq: Configure resource regions
>>   ufs: core: mcq: Calculate queue depth
>>   ufs: core: mcq: Allocate memory for mcq mode
>>   ufs: core: mcq: Configure operation and runtime interface
>>   ufs: core: mcq: Use shared tags for MCQ mode
>>   ufs: core: Prepare ufshcd_send_command for mcq
>>   ufs: core: mcq: Find hardware queue to queue request
>>   ufs: core: Prepare for completion in mcq
>>   ufs: mcq: Add completion support of a cqe
>>   ufs: core: mcq: Add completion support in poll
>>   ufs: core: mcq: Enable Multi Circular Queue
>>
>>  drivers/ufs/core/Makefile      |   2 +-
>>  drivers/ufs/core/ufs-mcq.c     | 416 +++++++++++++++++++++++++++++++++++++++++
>>  drivers/ufs/core/ufshcd-priv.h |  92 ++++++++-
>>  drivers/ufs/core/ufshcd.c      | 395 +++++++++++++++++++++++++++++++-------
>>  drivers/ufs/host/ufs-qcom.c    | 148 +++++++++++++++
>>  drivers/ufs/host/ufs-qcom.h    |   5 +
>>  include/ufs/ufs.h              |   6 +
>>  include/ufs/ufshcd.h           | 128 +++++++++++++
>>  include/ufs/ufshci.h           |  64 +++++++
>>  9 files changed, 1189 insertions(+), 67 deletions(-)
>>  create mode 100644 drivers/ufs/core/ufs-mcq.c
>>
>> --
>> 2.7.4
>>