[V3,00/25] smartpqi updates

Message ID	160763241302.26927.17487238067261230799.stgit@brunhilda
Headers	show Return-Path: <linux-scsi-owner@kernel.org> IronPort-SDR: Eq3aMmlt4C3MSjHWt8CeXj1WkO/uUyUxqnCySTy1QuGiuHyKpbCEGH3fptwM+mAE4VTvA0x3f3 h4tXd0Hxn2gn++xInzJXRs5In0vED6NvWLMrjFy+Q9U6fZuZw38+qQGPXjVP1ZCwsv8ZANHIR6 OY8p6/v1HRTWJUIhO/j3ON6f6c9T9AVAUyS7zFrmYuXhRukibLz5RVJ8m+TWwBVy9vD1KUxwzo xF+Y8sUUlAEAK9INoGXXeiz92U6GA0isOhB3ZNpZTp+V3YR9PZ1yMuqxl6xm7WxsSSatJ3Wj0k uoM= Subject: [PATCH V3 00/25] smartpqi updates From: Don Brace <don.brace@microchip.com> To: <Kevin.Barnett@microchip.com>, <scott.teel@microchip.com>, <Justin.Lindley@microchip.com>, <scott.benesh@microchip.com>, <gerry.morong@microchip.com>, <mahesh.rajashekhara@microchip.com>, <hch@infradead.org>, <jejb@linux.vnet.ibm.com>, <joseph.szczypek@hpe.com>, <POSWALD@suse.com> CC: <linux-scsi@vger.kernel.org> Date: Thu, 10 Dec 2020 14:34:20 -0600 Message-ID: <160763241302.26927.17487238067261230799.stgit@brunhilda> User-Agent: StGit/0.23-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk
Series	smartpqi updates \| expand [V3,00/25] smartpqi updates [V3,01/25] smartpqi: add support for product id [V3,02/25] smartpqi: refactor aio submission code [V3,03/25] smartpqi: refactor build sg list code [V3,08/25] smartpqi: add support for long firmware version [V3,10/25] smartpqi: add stream detection [V3,12/25] smartpqi: enable support for NVMe encryption [V3,13/25] smartpqi: disable write_same for nvme hba disks [V3,14/25] smartpqi: fix driver synchronization issues [V3,16/25] smartpqi: convert snprintf to scnprintf [V3,18/25] smartpqi: return busy indication for IOCTLs when ofa is active [V3,19/25] smartpqi: add phy id support for the physical drives [V3,20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols

Don Brace Dec. 10, 2020, 8:34 p.m. UTC

These patches are based on Martin Peterson's 5.11/scsi-queue tree

Note that these patches depend on the following three patches
applied to Martin Peterson's tree:
  https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
  5.11/scsi-queue
Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition
Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks

This set of changes consist of:
  * Add support for newer controller hardware.
    * Refactor AIO and s/g processing code. (No functional changes)
    * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
    * Add check for sequential streaming.
    * Add in new PCI-IDs.
  * Format changes to re-align with our in-house driver. (No functional changes.)
  * Correct some issues relating to suspend/hibernation/OFA/shutdown.
    * Block I/O requests during these conditions.
  * Add in qdepth limit check to limit outstanding commands.
    to the max values supported by the controller.
  * Correct some minor issues found during regression testing.
  * Update the driver version.

Changes since V1:
  * Re-added 32bit calculations to correct i386 compile issues
    to patch smartpqi-refactor-aio-submission-code 
    Reported-by: kernel test robot <lkp@intel.com>
    https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/

Changes since V2:
  * Added 32bit division to correct i386 compile issues
    to patch smartpqi-add-support-for-raid5-and-raid6-writes
    Reported-by: kernel test robot <lkp@intel.com>
    https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

---

Don Brace (7):
      smartpqi: refactor aio submission code
      smartpqi: refactor build sg list code
      smartpqi: add support for raid5 and raid6 writes
      smartpqi: add support for raid1 writes
      smartpqi: add stream detection
      smartpqi: add host level stream detection enable
      smartpqi: update version to 2.1.6-005

Kevin Barnett (14):
      smartpqi: add support for product id
      smartpqi: add support for BMIC sense feature cmd and feature bits
      smartpqi: update AIO Sub Page 0x02 support
      smartpqi: add support for long firmware version
      smartpqi: align code with oob driver
      smartpqi: enable support for NVMe encryption
      smartpqi: disable write_same for nvme hba disks
      smartpqi: fix driver synchronization issues
      smartpqi: convert snprintf to scnprintf
      smartpqi: change timing of release of QRM memory during OFA
      smartpqi: return busy indication for IOCTLs when ofa is active
      smartpqi: add additional logging for LUN resets
      smartpqi: correct system hangs when resuming from hibernation
      smartpqi: add new pci ids

Mahesh Rajashekhara (1):
      smartpqi: fix host qdepth limit

Murthy Bhat (3):
      smartpqi: add phy id support for the physical drives
      smartpqi: update sas initiator_port_protocols and target_port_protocols
      smartpqi: update enclosure identifier in sysf


 drivers/scsi/smartpqi/smartpqi.h              |  301 +-
 drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
 .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
 drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
 4 files changed, 2189 insertions(+), 1278 deletions(-)

--
Signature

Paul Menzel Dec. 14, 2020, 5:54 p.m. UTC | #1

Dear Don, dear Mahesh,


Am 10.12.20 um 21:35 schrieb Don Brace:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> 

> * Correct scsi-mid-layer sending more requests than

>    exposed host Q depth causing firmware ASSERT issue.

>    * Add host Qdepth counter.


This supposedly fixes the regression between Linux 5.4 and 5.9, which we 
reported in [1].

     kernel: smartpqi 0000:89:00.0: controller is offline: status code 
0x6100c
     kernel: smartpqi 0000:89:00.0: controller offline

Thank you for looking into this issue and fixing it. We are going to 
test this.

For easily finding these things in the git history or the WWW, it would 
be great if these log messages could be included (in the future).

Also, that means, that the regression is still present in Linux 5.10, 
released yesterday, and this commit does not apply to these versions.

Mahesh, do you have any idea, what commit caused the regression and why 
the issue started to show up?

James, Martin, how are regressions handled for the SCSI subsystem?

Regarding the diff, personally, I find the commit message much too 
terse. `pqi_scsi_queue_command()` will return `SCSI_MLQUEUE_HOST_BUSY` 
for the case of too many requests. Will that be logged by Linux in some 
log level? In my opinion it points to a performance problem, and should 
be at least logged as a notice or warning.

Can `ctrl_info->scsi_ml_can_queue` be queried somehow maybe in the logs? 
`sudo find /sys -name queue` did not display something interesting.

[1]: https://marc.info/?l=linux-scsi&m=160271263114829&w=2
      "Linux 5.9: smartpqi: controller is offline: status code 0x6100c"

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>   drivers/scsi/smartpqi/smartpqi.h      |    2 ++

>   drivers/scsi/smartpqi/smartpqi_init.c |   19 ++++++++++++++++---

>   2 files changed, 18 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h

> index 0b94c755a74c..c3b103b15924 100644

> --- a/drivers/scsi/smartpqi/smartpqi.h

> +++ b/drivers/scsi/smartpqi/smartpqi.h

> @@ -1345,6 +1345,8 @@ struct pqi_ctrl_info {

>   	struct work_struct ofa_quiesce_work;

>   	u32		ofa_bytes_requested;

>   	u16		ofa_cancel_reason;

> +

> +	atomic_t	total_scmds_outstanding;

>   };


What is the difference between the already existing

     atomic_t scsi_cmds_outstanding;

and the new counter?

     atomic_t	total_scmds_outstanding;

The names are quite similar, so different names or a comment might be 
useful.

>   

>   enum pqi_ctrl_mode {

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c

> index 082b17e9bd80..4e088f47d95f 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -5578,6 +5578,8 @@ static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)

>   void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)

>   {

>   	struct pqi_scsi_dev *device;

> +	struct pqi_ctrl_info *ctrl_info;

> +	struct Scsi_Host *shost;

>   

>   	if (!scmd->device) {

>   		set_host_byte(scmd, DID_NO_CONNECT);

> @@ -5590,7 +5592,11 @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)

>   		return;

>   	}

>   

> +	shost = scmd->device->host;


The function already has a variable `device`, which is assigned 
“hostdata” though:

     device = scmd->device->hostdata;

This confuses me. Maybe this should be cleaned up in a followup commit, 
and the variable device be reused above in the `shost` assignment.

> +	ctrl_info = shost_to_hba(shost);

> +

>   	atomic_dec(&device->scsi_cmds_outstanding);

> +	atomic_dec(&ctrl_info->total_scmds_outstanding);

>   }

>   

>   static bool pqi_is_parity_write_stream(struct pqi_ctrl_info *ctrl_info,

> @@ -5678,6 +5684,7 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>   	bool raid_bypassed;

>   

>   	device = scmd->device->hostdata;

> +	ctrl_info = shost_to_hba(shost);

>   

>   	if (!device) {

>   		set_host_byte(scmd, DID_NO_CONNECT);

> @@ -5686,8 +5693,11 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>   	}

>   

>   	atomic_inc(&device->scsi_cmds_outstanding);

> -

> -	ctrl_info = shost_to_hba(shost);


I believe, style changes (re-ordering) in commits fixing regressions 
make it harder to backport it.

> +	if (atomic_inc_return(&ctrl_info->total_scmds_outstanding) >

> +		ctrl_info->scsi_ml_can_queue) {

> +		rc = SCSI_MLQUEUE_HOST_BUSY;

> +		goto out;

> +	}

>   

>   	if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {

>   		set_host_byte(scmd, DID_NO_CONNECT);

> @@ -5730,8 +5740,10 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>   	}

>   

>   out:

> -	if (rc)

> +	if (rc) {

>   		atomic_dec(&device->scsi_cmds_outstanding);

> +		atomic_dec(&ctrl_info->total_scmds_outstanding);

> +	}

>   

>   	return rc;

>   }

> @@ -8054,6 +8066,7 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)

>   

>   	INIT_WORK(&ctrl_info->event_work, pqi_event_worker);

>   	atomic_set(&ctrl_info->num_interrupts, 0);

> +	atomic_set(&ctrl_info->total_scmds_outstanding, 0);

>   

>   	INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);

>   	INIT_DELAYED_WORK(&ctrl_info->update_time_work, pqi_update_time_worker);



Kind regards,

Paul

Don Brace Dec. 15, 2020, 8:23 p.m. UTC | #2

Please see answers below. Hope this helps.

-----Original Message-----
From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 

Sent: Monday, December 14, 2020 11:54 AM
To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley <jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

Dear Don, dear Mahesh,

Am 10.12.20 um 21:35 schrieb Don Brace:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

>

> * Correct scsi-mid-layer sending more requests than

>    exposed host Q depth causing firmware ASSERT issue.

>    * Add host Qdepth counter.

This supposedly fixes the regression between Linux 5.4 and 5.9, which we reported in [1].

     kernel: smartpqi 0000:89:00.0: controller is offline: status code 0x6100c
     kernel: smartpqi 0000:89:00.0: controller offline

Thank you for looking into this issue and fixing it. We are going to test this.

For easily finding these things in the git history or the WWW, it would be great if these log messages could be included (in the future).
DON> Thanks for your suggestion. Well add them in the next time.

Also, that means, that the regression is still present in Linux 5.10, released yesterday, and this commit does not apply to these versions.

DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12 depending when all of the patches are applied. The patch in question is among 28 other patches.

Mahesh, do you have any idea, what commit caused the regression and why the issue started to show up?
DON> The smartpqi driver sets two scsi_host_template member fields: .can_queue and .nr_hw_queues. But we have not yet converted to host_tagset. So the queue_depth becomes nr_hw_queues * can_queue, which is more than the hw can support. That can be verified by looking at scsi_host.h.
        /*
         * In scsi-mq mode, the number of hardware queues supported by the LLD.
         *
         * Note: it is assumed that each hardware queue has a queue depth of
         * can_queue. In other words, the total queue depth per host
         * is nr_hw_queues * can_queue. However, for when host_tagset is set,
         * the total queue depth is can_queue.
         */

So, until we make this change, the queue_depth change prevents the above issue from happening.
Note: you will see better performance and more evenly distributed performance with this patch applied.

James, Martin, how are regressions handled for the SCSI subsystem?

Regarding the diff, personally, I find the commit message much too terse. `pqi_scsi_queue_command()` will return `SCSI_MLQUEUE_HOST_BUSY` for the case of too many requests. Will that be logged by Linux in some log level? In my opinion it points to a performance problem, and should be at least logged as a notice or warning.
DON> We could add a ratelimited print, but we did not want to interrupt the CPU for logging these messages.
Also, you should see better and more even performance.

Can `ctrl_info->scsi_ml_can_queue` be queried somehow maybe in the logs?
`sudo find /sys -name queue` did not display something interesting.
All I find is /sys/class/scsi_host/host<X>/{cmd_per_lun, can_queue}, but not nr_hw_queues, but there is one queue for each CPU.

[1]: https://marc.info/?l=linux-scsi&m=160271263114829&w=2
      "Linux 5.9: smartpqi: controller is offline: status code 0x6100c"

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>   drivers/scsi/smartpqi/smartpqi.h      |    2 ++

>   drivers/scsi/smartpqi/smartpqi_init.c |   19 ++++++++++++++++---

>   2 files changed, 18 insertions(+), 3 deletions(-)

>

> diff --git a/drivers/scsi/smartpqi/smartpqi.h 

> b/drivers/scsi/smartpqi/smartpqi.h

> index 0b94c755a74c..c3b103b15924 100644

> --- a/drivers/scsi/smartpqi/smartpqi.h

> +++ b/drivers/scsi/smartpqi/smartpqi.h

> @@ -1345,6 +1345,8 @@ struct pqi_ctrl_info {

>       struct work_struct ofa_quiesce_work;

>       u32             ofa_bytes_requested;

>       u16             ofa_cancel_reason;

> +

> +     atomic_t        total_scmds_outstanding;

>   };

What is the difference between the already existing

     atomic_t scsi_cmds_outstanding;

and the new counter?

     atomic_t   total_scmds_outstanding;

The names are quite similar, so different names or a comment might be useful.
DON> total_scmds_outstanding tracks the queue_depth for the entire driver instance.

>

>   enum pqi_ctrl_mode {

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 082b17e9bd80..4e088f47d95f 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -5578,6 +5578,8 @@ static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)

>   void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)

>   {

>       struct pqi_scsi_dev *device;

> +     struct pqi_ctrl_info *ctrl_info;

> +     struct Scsi_Host *shost;

>

>       if (!scmd->device) {

>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5590,7 +5592,11 

> @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)

>               return;

>       }

>

> +     shost = scmd->device->host;

The function already has a variable `device`, which is assigned “hostdata” though:

     device = scmd->device->hostdata;

This confuses me. Maybe this should be cleaned up in a followup commit, and the variable device be reused above in the `shost` assignment.
DON> host points back to the driver instance of our HW.
DON> hostdata is a driver usable field that points back to our internal device pointer <LUN or HBA>.

> +     ctrl_info = shost_to_hba(shost);

> +

>       atomic_dec(&device->scsi_cmds_outstanding);

> +     atomic_dec(&ctrl_info->total_scmds_outstanding);

>   }

>

>   static bool pqi_is_parity_write_stream(struct pqi_ctrl_info 

> *ctrl_info, @@ -5678,6 +5684,7 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>       bool raid_bypassed;

>

>       device = scmd->device->hostdata;

> +     ctrl_info = shost_to_hba(shost);

>

>       if (!device) {

>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5686,8 +5693,11 

> @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>       }

>

>       atomic_inc(&device->scsi_cmds_outstanding);

> -

> -     ctrl_info = shost_to_hba(shost);

I believe, style changes (re-ordering) in commits fixing regressions make it harder to backport it.

> +     if (atomic_inc_return(&ctrl_info->total_scmds_outstanding) >

> +             ctrl_info->scsi_ml_can_queue) {

> +             rc = SCSI_MLQUEUE_HOST_BUSY;

> +             goto out;

> +     }

>

>       if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {

>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5730,8 +5740,10 

> @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm

>       }

>

>   out:

> -     if (rc)

> +     if (rc) {

>               atomic_dec(&device->scsi_cmds_outstanding);

> +             atomic_dec(&ctrl_info->total_scmds_outstanding);

> +     }

>

>       return rc;

>   }

> @@ -8054,6 +8066,7 @@ static struct pqi_ctrl_info 

> *pqi_alloc_ctrl_info(int numa_node)

>

>       INIT_WORK(&ctrl_info->event_work, pqi_event_worker);

>       atomic_set(&ctrl_info->num_interrupts, 0);

> +     atomic_set(&ctrl_info->total_scmds_outstanding, 0);

>

>       INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);

>       INIT_DELAYED_WORK(&ctrl_info->update_time_work, 

> pqi_update_time_worker);

Kind regards,

Paul

Donald Buczek Dec. 21, 2020, 2:31 p.m. UTC | #3

Dear Don,

just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

After three hours of heavy operation (including raid scrubbing!) the driver seems to have lost some requests for the md0 member disks

This is the static picture after all activity has ceased:

     root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
     /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
     /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

Best
   Donald

On 10.12.20 21:34, Don Brace wrote:
> These patches are based on Martin Peterson's 5.11/scsi-queue tree

> 

> Note that these patches depend on the following three patches

> applied to Martin Peterson's tree:

>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

>    5.11/scsi-queue

> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012

> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition

> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks

> 

> This set of changes consist of:

>    * Add support for newer controller hardware.

>      * Refactor AIO and s/g processing code. (No functional changes)

>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).

>      * Add check for sequential streaming.

>      * Add in new PCI-IDs.

>    * Format changes to re-align with our in-house driver. (No functional changes.)

>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.

>      * Block I/O requests during these conditions.

>    * Add in qdepth limit check to limit outstanding commands.

>      to the max values supported by the controller.

>    * Correct some minor issues found during regression testing.

>    * Update the driver version.

> 

> Changes since V1:

>    * Re-added 32bit calculations to correct i386 compile issues

>      to patch smartpqi-refactor-aio-submission-code

>      Reported-by: kernel test robot <lkp@intel.com>

>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/

> 

> Changes since V2:

>    * Added 32bit division to correct i386 compile issues

>      to patch smartpqi-add-support-for-raid5-and-raid6-writes

>      Reported-by: kernel test robot <lkp@intel.com>

>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

> 

> ---

> 

> Don Brace (7):

>        smartpqi: refactor aio submission code

>        smartpqi: refactor build sg list code

>        smartpqi: add support for raid5 and raid6 writes

>        smartpqi: add support for raid1 writes

>        smartpqi: add stream detection

>        smartpqi: add host level stream detection enable

>        smartpqi: update version to 2.1.6-005

> 

> Kevin Barnett (14):

>        smartpqi: add support for product id

>        smartpqi: add support for BMIC sense feature cmd and feature bits

>        smartpqi: update AIO Sub Page 0x02 support

>        smartpqi: add support for long firmware version

>        smartpqi: align code with oob driver

>        smartpqi: enable support for NVMe encryption

>        smartpqi: disable write_same for nvme hba disks

>        smartpqi: fix driver synchronization issues

>        smartpqi: convert snprintf to scnprintf

>        smartpqi: change timing of release of QRM memory during OFA

>        smartpqi: return busy indication for IOCTLs when ofa is active

>        smartpqi: add additional logging for LUN resets

>        smartpqi: correct system hangs when resuming from hibernation

>        smartpqi: add new pci ids

> 

> Mahesh Rajashekhara (1):

>        smartpqi: fix host qdepth limit

> 

> Murthy Bhat (3):

>        smartpqi: add phy id support for the physical drives

>        smartpqi: update sas initiator_port_protocols and target_port_protocols

>        smartpqi: update enclosure identifier in sysf

> 

> 

>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-

>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------

>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-

>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-

>   4 files changed, 2189 insertions(+), 1278 deletions(-)

> 


-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

Donald Buczek Dec. 22, 2020, 1:13 p.m. UTC | #4

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?


Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21
On it two AIC J3016-01 Enclosures with 16 8TB disks each
The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *From:* Donald Buczek <buczek@molgen.mpg.de>

> *Sent:* Monday, December 21, 2020 8:31 AM

> *To:* Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org <hch@infradead.org>; jejb@linux.vnet.ibm.com <jejb@linux.vnet.ibm.com>; joseph.szczypek@hpe.com <joseph.szczypek@hpe.com>; POSWALD@suse.com <POSWALD@suse.com>

> *Cc:* linux-scsi@vger.kernel.org <linux-scsi@vger.kernel.org>; it+linux@molgen.mpg.de <it+linux@molgen.mpg.de>

> *Subject:* Re: [PATCH V3 00/25] smartpqi updates

> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

> 

> Dear Don,

> 

> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

> 

> After three hours of heavy operation (including raid scrubbing!) the driver seems to have lost some requests for the md0 member disks

> 

> This is the static picture after all activity has ceased:

> 

>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done

>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0

>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

> 

> Best

>     Donald

> 

> On 10.12.20 21:34, Don Brace wrote:

>> These patches are based on Martin Peterson's 5.11/scsi-queue tree

>>

>> Note that these patches depend on the following three patches

>> applied to Martin Peterson's tree:

>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

>>    5.11/scsi-queue

>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012

>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition

>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks

>>

>> This set of changes consist of:

>>    * Add support for newer controller hardware.

>>      * Refactor AIO and s/g processing code. (No functional changes)

>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).

>>      * Add check for sequential streaming.

>>      * Add in new PCI-IDs.

>>    * Format changes to re-align with our in-house driver. (No functional changes.)

>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.

>>      * Block I/O requests during these conditions.

>>    * Add in qdepth limit check to limit outstanding commands.

>>      to the max values supported by the controller.

>>    * Correct some minor issues found during regression testing.

>>    * Update the driver version.

>>

>> Changes since V1:

>>    * Re-added 32bit calculations to correct i386 compile issues

>>      to patch smartpqi-refactor-aio-submission-code

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/

>>

>> Changes since V2:

>>    * Added 32bit division to correct i386 compile issues

>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

>>

>> ---

>>

>> Don Brace (7):

>>        smartpqi: refactor aio submission code

>>        smartpqi: refactor build sg list code

>>        smartpqi: add support for raid5 and raid6 writes

>>        smartpqi: add support for raid1 writes

>>        smartpqi: add stream detection

>>        smartpqi: add host level stream detection enable

>>        smartpqi: update version to 2.1.6-005

>>

>> Kevin Barnett (14):

>>        smartpqi: add support for product id

>>        smartpqi: add support for BMIC sense feature cmd and feature bits

>>        smartpqi: update AIO Sub Page 0x02 support

>>        smartpqi: add support for long firmware version

>>        smartpqi: align code with oob driver

>>        smartpqi: enable support for NVMe encryption

>>        smartpqi: disable write_same for nvme hba disks

>>        smartpqi: fix driver synchronization issues

>>        smartpqi: convert snprintf to scnprintf

>>        smartpqi: change timing of release of QRM memory during OFA

>>        smartpqi: return busy indication for IOCTLs when ofa is active

>>        smartpqi: add additional logging for LUN resets

>>        smartpqi: correct system hangs when resuming from hibernation

>>        smartpqi: add new pci ids

>>

>> Mahesh Rajashekhara (1):

>>        smartpqi: fix host qdepth limit

>>

>> Murthy Bhat (3):

>>        smartpqi: add phy id support for the physical drives

>>        smartpqi: update sas initiator_port_protocols and target_port_protocols

>>        smartpqi: update enclosure identifier in sysf

>>

>>

>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-

>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------

>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-

>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-

>>   4 files changed, 2189 insertions(+), 1278 deletions(-)

>>

> 

> --

> Donald Buczek

> buczek@molgen.mpg.de

> Tel: +49 30 8413 1433


-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

Don Brace Dec. 28, 2020, 3:57 p.m. UTC | #5

Subject: Re: [PATCH V3 00/25] smartpqi updates

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?


Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

>>Thanks for sharing your HW setup.

>>I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.


>>The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.


>>I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.


>>Thanks again,

>>Don.




>

> Dear Don,

>

> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

>

> After three hours of heavy operation (including raid scrubbing!) the 

> driver seems to have lost some requests for the md0 member disks

>

> This is the static picture after all activity has ceased:

>

>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done

>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0

>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

>

> Best

>     Donald

>

> On 10.12.20 21:34, Don Brace wrote:

>> These patches are based on Martin Peterson's 5.11/scsi-queue tree

>>

>> Note that these patches depend on the following three patches  

>>applied to Martin Peterson's tree:

>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

>>    5.11/scsi-queue

>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012

>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler 

>>busy condition

>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with 

>>HBA disks

>>

>> This set of changes consist of:

>>    * Add support for newer controller hardware.

>>      * Refactor AIO and s/g processing code. (No functional changes)

>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).

>>      * Add check for sequential streaming.

>>      * Add in new PCI-IDs.

>>    * Format changes to re-align with our in-house driver. (No 

>>functional changes.)

>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.

>>      * Block I/O requests during these conditions.

>>    * Add in qdepth limit check to limit outstanding commands.

>>      to the max values supported by the controller.

>>    * Correct some minor issues found during regression testing.

>>    * Update the driver version.

>>

>> Changes since V1:

>>    * Re-added 32bit calculations to correct i386 compile issues

>>      to patch smartpqi-refactor-aio-submission-code

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      

>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM

>>BBGGGE5446SVEOQBRCKBTRRWTSH4AB/

>>

>> Changes since V2:

>>    * Added 32bit division to correct i386 compile issues

>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      

>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC

>>XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

>>

>> ---

>>

>> Don Brace (7):

>>        smartpqi: refactor aio submission code

>>        smartpqi: refactor build sg list code

>>        smartpqi: add support for raid5 and raid6 writes

>>        smartpqi: add support for raid1 writes

>>        smartpqi: add stream detection

>>        smartpqi: add host level stream detection enable

>>        smartpqi: update version to 2.1.6-005

>>

>> Kevin Barnett (14):

>>        smartpqi: add support for product id

>>        smartpqi: add support for BMIC sense feature cmd and feature 

>>bits

>>        smartpqi: update AIO Sub Page 0x02 support

>>        smartpqi: add support for long firmware version

>>        smartpqi: align code with oob driver

>>        smartpqi: enable support for NVMe encryption

>>        smartpqi: disable write_same for nvme hba disks

>>        smartpqi: fix driver synchronization issues

>>        smartpqi: convert snprintf to scnprintf

>>        smartpqi: change timing of release of QRM memory during OFA

>>        smartpqi: return busy indication for IOCTLs when ofa is active

>>        smartpqi: add additional logging for LUN resets

>>        smartpqi: correct system hangs when resuming from hibernation

>>        smartpqi: add new pci ids

>>

>> Mahesh Rajashekhara (1):

>>        smartpqi: fix host qdepth limit

>>

>> Murthy Bhat (3):

>>        smartpqi: add phy id support for the physical drives

>>        smartpqi: update sas initiator_port_protocols and 

>>target_port_protocols

>>        smartpqi: update enclosure identifier in sysf

>>

>>

>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-

>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------

>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-

>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-

>>   4 files changed, 2189 insertions(+), 1278 deletions(-)

>>

>

> --

> Donald Buczek

> buczek@molgen.mpg.de

> Tel: +49 30 8413 1433


--
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

Don Brace Dec. 28, 2020, 7:25 p.m. UTC | #6

Can you provide the base OS that you used to build the kernel.org kernel?

Thanks,
Don

-----Original Message-----
From: Don.Brace@microchip.com [mailto:Don.Brace@microchip.com] 

Sent: Monday, December 28, 2020 9:58 AM
To: buczek@molgen.mpg.de; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; jejb@linux.vnet.ibm.com; joseph.szczypek@hpe.com; POSWALD@suse.com
Cc: linux-scsi@vger.kernel.org; it+linux@molgen.mpg.de
Subject: RE: [PATCH V3 00/25] smartpqi updates


Subject: Re: [PATCH V3 00/25] smartpqi updates

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?


Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

>>Thanks for sharing your HW setup.

>>I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.


>>The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.


>>I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.


>>Thanks again,

>>Don.




>

> Dear Don,

>

> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

>

> After three hours of heavy operation (including raid scrubbing!) the 

> driver seems to have lost some requests for the md0 member disks

>

> This is the static picture after all activity has ceased:

>

>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done

>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0

>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0

>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0

>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

>

> Best

>     Donald

>

> On 10.12.20 21:34, Don Brace wrote:

>> These patches are based on Martin Peterson's 5.11/scsi-queue tree

>>

>> Note that these patches depend on the following three patches applied 

>>to Martin Peterson's tree:

>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

>>    5.11/scsi-queue

>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012

>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler 

>>busy condition

>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with 

>>HBA disks

>>

>> This set of changes consist of:

>>    * Add support for newer controller hardware.

>>      * Refactor AIO and s/g processing code. (No functional changes)

>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).

>>      * Add check for sequential streaming.

>>      * Add in new PCI-IDs.

>>    * Format changes to re-align with our in-house driver. (No 

>>functional changes.)

>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.

>>      * Block I/O requests during these conditions.

>>    * Add in qdepth limit check to limit outstanding commands.

>>      to the max values supported by the controller.

>>    * Correct some minor issues found during regression testing.

>>    * Update the driver version.

>>

>> Changes since V1:

>>    * Re-added 32bit calculations to correct i386 compile issues

>>      to patch smartpqi-refactor-aio-submission-code

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      

>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM

>>BBGGGE5446SVEOQBRCKBTRRWTSH4AB/

>>

>> Changes since V2:

>>    * Added 32bit division to correct i386 compile issues

>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes

>>      Reported-by: kernel test robot <lkp@intel.com>

>>      

>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC

>>XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

>>

>> ---

>>

>> Don Brace (7):

>>        smartpqi: refactor aio submission code

>>        smartpqi: refactor build sg list code

>>        smartpqi: add support for raid5 and raid6 writes

>>        smartpqi: add support for raid1 writes

>>        smartpqi: add stream detection

>>        smartpqi: add host level stream detection enable

>>        smartpqi: update version to 2.1.6-005

>>

>> Kevin Barnett (14):

>>        smartpqi: add support for product id

>>        smartpqi: add support for BMIC sense feature cmd and feature 

>>bits

>>        smartpqi: update AIO Sub Page 0x02 support

>>        smartpqi: add support for long firmware version

>>        smartpqi: align code with oob driver

>>        smartpqi: enable support for NVMe encryption

>>        smartpqi: disable write_same for nvme hba disks

>>        smartpqi: fix driver synchronization issues

>>        smartpqi: convert snprintf to scnprintf

>>        smartpqi: change timing of release of QRM memory during OFA

>>        smartpqi: return busy indication for IOCTLs when ofa is active

>>        smartpqi: add additional logging for LUN resets

>>        smartpqi: correct system hangs when resuming from hibernation

>>        smartpqi: add new pci ids

>>

>> Mahesh Rajashekhara (1):

>>        smartpqi: fix host qdepth limit

>>

>> Murthy Bhat (3):

>>        smartpqi: add phy id support for the physical drives

>>        smartpqi: update sas initiator_port_protocols and 

>>target_port_protocols

>>        smartpqi: update enclosure identifier in sysf

>>

>>

>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-

>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------

>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-

>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-

>>   4 files changed, 2189 insertions(+), 1278 deletions(-)

>>

>

> --

> Donald Buczek

> buczek@molgen.mpg.de

> Tel: +49 30 8413 1433


--
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

Donald Buczek Dec. 28, 2020, 10:36 p.m. UTC | #7

On 28.12.20 20:25, Don.Brace@microchip.com wrote:

> Can you provide the base OS that you used to build the kernel.org kernel?


GNU/Linux build form scratch, no distribution. gcc-7.5.0

However, after more testing, I'm no longer convinced that deadlock is caused by the block layer.

I've original posted this as an xfs bug "v5.10.1 xfs deadlock" ( https://lkml.org/lkml/2020/12/17/608 )

In that thread it was suggested, that this might be caused by the block layer dropping I/Os. The non-zero inflight counters, I observed, seemed to confirm that, so I posted the same problem to linux-scsi and the relevant maintainers. However, since then I did a lot more tests and while I am now able to reproduce the original deadlock, I am not able to reproduce the non-zero inflight counters and haven't seen them since then.

It is quite clear now, that the deadlock is in the xfs layer and although, I don't fully understand it, I can patch it away now.

I was very eager to test smartpqi 2.1.6-005, which you submitted for linux-next. As you know, we have severe problems with the in-tree 1.2.8-026 smartpqi driver, while the 2.1.8-005 OOT driver, you provided us, did work. However, smartpqi 2.1.6-005 on Linux 5.10 failed for us on the production system, too, and I couldn't continue the tests on this system, so we set up the test system to identify the (potential) problem.

Unfortunately, on this test system, two other problems got in our way, which could be related to the smartpqi driver, but probably aren't. The first one was "md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition" ( https://lkml.org/lkml/2020/11/28/165 ). When I tried to isolate that problem, the xfs problem "v5.10.1 xfs deadlock" ( https://lkml.org/lkml/2020/12/17/608 ) brought itself into the foreground and needed to be resolved first. My original goal of testing the smartpqi 2.1.6-005 driver and trying to reproduce the problem, we've seen on the production system, didn't make progress because of that.

On the other hand, all this time smartpqi 2.1.6-005 was used on the test system with high simulated load and didn't fail on me. So I kind of tested it without bad results. I'd appreciate, if it would go into the next Linux release, anyway.

Best
   Donald

> 

> Thanks,

> Don

> 

> -----Original Message-----

> From: Don.Brace@microchip.com [mailto:Don.Brace@microchip.com]

> Sent: Monday, December 28, 2020 9:58 AM

> To: buczek@molgen.mpg.de; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; jejb@linux.vnet.ibm.com; joseph.szczypek@hpe.com; POSWALD@suse.com

> Cc: linux-scsi@vger.kernel.org; it+linux@molgen.mpg.de

> Subject: RE: [PATCH V3 00/25] smartpqi updates

> 

> 

> Subject: Re: [PATCH V3 00/25] smartpqi updates

> 

> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

> 

> On 22.12.20 00:30, Don.Brace@microchip.com wrote:

>> Can you please post your hw configuration and the stress load that you used? Was it fio?

> 

> Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.

> Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.

> So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

> 

> Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

> 

> I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

> 

> Donald

> 

>>> Thanks for sharing your HW setup.

>>> I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.

> 

>>> The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.

> 

>>> I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.

> 

>>> Thanks again,

>>> Don.

> 

> 

> 

>>

>> Dear Don,

>>

>> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

>>

>> After three hours of heavy operation (including raid scrubbing!) the

>> driver seems to have lost some requests for the md0 member disks

>>

>> This is the static picture after all activity has ceased:

>>

>>        root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done

>>        /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0

>>        /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0

>>        /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0

>>        /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

>>

>> Best

>>      Donald

>>

>> On 10.12.20 21:34, Don Brace wrote:

>>> These patches are based on Martin Peterson's 5.11/scsi-queue tree

>>>

>>> Note that these patches depend on the following three patches applied

>>> to Martin Peterson's tree:

>>>     https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git

>>>      5.11/scsi-queue

>>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012

>>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler

>>> busy condition

>>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with

>>> HBA disks

>>>

>>> This set of changes consist of:

>>>      * Add support for newer controller hardware.

>>>        * Refactor AIO and s/g processing code. (No functional changes)

>>>        * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).

>>>        * Add check for sequential streaming.

>>>        * Add in new PCI-IDs.

>>>      * Format changes to re-align with our in-house driver. (No

>>> functional changes.)

>>>      * Correct some issues relating to suspend/hibernation/OFA/shutdown.

>>>        * Block I/O requests during these conditions.

>>>      * Add in qdepth limit check to limit outstanding commands.

>>>        to the max values supported by the controller.

>>>      * Correct some minor issues found during regression testing.

>>>      * Update the driver version.

>>>

>>> Changes since V1:

>>>      * Re-added 32bit calculations to correct i386 compile issues

>>>        to patch smartpqi-refactor-aio-submission-code

>>>        Reported-by: kernel test robot <lkp@intel.com>

>>>       

>>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM

>>> BBGGGE5446SVEOQBRCKBTRRWTSH4AB/

>>>

>>> Changes since V2:

>>>      * Added 32bit division to correct i386 compile issues

>>>        to patch smartpqi-add-support-for-raid5-and-raid6-writes

>>>        Reported-by: kernel test robot <lkp@intel.com>

>>>       

>>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC

>>> XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

>>>

>>> ---

>>>

>>> Don Brace (7):

>>>          smartpqi: refactor aio submission code

>>>          smartpqi: refactor build sg list code

>>>          smartpqi: add support for raid5 and raid6 writes

>>>          smartpqi: add support for raid1 writes

>>>          smartpqi: add stream detection

>>>          smartpqi: add host level stream detection enable

>>>          smartpqi: update version to 2.1.6-005

>>>

>>> Kevin Barnett (14):

>>>          smartpqi: add support for product id

>>>          smartpqi: add support for BMIC sense feature cmd and feature

>>> bits

>>>          smartpqi: update AIO Sub Page 0x02 support

>>>          smartpqi: add support for long firmware version

>>>          smartpqi: align code with oob driver

>>>          smartpqi: enable support for NVMe encryption

>>>          smartpqi: disable write_same for nvme hba disks

>>>          smartpqi: fix driver synchronization issues

>>>          smartpqi: convert snprintf to scnprintf

>>>          smartpqi: change timing of release of QRM memory during OFA

>>>          smartpqi: return busy indication for IOCTLs when ofa is active

>>>          smartpqi: add additional logging for LUN resets

>>>          smartpqi: correct system hangs when resuming from hibernation

>>>          smartpqi: add new pci ids

>>>

>>> Mahesh Rajashekhara (1):

>>>          smartpqi: fix host qdepth limit

>>>

>>> Murthy Bhat (3):

>>>          smartpqi: add phy id support for the physical drives

>>>          smartpqi: update sas initiator_port_protocols and

>>> target_port_protocols

>>>          smartpqi: update enclosure identifier in sysf

>>>

>>>

>>>     drivers/scsi/smartpqi/smartpqi.h              |  301 +-

>>>     drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------

>>>     .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-

>>>     drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-

>>>     4 files changed, 2189 insertions(+), 1278 deletions(-)

>>>

>>

>> --

>> Donald Buczek

>> buczek@molgen.mpg.de

>> Tel: +49 30 8413 1433

> 

> --

> Donald Buczek

> buczek@molgen.mpg.de

> Tel: +49 30 8413 1433

> 


-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

Martin Wilck Jan. 7, 2021, 4:44 p.m. UTC | #8

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * Add in new IU definition.

> * Add in support raid5 and raid6 writes.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi.h      |   39 +++++

>  drivers/scsi/smartpqi/smartpqi_init.c |  247

> ++++++++++++++++++++++++++++++++-

>  2 files changed, 278 insertions(+), 8 deletions(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi.h

> b/drivers/scsi/smartpqi/smartpqi.h

> index d486a2ec3045..e9844210c4a0 100644

> --- a/drivers/scsi/smartpqi/smartpqi.h

> +++ b/drivers/scsi/smartpqi/smartpqi.h

> @@ -257,6 +257,7 @@ struct pqi_device_capability {

>  };

>  

>  #define PQI_MAX_EMBEDDED_SG_DESCRIPTORS                4

> +#define PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS    3

>  

>  struct pqi_raid_path_request {

>         struct pqi_iu_header header;

> @@ -312,6 +313,39 @@ struct pqi_aio_path_request {

>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];

>  };

>  

> +#define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */

> +#define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */


You don't seem to use these, and you'll remove them again in patch
06/25.

> +struct pqi_aio_r56_path_request {

> +       struct pqi_iu_header header;

> +       __le16  request_id;

> +       __le16  volume_id;              /* ID of the RAID volume */

> +       __le32  data_it_nexus;          /* IT nexus for the data

> drive */

> +       __le32  p_parity_it_nexus;      /* IT nexus for the P parity

> drive */

> +       __le32  q_parity_it_nexus;      /* IT nexus for the Q parity

> drive */

> +       __le32  data_length;            /* total bytes to read/write

> */

> +       u8      data_direction : 2;

> +       u8      partial : 1;

> +       u8      mem_type : 1;           /* 0b: PCIe, 1b: DDR */

> +       u8      fence : 1;

> +       u8      encryption_enable : 1;

> +       u8      reserved : 2;

> +       u8      task_attribute : 3;

> +       u8      command_priority : 4;

> +       u8      reserved1 : 1;

> +       __le16  data_encryption_key_index;

> +       u8      cdb[16];

> +       __le16  error_index;

> +       u8      num_sg_descriptors;

> +       u8      cdb_length;

> +       u8      xor_multiplier;

> +       u8      reserved2[3];

> +       __le32  encrypt_tweak_lower;

> +       __le32  encrypt_tweak_upper;

> +       u8      row;                    /* row = logical lba/blocks

> per row */

> +       u8      reserved3[8];

> +       struct pqi_sg_descriptor

> sg_descriptors[PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS];

> +};

> +

>  struct pqi_io_response {

>         struct pqi_iu_header header;

>         __le16  request_id;

> @@ -484,6 +518,8 @@ struct pqi_raid_error_info {

>  #define PQI_REQUEST_IU_TASK_MANAGEMENT                 0x13

>  #define PQI_REQUEST_IU_RAID_PATH_IO                    0x14

>  #define PQI_REQUEST_IU_AIO_PATH_IO                     0x15

> +#define PQI_REQUEST_IU_AIO_PATH_RAID5_IO               0x18

> +#define PQI_REQUEST_IU_AIO_PATH_RAID6_IO               0x19

>  #define PQI_REQUEST_IU_GENERAL_ADMIN                   0x60

>  #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG      0x72

>  #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG         0x73

> @@ -1179,6 +1215,7 @@ struct pqi_ctrl_info {

>         u16             max_inbound_iu_length_per_firmware;

>         u16             max_inbound_iu_length;

>         unsigned int    max_sg_per_iu;

> +       unsigned int    max_sg_per_r56_iu;

>         void            *admin_queue_memory_base;

>         u32             admin_queue_memory_length;

>         dma_addr_t      admin_queue_memory_base_dma_handle;

> @@ -1210,6 +1247,8 @@ struct pqi_ctrl_info {

>         u8              soft_reset_handshake_supported : 1;

>         u8              raid_iu_timeout_supported: 1;

>         u8              tmf_iu_timeout_supported: 1;

> +       u8              enable_r5_writes : 1;

> +       u8              enable_r6_writes : 1;

>  

>         struct list_head scsi_device_list;

>         spinlock_t      scsi_device_list_lock;

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 6bcb037ae9d7..c813cec10003 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info

> *ctrl_info,

>         struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,

>         unsigned int cdb_length, struct pqi_queue_group *queue_group,

>         struct pqi_encryption_info *encryption_info, bool

> raid_bypass);

> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd);

>  static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info);

>  static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info);

>  static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info);

> @@ -2237,7 +2241,8 @@ static inline void pqi_set_encryption_info(

>   * Attempt to perform RAID bypass mapping for a logical volume I/O.

>   */

>  

> -static bool pqi_aio_raid_level_supported(struct

> pqi_scsi_dev_raid_map_data *rmd)

> +static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info

> *ctrl_info,

> +       struct pqi_scsi_dev_raid_map_data *rmd)

>  {

>         bool is_supported = true;

>  

> @@ -2245,13 +2250,14 @@ static bool

> pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)

>         case SA_RAID_0:

>                 break;

>         case SA_RAID_1:

> -               if (rmd->is_write)

> -                       is_supported = false;

> +               is_supported = false;


You disable RAID1 READs with this patch. I can see you fix it again in
05/25, still it looks wrong.

>                 break;

>         case SA_RAID_5:

> -               fallthrough;

> +               if (rmd->is_write && !ctrl_info->enable_r5_writes)

> +                       is_supported = false;

> +               break;

>         case SA_RAID_6:

> -               if (rmd->is_write)

> +               if (rmd->is_write && !ctrl_info->enable_r6_writes)

>                         is_supported = false;

>                 break;

>         case SA_RAID_ADM:

> @@ -2526,6 +2532,26 @@ static int pqi_calc_aio_r5_or_r6(struct

> pqi_scsi_dev_raid_map_data *rmd,

>                 rmd->total_disks_per_row)) +

>                 (rmd->map_row * rmd->total_disks_per_row) + rmd-

> > first_column;

>  

> +       if (rmd->is_write) {

> +               rmd->p_index = (rmd->map_row * rmd-

> > total_disks_per_row) + rmd->data_disks_per_row;

> +               rmd->p_parity_it_nexus = raid_map->disk_data[rmd-

> > p_index].aio_handle;


I suppose you have made sure rmd->p_index can't be larger than the
size of raid_map->disk_data. A comment explaining that would be helpful
for the reader though.


> +               if (rmd->raid_level == SA_RAID_6) {

> +                       rmd->q_index = (rmd->map_row * rmd-

> > total_disks_per_row) +

> +                               (rmd->data_disks_per_row + 1);

> +                       rmd->q_parity_it_nexus = raid_map-

> > disk_data[rmd->q_index].aio_handle;

> +                       rmd->xor_mult = raid_map->disk_data[rmd-

> > map_index].xor_mult[1];


See above.

> +               }

> +               if (rmd->blocks_per_row == 0)

> +                       return PQI_RAID_BYPASS_INELIGIBLE;

> +#if BITS_PER_LONG == 32

> +               tmpdiv = rmd->first_block;

> +               do_div(tmpdiv, rmd->blocks_per_row);

> +               rmd->row = tmpdiv;

> +#else

> +               rmd->row = rmd->first_block / rmd->blocks_per_row;

> +#endif


Why not always use do_div()?

> +       }

> +

>         return 0;

>  }

>  

> @@ -2567,7 +2593,7 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>  

>         rmd.raid_level = device->raid_level;

>  

> -       if (!pqi_aio_raid_level_supported(&rmd))

> +       if (!pqi_aio_raid_level_supported(ctrl_info, &rmd))

>                 return PQI_RAID_BYPASS_INELIGIBLE;

>  

>         if (unlikely(rmd.block_cnt == 0))

> @@ -2587,7 +2613,8 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>         } else if (device->raid_level == SA_RAID_ADM) {

>                 rc = pqi_calc_aio_raid_adm(&rmd, device);

>         } else if ((device->raid_level == SA_RAID_5 ||

> -               device->raid_level == SA_RAID_6) &&

> rmd.layout_map_count > 1) {

> +               device->raid_level == SA_RAID_6) &&

> +               (rmd.layout_map_count > 1 || rmd.is_write)) {

>                 rc = pqi_calc_aio_r5_or_r6(&rmd, raid_map);

>                 if (rc)

>                         return PQI_RAID_BYPASS_INELIGIBLE;

> @@ -2622,9 +2649,27 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>                 encryption_info_ptr = NULL;

>         }

>  

> -       return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,

> +       if (rmd.is_write) {

> +               switch (device->raid_level) {

> +               case SA_RAID_0:

> +                       return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

>                                 rmd.cdb, rmd.cdb_length, queue_group,

>                                 encryption_info_ptr, true);

> +               case SA_RAID_5:

> +               case SA_RAID_6:

> +                       return pqi_aio_submit_r56_write_io(ctrl_info,

> scmd, queue_group,

> +                                       encryption_info_ptr, device,

> &rmd);

> +               default:

> +                       return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

> +                               rmd.cdb, rmd.cdb_length, queue_group,

> +                               encryption_info_ptr, true);

> +               }

> +       } else {

> +               return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

> +                       rmd.cdb, rmd.cdb_length, queue_group,

> +                       encryption_info_ptr, true);

> +       }

> +

>  }

>  

>  #define PQI_STATUS_IDLE                0x0

> @@ -4844,6 +4889,12 @@ static void

> pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)

>                 PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /

>                 sizeof(struct pqi_sg_descriptor)) +

>                 PQI_MAX_EMBEDDED_SG_DESCRIPTORS;

> +

> +       ctrl_info->max_sg_per_r56_iu =

> +               ((ctrl_info->max_inbound_iu_length -

> +               PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /

> +               sizeof(struct pqi_sg_descriptor)) +

> +               PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;

>  }

>  

>  static inline void pqi_set_sg_descriptor(

> @@ -4931,6 +4982,44 @@ static int pqi_build_raid_sg_list(struct

> pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>  

> +static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info

> *ctrl_info,

> +       struct pqi_aio_r56_path_request *request, struct scsi_cmnd

> *scmd,

> +       struct pqi_io_request *io_request)

> +{

> +       u16 iu_length;

> +       int sg_count;

> +       bool chained;

> +       unsigned int num_sg_in_iu;

> +       struct scatterlist *sg;

> +       struct pqi_sg_descriptor *sg_descriptor;

> +

> +       sg_count = scsi_dma_map(scmd);

> +       if (sg_count < 0)

> +               return sg_count;

> +

> +       iu_length = offsetof(struct pqi_aio_r56_path_request,

> sg_descriptors) -

> +               PQI_REQUEST_HEADER_LENGTH;

> +       num_sg_in_iu = 0;

> +

> +       if (sg_count == 0)

> +               goto out;


An if {} block would be better readable here.

> +

> +       sg = scsi_sglist(scmd);

> +       sg_descriptor = request->sg_descriptors;

> +

> +       num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count,

> io_request,

> +               ctrl_info->max_sg_per_r56_iu, &chained);

> +

> +       request->partial = chained;

> +       iu_length += num_sg_in_iu * sizeof(*sg_descriptor);

> +

> +out:

> +       put_unaligned_le16(iu_length, &request->header.iu_length);

> +       request->num_sg_descriptors = num_sg_in_iu;

> +

> +       return 0;

> +}

> +

>  static int pqi_build_aio_sg_list(struct pqi_ctrl_info *ctrl_info,

>         struct pqi_aio_path_request *request, struct scsi_cmnd *scmd,

>         struct pqi_io_request *io_request)

> @@ -5335,6 +5424,88 @@ static int pqi_aio_submit_io(struct

> pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>  

> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd)

> +{

> +       int rc;

> +       struct pqi_io_request *io_request;

> +       struct pqi_aio_r56_path_request *r56_request;

> +

> +       io_request = pqi_alloc_io_request(ctrl_info);

> +       io_request->io_complete_callback = pqi_aio_io_complete;

> +       io_request->scmd = scmd;

> +       io_request->raid_bypass = true;

> +

> +       r56_request = io_request->iu;

> +       memset(r56_request, 0, offsetof(struct

> pqi_aio_r56_path_request, sg_descriptors));

> +

> +       if (device->raid_level == SA_RAID_5 || device->raid_level ==

> SA_RAID_51)

> +               r56_request->header.iu_type =

> PQI_REQUEST_IU_AIO_PATH_RAID5_IO;

> +       else

> +               r56_request->header.iu_type =

> PQI_REQUEST_IU_AIO_PATH_RAID6_IO;

> +

> +       put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff,

> &r56_request->volume_id);

> +       put_unaligned_le32(rmd->aio_handle, &r56_request-

> > data_it_nexus);

> +       put_unaligned_le32(rmd->p_parity_it_nexus, &r56_request-

> > p_parity_it_nexus);

> +       if (rmd->raid_level == SA_RAID_6) {

> +               put_unaligned_le32(rmd->q_parity_it_nexus,

> &r56_request->q_parity_it_nexus);

> +               r56_request->xor_multiplier = rmd->xor_mult;

> +       }

> +       put_unaligned_le32(scsi_bufflen(scmd), &r56_request-

> > data_length);

> +       r56_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;

> +       put_unaligned_le64(rmd->row, &r56_request->row);

> +

> +       put_unaligned_le16(io_request->index, &r56_request-

> > request_id);

> +       r56_request->error_index = r56_request->request_id;

> +

> +       if (rmd->cdb_length > sizeof(r56_request->cdb))

> +               rmd->cdb_length = sizeof(r56_request->cdb);

> +       r56_request->cdb_length = rmd->cdb_length;

> +       memcpy(r56_request->cdb, rmd->cdb, rmd->cdb_length);

> +

> +       switch (scmd->sc_data_direction) {

> +       case DMA_TO_DEVICE:

> +               r56_request->data_direction = SOP_READ_FLAG;

> +               break;


I wonder how it would be possible that sc_data_direction is anything
else but DMA_TO_DEVICE here. AFAICS we will only reach this code for
WRITE commands. Add a comment, please.


> +       case DMA_FROM_DEVICE:

> +               r56_request->data_direction = SOP_WRITE_FLAG;

> +               break;

> +       case DMA_NONE:

> +               r56_request->data_direction = SOP_NO_DIRECTION_FLAG;

> +               break;

> +       case DMA_BIDIRECTIONAL:

> +               r56_request->data_direction = SOP_BIDIRECTIONAL;

> +               break;

> +       default:

> +               dev_err(&ctrl_info->pci_dev->dev,

> +                       "unknown data direction: %d\n",

> +                       scmd->sc_data_direction);

> +               break;

> +       }

> +

> +       if (encryption_info) {

> +               r56_request->encryption_enable = true;

> +               put_unaligned_le16(encryption_info-

> > data_encryption_key_index,

> +                               &r56_request-

> > data_encryption_key_index);

> +               put_unaligned_le32(encryption_info-

> > encrypt_tweak_lower,

> +                               &r56_request->encrypt_tweak_lower);

> +               put_unaligned_le32(encryption_info-

> > encrypt_tweak_upper,

> +                               &r56_request->encrypt_tweak_upper);

> +       }

> +

> +       rc = pqi_build_aio_r56_sg_list(ctrl_info, r56_request, scmd,

> io_request);

> +       if (rc) {

> +               pqi_free_io_request(io_request);

> +               return SCSI_MLQUEUE_HOST_BUSY;

> +       }

> +

> +       pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);

> +

> +       return 0;

> +}

> +

>  static inline u16 pqi_get_hw_queue(struct pqi_ctrl_info *ctrl_info,

>         struct scsi_cmnd *scmd)

>  {

> @@ -6298,6 +6469,60 @@ static ssize_t pqi_lockup_action_store(struct

> device *dev,

>         return -EINVAL;

>  }

>  

> +static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,

> +       struct device_attribute *attr, char *buffer)

> +{

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +

> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-

> > enable_r5_writes);


"%hhx" is deprecated, see
https://lore.kernel.org/lkml/20190914015858.7c76e036@lwn.net/T/


> +}

> +

> +static ssize_t pqi_host_enable_r5_writes_store(struct device *dev,

> +       struct device_attribute *attr, const char *buffer, size_t

> count)

> +{

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +       u8 set_r5_writes = 0;

> +

> +       if (kstrtou8(buffer, 0, &set_r5_writes))

> +               return -EINVAL;

> +

> +       if (set_r5_writes > 0)

> +               set_r5_writes = 1;

> +

> +       ctrl_info->enable_r5_writes = set_r5_writes;

> +

> +       return count;

> +}

> +

> +static ssize_t pqi_host_enable_r6_writes_show(struct device *dev,

> +       struct device_attribute *attr, char *buffer)

> +{

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +

> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-

> > enable_r6_writes);


See above

> +}

> +

> +static ssize_t pqi_host_enable_r6_writes_store(struct device *dev,

> +       struct device_attribute *attr, const char *buffer, size_t

> count)

> +{

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +       u8 set_r6_writes = 0;

> +

> +       if (kstrtou8(buffer, 0, &set_r6_writes))

> +               return -EINVAL;

> +

> +       if (set_r6_writes > 0)

> +               set_r6_writes = 1;

> +

> +       ctrl_info->enable_r6_writes = set_r6_writes;

> +

> +       return count;

> +}

> +

>  static DEVICE_ATTR(driver_version, 0444, pqi_driver_version_show,

> NULL);

>  static DEVICE_ATTR(firmware_version, 0444,

> pqi_firmware_version_show, NULL);

>  static DEVICE_ATTR(model, 0444, pqi_model_show, NULL);

> @@ -6306,6 +6531,10 @@ static DEVICE_ATTR(vendor, 0444,

> pqi_vendor_show, NULL);

>  static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);

>  static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,

>         pqi_lockup_action_store);

> +static DEVICE_ATTR(enable_r5_writes, 0644,

> +       pqi_host_enable_r5_writes_show,

> pqi_host_enable_r5_writes_store);

> +static DEVICE_ATTR(enable_r6_writes, 0644,

> +       pqi_host_enable_r6_writes_show,

> pqi_host_enable_r6_writes_store);

>  

>  static struct device_attribute *pqi_shost_attrs[] = {

>         &dev_attr_driver_version,

> @@ -6315,6 +6544,8 @@ static struct device_attribute

> *pqi_shost_attrs[] = {

>         &dev_attr_vendor,

>         &dev_attr_rescan,

>         &dev_attr_lockup_action,

> +       &dev_attr_enable_r5_writes,

> +       &dev_attr_enable_r6_writes,

>         NULL

>  };

>  

>

Martin Wilck Jan. 7, 2021, 4:44 p.m. UTC | #9

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * Add raid1 write IU.

> * Add in raid1 write support.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi.h      |   37 +++++

>  drivers/scsi/smartpqi/smartpqi_init.c |  235

> +++++++++++++++++++++++----------

>  2 files changed, 196 insertions(+), 76 deletions(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi.h

> b/drivers/scsi/smartpqi/smartpqi.h

> index e9844210c4a0..225ec6843c68 100644

> --- a/drivers/scsi/smartpqi/smartpqi.h

> +++ b/drivers/scsi/smartpqi/smartpqi.h

> @@ -313,6 +313,36 @@ struct pqi_aio_path_request {

>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];

>  };

>  

> +#define PQI_RAID1_NVME_XFER_LIMIT      (32 * 1024)     /* 32 KiB */

> +struct pqi_aio_r1_path_request {

> +       struct pqi_iu_header header;

> +       __le16  request_id;

> +       __le16  volume_id;      /* ID of the RAID volume */

> +       __le32  it_nexus_1;     /* IT nexus of the 1st drive in the

> RAID volume */

> +       __le32  it_nexus_2;     /* IT nexus of the 2nd drive in the

> RAID volume */

> +       __le32  it_nexus_3;     /* IT nexus of the 3rd drive in the

> RAID volume */

> +       __le32  data_length;    /* total bytes to read/write */

> +       u8      data_direction : 2;

> +       u8      partial : 1;

> +       u8      memory_type : 1;

> +       u8      fence : 1;

> +       u8      encryption_enable : 1;

> +       u8      reserved : 2;

> +       u8      task_attribute : 3;

> +       u8      command_priority : 4;

> +       u8      reserved2 : 1;

> +       __le16  data_encryption_key_index;

> +       u8      cdb[16];

> +       __le16  error_index;

> +       u8      num_sg_descriptors;

> +       u8      cdb_length;

> +       u8      num_drives;     /* number of drives in the RAID

> volume (2 or 3) */

> +       u8      reserved3[3];

> +       __le32  encrypt_tweak_lower;

> +       __le32  encrypt_tweak_upper;

> +       struct pqi_sg_descriptor

> sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];

> +};

> +

>  #define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */

>  #define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */

>  struct pqi_aio_r56_path_request {

> @@ -520,6 +550,7 @@ struct pqi_raid_error_info {

>  #define PQI_REQUEST_IU_AIO_PATH_IO                     0x15

>  #define PQI_REQUEST_IU_AIO_PATH_RAID5_IO               0x18

>  #define PQI_REQUEST_IU_AIO_PATH_RAID6_IO               0x19

> +#define PQI_REQUEST_IU_AIO_PATH_RAID1_IO               0x1A

>  #define PQI_REQUEST_IU_GENERAL_ADMIN                   0x60

>  #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG      0x72

>  #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG         0x73

> @@ -972,14 +1003,12 @@ struct pqi_scsi_dev_raid_map_data {

>         u16     strip_size;

>         u32     first_group;

>         u32     last_group;

> -       u32     current_group;

>         u32     map_row;

>         u32     aio_handle;

>         u64     disk_block;

>         u32     disk_block_cnt;

>         u8      cdb[16];

>         u8      cdb_length;

> -       int     offload_to_mirror;

>  

>         /* RAID1 specific */

>  #define NUM_RAID1_MAP_ENTRIES 3

> @@ -1040,8 +1069,7 @@ struct pqi_scsi_dev {

>         u16     phys_connector[8];

>         bool    raid_bypass_configured; /* RAID bypass configured */

>         bool    raid_bypass_enabled;    /* RAID bypass enabled */

> -       int     offload_to_mirror;      /* Send next RAID bypass

> request */

> -                                       /* to mirror drive. */

> +       u32     next_bypass_group;

>         struct raid_map *raid_map;      /* RAID bypass map */

>  

>         struct pqi_sas_port *sas_port;

> @@ -1247,6 +1275,7 @@ struct pqi_ctrl_info {

>         u8              soft_reset_handshake_supported : 1;

>         u8              raid_iu_timeout_supported: 1;

>         u8              tmf_iu_timeout_supported: 1;

> +       u8              enable_r1_writes : 1;

>         u8              enable_r5_writes : 1;

>         u8              enable_r6_writes : 1;

>  

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index c813cec10003..8da9031c9c0b 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info

> *ctrl_info,

>         struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,

>         unsigned int cdb_length, struct pqi_queue_group *queue_group,

>         struct pqi_encryption_info *encryption_info, bool

> raid_bypass);

> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd);

>  static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info

> *ctrl_info,

>         struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

>         struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> @@ -1717,7 +1721,7 @@ static void pqi_scsi_update_device(struct

> pqi_scsi_dev *existing_device,

>                 sizeof(existing_device->box));

>         memcpy(existing_device->phys_connector, new_device-

> >phys_connector,

>                 sizeof(existing_device->phys_connector));

> -       existing_device->offload_to_mirror = 0;

> +       existing_device->next_bypass_group = 0;

>         kfree(existing_device->raid_map);

>         existing_device->raid_map = new_device->raid_map;

>         existing_device->raid_bypass_configured =

> @@ -2250,7 +2254,10 @@ static bool

> pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,

>         case SA_RAID_0:

>                 break;

>         case SA_RAID_1:

> -               is_supported = false;

> +               fallthrough;


Nit: fallthrough isn't necessary here.

> +       case SA_RAID_ADM:

> +               if (rmd->is_write && !ctrl_info->enable_r1_writes)

> +                       is_supported = false;

>                 break;

>         case SA_RAID_5:

>                 if (rmd->is_write && !ctrl_info->enable_r5_writes)

> @@ -2260,10 +2267,6 @@ static bool

> pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,

>                 if (rmd->is_write && !ctrl_info->enable_r6_writes)

>                         is_supported = false;

>                 break;

> -       case SA_RAID_ADM:

> -               if (rmd->is_write)

> -                       is_supported = false;

> -               break;

>         default:

>                 is_supported = false;

>         }

> @@ -2385,64 +2388,6 @@ static int

> pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>  

> -static int pqi_calc_aio_raid_adm(struct pqi_scsi_dev_raid_map_data

> *rmd,

> -                               struct pqi_scsi_dev *device)

> -{

> -       /* RAID ADM */

> -       /*

> -        * Handles N-way mirrors  (R1-ADM) and R10 with # of drives

> -        * divisible by 3.

> -        */

> -       rmd->offload_to_mirror = device->offload_to_mirror;

> -

> -       if (rmd->offload_to_mirror == 0)  {

> -               /* use physical disk in the first mirrored group. */

> -               rmd->map_index %= rmd->data_disks_per_row;

> -       } else {

> -               do {

> -                       /*

> -                        * Determine mirror group that map_index

> -                        * indicates.

> -                        */

> -                       rmd->current_group =

> -                               rmd->map_index / rmd-

> >data_disks_per_row;

> -

> -                       if (rmd->offload_to_mirror !=

> -                                       rmd->current_group) {

> -                               if (rmd->current_group <

> -                                       rmd->layout_map_count - 1) {

> -                                       /*

> -                                        * Select raid index from

> -                                        * next group.

> -                                        */

> -                                       rmd->map_index += rmd-

> >data_disks_per_row;

> -                                       rmd->current_group++;

> -                               } else {

> -                                       /*

> -                                        * Select raid index from

> first

> -                                        * group.

> -                                        */

> -                                       rmd->map_index %= rmd-

> >data_disks_per_row;

> -                                       rmd->current_group = 0;

> -                               }

> -                       }

> -               } while (rmd->offload_to_mirror != rmd-

> >current_group);

> -       }

> -

> -       /* Set mirror group to use next time. */

> -       rmd->offload_to_mirror =

> -               (rmd->offload_to_mirror >= rmd->layout_map_count - 1)

> ?

> -                       0 : rmd->offload_to_mirror + 1;

> -       device->offload_to_mirror = rmd->offload_to_mirror;

> -       /*

> -        * Avoid direct use of device->offload_to_mirror within this

> -        * function since multiple threads might simultaneously

> -        * increment it beyond the range of device->layout_map_count

> -1.

> -        */

> -

> -       return 0;

> -}

> -

>  static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data

> *rmd,

>                                 struct raid_map *raid_map)

>  {

> @@ -2577,12 +2522,34 @@ static void pqi_set_aio_cdb(struct

> pqi_scsi_dev_raid_map_data *rmd)

>         }

>  }

>  

> +static void pqi_calc_aio_r1_nexus(struct raid_map *raid_map,

> +                               struct pqi_scsi_dev_raid_map_data

> *rmd)

> +{

> +       u32 index;

> +       u32 group;

> +

> +       group = rmd->map_index / rmd->data_disks_per_row;

> +

> +       index = rmd->map_index - (group * rmd->data_disks_per_row);

> +       rmd->it_nexus[0] = raid_map->disk_data[index].aio_handle;

> +       index += rmd->data_disks_per_row;

> +       rmd->it_nexus[1] = raid_map->disk_data[index].aio_handle;

> +       if (rmd->layout_map_count > 2) {

> +               index += rmd->data_disks_per_row;

> +               rmd->it_nexus[2] = raid_map-

> >disk_data[index].aio_handle;

> +       }

> +

> +       rmd->num_it_nexus_entries = rmd->layout_map_count;

> +}

> +

>  static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info

> *ctrl_info,

>         struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,

>         struct pqi_queue_group *queue_group)

>  {

> -       struct raid_map *raid_map;

>         int rc;

> +       struct raid_map *raid_map;

> +       u32 group;

> +       u32 next_bypass_group;

>         struct pqi_encryption_info *encryption_info_ptr;

>         struct pqi_encryption_info encryption_info;

>         struct pqi_scsi_dev_raid_map_data rmd = {0};

> @@ -2605,13 +2572,18 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>         if (rc)

>                 return PQI_RAID_BYPASS_INELIGIBLE;

>  

> -       /* RAID 1 */

> -       if (device->raid_level == SA_RAID_1) {

> -               if (device->offload_to_mirror)

> -                       rmd.map_index += rmd.data_disks_per_row;

> -               device->offload_to_mirror = !device-

> >offload_to_mirror;

> -       } else if (device->raid_level == SA_RAID_ADM) {

> -               rc = pqi_calc_aio_raid_adm(&rmd, device);

> +       if (device->raid_level == SA_RAID_1 ||

> +               device->raid_level == SA_RAID_ADM) {

> +               if (rmd.is_write) {

> +                       pqi_calc_aio_r1_nexus(raid_map, &rmd);

> +               } else {

> +                       group = device->next_bypass_group;

> +                       next_bypass_group = group + 1;

> +                       if (next_bypass_group >=

> rmd.layout_map_count)

> +                               next_bypass_group = 0;

> +                       device->next_bypass_group =

> next_bypass_group;

> +                       rmd.map_index += group *

> rmd.data_disks_per_row;

> +               }

>         } else if ((device->raid_level == SA_RAID_5 ||

>                 device->raid_level == SA_RAID_6) &&

>                 (rmd.layout_map_count > 1 || rmd.is_write)) {

> @@ -2655,6 +2627,10 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>                         return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

>                                 rmd.cdb, rmd.cdb_length, queue_group,

>                                 encryption_info_ptr, true);

> +               case SA_RAID_1:

> +               case SA_RAID_ADM:

> +                       return pqi_aio_submit_r1_write_io(ctrl_info,

> scmd, queue_group,

> +                               encryption_info_ptr, device, &rmd);

>                 case SA_RAID_5:

>                 case SA_RAID_6:

>                         return pqi_aio_submit_r56_write_io(ctrl_info,

> scmd, queue_group,

> @@ -4982,6 +4958,44 @@ static int pqi_build_raid_sg_list(struct

> pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>  

> +static int pqi_build_aio_r1_sg_list(struct pqi_ctrl_info *ctrl_info,

> +       struct pqi_aio_r1_path_request *request, struct scsi_cmnd

> *scmd,

> +       struct pqi_io_request *io_request)

> +{

> +       u16 iu_length;

> +       int sg_count;

> +       bool chained;

> +       unsigned int num_sg_in_iu;

> +       struct scatterlist *sg;

> +       struct pqi_sg_descriptor *sg_descriptor;

> +

> +       sg_count = scsi_dma_map(scmd);

> +       if (sg_count < 0)

> +               return sg_count;

> +

> +       iu_length = offsetof(struct pqi_aio_r1_path_request,

> sg_descriptors) -

> +               PQI_REQUEST_HEADER_LENGTH;

> +       num_sg_in_iu = 0;

> +

> +       if (sg_count == 0)

> +               goto out;

> +

> +       sg = scsi_sglist(scmd);

> +       sg_descriptor = request->sg_descriptors;

> +

> +       num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count,

> io_request,

> +               ctrl_info->max_sg_per_iu, &chained);

> +

> +       request->partial = chained;

> +       iu_length += num_sg_in_iu * sizeof(*sg_descriptor);

> +

> +out:

> +       put_unaligned_le16(iu_length, &request->header.iu_length);

> +       request->num_sg_descriptors = num_sg_in_iu;

> +

> +       return 0;

> +}

> +

>  static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info

> *ctrl_info,

>         struct pqi_aio_r56_path_request *request, struct scsi_cmnd

> *scmd,

>         struct pqi_io_request *io_request)

> @@ -5424,6 +5438,83 @@ static int pqi_aio_submit_io(struct

> pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>  

> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd)

> +

> +{

> +       int rc;

> +       struct pqi_io_request *io_request;

> +       struct pqi_aio_r1_path_request *r1_request;

> +

> +       io_request = pqi_alloc_io_request(ctrl_info);

> +       io_request->io_complete_callback = pqi_aio_io_complete;

> +       io_request->scmd = scmd;

> +       io_request->raid_bypass = true;

> +

> +       r1_request = io_request->iu;

> +       memset(r1_request, 0, offsetof(struct

> pqi_aio_r1_path_request, sg_descriptors));

> +

> +       r1_request->header.iu_type =

> PQI_REQUEST_IU_AIO_PATH_RAID1_IO;

> +

> +       put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff,

> &r1_request->volume_id);

> +       r1_request->num_drives = rmd->num_it_nexus_entries;

> +       put_unaligned_le32(rmd->it_nexus[0], &r1_request-

> >it_nexus_1);

> +       put_unaligned_le32(rmd->it_nexus[1], &r1_request-

> >it_nexus_2);

> +       if (rmd->num_it_nexus_entries == 3)

> +               put_unaligned_le32(rmd->it_nexus[2], &r1_request-

> >it_nexus_3);

> +

> +       put_unaligned_le32(scsi_bufflen(scmd), &r1_request-

> >data_length);

> +       r1_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;

> +       put_unaligned_le16(io_request->index, &r1_request-

> >request_id);

> +       r1_request->error_index = r1_request->request_id;

> +       if (rmd->cdb_length > sizeof(r1_request->cdb))

> +               rmd->cdb_length = sizeof(r1_request->cdb);

> +       r1_request->cdb_length = rmd->cdb_length;

> +       memcpy(r1_request->cdb, rmd->cdb, rmd->cdb_length);

> +

> +       switch (scmd->sc_data_direction) {

> +       case DMA_TO_DEVICE:

> +               r1_request->data_direction = SOP_READ_FLAG;

> +               break;


Same question as for previous patch, how would anything else than
DMA_TO_DEVICE be possible here?

> +       case DMA_FROM_DEVICE:

> +               r1_request->data_direction = SOP_WRITE_FLAG;

> +               break;

> +       case DMA_NONE:

> +               r1_request->data_direction = SOP_NO_DIRECTION_FLAG;

> +               break;

> +       case DMA_BIDIRECTIONAL:

> +               r1_request->data_direction = SOP_BIDIRECTIONAL;

> +               break;

> +       default:

> +               dev_err(&ctrl_info->pci_dev->dev,

> +                       "unknown data direction: %d\n",

> +                       scmd->sc_data_direction);

> +               break;

> +       }

> +

> +       if (encryption_info) {

> +               r1_request->encryption_enable = true;

> +               put_unaligned_le16(encryption_info-

> >data_encryption_key_index,

> +                               &r1_request-

> >data_encryption_key_index);

> +               put_unaligned_le32(encryption_info-

> >encrypt_tweak_lower,

> +                               &r1_request->encrypt_tweak_lower);

> +               put_unaligned_le32(encryption_info-

> >encrypt_tweak_upper,

> +                               &r1_request->encrypt_tweak_upper);

> +       }

> +

> +       rc = pqi_build_aio_r1_sg_list(ctrl_info, r1_request, scmd,

> io_request);

> +       if (rc) {

> +               pqi_free_io_request(io_request);

> +               return SCSI_MLQUEUE_HOST_BUSY;

> +       }

> +

> +       pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);

> +

> +       return 0;

> +}

> +

>  static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info

> *ctrl_info,

>         struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

>         struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

>

Martin Wilck Jan. 7, 2021, 4:44 p.m. UTC | #10

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Determine support for supported features from

>   BMIC sense feature command instead of config table.

> 

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi.h      |   77 +++++++-

>  drivers/scsi/smartpqi/smartpqi_init.c |  328

> +++++++++++++++++++++++++++++----

>  2 files changed, 363 insertions(+), 42 deletions(-)

> 


In general: This patch contains a lot of whitespace, indentation, and
minor comment formatting changes which should rather go into a separate
patch IMHO. This one is big enough without them.

Further remarks below.

> [...]

> 

> @@ -2552,7 +2686,7 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>         u32 next_bypass_group;

>         struct pqi_encryption_info *encryption_info_ptr;

>         struct pqi_encryption_info encryption_info;

> -       struct pqi_scsi_dev_raid_map_data rmd = {0};

> +       struct pqi_scsi_dev_raid_map_data rmd = { 0 };

>  

>         rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);

>         if (rc)

> @@ -2613,7 +2747,9 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>         pqi_set_aio_cdb(&rmd);

>  

>         if (get_unaligned_le16(&raid_map->flags) &

> -               RAID_MAP_ENCRYPTION_ENABLED) {

> +                       RAID_MAP_ENCRYPTION_ENABLED) {

> +               if (rmd.data_length > device->max_transfer_encrypted)

> +                       return PQI_RAID_BYPASS_INELIGIBLE;

>                 pqi_set_encryption_info(&encryption_info, raid_map,

>                         rmd.first_block);

>                 encryption_info_ptr = &encryption_info;

> @@ -2623,10 +2759,6 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>  


This hunk is fine, but AFAICS it doesn't belong here logically, it
should rather be part of patch 04 and 05.

>         if (rmd.is_write) {

>                 switch (device->raid_level) {

> -               case SA_RAID_0:

> -                       return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

> -                               rmd.cdb, rmd.cdb_length, queue_group,

> -                               encryption_info_ptr, true);

>                 case SA_RAID_1:

>                 case SA_RAID_ADM:

>                         return pqi_aio_submit_r1_write_io(ctrl_info,

> scmd, queue_group,

> @@ -2635,17 +2767,12 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>                 case SA_RAID_6:

>                         return pqi_aio_submit_r56_write_io(ctrl_info,

> scmd, queue_group,

>                                         encryption_info_ptr, device,

> &rmd);

> -               default:

> -                       return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

> -                               rmd.cdb, rmd.cdb_length, queue_group,

> -                               encryption_info_ptr, true);

>                 }

> -       } else {

> -               return pqi_aio_submit_io(ctrl_info, scmd,

> rmd.aio_handle,

> -                       rmd.cdb, rmd.cdb_length, queue_group,

> -                       encryption_info_ptr, true);

>         }

>  

> +       return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,

> +               rmd.cdb, rmd.cdb_length, queue_group,

> +               encryption_info_ptr, true);

>  }

>  

>  #define PQI_STATUS_IDLE                0x0

> @@ -7209,6 +7336,7 @@ static int pqi_enable_firmware_features(struct

> pqi_ctrl_info *ctrl_info,

>  {

>         void *features_requested;

>         void __iomem *features_requested_iomem_addr;

> +       void __iomem *host_max_known_feature_iomem_addr;

>  

>         features_requested = firmware_features->features_supported +

>                 le16_to_cpu(firmware_features->num_elements);

> @@ -7219,6 +7347,16 @@ static int pqi_enable_firmware_features(struct

> pqi_ctrl_info *ctrl_info,

>         memcpy_toio(features_requested_iomem_addr,

> features_requested,

>                 le16_to_cpu(firmware_features->num_elements));

>  

> +       if (pqi_is_firmware_feature_supported(firmware_features,

> +               PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE)) {

> +               host_max_known_feature_iomem_addr =

> +                       features_requested_iomem_addr +

> +                       (le16_to_cpu(firmware_features->num_elements)

> * 2) +

> +                       sizeof(__le16);

> +               writew(PQI_FIRMWARE_FEATURE_MAXIMUM,

> +                       host_max_known_feature_iomem_addr);

> +       }

> +

>         return pqi_config_table_update(ctrl_info,

>                 PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES,

>                 PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES);

> @@ -7256,6 +7394,15 @@ static void

> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,

>         struct pqi_firmware_feature *firmware_feature)

>  {

>         switch (firmware_feature->feature_bit) {

> +       case PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS:

> +               ctrl_info->enable_r1_writes = firmware_feature-

> >enabled;

> +               break;

> +       case PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS:

> +               ctrl_info->enable_r5_writes = firmware_feature-

> >enabled;

> +               break;

> +       case PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS:

> +               ctrl_info->enable_r6_writes = firmware_feature-

> >enabled;

> +               break;

>         case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:

>                 ctrl_info->soft_reset_handshake_supported =

>                         firmware_feature->enabled;

> @@ -7293,6 +7440,51 @@ static struct pqi_firmware_feature

> pqi_firmware_features[] = {

>                 .feature_bit = PQI_FIRMWARE_FEATURE_SMP,

>                 .feature_status = pqi_firmware_feature_status,

>         },

> +       {

> +               .feature_name = "Maximum Known Feature",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 0 Read Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 1 Read Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 5 Read Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 6 Read Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 0 Write Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS,

> +               .feature_status = pqi_firmware_feature_status,

> +       },

> +       {

> +               .feature_name = "RAID 1 Write Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS,

> +               .feature_status = pqi_ctrl_update_feature_flags,

> +       },

> +       {

> +               .feature_name = "RAID 5 Write Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS,

> +               .feature_status = pqi_ctrl_update_feature_flags,

> +       },

> +       {

> +               .feature_name = "RAID 6 Write Bypass",

> +               .feature_bit =

> PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS,

> +               .feature_status = pqi_ctrl_update_feature_flags,

> +       },

>         {

>                 .feature_name = "New Soft Reset Handshake",

>                 .feature_bit =

> PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE,

> @@ -7667,6 +7859,17 @@ static int pqi_ctrl_init(struct pqi_ctrl_info

> *ctrl_info)

>  

>         pqi_start_heartbeat_timer(ctrl_info);

>  

> +       if (ctrl_info->enable_r5_writes || ctrl_info-

> >enable_r6_writes) {

> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);

> +               if (rc) {

> +                       dev_err(&ctrl_info->pci_dev->dev,

> +                               "error obtaining advanced RAID bypass

> configuration\n");

> +                       return rc;

> +               }

> +               ctrl_info->ciss_report_log_flags |=

> +                       CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX;

> +       }

> +

>         rc = pqi_enable_events(ctrl_info);

>         if (rc) {

>                 dev_err(&ctrl_info->pci_dev->dev,

> @@ -7822,6 +8025,17 @@ static int pqi_ctrl_init_resume(struct

> pqi_ctrl_info *ctrl_info)

>  

>         pqi_start_heartbeat_timer(ctrl_info);

>  

> +       if (ctrl_info->enable_r5_writes || ctrl_info-

> >enable_r6_writes) {

> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);

> +               if (rc) {

> +                       dev_err(&ctrl_info->pci_dev->dev,

> +                               "error obtaining advanced RAID bypass

> configuration\n");

> +                       return rc;


Do you need to error out here ? Can't you simply unset the
enable_rX_writes feature?


Regards
Martin

Martin Wilck Jan. 7, 2021, 4:44 p.m. UTC | #11

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> The specification for AIO Sub-Page (0x02) has changed slightly.

> * bring the driver into conformance with the spec.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi.h      |   12 ++++---

>  drivers/scsi/smartpqi/smartpqi_init.c |   60 +++++++++++++++++++++--

> ----------

>  2 files changed, 47 insertions(+), 25 deletions(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi.h

> b/drivers/scsi/smartpqi/smartpqi.h

> index 31281cddadfe..eb23c3cf59c0 100644

> --- a/drivers/scsi/smartpqi/smartpqi.h

> +++ b/drivers/scsi/smartpqi/smartpqi.h

> @@ -1028,13 +1028,13 @@ struct pqi_scsi_dev_raid_map_data {

>         u8      cdb_length;

>  

>         /* RAID 1 specific */

> -#define NUM_RAID1_MAP_ENTRIES 3

> +#define NUM_RAID1_MAP_ENTRIES  3

>         u32     num_it_nexus_entries;

>         u32     it_nexus[NUM_RAID1_MAP_ENTRIES];

>  

>         /* RAID 5 / RAID 6 specific */

> -       u32     p_parity_it_nexus; /* aio_handle */

> -       u32     q_parity_it_nexus; /* aio_handle */

> +       u32     p_parity_it_nexus;      /* aio_handle */

> +       u32     q_parity_it_nexus;      /* aio_handle */

>         u8      xor_mult;

>         u64     row;

>         u64     stripe_lba;

> @@ -1044,6 +1044,7 @@ struct pqi_scsi_dev_raid_map_data {

>  

>  #define RAID_CTLR_LUNID                "\0\0\0\0\0\0\0\0"

>  

> +

>  struct pqi_scsi_dev {

>         int     devtype;                /* as reported by INQUIRY

> commmand */

>         u8      device_type;            /* as reported by */

> @@ -1302,7 +1303,8 @@ struct pqi_ctrl_info {

>         u32             max_transfer_encrypted_sas_sata;

>         u32             max_transfer_encrypted_nvme;

>         u32             max_write_raid_5_6;

> -

> +       u32             max_write_raid_1_10_2drive;

> +       u32             max_write_raid_1_10_3drive;

>  

>         struct list_head scsi_device_list;

>         spinlock_t      scsi_device_list_lock;

> @@ -1533,6 +1535,8 @@ struct bmic_sense_feature_io_page_aio_subpage {

>         __le16  max_transfer_encrypted_sas_sata;

>         __le16  max_transfer_encrypted_nvme;

>         __le16  max_write_raid_5_6;

> +       __le16  max_write_raid_1_10_2drive;

> +       __le16  max_write_raid_1_10_3drive;

>  };

>  

>  struct bmic_smp_request {

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index aa21c1cd2cac..419887aa8ff3 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -696,6 +696,19 @@ static int pqi_identify_physical_device(struct

> pqi_ctrl_info *ctrl_info,

>         return rc;

>  }

>  

> +static inline u32 pqi_aio_limit_to_bytes(__le16 *limit)

> +{

> +       u32 bytes;

> +

> +       bytes = get_unaligned_le16(limit);

> +       if (bytes == 0)

> +               bytes = ~0;

> +       else

> +               bytes *= 1024;

> +

> +       return bytes;

> +}


Nice, but this function and it's callers belong into patch 06/25.


> +

>  #pragma pack(1)

>  

>  struct bmic_sense_feature_buffer {

> @@ -707,11 +720,11 @@ struct bmic_sense_feature_buffer {

>  

>  #define MINIMUM_AIO_SUBPAGE_BUFFER_LENGTH      \

>         offsetofend(struct bmic_sense_feature_buffer, \

> -               aio_subpage.max_write_raid_5_6)

> +               aio_subpage.max_write_raid_1_10_3drive)

>  

>  #define MINIMUM_AIO_SUBPAGE_LENGTH     \

>         (offsetofend(struct bmic_sense_feature_io_page_aio_subpage, \

> -               max_write_raid_5_6) - \

> +               max_write_raid_1_10_3drive) - \

>                 sizeof_field(struct

> bmic_sense_feature_io_page_aio_subpage, header))

>  

>  static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info

> *ctrl_info)

> @@ -753,33 +766,28 @@ static int

> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)

>                         BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||

>                 get_unaligned_le16(&buffer-

> >aio_subpage.header.page_length) <

>                         MINIMUM_AIO_SUBPAGE_LENGTH) {

> -               rc = -EINVAL;


This should be changed in 06/25.

>                 goto error;

>         }

>  


Regards
Martin

Martin Wilck Jan. 7, 2021, 11:43 p.m. UTC | #12

On Tue, 2020-12-15 at 20:23 +0000, Don.Brace@microchip.com wrote:
> Please see answers below. Hope this helps.

> 

> -----Original Message-----

> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 

> Sent: Monday, December 14, 2020 11:54 AM

> To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett -

> C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <

> Scott.Teel@microchip.com>; Justin Lindley - C33718 <

> Justin.Lindley@microchip.com>; Scott Benesh - C33703 <

> Scott.Benesh@microchip.com>; Gerry Morong - C33720 <

> Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <

> Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; 

> joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley <

> jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>

> Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald

> Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>

> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

> 

> EXTERNAL EMAIL: Do not click links or open attachments unless you

> know the content is safe

> 

> Dear Don, dear Mahesh,

> 

> 

> Am 10.12.20 um 21:35 schrieb Don Brace:

> > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> > 

> > * Correct scsi-mid-layer sending more requests than

> >    exposed host Q depth causing firmware ASSERT issue.

> >    * Add host Qdepth counter.

> 

> This supposedly fixes the regression between Linux 5.4 and 5.9, which

> we reported in [1].

> 

>      kernel: smartpqi 0000:89:00.0: controller is offline: status

> code 0x6100c

>      kernel: smartpqi 0000:89:00.0: controller offline

> 

> Thank you for looking into this issue and fixing it. We are going to

> test this.

> 

> For easily finding these things in the git history or the WWW, it

> would be great if these log messages could be included (in the

> future).

> DON> Thanks for your suggestion. Well add them in the next time.

> 

> Also, that means, that the regression is still present in Linux 5.10,

> released yesterday, and this commit does not apply to these versions.

> 

> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

> depending when all of the patches are applied. The patch in question

> is among 28 other patches.

> 

> Mahesh, do you have any idea, what commit caused the regression and

> why the issue started to show up?

> DON> The smartpqi driver sets two scsi_host_template member fields:

> .can_queue and .nr_hw_queues. But we have not yet converted to

> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,

> which is more than the hw can support. That can be verified by

> looking at scsi_host.h.

>         /*

>          * In scsi-mq mode, the number of hardware queues supported

> by the LLD.

>          *

>          * Note: it is assumed that each hardware queue has a queue

> depth of

>          * can_queue. In other words, the total queue depth per host

>          * is nr_hw_queues * can_queue. However, for when host_tagset

> is set,

>          * the total queue depth is can_queue.

>          */

> 

> So, until we make this change, the queue_depth change prevents the

> above issue from happening.


can_queue and nr_hw_queues have been set like this as long as the
driver existed. Why did Paul observe a regression with 5.9?

And why can't you simply set can_queue to 
(ctrl_info->scsi_ml_can_queue / nr_hw_queues)?

Regards,
Martin

Martin Wilck Jan. 8, 2021, 12:13 a.m. UTC | #13

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Non-functional changes.

> * Reduce differences between out-of-box driver and

>   kernel.org driver.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>


Reviewed-by: Martin Wilck <mwilck@suse.com>

Martin Wilck Jan. 8, 2021, 12:13 a.m. UTC | #14

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> * Allow R5/R6 stream detection to be disabled/enabled.

>   using sysfs entry enable_stream_detection.

> 

> Example usage:

> 

> lsscsi

> [2:2:0:0]    storage Adaptec  3258P-32i /e     0010

>  ^

>  |

>  +---- NOTE: here host is host2

> 

> find /sys -name \*enable_stream\*

> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3

> 9:00.0/host2/scsi_host/host2/enable_stream_detection

> /sys/devices/pci0000:5b/0000:5b:00.0/0000:5c:00.0/host3/scsi_host/hos

> t3/enable_stream_detection

> 

> Current stream detection:

> cat

> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3

> 9:00.0/host2/scsi_host/host2/enable_stream_detection

> 1

> 

> Turn off stream detection:

> echo 0 >

> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3

> 9:00.0/host2/scsi_host/host2/enable_stream_detection

> 

> Turn on stream detection:

> echo 1 >

> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3

> 9:00.0/host2/scsi_host/host2/enable_stream_detection

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>


Nitpick below, but

Reviewed-by: Martin Wilck <mwilck@suse.com>


> ---

>  drivers/scsi/smartpqi/smartpqi_init.c |   32

> ++++++++++++++++++++++++++++++++

>  1 file changed, 32 insertions(+)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 96383d047a88..9a449bbc1898 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -6724,6 +6724,34 @@ static ssize_t pqi_lockup_action_store(struct

> device *dev,

>         return -EINVAL;

>  }

>  

> +static ssize_t pqi_host_enable_stream_detection_show(struct device

> *dev,

> +       struct device_attribute *attr, char *buffer)

> +{

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +

> +       return scnprintf(buffer, 10, "%hhx\n",

> +                       ctrl_info->enable_stream_detection);


Nitpick: As noted before, %hhx is discouraged.

Regards,
Martin

Martin Wilck Jan. 8, 2021, 12:14 a.m. UTC | #15

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Release QRM memory (OFA buffer) on OFA error conditions.

> * Controller is left in a bad state which can cause a kernel panic

>     upon reboot after an unsuccessful OFA.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>


I don't understand how the patch description relates to the actual
change. With the patch, the buffers are released just like before,
only some instructions later. So apparently, without this patch, the
OFA memory had been released prematurely?

Anyway,

Reviewed-by: Martin Wilck <mwilck@suse.com>

Martin Wilck Jan. 8, 2021, 12:27 a.m. UTC | #16

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Add additional logging to help in debugging issues

>   with LUN resets.

> 

> Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>


The patch description is not complete, as the patch also changes
some timings. Two remarks below.

Cheers,
Martin

> ---

>  drivers/scsi/smartpqi/smartpqi_init.c |  125

> +++++++++++++++++++++++----------

>  1 file changed, 89 insertions(+), 36 deletions(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 6b624413c8e6..1c51a59f1da6 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -84,7 +84,7 @@ static void pqi_ofa_setup_host_buffer(struct

> pqi_ctrl_info *ctrl_info);

>  static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info

> *ctrl_info);

>  static int pqi_ofa_host_memory_update(struct pqi_ctrl_info

> *ctrl_info);

>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info

> *ctrl_info,

> -       struct pqi_scsi_dev *device, unsigned long timeout_secs);

> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs);

>  

>  /* for flags argument to pqi_submit_raid_request_synchronous() */

>  #define PQI_SYNC_FLAGS_INTERRUPTABLE   0x1

> @@ -335,11 +335,34 @@ static void pqi_wait_if_ctrl_blocked(struct

> pqi_ctrl_info *ctrl_info)

>         atomic_dec(&ctrl_info->num_blocked_threads);

>  }

>  

> +#define PQI_QUIESE_WARNING_TIMEOUT_SECS                10


Did you mean QUIESCE ?

> +

>  static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info

> *ctrl_info)

>  {

> +       unsigned long start_jiffies;

> +       unsigned long warning_timeout;

> +       bool displayed_warning;

> +

> +       displayed_warning = false;

> +       start_jiffies = jiffies;

> +       warning_timeout = (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ)

> + start_jiffies;

> +

>         while (atomic_read(&ctrl_info->num_busy_threads) >

> -               atomic_read(&ctrl_info->num_blocked_threads))

> +               atomic_read(&ctrl_info->num_blocked_threads)) {

> +               if (time_after(jiffies, warning_timeout)) {

> +                       dev_warn(&ctrl_info->pci_dev->dev,

> +                               "waiting %u seconds for driver

> activity to quiesce\n",

> +                               jiffies_to_msecs(jiffies -

> start_jiffies) / 1000);

> +                       displayed_warning = true;

> +                       warning_timeout =

> (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;

> +               }

>                 usleep_range(1000, 2000);

> +       }

> +

> +       if (displayed_warning)

> +               dev_warn(&ctrl_info->pci_dev->dev,

> +                       "driver activity quiesced after waiting for

> %u seconds\n",

> +                       jiffies_to_msecs(jiffies - start_jiffies) /

> 1000);

>  }

>  

>  static inline bool pqi_device_offline(struct pqi_scsi_dev *device)

> @@ -1670,7 +1693,7 @@ static int pqi_add_device(struct pqi_ctrl_info

> *ctrl_info,

>         return rc;

>  }

>  

> -#define PQI_PENDING_IO_TIMEOUT_SECS    20

> +#define PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS     (20 * 1000)

>  

>  static inline void pqi_remove_device(struct pqi_ctrl_info

> *ctrl_info, struct pqi_scsi_dev *device)

>  {

> @@ -1678,7 +1701,8 @@ static inline void pqi_remove_device(struct

> pqi_ctrl_info *ctrl_info, struct pqi

>  

>         pqi_device_remove_start(device);

>  

> -       rc = pqi_device_wait_for_pending_io(ctrl_info, device,

> PQI_PENDING_IO_TIMEOUT_SECS);

> +       rc = pqi_device_wait_for_pending_io(ctrl_info, device,

> +               PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS);

>         if (rc)

>                 dev_err(&ctrl_info->pci_dev->dev,

>                         "scsi %d:%d:%d:%d removing device with %d

> outstanding command(s)\n",

> @@ -3070,7 +3094,7 @@ static void pqi_process_io_error(unsigned int

> iu_type,

>         }

>  }

>  

> -static int pqi_interpret_task_management_response(

> +static int pqi_interpret_task_management_response(struct

> pqi_ctrl_info *ctrl_info,

>         struct pqi_task_management_response *response)

>  {

>         int rc;

> @@ -3088,6 +3112,10 @@ static int

> pqi_interpret_task_management_response(

>                 break;

>         }

>  

> +       if (rc)

> +               dev_err(&ctrl_info->pci_dev->dev,

> +                       "Task Management Function error: %d (response

> code: %u)\n", rc, response->response_code);

> +

>         return rc;

>  }

>  

> @@ -3156,9 +3184,8 @@ static int pqi_process_io_intr(struct

> pqi_ctrl_info *ctrl_info, struct pqi_queue

>                                 &((struct pqi_vendor_general_response

> *)response)->status);

>                         break;

>                 case PQI_RESPONSE_IU_TASK_MANAGEMENT:

> -                       io_request->status =

> -

>                                pqi_interpret_task_management_response(

> -                                       (void *)response);

> +                       io_request->status =

> pqi_interpret_task_management_response(ctrl_info,

> +                               (void *)response);

>                         break;

>                 case PQI_RESPONSE_IU_AIO_PATH_DISABLED:

>                         pqi_aio_path_disabled(io_request);

> @@ -5862,24 +5889,37 @@ static void

> pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,

>         }

>  }

>  

> +#define PQI_PENDING_IO_WARNING_TIMEOUT_SECS    10

> +

>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info

> *ctrl_info,

> -       struct pqi_scsi_dev *device, unsigned long timeout_secs)

> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs)

>  {

> -       unsigned long timeout;

> +       int cmds_outstanding;

> +       unsigned long start_jiffies;

> +       unsigned long warning_timeout;

> +       unsigned long msecs_waiting;

>  

> +       start_jiffies = jiffies;

> +       warning_timeout = (PQI_PENDING_IO_WARNING_TIMEOUT_SECS *

> PQI_HZ) + start_jiffies;

>  

> -       timeout = (timeout_secs * PQI_HZ) + jiffies;

> -

> -       while (atomic_read(&device->scsi_cmds_outstanding)) {

> +       while ((cmds_outstanding = atomic_read(&device-

> >scsi_cmds_outstanding)) > 0) {

>                 pqi_check_ctrl_health(ctrl_info);

>                 if (pqi_ctrl_offline(ctrl_info))

>                         return -ENXIO;

> -               if (timeout_secs != NO_TIMEOUT) {

> -                       if (time_after(jiffies, timeout)) {

> -                               dev_err(&ctrl_info->pci_dev->dev,

> -                                       "timed out waiting for

> pending I/O\n");

> -                               return -ETIMEDOUT;

> -                       }

> +               msecs_waiting = jiffies_to_msecs(jiffies -

> start_jiffies);

> +               if (msecs_waiting > timeout_msecs) {

> +                       dev_err(&ctrl_info->pci_dev->dev,

> +                               "scsi %d:%d:%d:%d: timed out after

> %lu seconds waiting for %d outstanding command(s)\n",

> +                               ctrl_info->scsi_host->host_no,

> device->bus, device->target,

> +                               device->lun, msecs_waiting / 1000,

> cmds_outstanding);

> +                       return -ETIMEDOUT;

> +               }

> +               if (time_after(jiffies, warning_timeout)) {

> +                       dev_warn(&ctrl_info->pci_dev->dev,

> +                               "scsi %d:%d:%d:%d: waiting %lu

> seconds for %d outstanding command(s)\n",

> +                               ctrl_info->scsi_host->host_no,

> device->bus, device->target,

> +                               device->lun, msecs_waiting / 1000,

> cmds_outstanding);

> +                       warning_timeout =

> (PQI_PENDING_IO_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;

>                 }

>                 usleep_range(1000, 2000);

>         }

> @@ -5895,13 +5935,15 @@ static void pqi_lun_reset_complete(struct

> pqi_io_request *io_request,

>         complete(waiting);

>  }

>  

> -#define PQI_LUN_RESET_TIMEOUT_SECS             30

>  #define PQI_LUN_RESET_POLL_COMPLETION_SECS     10

>  

>  static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info

> *ctrl_info,

>         struct pqi_scsi_dev *device, struct completion *wait)

>  {

>         int rc;

> +       unsigned int wait_secs;

> +

> +       wait_secs = 0;

>  

>         while (1) {

>                 if (wait_for_completion_io_timeout(wait,

> @@ -5915,13 +5957,21 @@ static int

> pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,

>                         rc = -ENXIO;

>                         break;

>                 }

> +

> +               wait_secs += PQI_LUN_RESET_POLL_COMPLETION_SECS;

> +

> +               dev_warn(&ctrl_info->pci_dev->dev,

> +                       "scsi %d:%d:%d:%d: waiting %u seconds for LUN

> reset to complete\n",

> +                       ctrl_info->scsi_host->host_no, device->bus,

> device->target, device->lun,

> +                       wait_secs);

>         }

>  

>         return rc;

>  }

>  

> -static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,

> -       struct pqi_scsi_dev *device)

> +#define PQI_LUN_RESET_FIRMWARE_TIMEOUT_SECS    30

> +

> +static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info, struct

> pqi_scsi_dev *device)

>  {

>         int rc;

>         struct pqi_io_request *io_request;

> @@ -5943,8 +5993,7 @@ static int pqi_lun_reset(struct pqi_ctrl_info

> *ctrl_info,

>                 sizeof(request->lun_number));

>         request->task_management_function =

> SOP_TASK_MANAGEMENT_LUN_RESET;

>         if (ctrl_info->tmf_iu_timeout_supported)

> -               put_unaligned_le16(PQI_LUN_RESET_TIMEOUT_SECS,

> -                                       &request->timeout);

> +               put_unaligned_le16(PQI_LUN_RESET_FIRMWARE_TIMEOUT_SEC

> S, &request->timeout);

>  

>         pqi_start_io(ctrl_info, &ctrl_info-

> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,

>                 io_request);

> @@ -5958,29 +6007,33 @@ static int pqi_lun_reset(struct pqi_ctrl_info

> *ctrl_info,

>         return rc;

>  }

>  

> -#define PQI_LUN_RESET_RETRIES                  3

> -#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000

> -#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120

> +#define PQI_LUN_RESET_RETRIES                          3

> +#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS             (10 * 1000)

> +#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS         (10 * 60 *

> 1000)


10 minutes? Isn't that a bit much?

> +#define PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS  (2 * 60 *

> 1000)


Why wait less long after a failure?



>  

> -static int pqi_lun_reset_with_retries(struct pqi_ctrl_info

> *ctrl_info,

> -       struct pqi_scsi_dev *device)

> +static int pqi_lun_reset_with_retries(struct pqi_ctrl_info

> *ctrl_info, struct pqi_scsi_dev *device)

>  {

> -       int rc;

> +       int reset_rc;

> +       int wait_rc;

>         unsigned int retries;

> -       unsigned long timeout_secs;

> +       unsigned long timeout_msecs;

>  

>         for (retries = 0;;) {

> -               rc = pqi_lun_reset(ctrl_info, device);

> -               if (rc == 0 || ++retries > PQI_LUN_RESET_RETRIES)

> +               reset_rc = pqi_lun_reset(ctrl_info, device);

> +               if (reset_rc == 0 || ++retries >

> PQI_LUN_RESET_RETRIES)

>                         break;

>                 msleep(PQI_LUN_RESET_RETRY_INTERVAL_MSECS);

>         }

>  

> -       timeout_secs = rc ? PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS :

> NO_TIMEOUT;

> +       timeout_msecs = reset_rc ?

> PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS :

> +               PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS;

>  

> -       rc |= pqi_device_wait_for_pending_io(ctrl_info, device,

> timeout_secs);

> +       wait_rc = pqi_device_wait_for_pending_io(ctrl_info, device,

> timeout_msecs);

> +       if (wait_rc && reset_rc == 0)

> +               reset_rc = wait_rc;

>  

> -       return rc == 0 ? SUCCESS : FAILED;

> +       return reset_rc == 0 ? SUCCESS : FAILED;

>  }

>  

>  static int pqi_device_reset(struct pqi_ctrl_info *ctrl_info,

>

Martin Wilck Jan. 8, 2021, 12:30 a.m. UTC | #17

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Murthy Bhat <Murthy.Bhat@microchip.com>

> 

> * Update enclosure identifier field corresponding to

>   physical devices in lsscsi/sysfs.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi_init.c |    1 -

>  1 file changed, 1 deletion(-)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 1c51a59f1da6..40ae82470d8c 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info

> *ctrl_info,

>  static void pqi_scsi_update_device(struct pqi_scsi_dev

> *existing_device,

>         struct pqi_scsi_dev *new_device)

>  {

> -       existing_device->devtype = new_device->devtype;

>         existing_device->device_type = new_device->device_type;

>         existing_device->bus = new_device->bus;

>         if (new_device->target_lun_valid) {

> 


I don't get this. Why was it wrong to update the devtype field?

Martin Wilck Jan. 8, 2021, 12:34 a.m. UTC | #18

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Correct system hangs when resuming from hibernation after

>   first successful hibernation/resume cycle.

>   * Rare condition involving OFA.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>

> ---

>  drivers/scsi/smartpqi/smartpqi_init.c |    5 +++++

>  1 file changed, 5 insertions(+)

> 

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 40ae82470d8c..5ca265babaa2 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct

> pci_dev *pci_dev)

>         pci_set_power_state(pci_dev, PCI_D0);

>         pci_restore_state(pci_dev);

>  

> +       pqi_ctrl_unblock_device_reset(ctrl_info);

> +       pqi_ctrl_unblock_requests(ctrl_info);

> +       pqi_scsi_unblock_requests(ctrl_info);

> +       pqi_ctrl_unblock_scan(ctrl_info);

> +

>         return pqi_ctrl_init_resume(ctrl_info);

>  }


Like I said in my comments on 14/25:

pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand
to mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and
a mutex has to be unlocked by the task that owns the lock. How
can you be sure that these conditions are met here?

Regards
Martin

Martin Wilck Jan. 8, 2021, 12:35 a.m. UTC | #19

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

> 

> * Add support for newer HW.

> 

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>

> Reviewed-by: Scott Teel <scott.teel@microchip.com>

> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>

> Signed-off-by: Don Brace <don.brace@microchip.com>


Looks good (I haven't verified the IDs).

Acked-by: Martin Wilck <mwilck@suse.com>

Don Brace Jan. 8, 2021, 10:56 p.m. UTC | #20

Subject: Re: [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes

Stg s>
>  struct pqi_raid_path_request {

>         struct pqi_iu_header header;

> @@ -312,6 +313,39 @@ struct pqi_aio_path_request {

>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];

>  };

>

> +#define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */

> +#define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */


You don't seem to use these, and you'll remove them again in patch 06/25.

Don: Removed these definitions.

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 6bcb037ae9d7..c813cec10003 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -2245,13 +2250,14 @@ static bool

> pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)

>         case SA_RAID_0:

>                 break;

>         case SA_RAID_1:

> -               if (rmd->is_write)

> -                       is_supported = false;

> +               is_supported = false;


You disable RAID1 READs with this patch. I can see you fix it again in 05/25, still it looks wrong.

Don: Corrected

>                 break;

>         case SA_RAID_5:

> -               fallthrough;

> +               if (rmd->is_write && !ctrl_info->enable_r5_writes)

> +                       is_supported = false;

> +               break;

>         case SA_RAID_6:

> -               if (rmd->is_write)

> +               if (rmd->is_write && !ctrl_info->enable_r6_writes)

>                         is_supported = false;

>                 break;

>         case SA_RAID_ADM:

> @@ -2526,6 +2532,26 @@ static int pqi_calc_aio_r5_or_r6(struct 

> pqi_scsi_dev_raid_map_data *rmd,

>                 rmd->total_disks_per_row)) +

>                 (rmd->map_row * rmd->total_disks_per_row) + rmd-

> > first_column;

>

> +       if (rmd->is_write) {

> +               rmd->p_index = (rmd->map_row * rmd-

> > total_disks_per_row) + rmd->data_disks_per_row;

> +               rmd->p_parity_it_nexus = raid_map->disk_data[rmd-

> > p_index].aio_handle;


I suppose you have made sure rmd->p_index can't be larger than the size of raid_map->disk_data. A comment explaining that would be helpful for the reader though.

Don: Added a comment for p_index.

> +               if (rmd->raid_level == SA_RAID_6) {

> +                       rmd->q_index = (rmd->map_row * rmd-

> > total_disks_per_row) +

> +                               (rmd->data_disks_per_row + 1);

> +                       rmd->q_parity_it_nexus = raid_map-

> > disk_data[rmd->q_index].aio_handle;

> +                       rmd->xor_mult = raid_map->disk_data[rmd-

> > map_index].xor_mult[1];


See above.

Don: Comment updated to include q_index.

> +               }

> +               if (rmd->blocks_per_row == 0)

> +                       return PQI_RAID_BYPASS_INELIGIBLE; #if 

> +BITS_PER_LONG == 32

> +               tmpdiv = rmd->first_block;

> +               do_div(tmpdiv, rmd->blocks_per_row);

> +               rmd->row = tmpdiv;

> +#else

> +               rmd->row = rmd->first_block / rmd->blocks_per_row; 

> +#endif


Why not always use do_div()?

Don: I had removed the BITS_PER_LONG check, was an attempt to clean up the code, but forgot we still need to support 32bit and I just re-added BITS_PER_LONG HUNKS. These HUNKS were there before I refactored the code so it predates me. Any chance I can leave this in? It's been through a lot of regression testing already...

> @@ -4844,6 +4889,12 @@ static void

> pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)

>                 PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /

>                 sizeof(struct pqi_sg_descriptor)) +

>                 PQI_MAX_EMBEDDED_SG_DESCRIPTORS;

> +

> +       ctrl_info->max_sg_per_r56_iu =

> +               ((ctrl_info->max_inbound_iu_length -

> +               PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /

> +               sizeof(struct pqi_sg_descriptor)) +

> +               PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;

>  }

>

>  static inline void pqi_set_sg_descriptor( @@ -4931,6 +4982,44 @@ 

> static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,

>         return 0;

>  }

>

> +static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info

> *ctrl_info,

> +       struct pqi_aio_r56_path_request *request, struct scsi_cmnd

> *scmd,

> +       struct pqi_io_request *io_request) {

> +       u16 iu_length;

> +       int sg_count;

> +       bool chained;

> +       unsigned int num_sg_in_iu;

> +       struct scatterlist *sg;

> +       struct pqi_sg_descriptor *sg_descriptor;

> +

> +       sg_count = scsi_dma_map(scmd);

> +       if (sg_count < 0)

> +               return sg_count;

> +

> +       iu_length = offsetof(struct pqi_aio_r56_path_request,

> sg_descriptors) -

> +               PQI_REQUEST_HEADER_LENGTH;

> +       num_sg_in_iu = 0;

> +

> +       if (sg_count == 0)

> +               goto out;


An if {} block would be better readable here.
Don> done.

>  }

>

> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd) {

> +

> +       switch (scmd->sc_data_direction) {

> +       case DMA_TO_DEVICE:

> +               r56_request->data_direction = SOP_READ_FLAG;

> +               break;


I wonder how it would be possible that sc_data_direction is anything else but DMA_TO_DEVICE here. AFAICS we will only reach this code for WRITE commands. Add a comment, please.

Don: Great observation, removed switch block and added a comment. Set direction to write.

> +static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,

> +       struct device_attribute *attr, char *buffer) {

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +

> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-

> > enable_r5_writes);


"%hhx" is deprecated, see
https://lore.kernel.org/lkml/20190914015858.7c76e036@lwn.net/T/

Don: done

> +static ssize_t pqi_host_enable_r6_writes_show(struct device *dev,

> +       struct device_attribute *attr, char *buffer) {

> +       struct Scsi_Host *shost = class_to_shost(dev);

> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);

> +

> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-

> > enable_r6_writes);


See above

Don: done

Don: Thanks for your all of your great effort on this patch. I'll upload a V4 with updates to this patch and the rest of your reviews soon.

Thanks,
Don Brace

Don Brace Jan. 9, 2021, 4:56 p.m. UTC | #21

Subject: Re: [PATCH V3 05/25] smartpqi: add support for raid1 writes
> @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info 

> *ctrl_info,

>         case SA_RAID_0:

>                 break;

>         case SA_RAID_1:

> -               is_supported = false;

> +               fallthrough;


Nit: fallthrough isn't necessary here.
Don: removed fallthrough
>

> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info

> *ctrl_info,

> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,

> +       struct pqi_encryption_info *encryption_info, struct

> pqi_scsi_dev *device,

> +       struct pqi_scsi_dev_raid_map_data *rmd)

> +

> +       switch (scmd->sc_data_direction) {

> +       case DMA_TO_DEVICE:

> +               r1_request->data_direction = SOP_READ_FLAG;

> +               break;


Same question as for previous patch, how would anything else than DMA_TO_DEVICE be possible here?

Don: changed direction to write, added comment.

Thank-you Martin for your review. I'll upload a V4 after I complete the other reviews.

Don Brace

Don Brace Jan. 11, 2021, 5:22 p.m. UTC | #22

Subject: Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits


In general: This patch contains a lot of whitespace, indentation, and minor comment formatting changes which should rather go into a separate patch IMHO. This one is big enough without them.

Don: Moved formatting changes into patch smartpqi-align-code-with-oob-driver

Further remarks below.


> [...]

>

> @@ -2552,7 +2686,7 @@ static int

> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,

>         u32 next_bypass_group;

>         struct pqi_encryption_info *encryption_info_ptr;

>         struct pqi_encryption_info encryption_info;

> -       struct pqi_scsi_dev_raid_map_data rmd = {0};

> +       struct pqi_scsi_dev_raid_map_data rmd = { 0 };

>

>

>         if (get_unaligned_le16(&raid_map->flags) &

> -               RAID_MAP_ENCRYPTION_ENABLED) {

> +                       RAID_MAP_ENCRYPTION_ENABLED) {

> +               if (rmd.data_length > device->max_transfer_encrypted)

> +                       return PQI_RAID_BYPASS_INELIGIBLE;

>                 pqi_set_encryption_info(&encryption_info, raid_map,

>                         rmd.first_block);

>                 encryption_info_ptr = &encryption_info; @@ -2623,10 

> +2759,6 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct 

> pqi_ctrl_info *ctrl_info,

>


This hunk is fine, but AFAICS it doesn't belong here logically, it should rather be part of patch 04 and 05.

Don: The patch adds max_transfer_encrypted field as part of new feature support. We would like to leave the update in this patch.


> @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)

>

>         pqi_start_heartbeat_timer(ctrl_info);

>

> +       if (ctrl_info->enable_r5_writes || ctrl_info-

> >enable_r6_writes) {

> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);

> +               if (rc) {

> +                       dev_err(&ctrl_info->pci_dev->dev,

> +                               "error obtaining advanced RAID bypass

> configuration\n");

> +                       return rc;


Do you need to error out here ? Can't you simply unset the enable_rX_writes feature?

This function should never fail, so a failure indicates a serious problem. But we're considering some changes in that area that we may push up at a later date.

Regards
Martin

Don Brace Jan. 11, 2021, 8:53 p.m. UTC | #23

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>

>

> The specification for AIO Sub-Page (0x02) has changed slightly.

> * bring the driver into conformance with the spec.

>

>

> +static inline u32 pqi_aio_limit_to_bytes(__le16 *limit) {

> +       u32 bytes;

> +

> +       bytes = get_unaligned_le16(limit);

> +       if (bytes == 0)

> +               bytes = ~0;

> +       else

> +               bytes *= 1024;

> +

> +       return bytes;

> +}


Nice, but this function and it's callers belong into patch 06/25.

Don: 
      * Squashed smartpqi-update-AIO-Sub-Page-0x02-support
      * Moved formatting HUNK for pqi_scsi_dev_raid_map_data into
        smartpqi-refactor-aio-submission-code
      * Moved structure pqi_aio_r56_path_request formatting HUNKS into
        smartpqi-add-support-for-raid5-and-raid6-writes
      * Moved remaining formatting HUNKs into
        smartpqi-align-code-with-oob-driver

Thanks for all of your attention to detail,
Don

> +

>  #pragma pack(1)

>

>

>  static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info

> *ctrl_info)

> @@ -753,33 +766,28 @@ static int

> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)

>                         BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||

>                 get_unaligned_le16(&buffer-

> >aio_subpage.header.page_length) <

>                         MINIMUM_AIO_SUBPAGE_LENGTH) {

> -               rc = -EINVAL;


This should be changed in 06/25.

>                 goto error;

>         }

>


Regards
Martin

Don Brace Jan. 12, 2021, 8:28 p.m. UTC | #24

From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 11/25] smartpqi: add host level stream detection enable


Nitpick: As noted before, %hhx is discouraged.

Regards,
Martin

Don: Changed to %x
Thanks for your hard work.
Don

Martin Wilck Jan. 13, 2021, 10:26 a.m. UTC | #25

On Fri, 2021-01-08 at 22:56 +0000, Don.Brace@microchip.com wrote:
> 

> > +               }

> > +               if (rmd->blocks_per_row == 0)

> > +                       return PQI_RAID_BYPASS_INELIGIBLE; #if 

> > +BITS_PER_LONG == 32

> > +               tmpdiv = rmd->first_block;

> > +               do_div(tmpdiv, rmd->blocks_per_row);

> > +               rmd->row = tmpdiv;

> > +#else

> > +               rmd->row = rmd->first_block / rmd->blocks_per_row; 

> > +#endif

> 

> Why not always use do_div()?

> 

> Don: I had removed the BITS_PER_LONG check, was an attempt to clean

> up the code, but forgot we still need to support 32bit and I just re-

> added BITS_PER_LONG HUNKS. These HUNKS were there before I refactored

> the code so it predates me. Any chance I can leave this in? It's been

> through a lot of regression testing already...


My suggestion was to rather do the opposite, use the 32bit code (with
do_div()) for both 32bit and 64bit. AFAIK, this would work just fine 
(but not vice-versa). 

You can leave this in. It was just a suggestion how to improve
readability. Perhaps consider cleaning it up sometime later.

Regards,
Martin

Don Brace Jan. 15, 2021, 9:17 p.m. UTC | #26

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On Tue, 2020-12-15 at 20:23 +0000, Don.Brace@microchip.com wrote:
> Please see answers below. Hope this helps.

>

> -----Original Message-----

> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]

> Sent: Monday, December 14, 2020 11:54 AM

> To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett -

> C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 < 

> Scott.Teel@microchip.com>; Justin Lindley - C33718 < 

> Justin.Lindley@microchip.com>; Scott Benesh - C33703 < 

> Scott.Benesh@microchip.com>; Gerry Morong - C33720 < 

> Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 < 

> Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; 

> joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley < 

> jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>

> Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald 

> Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>

> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>

> EXTERNAL EMAIL: Do not click links or open attachments unless you know 

> the content is safe

>

> Dear Don, dear Mahesh,

>

>

> Am 10.12.20 um 21:35 schrieb Don Brace:

> > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> >

> > * Correct scsi-mid-layer sending more requests than

> >    exposed host Q depth causing firmware ASSERT issue.

> >    * Add host Qdepth counter.

>

> This supposedly fixes the regression between Linux 5.4 and 5.9, which 

> we reported in [1].

>

>      kernel: smartpqi 0000:89:00.0: controller is offline: status code 

> 0x6100c

>      kernel: smartpqi 0000:89:00.0: controller offline

>

> Thank you for looking into this issue and fixing it. We are going to 

> test this.

>

> For easily finding these things in the git history or the WWW, it 

> would be great if these log messages could be included (in the 

> future).

> DON> Thanks for your suggestion. Well add them in the next time.

>

> Also, that means, that the regression is still present in Linux 5.10, 

> released yesterday, and this commit does not apply to these versions.

>

> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

> depending when all of the patches are applied. The patch in question 

> is among 28 other patches.

>

> Mahesh, do you have any idea, what commit caused the regression and 

> why the issue started to show up?

> DON> The smartpqi driver sets two scsi_host_template member fields:

> .can_queue and .nr_hw_queues. But we have not yet converted to 

> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue, 

> which is more than the hw can support. That can be verified by looking 

> at scsi_host.h.

>         /*

>          * In scsi-mq mode, the number of hardware queues supported by 

> the LLD.

>          *

>          * Note: it is assumed that each hardware queue has a queue 

> depth of

>          * can_queue. In other words, the total queue depth per host

>          * is nr_hw_queues * can_queue. However, for when host_tagset 

> is set,

>          * the total queue depth is can_queue.

>          */

>

> So, until we make this change, the queue_depth change prevents the 

> above issue from happening.


can_queue and nr_hw_queues have been set like this as long as the driver existed. Why did Paul observe a regression with 5.9?

And why can't you simply set can_queue to (ctrl_info->scsi_ml_can_queue / nr_hw_queues)?

Don: I did this in an internal patch, but this patch seemed to work the best for our driver. HBA performance remained steady when running benchmarks.

Regards,
Martin

John Garry Jan. 19, 2021, 10:33 a.m. UTC | #27

>>

>> Am 10.12.20 um 21:35 schrieb Don Brace:

>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

>>>

>>> * Correct scsi-mid-layer sending more requests than

>>>     exposed host Q depth causing firmware ASSERT issue.

>>>     * Add host Qdepth counter.

>>

>> This supposedly fixes the regression between Linux 5.4 and 5.9, which

>> we reported in [1].

>>

>>       kernel: smartpqi 0000:89:00.0: controller is offline: status code

>> 0x6100c

>>       kernel: smartpqi 0000:89:00.0: controller offline

>>

>> Thank you for looking into this issue and fixing it. We are going to

>> test this.

>>

>> For easily finding these things in the git history or the WWW, it

>> would be great if these log messages could be included (in the

>> future).

>> DON> Thanks for your suggestion. Well add them in the next time.

>>

>> Also, that means, that the regression is still present in Linux 5.10,

>> released yesterday, and this commit does not apply to these versions.

>>

>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

>> depending when all of the patches are applied. The patch in question

>> is among 28 other patches.

>>

>> Mahesh, do you have any idea, what commit caused the regression and

>> why the issue started to show up?

>> DON> The smartpqi driver sets two scsi_host_template member fields:

>> .can_queue and .nr_hw_queues. But we have not yet converted to

>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,

>> which is more than the hw can support. That can be verified by looking

>> at scsi_host.h.

>>          /*

>>           * In scsi-mq mode, the number of hardware queues supported by

>> the LLD.

>>           *

>>           * Note: it is assumed that each hardware queue has a queue

>> depth of

>>           * can_queue. In other words, the total queue depth per host

>>           * is nr_hw_queues * can_queue. However, for when host_tagset

>> is set,

>>           * the total queue depth is can_queue.

>>           */

>>

>> So, until we make this change, the queue_depth change prevents the

>> above issue from happening.

> 

> can_queue and nr_hw_queues have been set like this as long as the driver existed. Why did Paul observe a regression with 5.9?

> 

> And why can't you simply set can_queue to (ctrl_info->scsi_ml_can_queue / nr_hw_queues)?

> 

> Don: I did this in an internal patch, but this patch seemed to work the best for our driver. HBA performance remained steady when running benchmarks.

> 


I guess that this is a fallout from commit 6eb045e092ef ("scsi:
  core: avoid host-wide host_busy counter for scsi_mq"). But that commit 
is correct.

If .can_queue is set to (ctrl_info->scsi_ml_can_queue / nr_hw_queues), 
then blk-mq can send each hw queue only (ctrl_info->scsi_ml_can_queue / 
nr_hw_queues) commands, while it should be possible to send 
ctrl_info->scsi_ml_can_queue commands.

I think that this can alternatively be solved by setting .host_tagset flag.

Thanks,
John

Martin Wilck Jan. 19, 2021, 2:12 p.m. UTC | #28

On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:
> > > 

> > > Am 10.12.20 um 21:35 schrieb Don Brace:

> > > > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

> > > > 

> > > > * Correct scsi-mid-layer sending more requests than

> > > >     exposed host Q depth causing firmware ASSERT issue.

> > > >     * Add host Qdepth counter.

> > > 

> > > This supposedly fixes the regression between Linux 5.4 and 5.9,

> > > which

> > > we reported in [1].

> > > 

> > >       kernel: smartpqi 0000:89:00.0: controller is offline:

> > > status code

> > > 0x6100c

> > >       kernel: smartpqi 0000:89:00.0: controller offline

> > > 

> > > Thank you for looking into this issue and fixing it. We are going

> > > to

> > > test this.

> > > 

> > > For easily finding these things in the git history or the WWW, it

> > > would be great if these log messages could be included (in the

> > > future).

> > > DON> Thanks for your suggestion. Well add them in the next time.

> > > 

> > > Also, that means, that the regression is still present in Linux

> > > 5.10,

> > > released yesterday, and this commit does not apply to these

> > > versions.

> > > 

> > > DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

> > > depending when all of the patches are applied. The patch in

> > > question

> > > is among 28 other patches.

> > > 

> > > Mahesh, do you have any idea, what commit caused the regression

> > > and

> > > why the issue started to show up?

> > > DON> The smartpqi driver sets two scsi_host_template member

> > > fields:

> > > .can_queue and .nr_hw_queues. But we have not yet converted to

> > > host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,

> > > which is more than the hw can support. That can be verified by

> > > looking

> > > at scsi_host.h.

> > >          /*

> > >           * In scsi-mq mode, the number of hardware queues

> > > supported by

> > > the LLD.

> > >           *

> > >           * Note: it is assumed that each hardware queue has a

> > > queue

> > > depth of

> > >           * can_queue. In other words, the total queue depth per

> > > host

> > >           * is nr_hw_queues * can_queue. However, for when

> > > host_tagset

> > > is set,

> > >           * the total queue depth is can_queue.

> > >           */

> > > 

> > > So, until we make this change, the queue_depth change prevents

> > > the

> > > above issue from happening.

> > 

> > can_queue and nr_hw_queues have been set like this as long as the

> > driver existed. Why did Paul observe a regression with 5.9?

> > 

> > And why can't you simply set can_queue to (ctrl_info-

> > >scsi_ml_can_queue / nr_hw_queues)?

> > 

> > Don: I did this in an internal patch, but this patch seemed to work

> > the best for our driver. HBA performance remained steady when

> > running benchmarks.


That was a stupid suggestion on my part. Sorry.

> I guess that this is a fallout from commit 6eb045e092ef ("scsi:

>   core: avoid host-wide host_busy counter for scsi_mq"). But that

> commit 

> is correct.


It would be good if someone (Paul?) could verify whether that commit
actually caused the regression they saw.

Looking at that 6eb045e092ef, I notice this hunk:

 
-       busy = atomic_inc_return(&shost->host_busy) - 1;
        if (atomic_read(&shost->host_blocked) > 0) {
-               if (busy)
+               if (scsi_host_busy(shost) > 0)
                        goto starved;

Before 6eb045e092ef, the busy count was incremented with membarrier
before looking at "host_blocked". The new code does this instead:

@ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
                spin_unlock_irq(shost->host_lock);
        }
 
+       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
+

but it happens *after* the "host_blocked" check. Could that perhaps
have caused the regression?

Thanks
Martin

Paul Menzel Jan. 19, 2021, 5:43 p.m. UTC | #29

Dear Martin, dear John, dear Don, dear Linux folks,


Am 19.01.21 um 15:12 schrieb Martin Wilck:
> On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:

>>>>

>>>> Am 10.12.20 um 21:35 schrieb Don Brace:

>>>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

>>>>>

>>>>> * Correct scsi-mid-layer sending more requests than

>>>>>      exposed host Q depth causing firmware ASSERT issue.

>>>>>      * Add host Qdepth counter.

>>>>

>>>> This supposedly fixes the regression between Linux 5.4 and 5.9,

>>>> which we reported in [1].

>>>>

>>>>        kernel: smartpqi 0000:89:00.0: controller is offline: status code 0x6100c

>>>>        kernel: smartpqi 0000:89:00.0: controller offline

>>>>

>>>> Thank you for looking into this issue and fixing it. We are going

>>>> to test this.

>>>>

>>>> For easily finding these things in the git history or the WWW, it

>>>> would be great if these log messages could be included (in the

>>>> future).

>>>> DON> Thanks for your suggestion. Well add them in the next time.

>>>>

>>>> Also, that means, that the regression is still present in Linux

>>>> 5.10, released yesterday, and this commit does not apply to these

>>>> versions.

>>>>

>>>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

>>>> depending when all of the patches are applied. The patch in

>>>> question is among 28 other patches.

>>>>

>>>> Mahesh, do you have any idea, what commit caused the regression

>>>> and why the issue started to show up?

>>>> DON> The smartpqi driver sets two scsi_host_template member

>>>> fields:

>>>> .can_queue and .nr_hw_queues. But we have not yet converted to

>>>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,

>>>> which is more than the hw can support. That can be verified by

>>>> looking at scsi_host.h.

>>>>           /*

>>>>            * In scsi-mq mode, the number of hardware queues supported by the LLD.

>>>>            *

>>>>            * Note: it is assumed that each hardware queue has a queue depth of

>>>>            * can_queue. In other words, the total queue depth per host

>>>>            * is nr_hw_queues * can_queue. However, for when host_tagset is set,

>>>>            * the total queue depth is can_queue.

>>>>            */

>>>>

>>>> So, until we make this change, the queue_depth change prevents

>>>> the above issue from happening.

>>>

>>> can_queue and nr_hw_queues have been set like this as long as the

>>> driver existed. Why did Paul observe a regression with 5.9?

>>>

>>> And why can't you simply set can_queue to (ctrl_info-

>>>> scsi_ml_can_queue / nr_hw_queues)?

>>>

>>> Don: I did this in an internal patch, but this patch seemed to work

>>> the best for our driver. HBA performance remained steady when

>>> running benchmarks.

> 

> That was a stupid suggestion on my part. Sorry.

> 

>> I guess that this is a fallout from commit 6eb045e092ef ("scsi:

>>    core: avoid host-wide host_busy counter for scsi_mq"). But that

>> commit is correct.


John, thank you very much for taking the time to point this out. The 
commit showed first up in Linux 5.5-rc1. (The host template flog 
`host_tagset` was introduced in Linux 5.10-rc1.)

> It would be good if someone (Paul?) could verify whether that commit

> actually caused the regression they saw.

> 

> Looking at that 6eb045e092ef, I notice this hunk:

>   

> -       busy = atomic_inc_return(&shost->host_busy) - 1;

>          if (atomic_read(&shost->host_blocked) > 0) {

> -               if (busy)

> +               if (scsi_host_busy(shost) > 0)

>                          goto starved;

> 

> Before 6eb045e092ef, the busy count was incremented with membarrier

> before looking at "host_blocked". The new code does this instead:

> 

> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,

>                  spin_unlock_irq(shost->host_lock);

>          }

>   

> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);

> +

> 

> but it happens *after* the "host_blocked" check. Could that perhaps

> have caused the regression?


As we only have production systems with this issue, and Don wrote the 
Microchip team was able to reproduce the issue, it’d be great, if Don 
and his team, could test, if commit 6eb045e092ef introduced the regression.

Also, we still need a path forward how to fix this for the Linux 5.10 
series. Due to the issue dragging on for so long, the 5.9 series has 
reached end of life already.


Kind regards,

Paul

Donald Buczek Jan. 20, 2021, 4:42 p.m. UTC | #30

On 19.01.21 15:12, Martin Wilck wrote:
> On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:

>>>>

>>>> Am 10.12.20 um 21:35 schrieb Don Brace:

>>>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

>>>>>

>>>>> * Correct scsi-mid-layer sending more requests than

>>>>>      exposed host Q depth causing firmware ASSERT issue.

>>>>>      * Add host Qdepth counter.

>>>>

>>>> This supposedly fixes the regression between Linux 5.4 and 5.9,

>>>> which

>>>> we reported in [1].

>>>>

>>>>        kernel: smartpqi 0000:89:00.0: controller is offline:

>>>> status code

>>>> 0x6100c

>>>>        kernel: smartpqi 0000:89:00.0: controller offline

>>>>

>>>> Thank you for looking into this issue and fixing it. We are going

>>>> to

>>>> test this.

>>>>

>>>> For easily finding these things in the git history or the WWW, it

>>>> would be great if these log messages could be included (in the

>>>> future).

>>>> DON> Thanks for your suggestion. Well add them in the next time.

>>>>

>>>> Also, that means, that the regression is still present in Linux

>>>> 5.10,

>>>> released yesterday, and this commit does not apply to these

>>>> versions.

>>>>

>>>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12

>>>> depending when all of the patches are applied. The patch in

>>>> question

>>>> is among 28 other patches.

>>>>

>>>> Mahesh, do you have any idea, what commit caused the regression

>>>> and

>>>> why the issue started to show up?

>>>> DON> The smartpqi driver sets two scsi_host_template member

>>>> fields:

>>>> .can_queue and .nr_hw_queues. But we have not yet converted to

>>>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,

>>>> which is more than the hw can support. That can be verified by

>>>> looking

>>>> at scsi_host.h.

>>>>           /*

>>>>            * In scsi-mq mode, the number of hardware queues

>>>> supported by

>>>> the LLD.

>>>>            *

>>>>            * Note: it is assumed that each hardware queue has a

>>>> queue

>>>> depth of

>>>>            * can_queue. In other words, the total queue depth per

>>>> host

>>>>            * is nr_hw_queues * can_queue. However, for when

>>>> host_tagset

>>>> is set,

>>>>            * the total queue depth is can_queue.

>>>>            */

>>>>

>>>> So, until we make this change, the queue_depth change prevents

>>>> the

>>>> above issue from happening.

>>>

>>> can_queue and nr_hw_queues have been set like this as long as the

>>> driver existed. Why did Paul observe a regression with 5.9?

>>>

>>> And why can't you simply set can_queue to (ctrl_info-

>>>> scsi_ml_can_queue / nr_hw_queues)?

>>>

>>> Don: I did this in an internal patch, but this patch seemed to work

>>> the best for our driver. HBA performance remained steady when

>>> running benchmarks.

> 

> That was a stupid suggestion on my part. Sorry.

> 

>> I guess that this is a fallout from commit 6eb045e092ef ("scsi:

>>    core: avoid host-wide host_busy counter for scsi_mq"). But that

>> commit

>> is correct.

> 

> It would be good if someone (Paul?) could verify whether that commit

> actually caused the regression they saw.


We can reliably trigger the issue with a certain load pattern on a certain hardware.

I've compiled 6eb045e092ef  and got (as with other affected kernels) "controller is offline: status code 0x6100c" after 15 minutes of the test load.
I've compiled 6eb045e092ef^ and the load is running for 3 1/2 hours now.

So you hit it.

> Looking at that 6eb045e092ef, I notice this hunk:

> 

>   

> -       busy = atomic_inc_return(&shost->host_busy) - 1;

>          if (atomic_read(&shost->host_blocked) > 0) {

> -               if (busy)

> +               if (scsi_host_busy(shost) > 0)

>                          goto starved;

> 

> Before 6eb045e092ef, the busy count was incremented with membarrier

> before looking at "host_blocked". The new code does this instead:

> 

> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,

>                  spin_unlock_irq(shost->host_lock);

>          }

>   

> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);

> +

> 

> but it happens *after* the "host_blocked" check. Could that perhaps

> have caused the regression?


I'm not into this and can't comment on that. But if you need me to test any patch for verification, I'll certainly can do that.

Best
   Donald

>

> 

> Thanks

> Martin

>

Don Brace Jan. 20, 2021, 5:03 p.m. UTC | #31

-----Original Message-----
From: Donald Buczek [mailto:buczek@molgen.mpg.de] 

Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>

> It would be good if someone (Paul?) could verify whether that commit 

> actually caused the regression they saw.

We can reliably trigger the issue with a certain load pattern on a certain hardware.

I've compiled 6eb045e092ef  and got (as with other affected kernels) "controller is offline: status code 0x6100c" after 15 minutes of the test load.
I've compiled 6eb045e092ef^ and the load is running for 3 1/2 hours now.

So you hit it.

Don: good news, I was starting my own testing.
Thanks for your help

> Looking at that 6eb045e092ef, I notice this hunk:

>

>

> -       busy = atomic_inc_return(&shost->host_busy) - 1;

>          if (atomic_read(&shost->host_blocked) > 0) {

> -               if (busy)

> +               if (scsi_host_busy(shost) > 0)

>                          goto starved;

>

> Before 6eb045e092ef, the busy count was incremented with membarrier 

> before looking at "host_blocked". The new code does this instead:

>

> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,

>                  spin_unlock_irq(shost->host_lock);

>          }

>

> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);

> +

>

> but it happens *after* the "host_blocked" check. Could that perhaps 

> have caused the regression?

I'm not into this and can't comment on that. But if you need me to test any patch for verification, I'll certainly can do that.

Best
   Donald

>

>

> Thanks

> Martin

>

Martin Wilck Jan. 20, 2021, 6:35 p.m. UTC | #32

On Wed, 2021-01-20 at 17:42 +0100, Donald Buczek wrote:
> 

> I'm not into this and can't comment on that. But if you need me to

> test any patch for verification, I'll certainly can do that.


I'll send an *experimental* patch.

Thanks
Martin

Don Brace Jan. 22, 2021, 4:45 p.m. UTC | #33

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits

In general: This patch contains a lot of whitespace, indentation, and minor comment formatting changes which should rather go into a separate patch IMHO. This one is big enough without them.

Don: Moved formatting HUNKs to smartpqi-align-code-with-oob-driver.

Further remarks below.

> @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)

>

>         pqi_start_heartbeat_timer(ctrl_info);

>

> +       if (ctrl_info->enable_r5_writes || ctrl_info-

> >enable_r6_writes) {

> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);

> +               if (rc) {

> +                       dev_err(&ctrl_info->pci_dev->dev,

> +                               "error obtaining advanced RAID bypass

> configuration\n");

> +                       return rc;

Do you need to error out here ? Can't you simply unset the enable_rX_writes feature?

Don: If the call to pqi_get_advanced_raid_bypass_config fails, then there most likely be some serious issues. So we abandon the initialization process.

Regards
Martin

Martin Wilck Jan. 22, 2021, 7:04 p.m. UTC | #34

On Fri, 2021-01-22 at 16:45 +0000, Don.Brace@microchip.com wrote:
> 

> > @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)

> > 

> >         pqi_start_heartbeat_timer(ctrl_info);

> > 

> > +       if (ctrl_info->enable_r5_writes || ctrl_info-

> > > enable_r6_writes) {

> > +               rc =

> > pqi_get_advanced_raid_bypass_config(ctrl_info);

> > +               if (rc) {

> > +                       dev_err(&ctrl_info->pci_dev->dev,

> > +                               "error obtaining advanced RAID

> > bypass

> > configuration\n");

> > +                       return rc;

> 

> Do you need to error out here ? Can't you simply unset the

> enable_rX_writes feature?

> 

> Don: If the call to pqi_get_advanced_raid_bypass_config fails, then

> there most likely be some serious issues. So we abandon the

> initialization process.


Ok, understood. A reader who isn't fully familiar with the HW
properties (like myself) may think that "advanced_raid_bypass"
is an optional performance-related feature and that the controller
could be operational without it. If that's not the case, fine.

Martin

Don Brace Jan. 25, 2021, 5:09 p.m. UTC | #35

From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 21/25] smartpqi: add additional logging for LUN resets


The patch description is not complete, as the patch also changes some timings. Two remarks below.

Cheers,
Martin

> ---

>  drivers/scsi/smartpqi/smartpqi_init.c |  125

> +++++++++++++++++++++++----------

>  1 file changed, 89 insertions(+), 36 deletions(-)

>

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c

> b/drivers/scsi/smartpqi/smartpqi_init.c

> index 6b624413c8e6..1c51a59f1da6 100644

> --- a/drivers/scsi/smartpqi/smartpqi_init.c

> +++ b/drivers/scsi/smartpqi/smartpqi_init.c

> @@ -84,7 +84,7 @@ static void pqi_ofa_setup_host_buffer(struct 

> pqi_ctrl_info *ctrl_info);  static void 

> pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info);  static int 

> pqi_ofa_host_memory_update(struct pqi_ctrl_info *ctrl_info);  static 

> int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,

> -       struct pqi_scsi_dev *device, unsigned long timeout_secs);

> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs);

>

>  /* for flags argument to pqi_submit_raid_request_synchronous() */

>  #define PQI_SYNC_FLAGS_INTERRUPTABLE   0x1

> @@ -335,11 +335,34 @@ static void pqi_wait_if_ctrl_blocked(struct

> pqi_ctrl_info *ctrl_info)

>         atomic_dec(&ctrl_info->num_blocked_threads);

>  }

>

> +#define PQI_QUIESE_WARNING_TIMEOUT_SECS                10


Did you mean QUIESCE ?

Don: Yes, corrected. Thank-you.

>

>         pqi_start_io(ctrl_info, &ctrl_info-

> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,

>                 io_request);

> @@ -5958,29 +6007,33 @@ static int pqi_lun_reset(struct pqi_ctrl_info

> *ctrl_info,

>         return rc;

>  }

>

> -#define PQI_LUN_RESET_RETRIES                  3

> -#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000

> -#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120

> +#define PQI_LUN_RESET_RETRIES                          3

> +#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS             (10 * 1000)

> +#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS         (10 * 60 *

> 1000)


10 minutes? Isn't that a bit much?

Don: 10 minutes seems to hold true for our worst-case-scenarios.

> +#define PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS  (2 * 60 *

> 1000)


Why wait less long after a failure?

Don: If the reset TMF fails, the driver is waiting two more minutes for I/O to be flushed out. After this timeout return a failure if the I/O has not been flushed. In many tests, the I/O eventually returns.

Don Brace Jan. 25, 2021, 5:13 p.m. UTC | #36

From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf

> @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info 

> *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev 

> *existing_device,

>         struct pqi_scsi_dev *new_device)  {

> -       existing_device->devtype = new_device->devtype;

>         existing_device->device_type = new_device->device_type;

>         existing_device->bus = new_device->bus;

>         if (new_device->target_lun_valid) {

>

I don't get this. Why was it wrong to update the devtype field?

Don: From patch Author...
If we don't remove that statement, following issue will crop up.

During initial device enumeration, the devtype attribute of the device(in current case enclosure device) is filled in slave_configure.
But whenever a rescan occurs, the f/w would return zero for this field,  and valid devtype is overwritten by zero, when device attributes are updated in pqi_scsi_update_device. Due to this lsscsi output shows wrong values.

Martin Wilck Jan. 25, 2021, 7:44 p.m. UTC | #37

On Mon, 2021-01-25 at 17:13 +0000, Don.Brace@microchip.com wrote:
> 

> From: Martin Wilck [mailto:mwilck@suse.com] 

> Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier

> in sysf

> 

> > @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info

> > *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev

> > *existing_device,

> >         struct pqi_scsi_dev *new_device)  {

> > -       existing_device->devtype = new_device->devtype;

> >         existing_device->device_type = new_device->device_type;

> >         existing_device->bus = new_device->bus;

> >         if (new_device->target_lun_valid) {

> > 

> 

> I don't get this. Why was it wrong to update the devtype field?

> 

> Don: From patch Author...

> If we don't remove that statement, following issue will crop up.

> 

> During initial device enumeration, the devtype attribute of the

> device(in current case enclosure device) is filled in

> slave_configure.

> But whenever a rescan occurs, the f/w would return zero for this

> field,  and valid devtype is overwritten by zero, when device

> attributes are updated in pqi_scsi_update_device. Due to this lsscsi

> output shows wrong values.


Thanks. It would be very helpful for reviewers to add comments in
cases like this.

Regards,
Martin

> 

>

Don Brace Jan. 25, 2021, 8:36 p.m. UTC | #38

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf

> From: Martin Wilck [mailto:mwilck@suse.com]

> Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in 

> sysf

>

> > @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info 

> > *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev 

> > *existing_device,

> >         struct pqi_scsi_dev *new_device)  {

> > -       existing_device->devtype = new_device->devtype;

> >         existing_device->device_type = new_device->device_type;

> >         existing_device->bus = new_device->bus;

> >         if (new_device->target_lun_valid) {

> >

>

> I don't get this. Why was it wrong to update the devtype field?

>

> Don: From patch Author...

> If we don't remove that statement, following issue will crop up.

>

> During initial device enumeration, the devtype attribute of the 

> device(in current case enclosure device) is filled in slave_configure.

> But whenever a rescan occurs, the f/w would return zero for this 

> field,  and valid devtype is overwritten by zero, when device 

> attributes are updated in pqi_scsi_update_device. Due to this lsscsi 

> output shows wrong values.


Thanks. It would be very helpful for reviewers to add comments in cases like this.

Regards,
Martin

Don: I updated the patch description accordingly
Thanks for all your hard-work.
>

>

Don Brace Jan. 27, 2021, 5:39 p.m. UTC | #39

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation

> @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct 

> pci_dev *pci_dev)

>         pci_set_power_state(pci_dev, PCI_D0);

>         pci_restore_state(pci_dev);

>

> +       pqi_ctrl_unblock_device_reset(ctrl_info);

> +       pqi_ctrl_unblock_requests(ctrl_info);

> +       pqi_scsi_unblock_requests(ctrl_info);

> +       pqi_ctrl_unblock_scan(ctrl_info);

> +

>         return pqi_ctrl_init_resume(ctrl_info);  }


Like I said in my comments on 14/25:

pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand to mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and a mutex has to be unlocked by the task that owns the lock. How can you be sure that these conditions are met here?

Don: I updated this patch to:
@@ -8661,9 +8661,17 @@ static __maybe_unused int pqi_resume(struct pci_dev *pci_dev)
                return 0;
        }
 
+       pqi_ctrl_block_device_reset(ctrl_info);
+       pqi_ctrl_block_scan(ctrl_info);
+
        pci_set_power_state(pci_dev, PCI_D0);
        pci_restore_state(pci_dev);
 
+       pqi_ctrl_unblock_device_reset(ctrl_info);
+       pqi_ctrl_unblock_requests(ctrl_info);
+       pqi_scsi_unblock_requests(ctrl_info);
+       pqi_ctrl_unblock_scan(ctrl_info);
+
        return pqi_ctrl_init_resume(ctrl_info);
 }
Don: So the mutexes are set and unset in the same task. I updated the other patch 14 accordingly, but I'll reply in that patch also. Is there a specific driver that initiates suspend/resume? Like acpi? Or some other pci driver?

Thanks for your hard work on these patches
Don

Martin Wilck Jan. 27, 2021, 5:45 p.m. UTC | #40

On Wed, 2021-01-27 at 17:39 +0000, Don.Brace@microchip.com wrote:
> -----Original Message-----

> From: Martin Wilck [mailto:mwilck@suse.com] 

> Subject: Re: [PATCH V3 23/25] smartpqi: correct system hangs when

> resuming from hibernation

> 

> > @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct 

> > pci_dev *pci_dev)

> >         pci_set_power_state(pci_dev, PCI_D0);

> >         pci_restore_state(pci_dev);

> > 

> > +       pqi_ctrl_unblock_device_reset(ctrl_info);

> > +       pqi_ctrl_unblock_requests(ctrl_info);

> > +       pqi_scsi_unblock_requests(ctrl_info);

> > +       pqi_ctrl_unblock_scan(ctrl_info);

> > +

> >         return pqi_ctrl_init_resume(ctrl_info);  }

> 

> Like I said in my comments on 14/25:

> 

> pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand to

> mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and a

> mutex has to be unlocked by the task that owns the lock. How can you

> be sure that these conditions are met here?

> 

> Don: I updated this patch to:

> @@ -8661,9 +8661,17 @@ static __maybe_unused int pqi_resume(struct

> pci_dev *pci_dev)

>                 return 0;

>         }

>  

> +       pqi_ctrl_block_device_reset(ctrl_info);

> +       pqi_ctrl_block_scan(ctrl_info);

> +

>         pci_set_power_state(pci_dev, PCI_D0);

>         pci_restore_state(pci_dev);

>  

> +       pqi_ctrl_unblock_device_reset(ctrl_info);

> +       pqi_ctrl_unblock_requests(ctrl_info);

> +       pqi_scsi_unblock_requests(ctrl_info);

> +       pqi_ctrl_unblock_scan(ctrl_info);

> +

>         return pqi_ctrl_init_resume(ctrl_info);

>  }

> Don: So the mutexes are set and unset in the same task.


Yes, that looks much better to me.

>  I updated the other patch 14 accordingly, but I'll reply in that

> patch also. Is there a specific driver that initiates suspend/resume?

> Like acpi? Or some other pci driver?


I'm no expert on suspend/resume. I think it's platform-dependent, you
shouldn't make any specific assumptions what the platform actually
does.

Regards
Martin

Don Brace Jan. 27, 2021, 5:46 p.m. UTC | #41

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 

Subject: Re: [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA

I don't understand how the patch description relates to the actual change. With the patch, the buffers are released just like before, only some instructions later. So apparently, without this patch, the OFA memory had been released prematurely?
Don: Yes. So when I broke up patch smartpqi-fix-driver-synchronization-issues, I ended up squashing this patch into a new patch called smartpqi-update-ofa-management. Thanks for your review.

Anyway,

Reviewed-by: Martin Wilck <mwilck@suse.com>

Don Brace Feb. 10, 2021, 3:27 p.m. UTC | #42

-----Original Message-----
From: John Garry [mailto:john.garry@huawei.com] 

Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

I think that this can alternatively be solved by setting .host_tagset flag.

Thanks,
John

Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?

John Garry Feb. 10, 2021, 3:42 p.m. UTC | #43

On 10/02/2021 15:27, Don.Brace@microchip.com wrote:
> -----Original Message-----

> From: John Garry [mailto:john.garry@huawei.com]

> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

> 

> 

> I think that this can alternatively be solved by setting .host_tagset flag.

> 

> Thanks,

> John

> 

> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?


I don't mind. And I think that Ming had the same idea.

Thanks,
John

Don Brace Feb. 10, 2021, 4:29 p.m. UTC | #44

-----Original Message-----
From: John Garry [mailto:john.garry@huawei.com] 

Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>

>

> I think that this can alternatively be solved by setting .host_tagset flag.

>

> Thanks,

> John

>

> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?


I don't mind. And I think that Ming had the same idea.

Thanks,
John

Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?

Paul Menzel March 29, 2021, 9:15 p.m. UTC | #45

Dear Linüx folks,


Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:
> -----Original Message-----

> From: John Garry [mailto:john.garry@huawei.com]

> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

> 

>> I think that this can alternatively be solved by setting .host_tagset flag.

>>

>> Thanks,

>> John

>>

>> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?

> 

> I don't mind. And I think that Ming had the same idea.


> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?


It looks like, iterations 4 and 5 of the patch series have been posted 
in the meantime. Unfortunately without the reporters and discussion 
participants in Cc. Linux upstream is still broken since version 5.5.


Kind regards,

Paul


[1]: 
https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/
[2]: 
https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

Paul Menzel March 29, 2021, 9:16 p.m. UTC | #46

[Resent from correct address.]

Am 29.03.21 um 23:15 schrieb Paul Menzel:
> Dear Linüx folks,

> 

> 

> Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:

>> -----Original Message-----

>> From: John Garry [mailto:john.garry@huawei.com]

>> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>>

>>> I think that this can alternatively be solved by setting .host_tagset 

>>> flag.

>>>

>>> Thanks,

>>> John

>>>

>>> Don: John, can I add a Suggested-by tag for you for my new patch 

>>> smartpqi-use-host-wide-tagspace?

>>

>> I don't mind. And I think that Ming had the same idea.

> 

>> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?

> 

> It looks like, iterations 4 and 5 of the patch series have been posted 

> in the meantime. Unfortunately without the reporters and discussion 

> participants in Cc. Linux upstream is still broken since version 5.5.

> 

> 

> Kind regards,

> 

> Paul

> 

> 

> [1]: https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/

> [2]: https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

Donald Buczek March 30, 2021, 2:37 p.m. UTC | #47

Dear Paul,

On 29.03.21 23:16, Paul Menzel wrote:
> [Resent from correct address.]

> 

> Am 29.03.21 um 23:15 schrieb Paul Menzel:

>> Dear Linüx folks,

>>

>>

>> Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:

>>> -----Original Message-----

>>> From: John Garry [mailto:john.garry@huawei.com]

>>> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>>>

>>>> I think that this can alternatively be solved by setting .host_tagset flag.

>>>>

>>>> Thanks,

>>>> John

>>>>

>>>> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?

>>>

>>> I don't mind. And I think that Ming had the same idea.

>>

>>> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?

>>

>> It looks like, iterations 4 and 5 of the patch series have been posted in the meantime. Unfortunately without the reporters and discussion participants in Cc. Linux upstream is still broken since version 5.5.


When "smartpqi: use host wide tagspace" [1] goes into mainline, we can submit it to stable, if nobody else does. This fixes the original problem and we got a patch with the same code change running in our 5.10 kernels.

Best
   Donald

[1]: https://lore.kernel.org/linux-scsi/161549369787.25025.8975999483518581619.stgit@brunhilda/

>>

>>

>> Kind regards,

>>

>> Paul

>>

>>

>> [1]: https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/

>> [2]: https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

[V3,00/25] smartpqi updates

Message

Comments