mbox series

[00/20] libsas and pm8001 fixes

Message ID 20220210114218.632725-1-damien.lemoal@opensource.wdc.com
Headers show
Series libsas and pm8001 fixes | expand

Message

Damien Le Moal Feb. 10, 2022, 11:41 a.m. UTC
The first 3 patches fix a problem with libsas handling of NCQ NON DATA
commands and simplify libsas code in a couple of places.
The remaining patches are fixes for the pm8001 driver:
* All sparse warnings are addressed, fixing along the way many le32
  handling bugs for big-endian architectures
* Fix handling of NCQ NON DATA commands
* Fix NCQ error recovery (abort all task execution) that was causing a
  crash
* Simplify the code in many places

With these fixes, libzbc test suite passes all test case. This test
suite was used with an SMR drive for testing because it generates many
NCQ NON DATA commands (for zone management commands) and also generates
many NCQ command errors to check ASC/ASCQ returned by the device. With
the test suite, the error recovery path was extensively exercised.

Note that without these patches, libzbc test suite result in the
controller hanging, or in kernel crashes.

Damien Le Moal (20):
  scsi: libsas: fix sas_ata_qc_issue() handling of NCQ NON DATA commands
  scsi: libsas: simplify sas_ata_qc_issue() detection of NCQ commands
  scsi: libsas: Remove unnecessary initialization in sas_ata_qc_issue()
  scsi: pm8001: fix __iomem pointer use in pm8001_phy_control()
  scsi: pm8001: Remove local variable in pm8001_pci_resume()
  scsi: pm8001: Fix pm8001_update_flash() local variable type
  scsi: pm8001: Fix command initialization in pm80XX_send_read_log()
  scsi: pm8001: Fix local variable declaration in pm80xx_pci_mem_copy()
  scsi: pm8001: Fix command initialization in pm8001_chip_ssp_tm_req()
  scsi: pm8001: fix payload initialization in
    pm80xx_set_thermal_config()
  scsi: pm8001: fix le32 values handling in
    pm80xx_set_sas_protocol_timer_config()
  scsi: pm8001: fix payload initialization in pm80xx_encrypt_update()
  scsi: pm8001: fix le32 values handling in pm80xx_chip_ssp_io_req()
  scsi: pm8001: fix le32 values handling in pm80xx_chip_sata_req()
  scsi: pm8001: fix use of struct set_phy_profile_req fields
  scsi: pm8001: simplify pm8001_get_ncq_tag()
  scsi: pm8001: fix NCQ NON DATA command task initialization
  scsi: pm8001: fix NCQ NON DATA command completion handling
  scsi: pm8001: cleanup pm8001_queue_command()
  scsi: pm8001: fix abort all task initialization

 drivers/scsi/libsas/sas_ata.c     |  12 +-
 drivers/scsi/pm8001/pm8001_ctl.c  |   5 +-
 drivers/scsi/pm8001/pm8001_hwi.c  |  23 +--
 drivers/scsi/pm8001/pm8001_init.c |   8 +-
 drivers/scsi/pm8001/pm8001_sas.c  |  48 +++----
 drivers/scsi/pm8001/pm80xx_hwi.c  | 226 ++++++++++++++++--------------
 drivers/scsi/pm8001/pm80xx_hwi.h  |   2 +-
 7 files changed, 169 insertions(+), 155 deletions(-)

Comments

John Garry Feb. 10, 2022, 12:04 p.m. UTC | #1
On 10/02/2022 11:42, Damien Le Moal wrote:
> In pm8001_pci_resume(), the use of the u32 type for the local variable
> device_state causes a sparse warning:
> 
> warning: incorrect type in assignment (different base types)
>      expected unsigned int [usertype] device_state
>      got restricted pci_power_t [usertype] current_state
> 
> Since this variable is used only once in the function, remove it and
> use pdev->current_state directly. While at it, also add a blank line
> after the last local variable declaration.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>

Regardless of a couple of comments:
Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   drivers/scsi/pm8001/pm8001_init.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
> index d8a2121cb8d9..4b9a26f008a9 100644
> --- a/drivers/scsi/pm8001/pm8001_init.c
> +++ b/drivers/scsi/pm8001/pm8001_init.c
> @@ -1335,13 +1335,13 @@ static int __maybe_unused pm8001_pci_resume(struct device *dev)
>   	struct pm8001_hba_info *pm8001_ha;
>   	int rc;
>   	u8 i = 0, j;
> -	u32 device_state;
>   	DECLARE_COMPLETION_ONSTACK(completion);
> +
>   	pm8001_ha = sha->lldd_ha;
> -	device_state = pdev->current_state;
>   
> -	pm8001_info(pm8001_ha, "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",
> -		      pdev, pm8001_ha->name, device_state);
> +	pm8001_info(pm8001_ha,
> +		    "pdev=0x%p, slot=%s, resuming from previous operating state [D%d]\n",

I think that we may put this on the same line as pm8001_info

Feel free to ignore this: if we're ok with changing logs, I am not sure 
on the "slot" value - it is already printed with pm8001_info. And 
printing pdev is suspect, since we should really be using dev_info or 
pci_info() and friends - but that is a bigger job.

Thanks,
John

> +		    pdev, pm8001_ha->name, pdev->current_state);
>   
>   	rc = pci_go_44(pdev);
>   	if (rc)
Damien Le Moal Feb. 10, 2022, 10:43 p.m. UTC | #2
On 2/10/22 23:28, John Garry wrote:
> On 10/02/2022 11:42, Damien Le Moal wrote:
>> In pm80xx_send_abort_all(), the n_elem field of the ccb used is not
>> initialized to 0. This missing initialization sometimes lead to the
>> task completion path seeing the ccb with a non-zero n_elem resulting in
>> the execution of invalid dma_unmap_sg() calls in pm8001_ccb_task_free(),
>> causing a crash such as:
>>
>> [  197.676341] RIP: 0010:iommu_dma_unmap_sg+0x6d/0x280
>> [  197.700204] RSP: 0018:ffff889bbcf89c88 EFLAGS: 00010012
>> [  197.705485] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff83d0bda0
>> [  197.712687] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88810dffc0d0
>> [  197.719887] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff8881c790098b
>> [  197.727089] R10: ffffed1038f20131 R11: 0000000000000001 R12: 0000000000000000
>> [  197.734296] R13: ffff88810dffc0d0 R14: 0000000000000010 R15: 0000000000000000
>> [  197.741493] FS:  0000000000000000(0000) GS:ffff889bbcf80000(0000) knlGS:0000000000000000
>> [  197.749659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  197.755459] CR2: 00007f16c1b42734 CR3: 0000000004814000 CR4: 0000000000350ee0
>> [  197.762656] Call Trace:
>> [  197.765127]  <IRQ>
>> [  197.767162]  pm8001_ccb_task_free+0x5f1/0x820 [pm80xx]
>> [  197.772364]  ? do_raw_spin_unlock+0x54/0x220
>> [  197.776680]  pm8001_mpi_task_abort_resp+0x2ce/0x4f0 [pm80xx]
>> [  197.782406]  process_oq+0xe85/0x7890 [pm80xx]
>> [  197.786817]  ? lock_acquire+0x194/0x490
>> [  197.790697]  ? handle_irq_event+0x10e/0x1b0
>> [  197.794920]  ? mpi_sata_completion+0x2d70/0x2d70 [pm80xx]
>> [  197.800378]  ? __wake_up_bit+0x100/0x100
>> [  197.804340]  ? lock_is_held_type+0x98/0x110
>> [  197.808565]  pm80xx_chip_isr+0x94/0x130 [pm80xx]
>> [  197.813243]  tasklet_action_common.constprop.0+0x24b/0x2f0
>> [  197.818785]  __do_softirq+0x1b5/0x82d
>> [  197.822485]  ? do_raw_spin_unlock+0x54/0x220
>> [  197.826799]  __irq_exit_rcu+0x17e/0x1e0
>> [  197.830678]  irq_exit_rcu+0xa/0x20
>> [  197.834114]  common_interrupt+0x78/0x90
>> [  197.840051]  </IRQ>
>> [  197.844236]  <TASK>
>> [  197.848397]  asm_common_interrupt+0x1e/0x40
>>
> 
> That's nasty.
> 
>> Avoid this issue by always initializing the ccb n_elem field to 0 in
>> pm8001_send_abort_all(), pm8001_send_read_log() and
>> pm80xx_send_abort_all().
>>
>> Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
>> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
>> ---
>>   drivers/scsi/pm8001/pm8001_hwi.c | 2 ++
>>   drivers/scsi/pm8001/pm80xx_hwi.c | 1 +
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
>> index 8095eb0b04f7..d853e8d0195a 100644
>> --- a/drivers/scsi/pm8001/pm8001_hwi.c
>> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
>> @@ -1788,6 +1788,7 @@ static void pm8001_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
> 
> Do you think that it would be better to clear this field when we free 
> the tag/ccb in pm8001_ccb_task_free()? I will note that there are error 
> paths whch only free the tag (not ccb), so need to be careful there.

I am thinking that hunk like this one:

        ccb = &pm8001_ha->ccb_info[ccb_tag];

        ccb->device = pm8001_ha_dev;

        ccb->ccb_tag = ccb_tag;

        ccb->task = task;

To initialize a new ccb should be wrapped into a helper function, which
would also add initialization for the other ccb fields. There are many
places that have code like this, so that will also nicely cleanup the
code. Then the free path can be left alone. Hmm ?

> 
> BTW, I see that this never landed:
> https://lore.kernel.org/lkml/20211214090337.29156-1-niejianglei2021@163.com/

Repost !

> 
> Though alloc'ing a domain_device in pm8001_send_read_log() is questionable.

Yes, and the messages say "device not found" when the task completes so
I think it is 100% useless. But I did not touch that since it did not
seem to cause troubles.

> 
> Thanks,
> John
> 
>>   
>>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>>   
>> @@ -1849,6 +1850,7 @@ static void pm8001_send_read_log(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
>>   	pm8001_ha_dev->id |= NCQ_READ_LOG_FLAG;
>>   	pm8001_ha_dev->id |= NCQ_2ND_RLE_FLAG;
>>   
>> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
>> index 4d88c0dbcefc..902af4eefa26 100644
>> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
>> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
>> @@ -1801,6 +1801,7 @@ static void pm80xx_send_abort_all(struct pm8001_hba_info *pm8001_ha,
>>   	ccb->device = pm8001_ha_dev;
>>   	ccb->ccb_tag = ccb_tag;
>>   	ccb->task = task;
>> +	ccb->n_elem = 0;
>>   
>>   	circularQ = &pm8001_ha->inbnd_q_tbl[0];
>>   
>
Damien Le Moal Feb. 10, 2022, 10:44 p.m. UTC | #3
On 2/11/22 00:35, John Garry wrote:
> On 10/02/2022 11:41, Damien Le Moal wrote:
>> The first 3 patches fix a problem with libsas handling of NCQ NON DATA
>> commands and simplify libsas code in a couple of places.
>> The remaining patches are fixes for the pm8001 driver:
>> * All sparse warnings are addressed, fixing along the way many le32
>>    handling bugs for big-endian architectures
>> * Fix handling of NCQ NON DATA commands
>> * Fix NCQ error recovery (abort all task execution) that was causing a
>>    crash
>> * Simplify the code in many places
>>
>> With these fixes, libzbc test suite passes all test case. This test
>> suite was used with an SMR drive for testing because it generates many
>> NCQ NON DATA commands (for zone management commands) and also generates
>> many NCQ command errors to check ASC/ASCQ returned by the device. With
>> the test suite, the error recovery path was extensively exercised.
>>
>> Note that without these patches, libzbc test suite result in the
>> controller hanging, or in kernel crashes.
> 
> Unfortunately I still see the hang on my arm64 system with this series :(

That is unfortunate. Any particular command sequence triggering the hang
? Or is it random ? What workload are you running ?

> 
> Thanks,
> John
Christoph Hellwig Feb. 11, 2022, 6:14 a.m. UTC | #4
On Thu, Feb 10, 2022 at 08:42:06PM +0900, Damien Le Moal wrote:
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
>  				u32 dw_count, u32 bus_base_number)
>  {
>  	u32 index, value, offset;
> -	u32 *destination1;
> -	destination1 = (u32 *)destination;
> +	__le32 *destination1 = (__le32 *)destination;

I think the right fix here is to declare the destination argument as a
le32 pointer without the incorrect const attribute.
Damien Le Moal Feb. 11, 2022, 7:18 a.m. UTC | #5
On 2/11/22 15:14, Christoph Hellwig wrote:
> On Thu, Feb 10, 2022 at 08:42:06PM +0900, Damien Le Moal wrote:
>> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
>> @@ -71,14 +71,13 @@ static void pm80xx_pci_mem_copy(struct pm8001_hba_info  *pm8001_ha, u32 soffset,
>>  				u32 dw_count, u32 bus_base_number)
>>  {
>>  	u32 index, value, offset;
>> -	u32 *destination1;
>> -	destination1 = (u32 *)destination;
>> +	__le32 *destination1 = (__le32 *)destination;
> 
> I think the right fix here is to declare the destination argument as a
> le32 pointer without the incorrect const attribute.

Yes. Much cleaner :)
John Garry Feb. 11, 2022, 9:24 a.m. UTC | #6
On 10/02/2022 22:44, Damien Le Moal wrote:

Hi Damien,

>>> Note that without these patches, libzbc test suite result in the
>>> controller hanging, or in kernel crashes.
>> Unfortunately I still see the hang on my arm64 system with this series:(
> That is unfortunate. Any particular command sequence triggering the hang
> ? Or is it random ? What workload are you running ?
> 

mount/unmount fails mostly even after as few as one attempt, but then 
even fdisk -l fails sometimes:

root@(none)$ fdisk -l
[   97.924789] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[   97.930652] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   97.937149] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.943232] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.952020] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.958099] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.966881] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   97.972961] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   97.981737] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   97.991241] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.000752] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.010223] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.015180] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.021645] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.027725] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.036495] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.042574] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.051344] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.057423] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.066154] pm80xx: rc= -5
[   98.068854] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.075331] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.081411] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.090192] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.096271] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.105053] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.111132] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.119909] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.129414] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.138919] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.148388] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.153345] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.159812] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.165892] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.174661] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.180741] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.189511] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.195590] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.204320] pm80xx: rc= -5
[   98.207019] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.213497] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.219577] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.228359] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.234438] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.243220] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.249299] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.258075] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.267580] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.277085] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.286554] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.291510] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.297978] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.304059] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.312827] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.318906] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.327677] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.333756] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.342487] pm80xx: rc= -5
[   98.345186] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.351664] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.357743] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.366525] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.372604] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.381386] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.387465] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.396230] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.405734] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.415239] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.424709] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.429665] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.436133] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.442213] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.450982] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.457061] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.465825] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.471904] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.480629] pm80xx: rc= -5
[   98.483327] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
[   98.489800] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.495880] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.504661] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.510740] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.519523] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.525602] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.534372] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.543877] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.553382] pm80xx0:: pm8001_mpi_task_abort_resp  3682:task abort 
failed status 0x6 ,tag = 0x2, scp= 0x0
[   98.562851] pm80xx0:: pm8001_abort_task  1345:rc= -5
[   98.567807] sas: sas_scsi_find_task: querying task 0x(____ptrval____)
[   98.574275] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.580355] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.589124] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.595203] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.603968] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
[   98.610047] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
Failure Drive:5000c500a7babc61
[   98.618778] pm80xx: rc= -5
[   98.621505] sas: --- Exit sas_scsi_recover_host: busy: 1 failed: 1 
tries: 1

Sometimes I get TMF timeouts, which is a bad situation. I guess it's a 
subtle driver bug, but where ....?

BTW, this following log needs removal/fixing at some stage by someone:

[   98.480629] pm80xx: rc= -5

It's from pm8001_query_task().

Thanks,
John
Damien Le Moal Feb. 11, 2022, 12:37 p.m. UTC | #7
On 2/11/22 18:24, John Garry wrote:
> On 10/02/2022 22:44, Damien Le Moal wrote:
> 
> Hi Damien,
> 
>>>> Note that without these patches, libzbc test suite result in the
>>>> controller hanging, or in kernel crashes.
>>> Unfortunately I still see the hang on my arm64 system with this series:(
>> That is unfortunate. Any particular command sequence triggering the hang
>> ? Or is it random ? What workload are you running ?
>>
> 
> mount/unmount fails mostly even after as few as one attempt, but then 
> even fdisk -l fails sometimes:

Try with patch 21 of my v2. It does fix a bug for scsi/sas case. That
problem would likely lead to a crash though, but never know...

> root@(none)$ fdisk -l
> [   97.924789] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
> [   97.930652] sas: sas_scsi_find_task: aborting task 0x(____ptrval____)
> [   97.937149] pm80xx0:: mpi_ssp_completion  1937:sas IO status 0x3b
> [   97.943232] pm80xx0:: mpi_ssp_completion  1948:SAS Address of IO 
> Failure Drive:5000c500a7babc61
[...]
> 
> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a 
> subtle driver bug, but where ....?

What is the command failing ? Always the same ? Can you try adding scsi
trace to see the commands ?

If you are "lucky", it is always the same type of command like for the
NCQ NON DATA in my case. Though on mount, I would only expect a lot of
read commands and not much else. There may be some writes and a flush
too, so there will be "data" commands and "non data" commands. It may be
an issue with non-data commands too ?

> BTW, this following log needs removal/fixing at some stage by someone:
> 
> [   98.480629] pm80xx: rc= -5
> 
> It's from pm8001_query_task().
> 
> Thanks,
> John
John Garry Feb. 11, 2022, 1:08 p.m. UTC | #8
On 11/02/2022 12:37, Damien Le Moal wrote:

Hi Damien,

>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>> subtle driver bug, but where ....?
> What is the command failing ? Always the same ? Can you try adding scsi
> trace to see the commands ?

This is the same issue I have had since day #1.

Generally mount/unmount or even fdisk -l fails after booting into 
miniramfs. I wouldn't ever try to boot a distro.

> 
> If you are "lucky", it is always the same type of command like for the
> NCQ NON DATA in my case.

I'm just trying SAS disks to start with - so it's an SCSI READ command. 
SATA/STP support is generally never as robust for SAS HBAs (HW and LLD 
bugs are common - as this series is evidence) so I start on something 
more basic - however SATA/STP also has this issue.

The command is sent successfully but just never completes. Then 
sometimes the TMFs for error handling timeout and sometimes succeed. I 
don't have much to do on....

> Though on mount, I would only expect a lot of
> read commands and not much else.

Yes, and it is commonly the first SCSI read command which times out. It 
reliably breaks quite early. So I can think we can rule out issues like 
memory barriers/timing.

  There may be some writes and a flush
> too, so there will be "data" commands and "non data" commands. It may be
> an issue with non-data commands too ?
> 

Not sure on that. I guess it isn't.

Thanks,
John
Damien Le Moal Feb. 11, 2022, 1:14 p.m. UTC | #9
On 2/11/22 22:08, John Garry wrote:
> On 11/02/2022 12:37, Damien Le Moal wrote:
> 
> Hi Damien,
> 
>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>> subtle driver bug, but where ....?
>> What is the command failing ? Always the same ? Can you try adding scsi
>> trace to see the commands ?
> 
> This is the same issue I have had since day #1.
> 
> Generally mount/unmount or even fdisk -l fails after booting into 
> miniramfs. I wouldn't ever try to boot a distro.

busybox ?

> 
>>
>> If you are "lucky", it is always the same type of command like for the
>> NCQ NON DATA in my case.
> 
> I'm just trying SAS disks to start with - so it's an SCSI READ command. 
> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD 
> bugs are common - as this series is evidence) so I start on something 
> more basic - however SATA/STP also has this issue.
> 
> The command is sent successfully but just never completes. Then 
> sometimes the TMFs for error handling timeout and sometimes succeed. I 
> don't have much to do on....

No SAS bus analyzer lying in a corner of the office ? :)
That could help...

I will go to the office Monday. So I will get a chance to add SAS drives
to my setup to see what I get. I have only tested with SATA until now.
My controller is not the same chip as yours though.

> 
>> Though on mount, I would only expect a lot of
>> read commands and not much else.
> 
> Yes, and it is commonly the first SCSI read command which times out. It 
> reliably breaks quite early. So I can think we can rule out issues like 
> memory barriers/timing.
> 
>   There may be some writes and a flush
>> too, so there will be "data" commands and "non data" commands. It may be
>> an issue with non-data commands too ?
>>
> 
> Not sure on that. I guess it isn't.

Anything special with the drives you are using ? Have you tried other
drives to see if you get lucky ?

> 
> Thanks,
> John
John Garry Feb. 11, 2022, 1:54 p.m. UTC | #10
Hi Damien,

>>
>>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>>> subtle driver bug, but where ....?
>>> What is the command failing ? Always the same ? Can you try adding scsi
>>> trace to see the commands ?
>>
>> This is the same issue I have had since day #1.
>>
>> Generally mount/unmount or even fdisk -l fails after booting into
>> miniramfs. I wouldn't ever try to boot a distro.
> 
> busybox ?
> 

Yes

>>
>>>
>>> If you are "lucky", it is always the same type of command like for the
>>> NCQ NON DATA in my case.
>>
>> I'm just trying SAS disks to start with - so it's an SCSI READ command.
>> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD
>> bugs are common - as this series is evidence) so I start on something
>> more basic - however SATA/STP also has this issue.
>>
>> The command is sent successfully but just never completes. Then
>> sometimes the TMFs for error handling timeout and sometimes succeed. I
>> don't have much to do on....
> 
> No SAS bus analyzer lying in a corner of the office ? :)
> That could help...

None unfortunately

> 
> I will go to the office Monday. So I will get a chance to add SAS drives
> to my setup to see what I get. I have only tested with SATA until now.
> My controller is not the same chip as yours though.

jfyi, Ajish, cc'ed, from microchip says that they see the same issue on 
their arm64 system. Unfortunately fixing it is not a priority for them. 
So it is something which arm64 is exposing.

And I tried an old kernel - like 4.10 - on the same board and the pm8001 
driver was working somewhat reliably (no hangs). It just looks like a 
driver issue.

I'll have a look at the driver code again if I get a chance. It might be 
a DMA issue.

> 
>>
>>> Though on mount, I would only expect a lot of
>>> read commands and not much else.
>>
>> Yes, and it is commonly the first SCSI read command which times out. It
>> reliably breaks quite early. So I can think we can rule out issues like
>> memory barriers/timing.
>>
>>    There may be some writes and a flush
>>> too, so there will be "data" commands and "non data" commands. It may be
>>> an issue with non-data commands too ?
>>>
>>
>> Not sure on that. I guess it isn't.
> 
> Anything special with the drives you are using ? Have you tried other
> drives to see if you get lucky ?
> 

I think that the drives are ok. On the same board I originally had them 
plugged in the hisi_sas HBA and no issues.

thanks,
john
Damien Le Moal Feb. 12, 2022, 6:19 a.m. UTC | #11
On 2/11/22 22:54, John Garry wrote:
> Hi Damien,
> 
>>>
>>>>> Sometimes I get TMF timeouts, which is a bad situation. I guess it's a
>>>>> subtle driver bug, but where ....?
>>>> What is the command failing ? Always the same ? Can you try adding scsi
>>>> trace to see the commands ?
>>>
>>> This is the same issue I have had since day #1.
>>>
>>> Generally mount/unmount or even fdisk -l fails after booting into
>>> miniramfs. I wouldn't ever try to boot a distro.
>>
>> busybox ?
>>
> 
> Yes
> 
>>>
>>>>
>>>> If you are "lucky", it is always the same type of command like for the
>>>> NCQ NON DATA in my case.
>>>
>>> I'm just trying SAS disks to start with - so it's an SCSI READ command.
>>> SATA/STP support is generally never as robust for SAS HBAs (HW and LLD
>>> bugs are common - as this series is evidence) so I start on something
>>> more basic - however SATA/STP also has this issue.
>>>
>>> The command is sent successfully but just never completes. Then
>>> sometimes the TMFs for error handling timeout and sometimes succeed. I
>>> don't have much to do on....
>>
>> No SAS bus analyzer lying in a corner of the office ? :)
>> That could help...
> 
> None unfortunately
> 
>>
>> I will go to the office Monday. So I will get a chance to add SAS drives
>> to my setup to see what I get. I have only tested with SATA until now.
>> My controller is not the same chip as yours though.
> 
> jfyi, Ajish, cc'ed, from microchip says that they see the same issue on 
> their arm64 system. Unfortunately fixing it is not a priority for them. 
> So it is something which arm64 is exposing.
> 
> And I tried an old kernel - like 4.10 - on the same board and the pm8001 
> driver was working somewhat reliably (no hangs). It just looks like a 
> driver issue.
> 
> I'll have a look at the driver code again if I get a chance. It might be 
> a DMA issue.

There is one more thing that I find strange in the driver and that may
cause problems: tag 0 is a perfectly valid tag value that can be
returned by pm8001_tag_alloc() since find_first_zero_bit() will return 0
if the first bit is 0. And yet, there are many places in the driver that
treat !tag as an error. Extremely weird, if not outright broken...

I patched that and tested and everything seems OK... Could it be that
you are not seeing some completions because of that ?

I added the patch to my v3. Will send Monday.