From patchwork Sun Oct 30 18:43:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 80088 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp2115477qge; Sun, 30 Oct 2016 11:43:12 -0700 (PDT) X-Received: by 10.98.94.7 with SMTP id s7mr42540949pfb.37.1477852992507; Sun, 30 Oct 2016 11:43:12 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t15si6117586pgn.14.2016.10.30.11.43.12; Sun, 30 Oct 2016 11:43:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755914AbcJ3SnL (ORCPT + 3 others); Sun, 30 Oct 2016 14:43:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:49641 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753299AbcJ3SnK (ORCPT ); Sun, 30 Oct 2016 14:43:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id AD177AB03; Sun, 30 Oct 2016 18:43:08 +0000 (UTC) Subject: Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination. To: Andrey Grodzovsky , MPT-FusionLinux.pdl@broadcom.com References: <1477831417-25655-1-git-send-email-andrey2805@gmail.com> Cc: linux-scsi@vger.kernel.org, Sathya Prakash , Chaitra P B , Suganath Prabu Subramani , Sreekanth Reddy , stable@vger.kernel.org From: Hannes Reinecke Message-ID: <2179ecb8-183f-a500-4d65-10f64f0f43cc@suse.de> Date: Sun, 30 Oct 2016 19:43:04 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <1477831417-25655-1-git-send-email-andrey2805@gmail.com> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On 10/30/2016 01:43 PM, Andrey Grodzovsky wrote: > Problem: > This is a work around for a bug with LSI Fusion MPT SAS2 when > pefroming secure erase. Due to the very long time the operation > takes commands issued during the erase will time out and will trigger > execution of abort hook. Even though the abort hook is called for > the specifc command which timed out this leads to entire device halt > (scsi_state terminated) and premature termination of the secured erase. > Actually, it is _not_ the erase command which times out, it's the successive commands which time out, as the controller is unable to process them while erase is running. I suspect a bug in the SAT-layer from the mpt3sas firmware, which simply does not return 'busy' for additional commands when erase is in progress. That being said, this issue was obscured prior to implementing asynchronous aborts, as originally a timeout would be invoking SCSI EH, which would wait for all outstanding commands to complete. So by the time SCSI EH was invoked the erase command was already completed, allowing for a successful retry of the failing command. With asynchronous aborts we don't have this option, as the abort will succeed, but the command cannot be retried as the original erase command is still running. In the light of the above I guess we need something like the attached patch. I'm not utterly proud of if, but I guess it's the best we can do for the moment. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg) Reviewed-by: Hannes Reinecke >From 1556746987c3b4c1a1a4705625280b1136554f89 Mon Sep 17 00:00:00 2001 From: Hannes Reinecke Date: Sun, 30 Oct 2016 14:24:44 +0100 Subject: [PATCH] mpt3sas: hack: disable concurrent commands for ATA_16/ATA_12 There's a bug in the mpt3sas driver/firmware which would not return BUSY if it's busy processing requests (eg 'erase') and cannot respond to other commands. Hence these commands will timeout and eventually start the error handler. This patch disallows request processing whenever an ATA_12 or ATA_16 command is received, thereby avoiding this problem. Signed-off-by: Hannes Reinecke --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 97987e7..18b9f09 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -4096,6 +4096,13 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) sas_device_priv_data->block) return SCSI_MLQUEUE_DEVICE_BUSY; + /* + * Hack: block the device for any ATA_12/ATA_16 command + */ + if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) { + sas_device_priv_data = scmd->device->hostdata; + _scsih_internal_device_block(scmd->device, sas_device_priv_data); + } if (scmd->sc_data_direction == DMA_FROM_DEVICE) mpi_control = MPI2_SCSIIO_CONTROL_READ; else if (scmd->sc_data_direction == DMA_TO_DEVICE) @@ -4835,6 +4842,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) out: + if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) { + sas_device_priv_data = scmd->device->hostdata; + _scsih_internal_device_unblock(scmd->device, sas_device_priv_data); + } scsi_dma_unmap(scmd); scmd->scsi_done(scmd); -- 2.6.6