From patchwork Thu Oct 22 12:34:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muneendra Kumar X-Patchwork-Id: 287079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, DATE_IN_PAST_06_12, DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, MIME_HEADER_CTYPE_ONLY, MIME_NO_TEXT, SPF_HELO_NONE, SPF_PASS, T_TVD_MIME_NO_HEADERS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30094C388F9 for ; Thu, 22 Oct 2020 19:28:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BBEB12465D for ; Thu, 22 Oct 2020 19:28:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="KTZx3ygv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2509603AbgJVT2V (ORCPT ); Thu, 22 Oct 2020 15:28:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2508349AbgJVT2V (ORCPT ); Thu, 22 Oct 2020 15:28:21 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC04AC0613CE for ; Thu, 22 Oct 2020 12:28:20 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id l18so1575124pgg.0 for ; Thu, 22 Oct 2020 12:28:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id; bh=ze22d1Ef/4NHZsfYJtkZP4FWclfRmXLyfBddSRe5gyE=; b=KTZx3ygvyi2tVtGIzORcUCf3BXAvpiRfvSfwhhEHQt+Wbu/OGNDTf9CM3lPjqapnbA MADbEhzMKntibiHCrQSCiMgaoA+FJVV3kxBN/0b9N/IUrAzTNIkGTWrWkiZVpOtXSy1Y +czAAjQpYnrGqKS/p1dWPnyWzjQDKYzHGJN5I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ze22d1Ef/4NHZsfYJtkZP4FWclfRmXLyfBddSRe5gyE=; b=cjodfsEARfohX66bxBtYkBLGq8c/qgzjtRkKgentFaWW5gC+nXvTZ6XNkPa7I/8DNq +aU8JIRXAYGkq4RLRZz92Hee/zlbBwimJNM3e4/iVRlNBvde6g9wcm6vgMF59L0EOI5R 8vndCezOtVKaA9ykyBDskirtUzzxcdjmBM2f5Rq9k6G44W1ZE2OKiEF4164MMxsMZALl a2J4WJsoqdTQrMCah03RgEsUe4FXssAar8g8ib58D9cqkbQywsMGLcR5l4kcu1FbXtPd UWQUaapgowDVsvTRnkJhW2hIHvBZtgcf9KLs1GAeGiG7cePGrg/T9+8Ybi+spXcEXdoZ 9AYw== X-Gm-Message-State: AOAM531/d7cBqUMKEkICOkmsWbSY2PUUVAvAgvs8a9MnLJIbL1nWBMrU +eXlr8OZY49s2rXF3wWwsG5dETHYN1BJrqxAGopUtZbYINcvfHMY4vrccA4DzyWU7Twpw67LypA CM+av1xH/DxgdL/eLwVALbC1dODLEp/qI7wX+4osp5BozaFs5JHRu0RBbq+pj8t9XOHY7JoEDmV eRPbXcjrSZ77Q= X-Google-Smtp-Source: ABdhPJzsplkDJ1yuYgTgRoHG2EJi8y8zaoVNMqyijoOGMN+RhJBCbc4vTxZST5sKYhUpKegqGjpUeQ== X-Received: by 2002:a63:d315:: with SMTP id b21mr3384705pgg.331.1603394899933; Thu, 22 Oct 2020 12:28:19 -0700 (PDT) Received: from localhost.localdomain ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id s10sm2846759pji.7.2020.10.22.12.28.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Oct 2020 12:28:19 -0700 (PDT) From: Muneendra To: linux-scsi@vger.kernel.org, michael.christie@oracle.com, hare@suse.de Cc: jsmart2021@gmail.com, emilne@redhat.com, mkumar@redhat.com, Muneendra Subject: [patch v4 0/5] scsi: Support to handle Intermittent errors Date: Thu, 22 Oct 2020 18:04:46 +0530 Message-Id: <1603370091-9337-1-git-send-email-muneendra.kumar@broadcom.com> X-Mailer: git-send-email 1.8.3.1 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This patch adds a support to prevent retries of all the io's after an abort succeeds on a particular device when transport connectivity to the device is encountering intermittent errors. Intermittent connectivity is a condition that can be detected by transport fabric notifications. A service can monitor the ELS notifications and take action on all the outstanding io's of a scsi device at that instant. This feature is intended to be used when the device is part of a multipath environment. When the service detects the poor connectivity, the multipath path can be placed in a marginal path group and ignored further io operations. After placing a path in the marginal path group,the daemon sets the port_state to Marginal which sets bit in scmd->state for all the io's on that particular device with the new sysfs interface provided in this patch.This prevent retries of all the io's if an io hits a scsi timeout which inturn issues an abort. On Abort succeeds on a marginal path the io will be immediately retried on another active path.On abort fails then the things escalates to existing target reset sg interface recovery process. Below is the interface provided to set the port state to Marginal and Online. echo "Marginal" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state echo "Online" >> /sys/class/fc_remote_ports/rport-X\:Y-Z/port_state The patches were cut against 5.10/scsi-queue tree --- v4: Made changes in fc_eh_timed_out callout to set the SCMD_NORETRIES_ABORT if port state is marginal With this change, we removed the code to loop over running commands and fc_remote_port_chkready changes to set the SCMD_NORETRIES_ABORT Removed the scsi_cmd argument for fc_remote_port_chkready and reverted back the patches that addressed this change(argument) Removed unnecessary comments Handle the return of errors on failure. v3: Removed the port_state from starget attributes. Enabled the store functionality for port_state under remote port Added a new argument to scsi_cmd to fc_remote_port_chkready Used the existing scsi command iterators scsi_host_busy_iter. Rearranged the patches Added new patches to add new argument for fc_remote_port_chkready v2: Added new error code DID_TRANSPORT_MARGINAL to handle marginal errors. Added a new rport_state FC_PORTSTATE_MARGINAL and also added a new sysfs interface port_state to set the port_state to marginal. Added the support in lpfc to handle the marginal state. Muneendra (5): scsi: Added a new definition in scsi_cmnd.h scsi: Added a new error code in scsi.h scsi: No retries on abort success scsi_transport_fc: Added a new rport state FC_PORTSTATE_MARGINAL scsi_transport_fc: Added store fucntionality to set the rport port_state using sysfs drivers/scsi/scsi_error.c | 10 ++++ drivers/scsi/scsi_lib.c | 2 + drivers/scsi/scsi_transport_fc.c | 97 ++++++++++++++++++++++++++------ include/scsi/scsi.h | 1 + include/scsi/scsi_cmnd.h | 3 + include/scsi/scsi_transport_fc.h | 19 +++++++ 6 files changed, 114 insertions(+), 18 deletions(-)