From patchwork Fri Sep 6 12:55:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 173233 Delivered-To: patch@linaro.org Received: by 2002:ac9:19ad:0:0:0:0:0 with SMTP id d45csp600649oce; Fri, 6 Sep 2019 05:58:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqxYCuueVNCs7x8f+0dFSfYApLrqu5gkcoJSveFk6ztBaN//bgzg+o4+IC6d+w47C88dlZuE X-Received: by 2002:a65:528d:: with SMTP id y13mr8042113pgp.120.1567774729697; Fri, 06 Sep 2019 05:58:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567774729; cv=none; d=google.com; s=arc-20160816; b=BB3wjEOUUgeL26hvCwmMRd7E1YC6K1+M0sOx54GqJEVpHITyVhBoWy7yEPTpee+6hb k8VQ88SzL8Rtt2s64puUsxz+6kvCVz7ZJdI0VIf59XQrbegKOo4cb6tv7Rx7WyZgMT07 B7vYocEV2svZGR/o1AMsn6aXHzJX+UHbt/ES7ZxxnlItCYkoFed8mtZouLce+cb6Wxqm VPeFPTC9wRJCoPNp4lZBAObUJJ7/t4PxLRKte7RvpRpCe1CoA7xtSaCF5asrfaUPAgdw eG3RDik4L4e+aA0enUM6qWYJbmnFZ7tZP32Mna8Dwo9MWilr0RBENm9jTaefX2dW9iNn GdSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=46iYIXeA7ics6SzH8gJJDGFM7za/LPf8CaJK9lSydQw=; b=k5ssWrOQRN/C5iy1tiQ2qGaocZ/D3atZg+7Bex8639BNlaHOLFPdlvRbTshYfRW6uJ GiS7Zwwvj4EforKcGaHS7Xuej7Gb9jhJOIFfZA0VU/+EjWYz2YdavgnwfBbEMJGX+ay6 63efMza7Lo9FN3FcxqJe+jgGkNpQ/ZG2g/RGlsaU4cuf5osnh/F1+2fFVLycL1jEyDIz wucfzHNtkJXPi8OAIwjsQROAo/jiLO7LT3mRrSzVCCG+Hi1ZXAIvRJ0CpZpvEgDBOz+0 FQPpV5ql9+V4xcgxxk7IJ0Rq4b37Gj62kfO2hDs2iB1iwHSZzsaDlt6bF+PZjpgbyHrh j/TQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-scsi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-scsi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s5si4862251pjk.68.2019.09.06.05.58.49 for ; Fri, 06 Sep 2019 05:58:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-scsi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-scsi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-scsi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391494AbfIFM6s (ORCPT ); Fri, 6 Sep 2019 08:58:48 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:56656 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404930AbfIFM6R (ORCPT ); Fri, 6 Sep 2019 08:58:17 -0400 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id CED8BFF98E9F63AD3B0C; Fri, 6 Sep 2019 20:58:15 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.439.0; Fri, 6 Sep 2019 20:58:05 +0800 From: John Garry To: , CC: , , , Xiang Chen , "John Garry" Subject: [PATCH 13/13] scsi: hisi_sas: Fix the conflict between device gone and host reset Date: Fri, 6 Sep 2019 20:55:37 +0800 Message-ID: <1567774537-20003-14-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1567774537-20003-1-git-send-email-john.garry@huawei.com> References: <1567774537-20003-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Xiang Chen When device gone, it will check whether it is during reset, if not, it will send internal task abort. Before internal task abort returned, reset begins, and it will check whether SAS_PHY_UNUSED is set, if not, it will call hisi_sas_init_device(), but at that time domain_device may already be freed or part of it is freed, so it may referenece null pointer in hisi_sas_init_device(). It may occur as follows: thread0 thread1 hisi_sas_dev_gone() check whether in RESET(no) internal task abort reset prep soft_reset ... (part of reset_done) internal task abort failed release resource anyway clear_itct device->lldd_dev=NULL hisi_sas_reset_init_all_device check sas_dev->dev_type is SAS_PHY_UNUSED and !device set dev_type SAS_PHY_UNUSED sas_free_device hisi_sas_init_device ... Semaphore hisi_hba.sema is used to sync the processes of device gone and host reset. To solve the issue, expand the scope that semaphore protects and let them never occur together. And also some places will check whether domain_device is NULL to judge whether the device is gone. So when device gone, need to clear sas_dev->sas_device. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) -- 2.17.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 04cbc54be387..a7b3d9d38fdc 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1049,21 +1049,22 @@ static void hisi_sas_dev_gone(struct domain_device *device) dev_info(dev, "dev[%d:%x] is gone\n", sas_dev->device_id, sas_dev->dev_type); + down(&hisi_hba->sem); if (!test_bit(HISI_SAS_RESET_BIT, &hisi_hba->flags)) { hisi_sas_internal_task_abort(hisi_hba, device, HISI_SAS_INT_ABT_DEV, 0); hisi_sas_dereg_device(hisi_hba, device); - down(&hisi_hba->sem); hisi_hba->hw->clear_itct(hisi_hba, sas_dev); - up(&hisi_hba->sem); device->lldd_dev = NULL; } if (hisi_hba->hw->free_device) hisi_hba->hw->free_device(sas_dev); sas_dev->dev_type = SAS_PHY_UNUSED; + sas_dev->sas_device = NULL; + up(&hisi_hba->sem); } static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags) @@ -1543,11 +1544,11 @@ void hisi_sas_controller_reset_done(struct hisi_hba *hisi_hba) msleep(1000); hisi_sas_refresh_port_id(hisi_hba); clear_bit(HISI_SAS_REJECT_CMD_BIT, &hisi_hba->flags); - up(&hisi_hba->sem); if (hisi_hba->reject_stp_links_msk) hisi_sas_terminate_stp_reject(hisi_hba); hisi_sas_reset_init_all_devices(hisi_hba); + up(&hisi_hba->sem); scsi_unblock_requests(shost); clear_bit(HISI_SAS_RESET_BIT, &hisi_hba->flags);