From patchwork Wed Aug 15 10:23:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhen Lei X-Patchwork-Id: 144276 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp655480ljj; Wed, 15 Aug 2018 03:23:37 -0700 (PDT) X-Google-Smtp-Source: AA+uWPwbmD2WFfT5LLj+DdeclPNPQupYeXDiB+cn4BosvvmQJiVCtL6UiE+mcx24fTupkCQdF5KO X-Received: by 2002:a17:902:6b05:: with SMTP id o5-v6mr23616467plk.338.1534328617558; Wed, 15 Aug 2018 03:23:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534328617; cv=none; d=google.com; s=arc-20160816; b=eYpZ6O1XFJ94xdtzmSBVrsRdSoZbiwQQoMJTiWdgt7DVeqjD7qpo+OOBmEnW419hHH hvV3NXg7F/xLTgvJwGyh/98E4L3eLjQt/cbxBMYNcIlmy2A5RboCx8tnVR6ETnUiFPIN cpxvuVOkF40MhJHwt2wkW/P6AviTixUuHM91vdwP7ISM5gyEDh7dX67i1fDjxxwI6x38 F6ByeX41N/2NH7hXzblYgULz2n5FGYMD7QW2pvyojE6IlK/aExCk4Rra9EOST9u5i0JX 8hYnVvUgZ4nRLJerwZ7uhVdool0fXDkpFbXJsK9VhP3zHn/Xqpn5bIGosXYwm/3qJwiU kdbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=nbEwdIp62yKPIo9mGXCby6qf5S5zNk9NDcay98Wt7E0=; b=HnNMaHo859eapMBocpq+TGzcihJ4WePavYnK5qylwxFvssgZPplye+nOOV6bnBgQja ZK9am8T2gUfK23ze12hG8i0umPp//WpalrKPZKBbyr4pveU2zAUf8VLKWDWhqF3Ne8xa PzXDVy6MrttZHf0ycAXlwNwIKrztfc1Dqc+Q1npVwgyUDNU+pMi4NuKpbbevICcTV3x+ hoyFnV0E5Jr6ik28bwjLCXDlC23n2tYXyMF+T1OCctukKy1YCWxwB2O9kVwmCpbteB8L u6TanjJjvp7HtDokJlKx/biJ29i4dTm+CORVRPd2oZQZGfg9ZwrjvgBVaRqPnF4hbc5d o0WA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u4-v6si23006315pgm.454.2018.08.15.03.23.36; Wed, 15 Aug 2018 03:23:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729106AbeHONPH (ORCPT + 32 others); Wed, 15 Aug 2018 09:15:07 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:60846 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728910AbeHONPH (ORCPT ); Wed, 15 Aug 2018 09:15:07 -0400 Received: from DGGEMS401-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id C726950DC2D5E; Wed, 15 Aug 2018 18:23:28 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS401-HUB.china.huawei.com (10.3.19.201) with Microsoft SMTP Server id 14.3.399.0; Wed, 15 Aug 2018 18:23:23 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel CC: Zhen Lei , LinuxArm , Hanjun Guo , Libin , "John Garry" Subject: [PATCH v3 1/2] iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout Date: Wed, 15 Aug 2018 18:23:01 +0800 Message-ID: <1534328582-17664-2-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1534328582-17664-1-git-send-email-thunder.leizhen@huawei.com> References: <1534328582-17664-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The condition "(int)(VAL - sync_idx) >= 0" to break loop in function __arm_smmu_sync_poll_msi requires that sync_idx must be increased monotonously according to the sequence of the CMDs in the cmdq. But ".msidata = atomic_inc_return_relaxed(&smmu->sync_nr)" is not protected by spinlock, so the following scenarios may appear: cpu0 cpu1 msidata=0 msidata=1 insert cmd1 insert cmd0 smmu execute cmd1 smmu execute cmd0 poll timeout, because msidata=1 is overridden by cmd0, that means VAL=0, sync_idx=1. This is not a functional problem, just make the caller wait for a long time until TIMEOUT. It's rare to happen, because any other CMD_SYNCs during the waiting period will break it. Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) -- 1.8.3 diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 1d64710..3f5c236 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -566,7 +566,7 @@ struct arm_smmu_device { int gerr_irq; int combined_irq; - atomic_t sync_nr; + u32 sync_nr; unsigned long ias; /* IPA */ unsigned long oas; /* PA */ @@ -775,6 +775,11 @@ static int queue_remove_raw(struct arm_smmu_queue *q, u64 *ent) return 0; } +static inline void arm_smmu_cmdq_sync_set_msidata(u64 *cmd, u32 msidata) +{ + cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, msidata); +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -836,7 +841,6 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_SEV); cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSH, ARM_SMMU_SH_ISH); cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIATTR, ARM_SMMU_MEMATTR_OIWB); - cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, ent->sync.msidata); cmd[1] |= ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; break; default: @@ -947,7 +951,6 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) struct arm_smmu_cmdq_ent ent = { .opcode = CMDQ_OP_CMD_SYNC, .sync = { - .msidata = atomic_inc_return_relaxed(&smmu->sync_nr), .msiaddr = virt_to_phys(&smmu->sync_count), }, }; @@ -955,6 +958,8 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) arm_smmu_cmdq_build_cmd(cmd, &ent); spin_lock_irqsave(&smmu->cmdq.lock, flags); + ent.sync.msidata = ++smmu->sync_nr; + arm_smmu_cmdq_sync_set_msidata(cmd, ent.sync.msidata); arm_smmu_cmdq_insert_cmd(smmu, cmd); spin_unlock_irqrestore(&smmu->cmdq.lock, flags); @@ -2179,7 +2184,6 @@ static int arm_smmu_init_structures(struct arm_smmu_device *smmu) { int ret; - atomic_set(&smmu->sync_nr, 0); ret = arm_smmu_init_queues(smmu); if (ret) return ret; From patchwork Wed Aug 15 10:23:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhen Lei X-Patchwork-Id: 144277 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp655505ljj; Wed, 15 Aug 2018 03:23:40 -0700 (PDT) X-Google-Smtp-Source: AA+uWPzuWeOczdMA2y2qWJ/JZxW3R74+qaZU0iU/A9Hzz6hSkaEMfPUqZP6XQLWZ0cNdazxYhHwu X-Received: by 2002:a63:35c3:: with SMTP id c186-v6mr24639824pga.217.1534328619989; Wed, 15 Aug 2018 03:23:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534328619; cv=none; d=google.com; s=arc-20160816; b=yCYrD6BzKolt2q0Tsq5tigHjn4Fi0WwSmS+2/DANYQbN98zq4HqIZHoWL0H7xT8cym 42tJQrcmezSmPwHj3mpt1FurT4Oy+xyLBdWSt8b4aUXx4aPVEajAVIpdy0iBs6LwWmp0 JAs3b5QUEWBTcpobOeipIhu0+rGSoAuYfB4NJxvdETQCiC4hPOINiSIEdtfKSgmG4NKL RfhObbiqfWII6jvyyS2HNk9UP86KpATnA7fVGHKbtf5O8hR/BoEGIkI7wKqfcN9o+duQ DzZqXO18N26HkqB2xfliEI+n4/678dJKKITPE7JsQbU5ulF4k89fZr12uNvltZCSZsb4 0vTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=N/Ss2PJ5Qy+q7f3yaNGTZlfkFEWBHSEEiq/BBdouYwQ=; b=A7qx4SJ4zA33sphTslEUPe/RstXsWyTamPbAPw7GXoAcVrB0urghKK/CI5C/dE8LUu nJ4RkTjioDPy1o5jthi7VA567in86RgQZlEGesRN3a9E39CcGDZ+/76n8hSXhTXsZfgP o5qtCtHxL4lcf41BSMRj6Ub2EwNY0ZXtPs/GZF2HI9chemWIFKlTofA9bf0zv/d/FuvC jNn4SpmMN/zapLGOr1mab3t6kqLlivg7kO3IlGr5Dabb0WMH5HqzxjyLhgGAVqOka7wd dh4ihDcoAClbLVeOYdYbUvMrRMgAdhla6suxckKTmCscMXmF8cgh2/hJK319+Wpnp2V4 KS2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6-v6si20523429pgl.567.2018.08.15.03.23.39; Wed, 15 Aug 2018 03:23:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729137AbeHONPK (ORCPT + 32 others); Wed, 15 Aug 2018 09:15:10 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:11162 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728910AbeHONPJ (ORCPT ); Wed, 15 Aug 2018 09:15:09 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id 5D7987B133A6C; Wed, 15 Aug 2018 18:23:30 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.399.0; Wed, 15 Aug 2018 18:23:25 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel CC: Zhen Lei , LinuxArm , Hanjun Guo , Libin , "John Garry" Subject: [PATCH v3 2/2] iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible Date: Wed, 15 Aug 2018 18:23:02 +0800 Message-ID: <1534328582-17664-3-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1534328582-17664-1-git-send-email-thunder.leizhen@huawei.com> References: <1534328582-17664-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org More than two CMD_SYNCs maybe adjacent in the command queue, and the first one has done what others want to do. Drop the redundant CMD_SYNCs can improve IO performance especially under the pressure scene. I did the statistics in my test environment, the number of CMD_SYNCs can be reduced about 1/3. See below: CMD_SYNCs reduced: 19542181 CMD_SYNCs total: 58098548 (include reduced) CMDs total: 116197099 (TLBI:SYNC about 1:1) Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) -- 1.8.3 diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 3f5c236..ee0219b 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -567,6 +567,7 @@ struct arm_smmu_device { int gerr_irq; int combined_irq; u32 sync_nr; + u8 prev_cmd_opcode; unsigned long ias; /* IPA */ unsigned long oas; /* PA */ @@ -780,6 +781,11 @@ static inline void arm_smmu_cmdq_sync_set_msidata(u64 *cmd, u32 msidata) cmd[0] |= FIELD_PREP(CMDQ_SYNC_0_MSIDATA, msidata); } +static inline u8 arm_smmu_cmd_opcode_get(u64 *cmd) +{ + return cmd[0] & CMDQ_0_OP; +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -904,6 +910,8 @@ static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, u64 *cmd) struct arm_smmu_queue *q = &smmu->cmdq.q; bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV); + smmu->prev_cmd_opcode = arm_smmu_cmd_opcode_get(cmd); + while (queue_insert_raw(q, cmd) == -ENOSPC) { if (queue_poll_cons(q, false, wfe)) dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); @@ -958,9 +966,17 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) arm_smmu_cmdq_build_cmd(cmd, &ent); spin_lock_irqsave(&smmu->cmdq.lock, flags); - ent.sync.msidata = ++smmu->sync_nr; - arm_smmu_cmdq_sync_set_msidata(cmd, ent.sync.msidata); - arm_smmu_cmdq_insert_cmd(smmu, cmd); + if (smmu->prev_cmd_opcode == CMDQ_OP_CMD_SYNC) { + /* + * Previous command is CMD_SYNC also, there is no need to add + * one more. Just poll it. + */ + ent.sync.msidata = smmu->sync_nr; + } else { + ent.sync.msidata = ++smmu->sync_nr; + arm_smmu_cmdq_sync_set_msidata(cmd, ent.sync.msidata); + arm_smmu_cmdq_insert_cmd(smmu, cmd); + } spin_unlock_irqrestore(&smmu->cmdq.lock, flags); return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);