From patchwork Thu May 18 07:59:55 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Zhen Lei <thunder.leizhen@huawei.com>
X-Patchwork-Id: 100050
Delivered-To: patch@linaro.org
Received: by 10.140.96.100 with SMTP id j91csp602152qge;
 Thu, 18 May 2017 01:02:41 -0700 (PDT)
X-Received: by 10.98.137.140 with SMTP id n12mr2982302pfk.183.1495094560962; 
 Thu, 18 May 2017 01:02:40 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1495094560; cv=none;
 d=google.com; s=arc-20160816;
 b=N57QivRKmF1tBF3hjfbIOGFeP0ypmD2pGF7P8rIADTWsvjvyngRXqG1ZDMWXjUi0v5
 M/NryPc8ZatjK7QL5a3YWYTrAgr/NGvr6LOKRCzmAowfbN5Y6jFawx4x7v0PeKtfcFmF
 S0gabt1811RARaDEU9nL3FgeLz7dW2QK74TKEHJ0FWaCczcSV9S/aZLPgMZhhGrsRlB+
 3QOJCMYS4D7TVE24DeZn7JOxRN2Cxc2aovOPFZMoszBcgk35ElxB8GeKLqcu5+t+bph6
 ij9jIUFS9ZL6ryV1a6A/OlsRJe2RlXShMa4L7hsma4MaptI4Ztpuj41zRR1AA463jeNO
 JCpg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816; 
 h=list-id:precedence:sender:mime-version:references:in-reply-to
 :message-id:date:subject:cc:to:from:arc-authentication-results;
 bh=3tVh6SNjPeKjqfzeZaqCnjbWHD/B578jUIkppVu6xa8=;
 b=EPqXYdzkZQWWL712iYhWqdFhrN6dNwh17YKaPyhbSTq1wWe3LF5FHjmMQ5koJm3ZV2
 rdI1JUbw/QhvmZWgUpBa3IhWILzU9/3kavrgLH7guAZ0UftWjSOcTOOLd9mjdlFpgNyH
 HMtUlQ91NjUupF1ts6yIhmPMz/wf0GUhZGPb2bAiKMu4aY8P5DGbkuYl2WB362ZuM3px
 exvYosPzqyo/6wjlV3+eaDFTXwnTXSEKAIyt/qAYlJdOkD2KS6qxg79jIQLuao0dYJY3
 sUABiKCPA9ek6swAGEzJOSchApRFcaBfQkQFrKvr/kSe3vv0giAglutAxs4uKijsAuzY
 dGpA==
ARC-Authentication-Results: i=1; mx.google.com;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 y10si4532358pgo.351.2017.05.18.01.02.40; 
 Thu, 18 May 2017 01:02:40 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1755092AbdERICW (ORCPT <rfc822;georgi.djakov@linaro.org>
 + 25 others); Thu, 18 May 2017 04:02:22 -0400
Received: from szxga01-in.huawei.com ([45.249.212.187]:6779 "EHLO
 szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1752852AbdERICG (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Thu, 18 May 2017 04:02:06 -0400
Received: from 172.30.72.56 (EHLO DGGEML401-HUB.china.huawei.com)
 ([172.30.72.56])
 by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued)
 with ESMTP id AOU59818; Thu, 18 May 2017 16:02:00 +0800 (CST)
Received: from localhost (10.177.23.164) by DGGEML401-HUB.china.huawei.com
 (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0;
 Thu, 18 May 2017 16:01:52 +0800
From: Zhen Lei <thunder.leizhen@huawei.com>
To: Joerg Roedel <joro@8bytes.org>, iommu <iommu@lists.linux-foundation.org>, 
 Robin Murphy <robin.murphy@arm.com>, David Woodhouse <dwmw2@infradead.org>,
 Sudeep Dutt <sudeep.dutt@intel.com>,
 Ashutosh Dixit <ashutosh.dixit@intel.com>,
 linux-kernel <linux-kernel@vger.kernel.org>
CC: Zefan Li <lizefan@huawei.com>, Xinwei Hu <huxinwei@huawei.com>,
 "Tianhong Ding" <dingtianhong@huawei.com>,
 Hanjun Guo <guohanjun@huawei.com>, Zhen Lei <thunder.leizhen@huawei.com>
Subject: [PATCH v3 4/6] iommu/iova: to optimize the allocation performance
 of dma64
Date: Thu, 18 May 2017 15:59:55 +0800
Message-ID: <1495094397-9132-5-git-send-email-thunder.leizhen@huawei.com>
X-Mailer: git-send-email 1.9.5.msysgit.0
In-Reply-To: <1495094397-9132-1-git-send-email-thunder.leizhen@huawei.com>
References: <1495094397-9132-1-git-send-email-thunder.leizhen@huawei.com>
MIME-Version: 1.0
X-Originating-IP: [10.177.23.164]
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
 refid=str=0001.0A020203.591D54F9.001C, ss=1, re=0.000, recu=0.000,
 reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0,
 so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 93c6b3f95fba0b7fa69efab330f6ae55
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Currently we always search free iova space for dma64 begin at the last
node of iovad rb-tree. In the worst case, there maybe too many nodes exist
at the tail, so that we should traverse many times for the first loop in
__alloc_and_insert_iova_range. As we traced, more than 10K times for the
case of iperf.

__alloc_and_insert_iova_range:
	......
	curr = __get_cached_rbnode(iovad, &limit_pfn);
		//--> return rb_last(&iovad->rbroot);
	while (curr) {
		......
		curr = rb_prev(curr);
	}

So add cached64_node to take the same effect as cached32_node, and add
the start_pfn boundary of dma64, to prevent a iova cross both dma32 and
dma64 area.
	|-------------------|------------------------------|
	|<--cached32_node-->|<--------cached64_node------->|
	|                   |
    start_pfn         dma_32bit_pfn + 1

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/iova.c | 46 +++++++++++++++++++++++++++-------------------
 include/linux/iova.h |  5 +++--
 2 files changed, 30 insertions(+), 21 deletions(-)

-- 
2.5.0

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 1b8e136..711b10a 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -37,10 +37,15 @@ insert_iova_boundary(struct iova_domain *iovad)
 {
 	struct iova *iova;
 	unsigned long start_pfn_32bit = iovad->start_pfn;
+	unsigned long start_pfn_64bit = iovad->dma_32bit_pfn + 1;
 
 	iova = reserve_iova(iovad, start_pfn_32bit, start_pfn_32bit);
 	BUG_ON(!iova);
 	iovad->cached32_node = &iova->node;
+
+	iova = reserve_iova(iovad, start_pfn_64bit, start_pfn_64bit);
+	BUG_ON(!iova);
+	iovad->cached64_node = &iova->node;
 }
 
 void
@@ -62,8 +67,8 @@ init_iova_domain(struct iova_domain *iovad, unsigned long granule,
 	init_iova_rcaches(iovad);
 
 	/*
-	 * Insert boundary nodes for dma32. So cached32_node can not be NULL in
-	 * future.
+	 * Insert boundary nodes for dma32 and dma64. So cached32_node and
+	 * cached64_node can not be NULL in future.
 	 */
 	insert_iova_boundary(iovad);
 }
@@ -75,10 +80,10 @@ __get_cached_rbnode(struct iova_domain *iovad, unsigned long *limit_pfn)
 	struct rb_node *cached_node;
 	struct rb_node *next_node;
 
-	if (*limit_pfn > iovad->dma_32bit_pfn)
-		return rb_last(&iovad->rbroot);
-	else
+	if (*limit_pfn <= iovad->dma_32bit_pfn)
 		cached_node = iovad->cached32_node;
+	else
+		cached_node = iovad->cached64_node;
 
 	next_node = rb_next(cached_node);
 	if (next_node) {
@@ -94,29 +99,32 @@ static void
 __cached_rbnode_insert_update(struct iova_domain *iovad, struct iova *new)
 {
 	struct iova *cached_iova;
+	struct rb_node **cached_node;
 
-	if (new->pfn_hi > iovad->dma_32bit_pfn)
-		return;
+	if (new->pfn_hi <= iovad->dma_32bit_pfn)
+		cached_node = &iovad->cached32_node;
+	else
+		cached_node = &iovad->cached64_node;
 
-	cached_iova = rb_entry(iovad->cached32_node, struct iova, node);
+	cached_iova = rb_entry(*cached_node, struct iova, node);
 	if (new->pfn_lo <= cached_iova->pfn_lo)
-		iovad->cached32_node = rb_prev(&new->node);
+		*cached_node = rb_prev(&new->node);
 }
 
 static void
 __cached_rbnode_delete_update(struct iova_domain *iovad, struct iova *free)
 {
 	struct iova *cached_iova;
-	struct rb_node *curr;
+	struct rb_node **cached_node;
 
-	curr = iovad->cached32_node;
-	cached_iova = rb_entry(curr, struct iova, node);
+	if (free->pfn_hi <= iovad->dma_32bit_pfn)
+		cached_node = &iovad->cached32_node;
+	else
+		cached_node = &iovad->cached64_node;
 
-	if (free->pfn_lo >= cached_iova->pfn_lo) {
-		/* only cache if it's below 32bit pfn */
-		if (free->pfn_hi <= iovad->dma_32bit_pfn)
-			iovad->cached32_node = rb_prev(&free->node);
-	}
+	cached_iova = rb_entry(*cached_node, struct iova, node);
+	if (free->pfn_lo >= cached_iova->pfn_lo)
+		*cached_node = rb_prev(&free->node);
 }
 
 /* Insert the iova into domain rbtree by holding writer lock */
@@ -262,7 +270,7 @@ EXPORT_SYMBOL_GPL(iova_cache_put);
  * alloc_iova - allocates an iova
  * @iovad: - iova domain in question
  * @size: - size of page frames to allocate
- * @limit_pfn: - max limit address
+ * @limit_pfn: - max limit address(included)
  * @size_aligned: - set if size_aligned address range is required
  * This function allocates an iova in the range iovad->start_pfn to limit_pfn,
  * searching top-down from limit_pfn to iovad->start_pfn. If the size_aligned
@@ -381,7 +389,7 @@ EXPORT_SYMBOL_GPL(free_iova);
  * alloc_iova_fast - allocates an iova from rcache
  * @iovad: - iova domain in question
  * @size: - size of page frames to allocate
- * @limit_pfn: - max limit address
+ * @limit_pfn: - max limit address(included)
  * This function tries to satisfy an iova allocation from the rcache,
  * and falls back to regular allocation on failure.
 */
diff --git a/include/linux/iova.h b/include/linux/iova.h
index e0a892a..2d34112 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -40,10 +40,11 @@ struct iova_rcache {
 struct iova_domain {
 	spinlock_t	iova_rbtree_lock; /* Lock to protect update of rbtree */
 	struct rb_root	rbroot;		/* iova domain rbtree root */
-	struct rb_node	*cached32_node; /* Save last alloced node */
+	struct rb_node	*cached32_node; /* Save last alloced node, 32bits */
+	struct rb_node	*cached64_node; /* Save last alloced node, 64bits */
 	unsigned long	granule;	/* pfn granularity for this domain */
 	unsigned long	start_pfn;	/* Lower limit for this domain */
-	unsigned long	dma_32bit_pfn;
+	unsigned long	dma_32bit_pfn;	/* max dma32 limit address(included) */
 	struct iova_rcache rcaches[IOVA_RANGE_CACHE_MAX_SIZE];	/* IOVA range caches */
 };