From patchwork Thu Apr 10 16:48:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Huang X-Patchwork-Id: 28232 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ie0-f200.google.com (mail-ie0-f200.google.com [209.85.223.200]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id C4E89212DC for ; Thu, 10 Apr 2014 16:53:52 +0000 (UTC) Received: by mail-ie0-f200.google.com with SMTP id lx4sf19444473iec.7 for ; Thu, 10 Apr 2014 09:53:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:date:message-id:in-reply-to :references:cc:subject:precedence:list-id:list-unsubscribe:list-post :list-help:list-subscribe:mime-version:sender:errors-to :x-original-sender:x-original-authentication-results:mailing-list :list-archive:content-type:content-transfer-encoding; bh=qhhKcMrO7AhuE8jgHOpeN61+HRGAOYwTi7rmk8ihYPI=; b=gxSe+X9QPvrYkfT7nV0WJE11UM0FG6FWFuJbgpIC3OciUXMuT5XMHkSAr0xEo1PqLQ c1/QvIakD/7hGqr1QdSEaeP7fys/5MtbY5KwhohNmmn8m5iEONUoyZHaTDmrvg3tNIX4 dMo1m79HL2M5Wl6CLW/1oCtrKhx7zttSkOHqOPTHo2J7Co+R35Z5ydQFdReAkv8vzIt4 ZrkheHMQ6LshYpLbtKxJuGuk4KWbWEDpiIjaO9XUzP9nDnAAIBPZrCi1EBe4/8g9/dFi 978riYbYmCvNbpGB5WJ9OImM0inBisMoSguJokYi7PKYfPtPabsTWpkrEfXfqJbTSt/T Noxw== X-Gm-Message-State: ALoCoQljpRocQCJW1beZ5SbntD9AYAmMvPDSuXtGGUX5eezFqjO7s4zwU4vEg80aHQTTWwgOKyyY X-Received: by 10.43.161.202 with SMTP id mh10mr7116206icc.23.1397148832263; Thu, 10 Apr 2014 09:53:52 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.86.170 with SMTP id p39ls1247665qgd.49.gmail; Thu, 10 Apr 2014 09:53:52 -0700 (PDT) X-Received: by 10.58.77.238 with SMTP id v14mr1308667vew.27.1397148832097; Thu, 10 Apr 2014 09:53:52 -0700 (PDT) Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by mx.google.com with ESMTPS id sc7si791597vdc.85.2014.04.10.09.53.52 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 10 Apr 2014 09:53:52 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.180 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.220.180; Received: by mail-vc0-f180.google.com with SMTP id lf12so3568039vcb.25 for ; Thu, 10 Apr 2014 09:53:52 -0700 (PDT) X-Received: by 10.220.101.3 with SMTP id a3mr1000715vco.31.1397148831994; Thu, 10 Apr 2014 09:53:51 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp75307vcb; Thu, 10 Apr 2014 09:53:51 -0700 (PDT) X-Received: by 10.220.161.8 with SMTP id p8mr15179429vcx.4.1397148831602; Thu, 10 Apr 2014 09:53:51 -0700 (PDT) Received: from lists.xen.org (lists.xen.org. [50.57.142.19]) by mx.google.com with ESMTPS id xo2si794160vec.62.2014.04.10.09.53.51 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 10 Apr 2014 09:53:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of xen-devel-bounces@lists.xen.org designates 50.57.142.19 as permitted sender) client-ip=50.57.142.19; Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WYIC9-0000mS-TQ; Thu, 10 Apr 2014 16:51:21 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WYIC7-0000lq-Va for xen-devel@lists.xen.org; Thu, 10 Apr 2014 16:51:20 +0000 Received: from [85.158.139.211:30303] by server-10.bemta-5.messagelabs.com id 2E/18-27081-70CC6435; Thu, 10 Apr 2014 16:51:19 +0000 X-Env-Sender: w1.huang@samsung.com X-Msg-Ref: server-11.tower-206.messagelabs.com!1397148671!4937538!4 X-Originating-IP: [203.254.224.25] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMjAzLjI1NC4yMjQuMjUgPT4gMzcxMzg3\n X-StarScan-Received: X-StarScan-Version: 6.11.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 9900 invoked from network); 10 Apr 2014 16:51:17 -0000 Received: from mailout2.samsung.com (HELO mailout2.samsung.com) (203.254.224.25) by server-11.tower-206.messagelabs.com with DES-CBC3-SHA encrypted SMTP; 10 Apr 2014 16:51:17 -0000 Received: from epcpsbgm2.samsung.com (epcpsbgm2 [203.254.230.27]) by mailout2.samsung.com (Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTP id <0N3T00EN1QTG7GB0@mailout2.samsung.com> for xen-devel@lists.xen.org; Fri, 11 Apr 2014 01:51:16 +0900 (KST) X-AuditID: cbfee61b-b7f456d000006dfd-22-5346cc049747 Received: from epmmp2 ( [203.254.227.17]) by epcpsbgm2.samsung.com (EPCPMTA) with SMTP id 3C.41.28157.40CC6435; Fri, 11 Apr 2014 01:51:16 +0900 (KST) Received: from weihp.spa.sarc.sas ([105.140.31.10]) by mmp2.samsung.com (Oracle Communications Messaging Server 7u4-24.01 (7.0.4.24.0) 64bit (built Nov 17 2011)) with ESMTPA id <0N3T00LLWQT18X00@mmp2.samsung.com>; Fri, 11 Apr 2014 01:51:16 +0900 (KST) From: Wei Huang To: xen-devel@lists.xen.org Date: Thu, 10 Apr 2014 16:48:57 +0000 Message-id: <1397148539-19084-5-git-send-email-w1.huang@samsung.com> X-Mailer: git-send-email 1.8.3.2 In-reply-to: <1397148539-19084-1-git-send-email-w1.huang@samsung.com> References: <1397148539-19084-1-git-send-email-w1.huang@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprKLMWRmVeSWpSXmKPExsVy+t9jQV2WM27BBism6Fi86e1gsbjRe4vN YvqfO2wWP+9cZLR4fe4js8WSj4tZLDr+TWNzYPd4PXkCo8f2JyIed67tYfM4uvs3k0ffllWM AaxRXDYpqTmZZalF+nYJXBnTDr5hLGgOqujZfYGtgbHTvouRk0NCwETixr0WZghbTOLCvfVs XYxcHEIC0xkl+mfdg3KamST6pi1gB6liE1CTOHXxPwuILSIgLXHt82VGkCJmgYWMEr2bP4Ml hAXCJC7ev8MEYrMIqErM+ngHbAWvgIvE7SfPodYpSCz7shbM5hRwlXi57RAriC0EVHO/5RXr BEbeBYwMqxhFUwuSC4qT0nON9IoTc4tL89L1kvNzNzGCA+uZ9A7GVQ0WhxgFOBiVeHgPLHMN FmJNLCuuzD3EKMHBrCTCy33ALViINyWxsiq1KD++qDQntfgQozQHi5I478FW60AhgfTEktTs 1NSC1CKYLBMHp1QD45ylbrOmf53if8NM8X+53Wvnb/+m2R30j7vLseCb10nFRWcnHZzncEXC m4V5p/aFPJNpC2M2X5xQtfbfD5E87kUXqr3FNWXbHx3RODft2x2XpN1NUwoboxY9bF6qkjuv /9jhrXxT3maWH5buu/pjtZbEXD2LTOl56mWz2DM18w2Mi/xWnk6Z+1uJpTgj0VCLuag4EQBU 0SDIKAIAAA== Cc: w1.huang@samsung.com, ian.campbell@citrix.com, stefano.stabellini@eu.citrix.com, julien.grall@linaro.org, jaeyong.yoo@samsung.com, yjhyun.yoo@samsung.com Subject: [Xen-devel] [PATCH 4/6] xen/arm: Implement VLPT for guest p2m mapping in live migration X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: w1.huang@samsung.com X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.180 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Archive: From: Jaeyong Yoo Thsi patch implements VLPT (virtual-linear page table) for fast accessing of 3rd PTE of guest P2M. For more info about VLPT, please see http://www.technovelty.org/linux/virtual-linear-page-table.html. When creating a mapping for VLPT, just copy the 1st level PTE of guest p2m to xen's 2nd level PTE. Then the mapping becomes the following: xen's 1st PTE --> xen's 2nd PTE (which is the same as 1st PTE of guest p2m) --> guest p2m's 2nd PTE --> guest p2m's 3rd PTE (the memory contents where the vlpt points) This function is used in dirty-page tracing. When domU write-fault is trapped by xen, xen can immediately locate the 3rd PTE of guest p2m. The following link shows the performance comparison for handling a dirty-page between vlpt and typical page table walking. http://lists.xen.org/archives/html/xen-devel/2013-08/msg01503.html Signed-off-by: Jaeyong Yoo --- xen/arch/arm/domain.c | 5 ++ xen/arch/arm/mm.c | 116 +++++++++++++++++++++++++++++++++++++++ xen/include/asm-arm/arm32/page.h | 23 ++++---- xen/include/asm-arm/config.h | 9 +++ xen/include/asm-arm/domain.h | 7 +++ xen/include/asm-arm/mm.h | 16 ++++++ 6 files changed, 166 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c index b125857..3f04a77 100644 --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -502,6 +502,11 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags) /* Default the virtual ID to match the physical */ d->arch.vpidr = boot_cpu_data.midr.bits; + d->arch.dirty.second_lvl_start = 0; + d->arch.dirty.second_lvl_end = 0; + d->arch.dirty.second_lvl[0] = NULL; + d->arch.dirty.second_lvl[1] = NULL; + clear_page(d->shared_info); share_xen_page_with_guest( virt_to_page(d->shared_info), d, XENSHARE_writable); diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 14b4686..df9d428 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1252,6 +1252,122 @@ void get_gma_start_end(struct domain *d, paddr_t *start, paddr_t *end) *end = GUEST_RAM_BASE + ((paddr_t) d->max_pages << PAGE_SHIFT); } +/* flush the vlpt area */ +void flush_vlpt(struct domain *d) +{ + int flush_size; + flush_size = (d->arch.dirty.second_lvl_end - + d->arch.dirty.second_lvl_start) << SECOND_SHIFT; + + /* flushing the 3rd level mapping */ + flush_xen_data_tlb_range_va(d->arch.dirty.second_lvl_start << SECOND_SHIFT, + flush_size); +} + +/* restore the xen page table for vlpt mapping for domain d */ +void restore_vlpt(struct domain *d) +{ + int i; + + dsb(sy); + + for ( i = d->arch.dirty.second_lvl_start; + i < d->arch.dirty.second_lvl_end; + ++i ) + { + int k = i % LPAE_ENTRIES; + int l = i / LPAE_ENTRIES; + + if ( xen_second[i].bits != d->arch.dirty.second_lvl[l][k].bits ) + { + write_pte(&xen_second[i], d->arch.dirty.second_lvl[l][k]); + flush_xen_data_tlb_range_va(i << SECOND_SHIFT, 1 << SECOND_SHIFT); + } + } + + dsb(sy); + isb(); +} + +/* setting up the xen page table for vlpt mapping for domain d */ +int prepare_vlpt(struct domain *d) +{ + int xen_second_linear_base; + int gp2m_start_index, gp2m_end_index; + struct p2m_domain *p2m = &d->arch.p2m; + struct page_info *second_lvl_page; + paddr_t gma_start = 0; + paddr_t gma_end = 0; + lpae_t *first[2]; + int i; + uint64_t required, avail = VIRT_LIN_P2M_END - VIRT_LIN_P2M_START; + + get_gma_start_end(d, &gma_start, &gma_end); + required = (gma_end - gma_start) >> LPAE_SHIFT; + + if ( required > avail ) + { + dprintk(XENLOG_ERR, "Available VLPT is small for domU guest" + "(avail: %llx, required: %llx)\n", (unsigned long long)avail, + (unsigned long long)required); + return -ENOMEM; + } + + xen_second_linear_base = second_linear_offset(VIRT_LIN_P2M_START); + + gp2m_start_index = gma_start >> FIRST_SHIFT; + gp2m_end_index = (gma_end >> FIRST_SHIFT) + 1; + + if ( xen_second_linear_base + gp2m_end_index >= LPAE_ENTRIES * 2 ) + { + dprintk(XENLOG_ERR, "xen second page is small for VLPT for domU"); + return -ENOMEM; + } + + second_lvl_page = alloc_domheap_pages(NULL, 1, 0); + if ( second_lvl_page == NULL ) + return -ENOMEM; + + /* First level p2m is 2 consecutive pages */ + d->arch.dirty.second_lvl[0] = map_domain_page_global( + page_to_mfn(second_lvl_page) ); + d->arch.dirty.second_lvl[1] = map_domain_page_global( + page_to_mfn(second_lvl_page+1) ); + + first[0] = __map_domain_page(p2m->first_level); + first[1] = __map_domain_page(p2m->first_level+1); + + for ( i = gp2m_start_index; i < gp2m_end_index; ++i ) + { + int k = i % LPAE_ENTRIES; + int l = i / LPAE_ENTRIES; + int k2 = (xen_second_linear_base + i) % LPAE_ENTRIES; + int l2 = (xen_second_linear_base + i) / LPAE_ENTRIES; + + write_pte(&xen_second[xen_second_linear_base+i], first[l][k]); + + /* we copy the mapping into domain''s structure as a reference + * in case of the context switch (used in restore_vlpt) */ + d->arch.dirty.second_lvl[l2][k2] = first[l][k]; + } + unmap_domain_page(first[0]); + unmap_domain_page(first[1]); + + /* storing the start and end index */ + d->arch.dirty.second_lvl_start = xen_second_linear_base + gp2m_start_index; + d->arch.dirty.second_lvl_end = xen_second_linear_base + gp2m_end_index; + + flush_vlpt(d); + + return 0; +} + +void cleanup_vlpt(struct domain *d) +{ + /* First level p2m is 2 consecutive pages */ + unmap_domain_page_global(d->arch.dirty.second_lvl[0]); + unmap_domain_page_global(d->arch.dirty.second_lvl[1]); +} /* * Local variables: * mode: C diff --git a/xen/include/asm-arm/arm32/page.h b/xen/include/asm-arm/arm32/page.h index 4abb281..feca06c 100644 --- a/xen/include/asm-arm/arm32/page.h +++ b/xen/include/asm-arm/arm32/page.h @@ -3,22 +3,25 @@ #ifndef __ASSEMBLY__ -/* Write a pagetable entry. - * - * If the table entry is changing a text mapping, it is responsibility - * of the caller to issue an ISB after write_pte. - */ -static inline void write_pte(lpae_t *p, lpae_t pte) +/* Write a pagetable entry. All necessary barriers are responsibility of + * the caller */ +static inline void __write_pte(lpae_t *p, lpae_t pte) { asm volatile ( - /* Ensure any writes have completed with the old mappings. */ - "dsb;" - /* Safely write the entry (STRD is atomic on CPUs that support LPAE) */ + /* safely write the entry (STRD is atomic on CPUs that support LPAE) */ "strd %0, %H0, [%1];" - "dsb;" : : "r" (pte.bits), "r" (p) : "memory"); } +/* Write a pagetable entry with dsb barriers. All necessary barriers are + * responsibility of the caller. */ +static inline void write_pte(lpae_t *p, lpae_t pte) +{ + dsb(); + __write_pte(p, pte); + dsb(); +} + /* Inline ASM to flush dcache on register R (may be an inline asm operand) */ #define __clean_xen_dcache_one(R) STORE_CP32(R, DCCMVAC) diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h index 5b7b1a8..95e84bd 100644 --- a/xen/include/asm-arm/config.h +++ b/xen/include/asm-arm/config.h @@ -87,6 +87,7 @@ * 0 - 8M * * 32M - 128M Frametable: 24 bytes per page for 16GB of RAM + * 128M - 256M Virtual-linear mapping to P2M table * 256M - 1G VMAP: ioremap and early_ioremap use this virtual address * space * @@ -124,7 +125,9 @@ #define CONFIG_SEPARATE_XENHEAP 1 #define FRAMETABLE_VIRT_START _AT(vaddr_t,0x02000000) +#define VIRT_LIN_P2M_START _AT(vaddr_t,0x08000000) #define VMAP_VIRT_START _AT(vaddr_t,0x10000000) +#define VIRT_LIN_P2M_END VMAP_VIRT_START #define XENHEAP_VIRT_START _AT(vaddr_t,0x40000000) #define XENHEAP_VIRT_END _AT(vaddr_t,0x7fffffff) #define DOMHEAP_VIRT_START _AT(vaddr_t,0x80000000) @@ -157,6 +160,12 @@ #define HYPERVISOR_VIRT_END DIRECTMAP_VIRT_END +/* Temporary definition for VIRT_LIN_P2M_START and VIRT_LIN_P2M_END + * TODO: Needs evaluation!!!! + */ +#define VIRT_LIN_P2M_START _AT(vaddr_t, 0x08000000) +#define VIRT_LIN_P2M_END VMAP_VIRT_START + #endif /* Fixmap slots */ diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index 28c359a..5321bd6 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -161,6 +161,13 @@ struct arch_domain spinlock_t lock; } vuart; + /* dirty-page tracing */ + struct { + volatile int second_lvl_start; /* for context switch */ + volatile int second_lvl_end; + lpae_t *second_lvl[2]; /* copy of guest p2m's first */ + } dirty; + unsigned int evtchn_irq; } __cacheline_aligned; diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h index 341493a..75c27fb 100644 --- a/xen/include/asm-arm/mm.h +++ b/xen/include/asm-arm/mm.h @@ -4,6 +4,7 @@ #include #include #include +#include #include /* Align Xen to a 2 MiB boundary. */ @@ -342,6 +343,21 @@ static inline void put_page_and_type(struct page_info *page) } void get_gma_start_end(struct domain *d, paddr_t *start, paddr_t *end); +int prepare_vlpt(struct domain *d); +void cleanup_vlpt(struct domain *d); +void restore_vlpt(struct domain *d); + +/* calculate the xen''s virtual address for accessing the leaf PTE of + * a given address (GPA) */ +static inline lpae_t * get_vlpt_3lvl_pte(paddr_t addr) +{ + lpae_t *table = (lpae_t *)VIRT_LIN_P2M_START; + + /* Since we slotted the guest''s first p2m page table to xen''s + * second page table, one shift is enough for calculating the + * index of guest p2m table entry */ + return &table[addr >> PAGE_SHIFT]; +} #endif /* __ARCH_ARM_MM__ */ /*