From patchwork Tue Apr 12 14:06:49 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tero Kristo X-Patchwork-Id: 65628 Delivered-To: patch@linaro.org Received: by 10.140.93.198 with SMTP id d64csp1942716qge; Tue, 12 Apr 2016 07:08:32 -0700 (PDT) X-Received: by 10.98.33.208 with SMTP id o77mr5018788pfj.108.1460470111054; Tue, 12 Apr 2016 07:08:31 -0700 (PDT) Return-Path: Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id qz9si10495315pab.94.2016.04.12.07.08.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Apr 2016 07:08:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) smtp.mailfrom=linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1apyyN-0006z9-Av; Tue, 12 Apr 2016 14:07:19 +0000 Received: from arroyo.ext.ti.com ([192.94.94.40]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1apyyG-0006i3-PJ for linux-arm-kernel@lists.infradead.org; Tue, 12 Apr 2016 14:07:16 +0000 Received: from dflxv15.itg.ti.com ([128.247.5.124]) by arroyo.ext.ti.com (8.13.7/8.13.7) with ESMTP id u3CE6bHx030556; Tue, 12 Apr 2016 09:06:37 -0500 Received: from DLEE70.ent.ti.com (dlee70.ent.ti.com [157.170.170.113]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id u3CE6bIa025722; Tue, 12 Apr 2016 09:06:37 -0500 Received: from dflp32.itg.ti.com (10.64.6.15) by DLEE70.ent.ti.com (157.170.170.113) with Microsoft SMTP Server id 14.3.224.2; Tue, 12 Apr 2016 09:06:37 -0500 Received: from [192.168.2.6] (ileax41-snat.itg.ti.com [10.172.224.153]) by dflp32.itg.ti.com (8.14.3/8.13.8) with ESMTP id u3CE6Z6C003418; Tue, 12 Apr 2016 09:06:35 -0500 Subject: Re: [RFC 0/1] ARM: mm: cache shareability tweak To: Mark Rutland References: <1460448880-5677-1-git-send-email-t-kristo@ti.com> <20160412132505.GG28057@leverpostej> From: Tero Kristo Message-ID: <570D00F9.4070406@ti.com> Date: Tue, 12 Apr 2016 17:06:49 +0300 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160412132505.GG28057@leverpostej> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160412_070713_215125_A1C20AF6 X-CRM114-Status: GOOD ( 28.12 ) X-Spam-Score: -7.9 (-------) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-7.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/, high trust [192.94.94.40 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [192.94.94.40 listed in wl.mailspike.net] -1.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: nm@ti.com, santosh.shilimkar@oracle.com, linux@arm.linux.org.uk, "Karicheri, Muralidharan" , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org On 04/12/2016 04:25 PM, Mark Rutland wrote: > On Tue, Apr 12, 2016 at 11:14:39AM +0300, Tero Kristo wrote: >> Hi, >> >> This RFC patch attempts to implement support for specifying cache >> shareability setting via kernel cmdline. This is required at least >> for TI keystone2 generation SoCs, where DMA masters are snooping on >> the cache maintenance messages to maintain coherency. Currently we >> are carrying an internal hack that modifies the macros via #ifdefs, >> this is obviously bad as the same kernel image can only work with >> keystone2 (or at least might be causing problems with other SoCs.) > > The de-facto semantics (which we should codify) for dma-coherent with > ARMv7 is that a device makes accesses which are coherent with Normal, > Inner Shareable, Inner Write-Back, Outer Write-Back. > > In arch/arm/boot/dts/keystone.dtsi I see that /soc/usb@2680000 has a > dma-coherent flag. Is that device coherent today with upstream? Or is > that misleading currently? Good question, Murali, can you comment on this? What peripherals are actually requiring the DMA coherency on K2? > > If the device isn't coherent with that, then dma-coherent isn't strictly > true (and should go), and we need additional properties to correctly > describe this case. > >> It would be very much preferred to replace this hardcoded >> implementation with a runtime solution. >> >> Some obvious holes in this implementation: >> >> 1) during execution of arch/arm/kernel/head.S, the tweaked MMU shareability >> settings are not in place. However, I am not too sure how much that >> matters, as I am not sure what is mapped at this point. Kernel image >> mapping should not matter at least, as we typically should not be doing >> any DMA transfers from the kernel image. > > Strictly speaking, changing the shareability can result in a loss of > coherency, even if all accesses are made by the same CPU. See > "Mismatched memory attributes" in section A3.5.7 of the ARMv7-AR > Reference Manual (ARM DDI 0406C.c). Basically we are not attempting to change shareability in-the-fly, but instead configure a different shareability value that is going to be used always. > > It's not just DMA that matters. I believe we may have page tables as > part of the kernel image, for instance, and those need to be accessed > with consistent attributes by the MMU when doing page table walks. > > You can avoid issues so long as you have appropriate cache maintenance, > but that's both expensive (all memory previously mapped must be > Clean+Invalidated by VA) and painful (as you can't reliably use any of > said memory until after the maintenance). The hack we have internally just maps all the DMA pages as outer shareable. I think maybe adding the original hack might help understanding the issue, so added inline in the end as reference. We just attempt to change the shareability value from 3 (the current) to 2. > >> I would like some comments on this, if handling during head.S >> should be fixed also, how can this be done? Some hack under >> compressed/keystone-head.S? > > If you need to do this, you need consistent attributes from the outset, > or you need to disable the MMU, perform cache maintenance, and re-enter > the kernel. > >> 2) the cmdline parameter could be something more descriptive >> >> 3) The single RFC patch should probably be split up a bit > > 4) It isn't possible to use dma-coherent to describe this without > weakening the semantics so as to be meaningless in general. So if we > go for this approach we need a mechanism to accurately describe the > coherency guarantees of masters in the system beyond a boolean. > > Thanks, > Mark. > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h index f8f1cff..62adf21 100644 --- a/arch/arm/include/asm/pgtable-3level-hwdef.h +++ b/arch/arm/include/asm/pgtable-3level-hwdef.h @@ -44,7 +44,11 @@ #define PMD_SECT_CACHEABLE (_AT(pmdval_t, 1) << 3) #define PMD_SECT_USER (_AT(pmdval_t, 1) << 6) /* AP[1] */ #define PMD_SECT_AP2 (_AT(pmdval_t, 1) << 7) /* read only */ +#ifdef CONFIG_KEYSTONE2_DMA_COHERENT +#define PMD_SECT_S (_AT(pmdval_t, 2) << 8) +#else #define PMD_SECT_S (_AT(pmdval_t, 3) << 8) +#endif #define PMD_SECT_AF (_AT(pmdval_t, 1) << 10) #define PMD_SECT_nG (_AT(pmdval_t, 1) << 11) #define PMD_SECT_PXN (_AT(pmdval_t, 1) << 53) @@ -73,7 +77,12 @@ #define PTE_BUFFERABLE (_AT(pteval_t, 1) << 2) /* AttrIndx[0] */ #define PTE_CACHEABLE (_AT(pteval_t, 1) << 3) /* AttrIndx[1] */ #define PTE_AP2 (_AT(pteval_t, 1) << 7) /* AP[2] */ +#ifdef CONFIG_KEYSTONE2_DMA_COHERENT +/* SH[1:0], outer shareable */ +#define PTE_EXT_SHARED (_AT(pteval_t, 2) << 8) +#else #define PTE_EXT_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */ +#endif #define PTE_EXT_AF (_AT(pteval_t, 1) << 10) /* Access Flag */ #define PTE_EXT_NG (_AT(pteval_t, 1) << 11) /* nG */ #define PTE_EXT_PXN (_AT(pteval_t, 1) << 53) /* PXN */ diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index a745a2a..b4090b1 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -78,7 +78,12 @@ #define L_PTE_VALID (_AT(pteval_t, 1) << 0) /* Valid */ #define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Present */ #define L_PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */ +#ifdef CONFIG_KEYSTONE2_DMA_COHERENT +/* SH[1:0], outer shareable */ +#define L_PTE_SHARED (_AT(pteval_t, 2) << 8) +#else #define L_PTE_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */ +#endif #define L_PTE_YOUNG (_AT(pteval_t, 1) << 10) /* AF */ #define L_PTE_XN (_AT(pteval_t, 1) << 54) /* XN */ #define L_PTE_DIRTY (_AT(pteval_t, 1) << 55) diff --git a/arch/arm/mach-keystone/Kconfig b/arch/arm/mach-keystone/Kconfig index ea955f6db..558385e 100644 --- a/arch/arm/mach-keystone/Kconfig +++ b/arch/arm/mach-keystone/Kconfig @@ -11,6 +11,10 @@ config ARCH_KEYSTONE select ZONE_DMA if ARM_LPAE select MIGHT_HAVE_PCI select PCI_DOMAINS if PCI + select KEYSTONE2_DMA_COHERENT help Support for boards based on the Texas Instruments Keystone family of SoCs. + +config KEYSTONE2_DMA_COHERENT + bool