diff mbox

[RFC,1/2] memremap: add arch specific hook for MEMREMAP_WB mappings

Message ID 1456149728-16706-2-git-send-email-ard.biesheuvel@linaro.org
State New
Headers show

Commit Message

Ard Biesheuvel Feb. 22, 2016, 2:02 p.m. UTC
Currently, the memremap code serves MEMREMAP_WB mappings directly from
the kernel direct mapping, unless the region is in high memory, in which
case it falls back to using ioremap_cache(). However, the semantics of
ioremap_cache() are not unambiguously defined, and on ARM, it will
actually result in a mapping type that differs from the attributes used
for the linear mapping, and for this reason, the ioremap_cache() call
fails if the region is part of the memory managed by the kernel.

So instead, implement an optional hook 'arch_memremap_wb' whose default
implementation calls ioremap_cache() as before, but which can be
overridden by the architecture to do what is appropriate for it.

Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---
 kernel/memremap.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

-- 
2.5.0

Comments

Ard Biesheuvel Feb. 22, 2016, 7:17 p.m. UTC | #1
On 22 February 2016 at 20:05, Dan Williams <dan.j.williams@intel.com> wrote:
> On Mon, Feb 22, 2016 at 6:02 AM, Ard Biesheuvel

> <ard.biesheuvel@linaro.org> wrote:

>> Currently, the memremap code serves MEMREMAP_WB mappings directly from

>> the kernel direct mapping, unless the region is in high memory, in which

>> case it falls back to using ioremap_cache(). However, the semantics of

>> ioremap_cache() are not unambiguously defined, and on ARM, it will

>> actually result in a mapping type that differs from the attributes used

>> for the linear mapping, and for this reason, the ioremap_cache() call

>> fails if the region is part of the memory managed by the kernel.

>>

>> So instead, implement an optional hook 'arch_memremap_wb' whose default

>> implementation calls ioremap_cache() as before, but which can be

>> overridden by the architecture to do what is appropriate for it.

>>

>

> Acked-by: Dan Williams <dan.j.williams@intel.com>

>

> I still have patches pending to delete ioremap_cache() from ARM and

> require memremap() to be used for cacheable mappings.  Do you see any

> use for ioremap_cache() on ARM after this change?


I am not exactly sure why ioremap_cache() does not use MT_MEMORY_RW
attributes, but the ARM architecture simply does not allow mismatched
attributes, so we cannot simply replace each instance of
ioremap_cache() with memremap()

Perhaps Russell can explain?
Ard Biesheuvel Feb. 22, 2016, 8:35 p.m. UTC | #2
On 22 February 2016 at 21:02, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Feb 22, 2016 at 08:17:11PM +0100, Ard Biesheuvel wrote:

>> I am not exactly sure why ioremap_cache() does not use MT_MEMORY_RW

>> attributes, but the ARM architecture simply does not allow mismatched

>> attributes, so we cannot simply replace each instance of

>> ioremap_cache() with memremap()

>>

>> Perhaps Russell can explain?

>

> ARM has had ioremap_cached() for a while - it was introduced in the

> bitkeeper times of 2.6, so pre-git.  In those kernels, and into the

> git era, pre-dating ARMv6 support, it was merely:

>

> +#define ioremap_cached(cookie,size)    __arch_ioremap((cookie),(size),L_PTE_CACHEABLE)

>

> which means that we got write-through cache behaviour for mappings

> created by this, where supported, or if not, read-allocate writeback.

> This was completely independent of the system memory mapping

> attributes, which could be specified on the kernel command line.

>

> This was originally used by pxa2xx-flash to provide faster flash

> access on those systems - in other words, it's created to remap

> devices with cacheable attributes.

>

> When creating ARMv6 support, I ended up completely rewriting how

> the memory attributes were handled, and so it then became this:

>

> +#define ioremap_cached(cookie,size)    __arch_ioremap((cookie), (size), MT_DEVICE_CACHED)

>

> which gives very similar behaviour, though we now default to RAWB

> mappings, which fall back to WT on CPUs that don't support RAWB.

> Again, independent of the system memory mapping.

>

> Then, in 2013, with the advent of Xen, ioremap_cached() became

> ioremap_cache() so that Xen would build on ARM, and to align it

> with other architectures.  Whether ioremap_cached() actually was

> suitable to become ioremap_cache(), I'm not sure, but that's

> what happened.

>

> Since it was just renamed, it preserves the original goal which is

> to remap device memory with cacheable attributes, which may differ

> from the cacheable attributes of the system RAM.  It has never

> been intended for remapping system memory: none of the ioremap_*

> family of functions on ARM were ever intended for that purpose.

>

> However, some people did use it for that purpose on ARMv5 and

> earlier architectures, where, due to the virtual cache architecture,

> you could get away with remapping the same memory with differing

> attributes without any problem.  With the advent of ARMv6

> (pre-dating 2013), this was clearly stated as being illegal at

> architecture level, but people were married to the idea - despite

> me telling them not to.

>

> So eventually, I had no other option than to add a code check to

> ioremap*() which prevents any ioremap*() function from being used

> on system memory - iow, memory that Linux maps itself either as

> part of lowmem or via the kmap*() API - since an ioremap*()

> mapping would conflict with those.

>

> That's basically where we are today: ioremap*() does not permit

> system memory to be remapped, even ioremap_cache().

>


OK, thanks for the historical context.

So what is your opinion on this series, i.e., to wire up memremap() to
remap arbitrary memory regions into the vmalloc area with MT_MEMORY_RW
attributes, and at the same time lift the restriction that the region
must be disjoint from memory covered by lowmem or kmap?

It would make my life a lot easier, since we can more easily share
code between x86, arm64 and ARM to permanently map memory regions that
have been populated by the firmware. As I noted in the commit log,
memremap() already does the right thing wrt lowmem, i.e., it returns
the existing mapping rather than creating a new one. For highmem, I
don't think kmap() is the way to go considering the unknown size and
the potentially permanent nature of the mappings (which resemble
ioremap more than they resemble kmap)

-- 
Ard.
Ard Biesheuvel Feb. 23, 2016, 12:03 p.m. UTC | #3
On 23 February 2016 at 12:58, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Feb 22, 2016 at 09:35:24PM +0100, Ard Biesheuvel wrote:

>> OK, thanks for the historical context.

>>

>> So what is your opinion on this series, i.e., to wire up memremap() to

>> remap arbitrary memory regions into the vmalloc area with MT_MEMORY_RW

>> attributes, and at the same time lift the restriction that the region

>> must be disjoint from memory covered by lowmem or kmap?

>

> The historical context is still present, because pxa2xx-flash has

> been converted to use memremap() from ioremap_cache() - possibly

> inappropriately.

>

> I've already described the semantics of ioremap_cache(), which are

> to always create a cacheable mapping irrespective of the system

> memory mapping type.  However, memremap() says that MEMREMAP_WB

> matches system RAM, which on ARM it doesn't right now.

>


Indeed. Hence this series, to decouple memremap(MEMREMAP_WB) from
ioremap_cache() for ARM

> Changing it to MT_MEMORY_RW would satisfy that comment against

> memremap(), but at the same time changes what happens with

> pxa2xx-flash - the memory region (which is not system RAM) then

> changes with the cache status of system RAM.

>

> So, I'm not that happy about the memremap() stuff right now, and

> I don't like the idea of making memremap() conform to its stated

> requirements without first preventing pxa2xx-flash being affected

> by such a change.

>


Actually, my change fixes this issue, since it will cause memremap()
to always create MT_MEMORY_RW mappings, and not fallback to
ioremap_cache() for ranges that are not covered by lowmem.

> Perhaps we need to reinstate the original ioremap_cached() API for

> pxa2xx-flash, and then switch memremap() to MT_MEMORY_RW - that

> would seem to result in the expected behaviour by all parties.

>


I think we can simply revert the change to pxa2xx-flash if it is
deemed inappropriate.
Ard Biesheuvel Feb. 23, 2016, 12:26 p.m. UTC | #4
On 23 February 2016 at 13:03, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 23 February 2016 at 12:58, Russell King - ARM Linux

> <linux@arm.linux.org.uk> wrote:

>> On Mon, Feb 22, 2016 at 09:35:24PM +0100, Ard Biesheuvel wrote:

>>> OK, thanks for the historical context.

>>>

>>> So what is your opinion on this series, i.e., to wire up memremap() to

>>> remap arbitrary memory regions into the vmalloc area with MT_MEMORY_RW

>>> attributes, and at the same time lift the restriction that the region

>>> must be disjoint from memory covered by lowmem or kmap?

>>

>> The historical context is still present, because pxa2xx-flash has

>> been converted to use memremap() from ioremap_cache() - possibly

>> inappropriately.

>>

>> I've already described the semantics of ioremap_cache(), which are

>> to always create a cacheable mapping irrespective of the system

>> memory mapping type.  However, memremap() says that MEMREMAP_WB

>> matches system RAM, which on ARM it doesn't right now.

>>

>

> Indeed. Hence this series, to decouple memremap(MEMREMAP_WB) from

> ioremap_cache() for ARM

>

>> Changing it to MT_MEMORY_RW would satisfy that comment against

>> memremap(), but at the same time changes what happens with

>> pxa2xx-flash - the memory region (which is not system RAM) then

>> changes with the cache status of system RAM.

>>

>> So, I'm not that happy about the memremap() stuff right now, and

>> I don't like the idea of making memremap() conform to its stated

>> requirements without first preventing pxa2xx-flash being affected

>> by such a change.

>>

>

> Actually, my change fixes this issue, since it will cause memremap()

> to always create MT_MEMORY_RW mappings, and not fallback to

> ioremap_cache() for ranges that are not covered by lowmem.

>

>> Perhaps we need to reinstate the original ioremap_cached() API for

>> pxa2xx-flash, and then switch memremap() to MT_MEMORY_RW - that

>> would seem to result in the expected behaviour by all parties.

>>

>

> I think we can simply revert the change to pxa2xx-flash if it is

> deemed inappropriate.


OK, I see what you mean. I find it unfortunate that ioremap_cache()
instances are blindly being replaced with memremap(), and I wonder if
this wasted test by and/or cc'ed to people who can actually test this
driver. Dan?

Anyway, I don't think it makes sense to stipulate at the generic level
that ioremap_cache() and memremap(MEMREMAP_WB) shall be the same, and
deprecating it is a bit premature since the cross-architecturally
loosely defined semantics of ioremap_cache() can never be replaced 1:1
with what memremap() promises.

So what I suggest is that I revert the change to pxa2xx-flash as a new
1/3 in this series, and put these existing two on top to decouple
memremap(MEMREMAP_WB) from ioremap_cache() entirely.

Thanks,
Ard.
Ard Biesheuvel Feb. 25, 2016, 7:49 a.m. UTC | #5
On 23 February 2016 at 23:23, Robert Jarzmik <robert.jarzmik@free.fr> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:

>

>> On Tue, Feb 23, 2016 at 4:26 AM, Ard Biesheuvel

>> <ard.biesheuvel@linaro.org> wrote:

>>> On 23 February 2016 at 13:03, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

>>>> On 23 February 2016 at 12:58, Russell King - ARM Linux

>>>> <linux@arm.linux.org.uk> wrote:

>>>>> On Mon, Feb 22, 2016 at 09:35:24PM +0100, Ard Biesheuvel wrote:

>>> OK, I see what you mean. I find it unfortunate that ioremap_cache()

>>> instances are blindly being replaced with memremap(), and I wonder if

>>> this wasted test by and/or cc'ed to people who can actually test this

>>> driver. Dan?

>

> Actually I have the hardware to test it.

>

> And I also know what is behind :

>  - it's a CFI NOR based memory

>  - these are Intel StrataFlash 28F128J3A chips

>  - as a CFI memory it is mapped on the system bus

>  - from a read perspective, it behaves like a normal memory

>  - but once the first write reaches the CFI, everything changes (the address

>    space layout doesn't have the same meaning, be that becoming a status code or

>    something else).

>    In these conditions reordering of writes versus reads, merging reads after

>    a write or coalescing writes is a recipe for disaster.

>

> All of this to say I can make a small discrete number of tests (less than 10

> write or erase ones to preserve the precious NOR).

>


Thanks Robert.

But to be honest, I think we should simply revert the change, after
which we can wire up memremap() for ARM properly. And while I agree
that ioremap_cache() is often abused for mapping things like ACPI
tables in RAM (which forces you to cast away the __iomem annotation),
using ioremap_cache() to map NOR flash is totally different IMO, even
if it has memory semantics while in array mode.
diff mbox

Patch

diff --git a/kernel/memremap.c b/kernel/memremap.c
index 7a1b5c3ef14e..77c41648ba16 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -27,6 +27,13 @@  __weak void __iomem *ioremap_cache(resource_size_t offset, unsigned long size)
 }
 #endif
 
+#ifndef arch_memremap_wb
+static void *arch_memremap_wb(resource_size_t offset, unsigned long size)
+{
+	return (__force void *)ioremap_cache(offset, size);
+}
+#endif
+
 static void *try_ram_remap(resource_size_t offset, size_t size)
 {
 	struct page *page = pfn_to_page(offset >> PAGE_SHIFT);
@@ -34,7 +41,7 @@  static void *try_ram_remap(resource_size_t offset, size_t size)
 	/* In the simple case just return the existing linear address */
 	if (!PageHighMem(page))
 		return __va(offset);
-	return NULL; /* fallback to ioremap_cache */
+	return arch_memremap_wb(offset, size);
 }
 
 /**
@@ -80,8 +87,6 @@  void *memremap(resource_size_t offset, size_t size, unsigned long flags)
 		 */
 		if (is_ram == REGION_INTERSECTS)
 			addr = try_ram_remap(offset, size);
-		if (!addr)
-			addr = ioremap_cache(offset, size);
 	}
 
 	/*