diff mbox

[1/1] arm64: fix flush_cache_range

Message ID 1464088597-8820-1-git-send-email-thunder.leizhen@huawei.com
State New
Headers show

Commit Message

Zhen Lei May 24, 2016, 11:16 a.m. UTC
When we ran mprotect04(a test case in LTP) infinitely, it would always
failed after a few seconds. The case can be described briefly that: copy
a empty function from code area into a new memory area(created by mmap),
then call mprotect to change the protection to PROT_EXEC. The syscall
sys_mprotect will finally invoke flush_cache_range, but this function
currently only invalid icache, the operation of flush dcache is missed.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

---
 arch/arm64/mm/flush.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.5.0

Comments

Mark Rutland May 24, 2016, 11:37 a.m. UTC | #1
On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:
> When we ran mprotect04(a test case in LTP) infinitely, it would always

> failed after a few seconds. The case can be described briefly that: copy

> a empty function from code area into a new memory area(created by mmap),

> then call mprotect to change the protection to PROT_EXEC. The syscall

> sys_mprotect will finally invoke flush_cache_range, but this function

> currently only invalid icache, the operation of flush dcache is missed.


In the LTP code I see powerpc-specific D-cache / I-cache synchronisation
(i.e. d-cache cleaning followed by I-cache invalidation), so there
appears to be some expectation of userspace maintenance. Hoever, there
is no such ARM-specific I-cache maintenance.

It looks like the test may be missing I-cache maintenance regardless of
the semantics of mprotect in this case.

I have not yet devled into flush_cache_range and how it is called.

Thanks,
Mark.

> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> ---

>  arch/arm64/mm/flush.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c

> index dbd12ea..eda4124 100644

> --- a/arch/arm64/mm/flush.c

> +++ b/arch/arm64/mm/flush.c

> @@ -31,7 +31,7 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start,

>  		       unsigned long end)

>  {

>  	if (vma->vm_flags & VM_EXEC)

> -		__flush_icache_all();

> +		flush_icache_range(start, end);

>  }

> 

>  static void sync_icache_aliases(void *kaddr, unsigned long len)

> --

> 2.5.0

> 

> 

> 

> _______________________________________________

> linux-arm-kernel mailing list

> linux-arm-kernel@lists.infradead.org

> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

>
Zhen Lei May 24, 2016, 12:19 p.m. UTC | #2
On 2016/5/24 19:37, Mark Rutland wrote:
> On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

>> When we ran mprotect04(a test case in LTP) infinitely, it would always

>> failed after a few seconds. The case can be described briefly that: copy

>> a empty function from code area into a new memory area(created by mmap),

>> then call mprotect to change the protection to PROT_EXEC. The syscall

>> sys_mprotect will finally invoke flush_cache_range, but this function

>> currently only invalid icache, the operation of flush dcache is missed.

> 

> In the LTP code I see powerpc-specific D-cache / I-cache synchronisation

> (i.e. d-cache cleaning followed by I-cache invalidation), so there

> appears to be some expectation of userspace maintenance. Hoever, there

> is no such ARM-specific I-cache maintenance.

But I see some other platforms have D-cache maintenance, like: arch/nios2/mm/cacheflush.c
And according to the name of flush_cache_range, it should do this, I judged. Otherwise,
mprotect04 will be failed on more platforms, it's easy to discover. Only PPC have specific
cache synchronization, maybe it meets some hardware limitation. It's impossible a programmer
fixed a common bug on only one platform but leave others unchanged.

> 

> It looks like the test may be missing I-cache maintenance regardless of

> the semantics of mprotect in this case.

> 

> I have not yet devled into flush_cache_range and how it is called.


SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

> 

> Thanks,

> Mark.

> 

>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

>> ---

>>  arch/arm64/mm/flush.c | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c

>> index dbd12ea..eda4124 100644

>> --- a/arch/arm64/mm/flush.c

>> +++ b/arch/arm64/mm/flush.c

>> @@ -31,7 +31,7 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start,

>>  		       unsigned long end)

>>  {

>>  	if (vma->vm_flags & VM_EXEC)

>> -		__flush_icache_all();

>> +		flush_icache_range(start, end);

>>  }

>>

>>  static void sync_icache_aliases(void *kaddr, unsigned long len)

>> --

>> 2.5.0

>>

>>

>>

>> _______________________________________________

>> linux-arm-kernel mailing list

>> linux-arm-kernel@lists.infradead.org

>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

>>

> 

> .

>
Catalin Marinas May 24, 2016, 1:02 p.m. UTC | #3
On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:
> On 2016/5/24 19:37, Mark Rutland wrote:

> > On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

> >> When we ran mprotect04(a test case in LTP) infinitely, it would always

> >> failed after a few seconds. The case can be described briefly that: copy

> >> a empty function from code area into a new memory area(created by mmap),

> >> then call mprotect to change the protection to PROT_EXEC. The syscall

> >> sys_mprotect will finally invoke flush_cache_range, but this function

> >> currently only invalid icache, the operation of flush dcache is missed.

> > 

> > In the LTP code I see powerpc-specific D-cache / I-cache synchronisation

> > (i.e. d-cache cleaning followed by I-cache invalidation), so there

> > appears to be some expectation of userspace maintenance. Hoever, there

> > is no such ARM-specific I-cache maintenance.

> 

> But I see some other platforms have D-cache maintenance, like: arch/nios2/mm/cacheflush.c

> And according to the name of flush_cache_range, it should do this, I judged. Otherwise,

> mprotect04 will be failed on more platforms, it's easy to discover. Only PPC have specific

> cache synchronization, maybe it meets some hardware limitation. It's impossible a programmer

> fixed a common bug on only one platform but leave others unchanged.


flush_cache_range() is primarily used on VIVT caches before changing the
mapping and should not really be implemented on arm64. I don't recall
why we still have the I-cache invalidation, possibly for the ASID-tagged
VIVT I-cache case, though we should have a specific check for this.

There are some other cases where flush_cache_range() is called and no
D-cache maintenance is necessary on arm64, so I don't want to penalise
them by implementing flush_cache_range().

> > It looks like the test may be missing I-cache maintenance regardless of

> > the semantics of mprotect in this case.

> > 

> > I have not yet devled into flush_cache_range and how it is called.

> 

> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range


The change_protection() shouldn't need to flush the caches in
flush_cache_range(). The change_pte_range() function eventually ends up
calling set_pte_at() which calls __sync_icache_dcache() if the mapping
is executable.

Can you be more specific about the kernel version you are using, its
configuration?

-- 
Catalin
Mark Rutland May 24, 2016, 3:12 p.m. UTC | #4
On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:
> 

> 

> On 2016/5/24 19:37, Mark Rutland wrote:

> > On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

> >> When we ran mprotect04(a test case in LTP) infinitely, it would always

> >> failed after a few seconds. The case can be described briefly that: copy

> >> a empty function from code area into a new memory area(created by mmap),

> >> then call mprotect to change the protection to PROT_EXEC. The syscall

> >> sys_mprotect will finally invoke flush_cache_range, but this function

> >> currently only invalid icache, the operation of flush dcache is missed.

> > 

> > In the LTP code I see powerpc-specific D-cache / I-cache synchronisation

> > (i.e. d-cache cleaning followed by I-cache invalidation), so there

> > appears to be some expectation of userspace maintenance. Hoever, there

> > is no such ARM-specific I-cache maintenance.

> But I see some other platforms have D-cache maintenance, like: arch/nios2/mm/cacheflush.c

> And according to the name of flush_cache_range, it should do this, I judged. Otherwise,

> mprotect04 will be failed on more platforms, it's easy to discover.


From my experience with the LTP, it's likely that the test was written
and tested on very few architectures and kernel configurations, and has
seen very little testing on others.

> Only PPC have specific cache synchronization, maybe it meets some

> hardware limitation. 


Per [1] that "hardware limitation" is  simply the presence of a Harvard
cache architecture, similar to that of ARM.

> It's impossible a programmer fixed a common bug on only one platform

> but leave others unchanged.


The common problem here is I-cache coherency, but that must be managed
in an architecture-specific way. The patch for PPC [1] added
PPC-specific assembly to achieve that, and hence doesn't apply to other
platforms.

Jumping back to the problem at hand:

It looks like we inherited this from ARM, which has done this for all
executable mappings since commit  6060e8df517847bf ("ARM: I-cache: flush
executable mappings in flush_cache_range()"). The commit message refers
to a4db94d, which doesn't seem to exist, and I assume is
115b22474eb1905d ("ARM: 5794/1: Flush the D-cache during
copy_user_highpage()") which happened to get rebased at some point.

That all implies that the icache maintenance is only intended to ensure
that CoW does not result in unexpected incoherency. Which in turn
implies that flipping permissions with mprotect is not expected to
synchronise the D and I caches.

Russell, does that analysis make sense to you?

Ben, Paul, Michael, I take it that the LTP commit for PPC [1] is as
expected? i.e. you similarly do not expect flipping execute permission
with mprotect to imply that the kernel must synchronise the D & I
caches?

Thanks,
Mark.

> > It looks like the test may be missing I-cache maintenance regardless of

> > the semantics of mprotect in this case.

> > 

> > I have not yet devled into flush_cache_range and how it is called.

> 

> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

> 

> > 

> > Thanks,

> > Mark.

> > 

> >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> >> ---

> >>  arch/arm64/mm/flush.c | 2 +-

> >>  1 file changed, 1 insertion(+), 1 deletion(-)

> >>

> >> diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c

> >> index dbd12ea..eda4124 100644

> >> --- a/arch/arm64/mm/flush.c

> >> +++ b/arch/arm64/mm/flush.c

> >> @@ -31,7 +31,7 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start,

> >>  		       unsigned long end)

> >>  {

> >>  	if (vma->vm_flags & VM_EXEC)

> >> -		__flush_icache_all();

> >> +		flush_icache_range(start, end);

> >>  }

> >>

> >>  static void sync_icache_aliases(void *kaddr, unsigned long len)

> >> --

> >> 2.5.0

> >>

> >>

> >>

> >> _______________________________________________

> >> linux-arm-kernel mailing list

> >> linux-arm-kernel@lists.infradead.org

> >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

> >>

> > 

> > .

> > 

> 


[1] https://github.com/linux-test-project/ltp/commit/cf9a0800cd0d71c8c471adc5631216f995ab7e02
Zhen Lei May 25, 2016, 1:20 a.m. UTC | #5
On 2016/5/24 21:02, Catalin Marinas wrote:
> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

>> On 2016/5/24 19:37, Mark Rutland wrote:

>>> On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

>>>> When we ran mprotect04(a test case in LTP) infinitely, it would always

>>>> failed after a few seconds. The case can be described briefly that: copy

>>>> a empty function from code area into a new memory area(created by mmap),

>>>> then call mprotect to change the protection to PROT_EXEC. The syscall

>>>> sys_mprotect will finally invoke flush_cache_range, but this function

>>>> currently only invalid icache, the operation of flush dcache is missed.

>>>

>>> In the LTP code I see powerpc-specific D-cache / I-cache synchronisation

>>> (i.e. d-cache cleaning followed by I-cache invalidation), so there

>>> appears to be some expectation of userspace maintenance. Hoever, there

>>> is no such ARM-specific I-cache maintenance.

>>

>> But I see some other platforms have D-cache maintenance, like: arch/nios2/mm/cacheflush.c

>> And according to the name of flush_cache_range, it should do this, I judged. Otherwise,

>> mprotect04 will be failed on more platforms, it's easy to discover. Only PPC have specific

>> cache synchronization, maybe it meets some hardware limitation. It's impossible a programmer

>> fixed a common bug on only one platform but leave others unchanged.

> 

> flush_cache_range() is primarily used on VIVT caches before changing the

> mapping and should not really be implemented on arm64. I don't recall

> why we still have the I-cache invalidation, possibly for the ASID-tagged

> VIVT I-cache case, though we should have a specific check for this.

> 

> There are some other cases where flush_cache_range() is called and no

> D-cache maintenance is necessary on arm64, so I don't want to penalise

> them by implementing flush_cache_range().

> 

>>> It looks like the test may be missing I-cache maintenance regardless of

>>> the semantics of mprotect in this case.

>>>

>>> I have not yet devled into flush_cache_range and how it is called.

>>

>> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

> 

> The change_protection() shouldn't need to flush the caches in

> flush_cache_range(). The change_pte_range() function eventually ends up

> calling set_pte_at() which calls __sync_icache_dcache() if the mapping

> is executable.


OK, I see.
But I'm afraid it entered the "if (pte_present(oldpte))" branch in function change_pte_range.
Because the test case called mmap to create pte first, then called pte_modify.
I will check it later.

> 

> Can you be more specific about the kernel version you are using, its

> configuration?

> 

I used the latest mainline kernel version, and built with arch/arm64/configs/defconfig, ran on our D02 board.
I have attached the testcase, you can simply run: sh test.sh
#!/bin/bash

i=0
while [ 1 ]
do
	./mprotect04
	if [ $? -ne 0 ]; then
		echo "i=$i, failed"
		exit 1
	fi

	i=$(($i+1))

done
Zhen Lei May 25, 2016, 3:36 a.m. UTC | #6
On 2016/5/25 9:20, Leizhen (ThunderTown) wrote:
> 

> 

> On 2016/5/24 21:02, Catalin Marinas wrote:

>> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

>>> On 2016/5/24 19:37, Mark Rutland wrote:

>>>> On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

>>>>> When we ran mprotect04(a test case in LTP) infinitely, it would always

>>>>> failed after a few seconds. The case can be described briefly that: copy

>>>>> a empty function from code area into a new memory area(created by mmap),

>>>>> then call mprotect to change the protection to PROT_EXEC. The syscall

>>>>> sys_mprotect will finally invoke flush_cache_range, but this function

>>>>> currently only invalid icache, the operation of flush dcache is missed.

>>>>

>>>> In the LTP code I see powerpc-specific D-cache / I-cache synchronisation

>>>> (i.e. d-cache cleaning followed by I-cache invalidation), so there

>>>> appears to be some expectation of userspace maintenance. Hoever, there

>>>> is no such ARM-specific I-cache maintenance.

>>>

>>> But I see some other platforms have D-cache maintenance, like: arch/nios2/mm/cacheflush.c

>>> And according to the name of flush_cache_range, it should do this, I judged. Otherwise,

>>> mprotect04 will be failed on more platforms, it's easy to discover. Only PPC have specific

>>> cache synchronization, maybe it meets some hardware limitation. It's impossible a programmer

>>> fixed a common bug on only one platform but leave others unchanged.

>>

>> flush_cache_range() is primarily used on VIVT caches before changing the

>> mapping and should not really be implemented on arm64. I don't recall

>> why we still have the I-cache invalidation, possibly for the ASID-tagged

>> VIVT I-cache case, though we should have a specific check for this.

>>

>> There are some other cases where flush_cache_range() is called and no

>> D-cache maintenance is necessary on arm64, so I don't want to penalise

>> them by implementing flush_cache_range().

>>

>>>> It looks like the test may be missing I-cache maintenance regardless of

>>>> the semantics of mprotect in this case.

>>>>

>>>> I have not yet devled into flush_cache_range and how it is called.

>>>

>>> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

>>

>> The change_protection() shouldn't need to flush the caches in

>> flush_cache_range(). The change_pte_range() function eventually ends up

>> calling set_pte_at() which calls __sync_icache_dcache() if the mapping

>> is executable.

> 

> OK, I see.

> But I'm afraid it entered the "if (pte_present(oldpte))" branch in function change_pte_range.

> Because the test case called mmap to create pte first, then called pte_modify.

> I will check it later.


I have checked that it entered "if (pte_present(oldpte))" branch.

But I don't known why I add flush_icache_range is OK, but add __sync_icache_dcache have no effect.

> 

>>

>> Can you be more specific about the kernel version you are using, its

>> configuration?

>>

> I used the latest mainline kernel version, and built with arch/arm64/configs/defconfig, ran on our D02 board.

> I have attached the testcase, you can simply run: sh test.sh

>
Catalin Marinas May 25, 2016, 10:50 a.m. UTC | #7
On Wed, May 25, 2016 at 11:36:38AM +0800, Leizhen (ThunderTown) wrote:
> On 2016/5/25 9:20, Leizhen (ThunderTown) wrote:

> > On 2016/5/24 21:02, Catalin Marinas wrote:

> >> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

> >>> On 2016/5/24 19:37, Mark Rutland wrote:

> >>>> It looks like the test may be missing I-cache maintenance regardless of

> >>>> the semantics of mprotect in this case.

> >>>>

> >>>> I have not yet devled into flush_cache_range and how it is called.

> >>>

> >>> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

> >>

> >> The change_protection() shouldn't need to flush the caches in

> >> flush_cache_range(). The change_pte_range() function eventually ends up

> >> calling set_pte_at() which calls __sync_icache_dcache() if the mapping

> >> is executable.

> > 

> > OK, I see.

> > But I'm afraid it entered the "if (pte_present(oldpte))" branch in

> > function change_pte_range. Because the test case called mmap to

> > create pte first, then called pte_modify. I will check it later.

> 

> I have checked that it entered "if (pte_present(oldpte))" branch.


This path eventually calls set_pte_at() via ptep_modify_prot_commit().

> But I don't known why I add flush_icache_range is OK, but add

> __sync_icache_dcache have no effect.


Do you mean you modified set_pte_at() to use flush_icache_range()
instead of __sync_icache_dcache() and it works?

What happens is that __sync_icache_dcache() only takes care of the first
time a page is mapped in user space and flushes the caches, marking it
as "clean" (PG_dcache_clean) afterwards. Subsequent changes to this
mapping or writes to it are entirely the responsibility of the user. So
if the user plans to execute instructions, it better explicitly flush
the caches (as Mark Rutland already stated in a previous reply).

I ran our internal LTP version yesterday and it was fine but didn't
realise that we actually patched mprotect04.c to include:

	__clear_cache((char *)func, (char *)func + page_sz);

just after memcpy().

(we still need to investigate whether the I-cache invalidation is
actually needed in flush_cache_range() or it's just something we forgot
to remove)

-- 
Catalin
Catalin Marinas May 25, 2016, 3:22 p.m. UTC | #8
On Tue, May 24, 2016 at 04:12:35PM +0100, Mark Rutland wrote:
> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

> > On 2016/5/24 19:37, Mark Rutland wrote:

> > > On Tue, May 24, 2016 at 07:16:37PM +0800, Zhen Lei wrote:

> > >> When we ran mprotect04(a test case in LTP) infinitely, it would always

> > >> failed after a few seconds. The case can be described briefly that: copy

> > >> a empty function from code area into a new memory area(created by mmap),

> > >> then call mprotect to change the protection to PROT_EXEC. The syscall

> > >> sys_mprotect will finally invoke flush_cache_range, but this function

> > >> currently only invalid icache, the operation of flush dcache is missed.

[...]
> Jumping back to the problem at hand:

> 

> It looks like we inherited this from ARM, which has done this for all

> executable mappings since commit  6060e8df517847bf ("ARM: I-cache: flush

> executable mappings in flush_cache_range()"). The commit message refers

> to a4db94d, which doesn't seem to exist, and I assume is

> 115b22474eb1905d ("ARM: 5794/1: Flush the D-cache during

> copy_user_highpage()") which happened to get rebased at some point.

> 

> That all implies that the icache maintenance is only intended to ensure

> that CoW does not result in unexpected incoherency. Which in turn

> implies that flipping permissions with mprotect is not expected to

> synchronise the D and I caches.


Looking through the arm32 history:

2.6.33:
  115b22474eb1 ("ARM: 5794/1: Flush the D-cache during copy_user_highpage()")
  6060e8df5178 ("ARM: I-cache: flush executable mappings in flush_cache_range()")

These both take care of the CoW issue with executable pages.

2.6.37:
  6012191aa9c6 ("ARM: 6380/1: Introduce __sync_icache_dcache() for VIPT caches")

The above is a fix for SMP systems where update_mmu_cache() happens
after the PTE was actually written, so slight chance of race with
another CPU. We generalised it to all ARMv6+ systems (so VIPT caches).

3.1:
  c7e89b16eb90 ("ARM: 6995/2: mm: remove unnecessary cache flush on v6 copypage")

That's when we realised that the CoW problem no longer exists for
non-aliasing VIPT caches. However, the I-cache counterpart 6060e8df5178
has not been reverted.

So I think that for arm64 and arm32 with non-aliasing VIPT,
flush_cache_range() can safely be a no-op.

-- 
Catalin
Zhen Lei May 26, 2016, 3:40 a.m. UTC | #9
On 2016/5/25 18:50, Catalin Marinas wrote:
> On Wed, May 25, 2016 at 11:36:38AM +0800, Leizhen (ThunderTown) wrote:

>> On 2016/5/25 9:20, Leizhen (ThunderTown) wrote:

>>> On 2016/5/24 21:02, Catalin Marinas wrote:

>>>> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

>>>>> On 2016/5/24 19:37, Mark Rutland wrote:

>>>>>> It looks like the test may be missing I-cache maintenance regardless of

>>>>>> the semantics of mprotect in this case.

>>>>>>

>>>>>> I have not yet devled into flush_cache_range and how it is called.

>>>>>

>>>>> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

>>>>

>>>> The change_protection() shouldn't need to flush the caches in

>>>> flush_cache_range(). The change_pte_range() function eventually ends up

>>>> calling set_pte_at() which calls __sync_icache_dcache() if the mapping

>>>> is executable.

>>>

>>> OK, I see.

>>> But I'm afraid it entered the "if (pte_present(oldpte))" branch in

>>> function change_pte_range. Because the test case called mmap to

>>> create pte first, then called pte_modify. I will check it later.

>>

>> I have checked that it entered "if (pte_present(oldpte))" branch.

> 

> This path eventually calls set_pte_at() via ptep_modify_prot_commit().

OK, I see.

> 

>> But I don't known why I add flush_icache_range is OK, but add

>> __sync_icache_dcache have no effect.

> 

> Do you mean you modified set_pte_at() to use flush_icache_range()

Just about. I added in change_pte_range after below statement.
ptent = pte_modify(ptent, newprot);

> instead of __sync_icache_dcache() and it works?

Yes.

> 

> What happens is that __sync_icache_dcache() only takes care of the first

> time a page is mapped in user space and flushes the caches, marking it

> as "clean" (PG_dcache_clean) afterwards. Subsequent changes to this

> mapping or writes to it are entirely the responsibility of the user. So

> if the user plans to execute instructions, it better explicitly flush

> the caches (as Mark Rutland already stated in a previous reply).

> 

> I ran our internal LTP version yesterday and it was fine but didn't

> realise that we actually patched mprotect04.c to include:

> 

> 	__clear_cache((char *)func, (char *)func + page_sz);

> 

> just after memcpy().

Yes, I aslo tried this before I sent this patch. Flush dcache in userspace
or kernel can both fixs this problem.

> 

> (we still need to investigate whether the I-cache invalidation is

> actually needed in flush_cache_range() or it's just something we forgot

> to remove)

>
Zhen Lei May 26, 2016, 11:46 a.m. UTC | #10
On 2016/5/25 18:50, Catalin Marinas wrote:
> On Wed, May 25, 2016 at 11:36:38AM +0800, Leizhen (ThunderTown) wrote:

>> On 2016/5/25 9:20, Leizhen (ThunderTown) wrote:

>>> On 2016/5/24 21:02, Catalin Marinas wrote:

>>>> On Tue, May 24, 2016 at 08:19:05PM +0800, Leizhen (ThunderTown) wrote:

>>>>> On 2016/5/24 19:37, Mark Rutland wrote:

>>>>>> It looks like the test may be missing I-cache maintenance regardless of

>>>>>> the semantics of mprotect in this case.

>>>>>>

>>>>>> I have not yet devled into flush_cache_range and how it is called.

>>>>>

>>>>> SYSCALL_DEFINE3(mprotect ---> mprotect_fixup ---> change_protection ---> change_protection_range --> flush_cache_range

>>>>

>>>> The change_protection() shouldn't need to flush the caches in

>>>> flush_cache_range(). The change_pte_range() function eventually ends up

>>>> calling set_pte_at() which calls __sync_icache_dcache() if the mapping

>>>> is executable.

>>>

>>> OK, I see.

>>> But I'm afraid it entered the "if (pte_present(oldpte))" branch in

>>> function change_pte_range. Because the test case called mmap to

>>> create pte first, then called pte_modify. I will check it later.

>>

>> I have checked that it entered "if (pte_present(oldpte))" branch.

> 

> This path eventually calls set_pte_at() via ptep_modify_prot_commit().

> 

>> But I don't known why I add flush_icache_range is OK, but add

>> __sync_icache_dcache have no effect.

> 

> Do you mean you modified set_pte_at() to use flush_icache_range()

> instead of __sync_icache_dcache() and it works?

> 

> What happens is that __sync_icache_dcache() only takes care of the first

> time a page is mapped in user space and flushes the caches, marking it

> as "clean" (PG_dcache_clean) afterwards. Subsequent changes to this


Hi,
As my tracing, it is returned by "if (!page_mapping(page))", because "mmap" are anonymous pages. I commented below code lines, it works well.
	
	/* no flushing needed for anonymous pages */
	if (!page_mapping(page))
		return;


I printed the page information three times, as below:
page->mapping=ffff8017baf36961, page->flags=0x1000000000040048
page->mapping=ffff8017b265bf51, page->flags=0x1000000000040048
page->mapping=ffff8017b94fc5a1, page->flags=0x1000000000040048

PG_slab=7, PG_arch_1=9, PG_swapcache=15

> mapping or writes to it are entirely the responsibility of the user. So

> if the user plans to execute instructions, it better explicitly flush

> the caches (as Mark Rutland already stated in a previous reply).

> 

> I ran our internal LTP version yesterday and it was fine but didn't

> realise that we actually patched mprotect04.c to include:

> 

> 	__clear_cache((char *)func, (char *)func + page_sz);

> 

> just after memcpy().

> 

> (we still need to investigate whether the I-cache invalidation is

> actually needed in flush_cache_range() or it's just something we forgot

> to remove)

>
Catalin Marinas May 26, 2016, 4:36 p.m. UTC | #11
On Thu, May 26, 2016 at 07:46:11PM +0800, Leizhen (ThunderTown) wrote:
> On 2016/5/25 18:50, Catalin Marinas wrote:

> > What happens is that __sync_icache_dcache() only takes care of the first

> > time a page is mapped in user space and flushes the caches, marking it

> > as "clean" (PG_dcache_clean) afterwards. Subsequent changes to this

> 

> As my tracing, it is returned by "if (!page_mapping(page))", because

> "mmap" are anonymous pages. I commented below code lines, it works

> well.

> 	

> 	/* no flushing needed for anonymous pages */

> 	if (!page_mapping(page))

> 		return;


I think it only works by luck. As I said above, even with your
modification for anonymous pages, the first time set_pte_at() is called,
__sync_icache_dcache() would set the PG_dcache_clean bit. Subsequent
set_pte_at() calls for changing the attributes would ignore the D-cache
invalidation as the page seems clean (unless there is a call to
flush_dcache_page() but this shouldn't be done on this path). What
probably happens is that memcpy() for copying the code triggers some
write streaming mode in the CPU and the information makes its way to the
PoU. The I-cache invalidation only removes the stale instructions.

-- 
Catalin
Catalin Marinas May 26, 2016, 4:44 p.m. UTC | #12
On Wed, May 25, 2016 at 06:27:07PM +0100, Russell King - ARM Linux wrote:
> On Wed, May 25, 2016 at 04:22:55PM +0100, Catalin Marinas wrote:

> > That's when we realised that the CoW problem no longer exists for

> > non-aliasing VIPT caches. However, the I-cache counterpart 6060e8df5178

> > has not been reverted.

> 

> I think I mostly agree, except for reverting 6060e8df5178, which I don't

> think would be correct.  That reintroduces the possibility of flushing

> the I-cache twice in that path, once for aliasing vipt dcaches, and again

> for asid tagged vivt icaches.  I'd rather have the code structured so we

> only do this once.


__sync_icache_dcache() seems to take care of AIVIVT caches as well (i.e.
it invalidates the whole I-cache) so I think we could just remove the
__flush_icache_all() from flush_cache_range().

I can't find a scenario where we still need D-cache clean+invalidate in
flush_cache_range() for aliasing VIPT. As for the I-cache,
__sync_icache_dcache() takes care of all the aliases.

-- 
Catalin
diff mbox

Patch

diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
index dbd12ea..eda4124 100644
--- a/arch/arm64/mm/flush.c
+++ b/arch/arm64/mm/flush.c
@@ -31,7 +31,7 @@  void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
 		       unsigned long end)
 {
 	if (vma->vm_flags & VM_EXEC)
-		__flush_icache_all();
+		flush_icache_range(start, end);
 }

 static void sync_icache_aliases(void *kaddr, unsigned long len)