diff mbox

arm64: KVM: fix unmapping with 48-bit VAs

Message ID 1414525005-21080-1-git-send-email-mark.rutland@arm.com
State Superseded
Headers show

Commit Message

Mark Rutland Oct. 28, 2014, 7:36 p.m. UTC
Hi All,

I'm not at all familiar with this code, so my analysis below might be off. Does
this look right or am I just masking a more fundamental issue?

I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
system seems to boot fine.

Thanks,
Mark.

---->8----
Currently if using a 48-bit VA, tearing down the hyp page tables (which
can happen in the absence of a GICH or GICV resource) results in the
rather nasty splat below, evidently becasue we access a table that
doesn't actually exist.

Commit 38f791a4e499792e (arm64: KVM: Implement 48 VA support for KVM EL2
and Stage-2) added a pgd_none check to __create_hyp_mappings to account
for the additional level of tables, but didn't add a corresponding check
to unmap_range, and this seems to be the source of the problem.

This patch adds the missing pgd_none check, ensuring we don't try to
access tables that don't exist.

Original splat below:

kvm [1]: Using HYP init bounce page @83fe94a000
kvm [1]: Cannot obtain GICH resource
Unable to handle kernel paging request at virtual address ffff7f7fff000000
pgd = ffff800000770000
[ffff7f7fff000000] *pgd=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89
task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000
PC is at unmap_range+0x120/0x580
LR is at free_hyp_pgds+0xac/0xe4
pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045
sp : ffff8003eb45fbf0
x29: ffff8003eb45fbf0 x28: ffff800000736000
x27: ffff800000735000 x26: ffff7f7fff000000
x25: 0000000040000000 x24: ffff8000006f5000
x23: 0000000000000000 x22: 0000007fffffffff
x21: 0000800000000000 x20: 0000008000000000
x19: 0000000000000000 x18: ffff800000648000
x17: ffff800000537228 x16: 0000000000000000
x15: 000000000000001f x14: 0000000000000000
x13: 0000000000000001 x12: 0000000000000020
x11: 0000000000000062 x10: 0000000000000006
x9 : 0000000000000000 x8 : 0000000000000063
x7 : 0000000000000018 x6 : 00000003ff000000
x5 : ffff800000744188 x4 : 0000000000000001
x3 : 0000000040000000 x2 : ffff800000000000
x1 : 0000007fffffffff x0 : 000000003fffffff

Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058)
Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000)
fbe0:                                     eb45fcb0 ffff8003 0009cad8 ffff8000
fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80
fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000
fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080
fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003
fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003
fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000
fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000
fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000
fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000
fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002
fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000
fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000
fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000
fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000
fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000
fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000
fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000
fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000
fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000
fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000
ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call trace:
[<ffff80000009b768>] unmap_range+0x120/0x580
[<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4
[<ffff80000009b374>] kvm_arch_init+0x268/0x44c
[<ffff80000009606c>] kvm_init+0x24/0x260
[<ffff80000009a338>] arm_init+0x18/0x24
[<ffff8000000814bc>] do_one_initcall+0x88/0x1a0
[<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8
[<ffff8000004f8ee4>] kernel_init+0x10/0xd4
Code: 8b000263 92628479 d1000720 eb01001f (f9400340)
---[ end trace 3bc230562e926fa4 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Jungseok Lee <jungseoklee85@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/kvm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Marc Zyngier Oct. 29, 2014, 10:39 a.m. UTC | #1
Hi Mark,

On 28/10/14 19:36, Mark Rutland wrote:
> Hi All,
> 
> I'm not at all familiar with this code, so my analysis below might be off. Does
> this look right or am I just masking a more fundamental issue?
> 
> I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
> system seems to boot fine.

Thanks for looking into this.

> Thanks,
> Mark.
> 
> ---->8----
> Currently if using a 48-bit VA, tearing down the hyp page tables (which
> can happen in the absence of a GICH or GICV resource) results in the
> rather nasty splat below, evidently becasue we access a table that
> doesn't actually exist.
> 
> Commit 38f791a4e499792e (arm64: KVM: Implement 48 VA support for KVM EL2
> and Stage-2) added a pgd_none check to __create_hyp_mappings to account
> for the additional level of tables, but didn't add a corresponding check
> to unmap_range, and this seems to be the source of the problem.
> 
> This patch adds the missing pgd_none check, ensuring we don't try to
> access tables that don't exist.
> 
> Original splat below:
> 
> kvm [1]: Using HYP init bounce page @83fe94a000
> kvm [1]: Cannot obtain GICH resource
> Unable to handle kernel paging request at virtual address ffff7f7fff000000
> pgd = ffff800000770000
> [ffff7f7fff000000] *pgd=0000000000000000
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89
> task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000
> PC is at unmap_range+0x120/0x580
> LR is at free_hyp_pgds+0xac/0xe4
> pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045
> sp : ffff8003eb45fbf0
> x29: ffff8003eb45fbf0 x28: ffff800000736000
> x27: ffff800000735000 x26: ffff7f7fff000000
> x25: 0000000040000000 x24: ffff8000006f5000
> x23: 0000000000000000 x22: 0000007fffffffff
> x21: 0000800000000000 x20: 0000008000000000
> x19: 0000000000000000 x18: ffff800000648000
> x17: ffff800000537228 x16: 0000000000000000
> x15: 000000000000001f x14: 0000000000000000
> x13: 0000000000000001 x12: 0000000000000020
> x11: 0000000000000062 x10: 0000000000000006
> x9 : 0000000000000000 x8 : 0000000000000063
> x7 : 0000000000000018 x6 : 00000003ff000000
> x5 : ffff800000744188 x4 : 0000000000000001
> x3 : 0000000040000000 x2 : ffff800000000000
> x1 : 0000007fffffffff x0 : 000000003fffffff
> 
> Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058)
> Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000)
> fbe0:                                     eb45fcb0 ffff8003 0009cad8 ffff8000
> fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80
> fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000
> fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080
> fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003
> fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003
> fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000
> fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000
> fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000
> fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
> fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000
> fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002
> fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
> fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000
> fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000
> fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000
> fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000
> fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000
> fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000
> fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000
> fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000
> fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000
> fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000
> fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000
> ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> Call trace:
> [<ffff80000009b768>] unmap_range+0x120/0x580
> [<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4
> [<ffff80000009b374>] kvm_arch_init+0x268/0x44c
> [<ffff80000009606c>] kvm_init+0x24/0x260
> [<ffff80000009a338>] arm_init+0x18/0x24
> [<ffff8000000814bc>] do_one_initcall+0x88/0x1a0
> [<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8
> [<ffff8000004f8ee4>] kernel_init+0x10/0xd4
> Code: 8b000263 92628479 d1000720 eb01001f (f9400340)
> ---[ end trace 3bc230562e926fa4 ]---
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Jungseok Lee <jungseoklee85@gmail.com>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/kvm/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 57a403a..79d3fbf 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -197,7 +197,8 @@ static void unmap_range(struct kvm *kvm, pgd_t *pgdp,
>  	pgd = pgdp + pgd_index(addr);
>  	do {
>  		next = kvm_pgd_addr_end(addr, end);
> -		unmap_puds(kvm, pgd, addr, next);
> +		if (!pgd_none(*pgd))
> +			unmap_puds(kvm, pgd, addr, next);
>  	} while (pgd++, addr = next, addr != end);
>  }

Looks good to me!

Acked-by: Marc Zyngier <marc.zyngier@arm.com>

Christoffer, if you're OK with that one, I'll queue it up for the next -rc.

	M.
Christoffer Dall Oct. 30, 2014, 9 a.m. UTC | #2
On Wed, Oct 29, 2014 at 10:39:45AM +0000, Marc Zyngier wrote:
> Hi Mark,
> 
> On 28/10/14 19:36, Mark Rutland wrote:
> > Hi All,
> > 
> > I'm not at all familiar with this code, so my analysis below might be off. Does
> > this look right or am I just masking a more fundamental issue?
> > 
> > I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
> > system seems to boot fine.
> 
> Thanks for looking into this.
> 

yeah, sorry for missing this!

[...]

> 
> Looks good to me!
> 
> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> 
> Christoffer, if you're OK with that one, I'll queue it up for the next -rc.
> 
Yup (for the record we're not breaking anything because 48 bit VA still
depends on broken, right?):

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

-Christoffer
Marc Zyngier Oct. 30, 2014, 9:08 a.m. UTC | #3
On 30/10/14 09:00, Christoffer Dall wrote:
> On Wed, Oct 29, 2014 at 10:39:45AM +0000, Marc Zyngier wrote:
>> Hi Mark,
>>
>> On 28/10/14 19:36, Mark Rutland wrote:
>>> Hi All,
>>>
>>> I'm not at all familiar with this code, so my analysis below might be off. Does
>>> this look right or am I just masking a more fundamental issue?
>>>
>>> I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
>>> system seems to boot fine.
>>
>> Thanks for looking into this.
>>
> 
> yeah, sorry for missing this!
> 
> [...]
> 
>>
>> Looks good to me!
>>
>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>
>> Christoffer, if you're OK with that one, I'll queue it up for the next -rc.
>>
> Yup (for the record we're not breaking anything because 48 bit VA still
> depends on broken, right?):

Since 04f905a (arm64: Allow 48-bits VA space without ARM_SMMU), we can
select 48bit VA if the SMMU is not selected.

	M.
Christoffer Dall Oct. 30, 2014, 9:15 a.m. UTC | #4
On Thu, Oct 30, 2014 at 10:08 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 30/10/14 09:00, Christoffer Dall wrote:
>> On Wed, Oct 29, 2014 at 10:39:45AM +0000, Marc Zyngier wrote:
>>> Hi Mark,
>>>
>>> On 28/10/14 19:36, Mark Rutland wrote:
>>>> Hi All,
>>>>
>>>> I'm not at all familiar with this code, so my analysis below might be off. Does
>>>> this look right or am I just masking a more fundamental issue?
>>>>
>>>> I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
>>>> system seems to boot fine.
>>>
>>> Thanks for looking into this.
>>>
>>
>> yeah, sorry for missing this!
>>
>> [...]
>>
>>>
>>> Looks good to me!
>>>
>>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>>
>>> Christoffer, if you're OK with that one, I'll queue it up for the next -rc.
>>>
>> Yup (for the record we're not breaking anything because 48 bit VA still
>> depends on broken, right?):
>
> Since 04f905a (arm64: Allow 48-bits VA space without ARM_SMMU), we can
> select 48bit VA if the SMMU is not selected.
>
Did Catlin merge this separately or did I accidentally do that?  I
thought we were holding off on that until we had done more testing...

-Christoffer
Marc Zyngier Oct. 30, 2014, 9:23 a.m. UTC | #5
On 30/10/14 09:15, Christoffer Dall wrote:
> On Thu, Oct 30, 2014 at 10:08 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>> On 30/10/14 09:00, Christoffer Dall wrote:
>>> On Wed, Oct 29, 2014 at 10:39:45AM +0000, Marc Zyngier wrote:
>>>> Hi Mark,
>>>>
>>>> On 28/10/14 19:36, Mark Rutland wrote:
>>>>> Hi All,
>>>>>
>>>>> I'm not at all familiar with this code, so my analysis below might be off. Does
>>>>> this look right or am I just masking a more fundamental issue?
>>>>>
>>>>> I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
>>>>> system seems to boot fine.
>>>>
>>>> Thanks for looking into this.
>>>>
>>>
>>> yeah, sorry for missing this!
>>>
>>> [...]
>>>
>>>>
>>>> Looks good to me!
>>>>
>>>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>
>>>> Christoffer, if you're OK with that one, I'll queue it up for the next -rc.
>>>>
>>> Yup (for the record we're not breaking anything because 48 bit VA still
>>> depends on broken, right?):
>>
>> Since 04f905a (arm64: Allow 48-bits VA space without ARM_SMMU), we can
>> select 48bit VA if the SMMU is not selected.
>>
> Did Catlin merge this separately or did I accidentally do that?  I
> thought we were holding off on that until we had done more testing...

Catalin felt very brave, did some testing (LTP and stuff), fixed another
nit (3dec0fe), sent a pull request to Linus and went on holiday!

So far, it looks good. :-)

	M.
Christoffer Dall Oct. 30, 2014, 9:29 a.m. UTC | #6
On Thu, Oct 30, 2014 at 10:23 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 30/10/14 09:15, Christoffer Dall wrote:
>> On Thu, Oct 30, 2014 at 10:08 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>> On 30/10/14 09:00, Christoffer Dall wrote:
>>>> On Wed, Oct 29, 2014 at 10:39:45AM +0000, Marc Zyngier wrote:
>>>>> Hi Mark,
>>>>>
>>>>> On 28/10/14 19:36, Mark Rutland wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> I'm not at all familiar with this code, so my analysis below might be off. Does
>>>>>> this look right or am I just masking a more fundamental issue?
>>>>>>
>>>>>> I've given this a go with 4KiB pages on Juno with {39,48}-bit VAs and the
>>>>>> system seems to boot fine.
>>>>>
>>>>> Thanks for looking into this.
>>>>>
>>>>
>>>> yeah, sorry for missing this!
>>>>
>>>> [...]
>>>>
>>>>>
>>>>> Looks good to me!
>>>>>
>>>>> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>
>>>>> Christoffer, if you're OK with that one, I'll queue it up for the next -rc.
>>>>>
>>>> Yup (for the record we're not breaking anything because 48 bit VA still
>>>> depends on broken, right?):
>>>
>>> Since 04f905a (arm64: Allow 48-bits VA space without ARM_SMMU), we can
>>> select 48bit VA if the SMMU is not selected.
>>>
>> Did Catlin merge this separately or did I accidentally do that?  I
>> thought we were holding off on that until we had done more testing...
>
> Catalin felt very brave, did some testing (LTP and stuff), fixed another
> nit (3dec0fe), sent a pull request to Linus and went on holiday!
>
> So far, it looks good. :-)
>
great, hellelujah for progress.
diff mbox

Patch

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 57a403a..79d3fbf 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -197,7 +197,8 @@  static void unmap_range(struct kvm *kvm, pgd_t *pgdp,
 	pgd = pgdp + pgd_index(addr);
 	do {
 		next = kvm_pgd_addr_end(addr, end);
-		unmap_puds(kvm, pgd, addr, next);
+		if (!pgd_none(*pgd))
+			unmap_puds(kvm, pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }