diff mbox series

[Xen-devel] xen/page_alloc: Keep away MFN 0 from the buddy allocator

Message ID 20190809121440.5668-1-julien.grall@arm.com
State New
Headers show
Series [Xen-devel] xen/page_alloc: Keep away MFN 0 from the buddy allocator | expand

Commit Message

Julien Grall Aug. 9, 2019, 12:14 p.m. UTC
Combining of buddies happens only such that the resulting larger buddy
is still order-aligned. To cross a zone boundary while merging, the
implication is that both the buddy [0, 2^n-1] and the buddy
[2^n, 2^(n+1)] are free.

Ideally we want to fix the allocator, but for now we can just prevent
adding the MFN 0 in the allocator.

On x86, the MFN 0 is already kept away from the buddy allocator. So the
bug can only happen on Arm platform where the first memory bank is
starting at 0.

As this is a specific to the allocator, the MFN 0 is removed in the common code
to cater all the architectures (current and future).

Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>

---
Cc: Jarvis.Roach@dornerworks.com
Cc: Stewart.Hildebrand@dornerworks.com

    I am not sure I fully understood the exact problem when MFN 0 is
    given to the allocator. Feel free to suggest a better explanation.

    Can anyone able to reproduce the bug test where the patch
    effectively solve the crash?
---
 xen/common/page_alloc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Jan Beulich Aug. 9, 2019, 1:38 p.m. UTC | #1
On 09.08.2019 14:14, Julien Grall wrote:
> Combining of buddies happens only such that the resulting larger buddy
> is still order-aligned. To cross a zone boundary while merging, the
> implication is that both the buddy [0, 2^n-1] and the buddy
> [2^n, 2^(n+1)] are free.

[2^n, 2^(n+1)-1]

You may want to add that merging across zone boundaries is what we
need to prevent.

> Ideally we want to fix the allocator, but for now we can just prevent
> adding the MFN 0 in the allocator.
> 
> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> bug can only happen on Arm platform where the first memory bank is
> starting at 0.
> 
> As this is a specific to the allocator, the MFN 0 is removed in the common code
> to cater all the architectures (current and future).
> 
> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
Stefano Stabellini Aug. 9, 2019, 5:55 p.m. UTC | #2
On Fri, 9 Aug 2019, Julien Grall wrote:
> Combining of buddies happens only such that the resulting larger buddy
> is still order-aligned. To cross a zone boundary while merging, the
> implication is that both the buddy [0, 2^n-1] and the buddy
> [2^n, 2^(n+1)] are free.
> 
> Ideally we want to fix the allocator, but for now we can just prevent
> adding the MFN 0 in the allocator.
> 
> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> bug can only happen on Arm platform where the first memory bank is
> starting at 0.
> 
> As this is a specific to the allocator, the MFN 0 is removed in the common code
> to cater all the architectures (current and future).
> 
> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

I could reproduce the problem and I confirm that this patch fixes it:

Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>



> ---
> Cc: Jarvis.Roach@dornerworks.com
> Cc: Stewart.Hildebrand@dornerworks.com
> 
>     I am not sure I fully understood the exact problem when MFN 0 is
>     given to the allocator. Feel free to suggest a better explanation.
> 
>     Can anyone able to reproduce the bug test where the patch
>     effectively solve the crash?
> ---
>  xen/common/page_alloc.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 453b303b5b..42b8a8ce23 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1759,6 +1759,18 @@ static void init_heap_pages(
>      bool idle_scrub = false;
>  
>      /*
> +     * Keep MFN 0 away from the buddy allocator to avoid crossing zone
> +     * boundary when merging two buddies.
> +     */
> +    if ( !mfn_x(page_to_mfn(pg)) )
> +    {
> +        if ( nr_pages-- <= 1 )
> +            return;
> +        pg++;
> +    }
> +
> +
> +    /*
>       * Some pages may not go through the boot allocator (e.g reserved
>       * memory at boot but released just after --- kernel, initramfs,
>       * etc.).
> -- 
> 2.11.0
>
Stewart Hildebrand Aug. 9, 2019, 6:15 p.m. UTC | #3
On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
>On 09.08.2019 14:14, Julien Grall wrote:

>> Combining of buddies happens only such that the resulting larger buddy

>> is still order-aligned. To cross a zone boundary while merging, the

>> implication is that both the buddy [0, 2^n-1] and the buddy

>> [2^n, 2^(n+1)] are free.

>

>[2^n, 2^(n+1)-1]

>

>You may want to add that merging across zone boundaries is what we

>need to prevent.

>

>> Ideally we want to fix the allocator, but for now we can just prevent

>> adding the MFN 0 in the allocator.

>>

>> On x86, the MFN 0 is already kept away from the buddy allocator. So the

>> bug can only happen on Arm platform where the first memory bank is

>> starting at 0.

>>

>> As this is a specific to the allocator, the MFN 0 is removed in the common code

>> to cater all the architectures (current and future).

>>

>> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>

>> Signed-off-by: Julien Grall <julien.grall@arm.com>

>

>Reviewed-by: Jan Beulich <jbeulich@suse.com>


Here is Jeff's initial patch for the issue.

From: Jeff Kubascik <jeff.kubascik@dornerworks.com>

Date: Mon, 4 Mar 2019 14:14:05 -0500
Subject: [PATCH] Check zone before merging adjacent blocks in heap

The Xen heap is split up into nodes and zones. Each node + zone is
managed as a separate pool of memory.

When returning pages to the heap, free_heap_pages will check adjacent
blocks to see if they can be combined into a larger block. However, the
zone of the adjacent block is not checked. This results in blocks that
migrate from one zone to another.

When a block migrates to the adjacent zone, the avail counters for the
old and new node + zone is not updated accordingly. The avail counter
is used when allocating pages to determine whether to skip over a zone.
With this behavior, it is possible for free pages to collect in a zone
with the avail counter smaller than the actual page count, resulting
in free pages that are not allocable.

This commit adds a check to compare the adjacent block's zone with the
current zone before merging them.

Signed-off-by: Jeff Kubascik <Jeff.Kubascik@dornerworks.com>

Tested-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>

---
 xen/common/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 482f0988f7..a92268cc67 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1419,6 +1419,7 @@ static void free_heap_pages(
             if ( !mfn_valid(page_to_mfn(predecessor)) ||
                  !page_state_is(predecessor, free) ||
                  (PFN_ORDER(predecessor) != order) ||
+                 (page_to_zone(pg-mask) != zone) ||
                  (phys_to_nid(page_to_maddr(predecessor)) != node) )
                 break;
 
@@ -1442,6 +1443,7 @@ static void free_heap_pages(
             if ( !mfn_valid(page_to_mfn(successor)) ||
                  !page_state_is(successor, free) ||
                  (PFN_ORDER(successor) != order) ||
+                 (page_to_zone(pg+mask) != zone) ||
                  (phys_to_nid(page_to_maddr(successor)) != node) )
                 break;
 
-- 
2.22.0
Stefano Stabellini Aug. 9, 2019, 6:23 p.m. UTC | #4
On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
> >On 09.08.2019 14:14, Julien Grall wrote:
> >> Combining of buddies happens only such that the resulting larger buddy
> >> is still order-aligned. To cross a zone boundary while merging, the
> >> implication is that both the buddy [0, 2^n-1] and the buddy
> >> [2^n, 2^(n+1)] are free.
> >
> >[2^n, 2^(n+1)-1]
> >
> >You may want to add that merging across zone boundaries is what we
> >need to prevent.
> >
> >> Ideally we want to fix the allocator, but for now we can just prevent
> >> adding the MFN 0 in the allocator.
> >>
> >> On x86, the MFN 0 is already kept away from the buddy allocator. So the
> >> bug can only happen on Arm platform where the first memory bank is
> >> starting at 0.
> >>
> >> As this is a specific to the allocator, the MFN 0 is removed in the common code
> >> to cater all the architectures (current and future).
> >>
> >> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
> >> Signed-off-by: Julien Grall <julien.grall@arm.com>
> >
> >Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Here is Jeff's initial patch for the issue.

I committed Julien's patch for now, but if we need to make any changes
or decide for a better alternative, we can always revert it.
Stewart Hildebrand Aug. 9, 2019, 6:34 p.m. UTC | #5
On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>> Here is Jeff's initial patch for the issue.
>
>I committed Julien's patch for now,

Great! Thanks!

>but if we need to make any changes
>or decide for a better alternative, we can always revert it.

Can we entertain committing both patches?
To paraphrase George from an earlier discussion: Removing MFN 0 fixes the issue by relying on side effects. Adding an explicit check in the merging logic directly fixes the issue.
Julien Grall Aug. 9, 2019, 6:35 p.m. UTC | #6
Hi Stewart,

On 09/08/2019 19:34, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>> On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>>> Here is Jeff's initial patch for the issue.
>>
>> I committed Julien's patch for now,
> 
> Great! Thanks!
> 
>> but if we need to make any changes
>> or decide for a better alternative, we can always revert it.
> 
> Can we entertain committing both patches?

That's you to post formally the patch on the ML ;). The way you posted is likely 
going to be unnoticed as you hijack the discussion (subject title as not be 
changed).
Jan Beulich Aug. 12, 2019, 6:01 a.m. UTC | #7
On 09.08.2019 20:15, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 9:39 AM, Jan Beulich <jbeulich@suse.com> wrote:
>> On 09.08.2019 14:14, Julien Grall wrote:
>>> Combining of buddies happens only such that the resulting larger buddy
>>> is still order-aligned. To cross a zone boundary while merging, the
>>> implication is that both the buddy [0, 2^n-1] and the buddy
>>> [2^n, 2^(n+1)] are free.
>>
>> [2^n, 2^(n+1)-1]
>>
>> You may want to add that merging across zone boundaries is what we
>> need to prevent.
>>
>>> Ideally we want to fix the allocator, but for now we can just prevent
>>> adding the MFN 0 in the allocator.
>>>
>>> On x86, the MFN 0 is already kept away from the buddy allocator. So the
>>> bug can only happen on Arm platform where the first memory bank is
>>> starting at 0.
>>>
>>> As this is a specific to the allocator, the MFN 0 is removed in the common code
>>> to cater all the architectures (current and future).
>>>
>>> Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Here is Jeff's initial patch for the issue.

To be honest, it would have been nice if you had clarified _why_
you sent this in reply here.

Jan
Jan Beulich Aug. 12, 2019, 6:16 a.m. UTC | #8
On 09.08.2019 20:34, Stewart Hildebrand wrote:
> On Friday, August 9, 2019 2:24 PM, Stefano Stabellini <sstabellini@kernel.org>
>> On Fri, 9 Aug 2019, Stewart Hildebrand wrote:
>>> Here is Jeff's initial patch for the issue.
>>
>> I committed Julien's patch for now,
> 
> Great! Thanks!
> 
>> but if we need to make any changes
>> or decide for a better alternative, we can always revert it.
> 
> Can we entertain committing both patches?
> To paraphrase George from an earlier discussion: Removing MFN 0 fixes the issue by relying on side effects. Adding an explicit check in the merging logic directly fixes the issue.
> 

I thought previous discussion (when you had first posted you variant
of the fix) had clarified that there are objections to you modifying
an often executed code path when the same effect can be achieved by
modifying an infrequently executed one.

Jan
diff mbox series

Patch

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 453b303b5b..42b8a8ce23 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1759,6 +1759,18 @@  static void init_heap_pages(
     bool idle_scrub = false;
 
     /*
+     * Keep MFN 0 away from the buddy allocator to avoid crossing zone
+     * boundary when merging two buddies.
+     */
+    if ( !mfn_x(page_to_mfn(pg)) )
+    {
+        if ( nr_pages-- <= 1 )
+            return;
+        pg++;
+    }
+
+
+    /*
      * Some pages may not go through the boot allocator (e.g reserved
      * memory at boot but released just after --- kernel, initramfs,
      * etc.).