diff mbox

[v2] util: align memory allocations to 2M on AArch64

Message ID 1461323529-30724-1-git-send-email-christoffer.dall@linaro.org
State Accepted
Commit ee1e0f8e5d3682c561edcdceccff72b9d9b16d8b
Headers show

Commit Message

Christoffer Dall April 22, 2016, 11:12 a.m. UTC
For KVM to use Transparent Huge Pages (THP) we have to ensure that the
alignment of the userspace address of the KVM memory slot and the IPA
that the guest sees for a memory region have the same offset from the 2M
huge page size boundary.

One way to achieve this is to always align the IPA region at a 2M
boundary and ensure that the mmap alignment is also at 2M.

Unfortunately, we were only doing this for __arm__, not for __aarch64__,
so add this simply condition.

This fixes a performance regression using KVM/ARM on AArch64 platforms
that showed a performance penalty of more than 50%, introduced by the
following commit:

9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10)

We were only lucky before the above commit, because we were allocating
large regions and naturally getting a 2M alignment on those allocations
then.

Reported-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

---
The first version of this patch was accidentally made against the v2.5.0
release instead of master, so this is a rebased version.

 util/oslib-posix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.1.2.330.g565301e.dirty

Comments

Peter Maydell April 22, 2016, 11:58 a.m. UTC | #1
On 22 April 2016 at 12:12, Christoffer Dall <christoffer.dall@linaro.org> wrote:
> For KVM to use Transparent Huge Pages (THP) we have to ensure that the

> alignment of the userspace address of the KVM memory slot and the IPA

> that the guest sees for a memory region have the same offset from the 2M

> huge page size boundary.

>

> One way to achieve this is to always align the IPA region at a 2M

> boundary and ensure that the mmap alignment is also at 2M.

>

> Unfortunately, we were only doing this for __arm__, not for __aarch64__,

> so add this simply condition.

>

> This fixes a performance regression using KVM/ARM on AArch64 platforms

> that showed a performance penalty of more than 50%, introduced by the

> following commit:

>

> 9fac18f (oslib: allocate PROT_NONE pages on top of RAM, 2015-09-10)

>

> We were only lucky before the above commit, because we were allocating

> large regions and naturally getting a 2M alignment on those allocations

> then.

>

> Reported-by: Shih-Wei Li <shihwei@cs.columbia.edu>

> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---

> The first version of this patch was accidentally made against the v2.5.0

> release instead of master, so this is a rebased version.


Thanks; applied to master (with the long line wrapped).

-- PMM
diff mbox

Patch

diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 20ca141..a0c5b91 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -26,7 +26,7 @@ 
  * THE SOFTWARE.
  */
 
-#if defined(__linux__) && (defined(__x86_64__) || defined(__arm__))
+#if defined(__linux__) && (defined(__x86_64__) || defined(__arm__) || defined(__aarch64__))
    /* Use 2 MiB alignment so transparent hugepages can be used by KVM.
       Valgrind does not support alignments larger than 1 MiB,
       therefore we need special code which handles running on Valgrind. */