From patchwork Tue Aug 28 21:00:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Github ODP bot X-Patchwork-Id: 145371 Delivered-To: patch@linaro.org Received: by 2002:a2e:1648:0:0:0:0:0 with SMTP id 8-v6csp1625968ljw; Tue, 28 Aug 2018 14:01:01 -0700 (PDT) X-Google-Smtp-Source: ANB0VdahPXgFV7cz2lnBxW3dupFNIhifgkaEyD4xBSszyJ4jQJtIOuEpkffYm13jc6PfMm1Y+7Yd X-Received: by 2002:ac8:156:: with SMTP id f22-v6mr3689294qtg.186.1535490060907; Tue, 28 Aug 2018 14:01:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535490060; cv=none; d=google.com; s=arc-20160816; b=pKdzNf68ofuT5idFtwpsyAR+1OVViDYkf6k2l55FtCaebByJWM8HgyczLiKeNA8PgZ y64iX9UdaQAHQkdRHshCOBLiFdO8EvhnM0OzgwNknxdI/4dWDbVWVSH3Mly7ukjRomBX nabUYFVFDdHG4cDGHGIq0J59X38oOifgpSd0ntP1WtStglg6KyyYE6Lg1I9AroHSm1K2 2z9GsMOH8nOua7TmD0P9rVtMRx5AF4dd9oVX16NVxKarsVFxEq2QCDcIsl05C6hZadS4 CwqtiO20IxbaaengstE0ZUjonxGr5hXue+1nCU/ZrQLTYpAIAKhLoU4j1uPuvXFRHj9m EGHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:github-pr-num :references:in-reply-to:message-id:date:to:from:delivered-to :arc-authentication-results; bh=XSgGT27Qc1a4AkRiSl79lK1dizwZR+IKUwt13D5aZg4=; b=TPOCLxlEvyqVAH+u4abyBzZIBFigXZABAcA24gqprVu/WbPQK3l+1GuRS+/oXGDf1r 9Q3yT+gcY5l4m5EcCsRPhFuA1xMxfqjygCT8+5xgkD0q32ZPkGFQcafENaRPE//qb7j6 oyK6zfXqxMi1LkZG9MvhYRzGsNTxa2nYl165UqnyuE3FsDP71IEcy6VVqfUqUA4HG5aD GGqbofdkoiSugEqlfdHpBTbjR3ZzLDqw8jrPKehW/nFvfTVYQ/xVDn1RkBYmJLFb8nBk VIgyuhx8W1zt+xqcwrEwkiGnnZ9O+rYdmxpX/cVv3fKiMCkS7Xdw8/QmT88vUzTP4lsL spNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) smtp.mailfrom=lng-odp-bounces@lists.linaro.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=yandex.ru Return-Path: Received: from lists.linaro.org (ec2-54-197-127-237.compute-1.amazonaws.com. [54.197.127.237]) by mx.google.com with ESMTP id o5-v6si2047934qtc.134.2018.08.28.14.01.00; Tue, 28 Aug 2018 14:01:00 -0700 (PDT) Received-SPF: pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) client-ip=54.197.127.237; Authentication-Results: mx.google.com; spf=pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) smtp.mailfrom=lng-odp-bounces@lists.linaro.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=yandex.ru Received: by lists.linaro.org (Postfix, from userid 109) id 9421F6173E; Tue, 28 Aug 2018 21:01:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on ip-10-142-244-252 X-Spam-Level: X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_LOW autolearn=disabled version=3.4.0 Received: from [127.0.0.1] (localhost [127.0.0.1]) by lists.linaro.org (Postfix) with ESMTP id 3FE2760981; Tue, 28 Aug 2018 21:00:27 +0000 (UTC) X-Original-To: lng-odp@lists.linaro.org Delivered-To: lng-odp@lists.linaro.org Received: by lists.linaro.org (Postfix, from userid 109) id BA9AC6094D; Tue, 28 Aug 2018 21:00:23 +0000 (UTC) Received: from forward106j.mail.yandex.net (forward106j.mail.yandex.net [5.45.198.249]) by lists.linaro.org (Postfix) with ESMTPS id AC448608D0 for ; Tue, 28 Aug 2018 21:00:21 +0000 (UTC) Received: from mxback8j.mail.yandex.net (mxback8j.mail.yandex.net [IPv6:2a02:6b8:0:1619::111]) by forward106j.mail.yandex.net (Yandex) with ESMTP id 1C3CA180538F for ; Wed, 29 Aug 2018 00:00:20 +0300 (MSK) Received: from smtp4p.mail.yandex.net (smtp4p.mail.yandex.net [2a02:6b8:0:1402::15:6]) by mxback8j.mail.yandex.net (nwsmtp/Yandex) with ESMTP id dZcf7b4cnz-0K781NMK; Wed, 29 Aug 2018 00:00:20 +0300 Received: by smtp4p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id oDdTfC844Q-0JuSSpKH; Wed, 29 Aug 2018 00:00:19 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (Client certificate not present) From: Github ODP bot To: lng-odp@lists.linaro.org Date: Tue, 28 Aug 2018 21:00:06 +0000 Message-Id: <1535490006-30683-2-git-send-email-odpbot@yandex.ru> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1535490006-30683-1-git-send-email-odpbot@yandex.ru> References: <1535490006-30683-1-git-send-email-odpbot@yandex.ru> Github-pr-num: 685 Subject: [lng-odp] [PATCH v2 1/1] linux-gen: ishm: implement huge page cache X-BeenThere: lng-odp@lists.linaro.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: "The OpenDataPlane \(ODP\) List" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: lng-odp-bounces@lists.linaro.org Sender: "lng-odp" From: Josep Puigdemont With this patch, ODP will pre-allocate several huge pages at init time. When memory is to be mapped into a huge page, one that was pre-allocated will be used, if available, this way ODP won't have to trap into the kernel to allocate huge pages. The idea with this implementation is to trick ishm into thinking that a file descriptor where to map the memory was provided, this way it it won't try to allocate one itself. This file descriptor is one of those previously allocated at init time. When the system is done with this file descriptor, instead of closing it, it is put back into the list of available huge pages, ready to be reused. A collateral effect of this patch is that memory is not zeroed out when it is reused. WARNING: This patch will not work when using process mode threads. For several reasons, this may not work when using ODP_ISHM_SINGLE_VA either, so for this case the list of pre-allocated files is not used. This patch should mitigate, if not solve, bug #3774: https://bugs.linaro.org/show_bug.cgi?id=3774 To pre-allocate huge pages, define the environment variable ODP_HP_CACHE, and possibly set it to the number of huge pages that should be pre-allocated, setting it to -1 will reserve up to 32 huge pages, which is currently a hard-coded limit. example usage: ODP_HP_CACHE=-1 ./test/validation/api/shmem/shmem_main Signed-off-by: Josep Puigdemont --- /** Email created from pull request 685 (joseppc:fix/cache_huge_pages) ** https://github.com/Linaro/odp/pull/685 ** Patch: https://github.com/Linaro/odp/pull/685.patch ** Base sha: 96e6c6409bfe8e5f276f136c4e10454f112cd662 ** Merge commit sha: fef12aa722655975b08b8f82c4cf4081abc6c072 **/ platform/linux-generic/odp_ishm.c | 206 ++++++++++++++++++++++++++++-- 1 file changed, 192 insertions(+), 14 deletions(-) diff --git a/platform/linux-generic/odp_ishm.c b/platform/linux-generic/odp_ishm.c index fc2f948cc..1de116ed8 100644 --- a/platform/linux-generic/odp_ishm.c +++ b/platform/linux-generic/odp_ishm.c @@ -164,7 +164,7 @@ typedef struct ishm_fragment { * will allocate both a block and a fragment. * Blocks contain only global data common to all processes. */ -typedef enum {UNKNOWN, HUGE, NORMAL, EXTERNAL} huge_flag_t; +typedef enum {UNKNOWN, HUGE, NORMAL, EXTERNAL, CACHED} huge_flag_t; typedef struct ishm_block { char name[ISHM_NAME_MAXLEN]; /* name for the ishm block (if any) */ char filename[ISHM_FILENAME_MAXLEN]; /* name of the .../odp-* file */ @@ -238,6 +238,17 @@ typedef struct { } ishm_ftable_t; static ishm_ftable_t *ishm_ftbl; +#define HP_CACHE_SIZE 32 +struct huge_page_cache { + uint64_t len; + int total; /* index in fd that's the highest allocated */ + int idx; /* retrieve fd[idx] to get a free file descriptor */ + unsigned int seq_num; + int fd[HP_CACHE_SIZE]; /* list of file descriptors */ +}; + +static struct huge_page_cache hpc; + #ifndef MAP_ANONYMOUS #define MAP_ANONYMOUS MAP_ANON #endif @@ -245,6 +256,132 @@ static ishm_ftable_t *ishm_ftbl; /* prototypes: */ static void procsync(void); +static int hp_create_file(uint64_t len, unsigned int seq_num) +{ + char filename[ISHM_FILENAME_MAXLEN]; + char dir[ISHM_FILENAME_MAXLEN]; + int fd; + void *addr; + + if (len <= 0) { + ODP_ERR("Length is wrong\n"); + return -1; + } + + if (!odp_global_data.hugepage_info.default_huge_page_dir) { + ODP_ERR("No huge page dir\n"); + return -1; + } + + snprintf(dir, ISHM_FILENAME_MAXLEN, "%s/%s", + odp_global_data.hugepage_info.default_huge_page_dir, + odp_global_data.uid); + + if (mkdir(dir, 0744) != 0) { + if (errno != EEXIST) { + ODP_ERR("Failed to creatr dir: %s\n", strerror(errno)); + return -1; + } + } + + snprintf(filename, ISHM_FILENAME_MAXLEN, + "%s/odp-%d-ishm_cached-%04x", + dir, + odp_global_data.main_pid, + seq_num++); + + fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, + S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); + if (fd < 0) { + ODP_ERR("Could not create cache file %s\n", filename); + return -1; + } + + /* remove file from file system */ + unlink(filename); + + if (ftruncate(fd, len) == -1) { + ODP_ERR("Could not truncate file: %s\n", strerror(errno)); + close(fd); + return -1; + } + + /* commit huge page */ + addr = _odp_ishmphy_map(fd, NULL, len, 0); + if (addr == NULL) { + /* no more pages available */ + close(fd); + return -1; + } + _odp_ishmphy_unmap(addr, len, 0); + + ODP_DBG("Created HP cache file %s, fd: %d\n", filename, fd); + + return fd; +} + +static void hp_init(void) +{ + char *odp_hp_env; + char *ptr; + int count; + + hpc.seq_num = 0; + hpc.total = -1; + hpc.idx = -1; + hpc.len = odp_sys_huge_page_size(); + + odp_hp_env = getenv("ODP_HP_CACHE"); + if (odp_hp_env == NULL) + return; + + ODP_DBG("Init HP cache\n"); + + count = strtol(odp_hp_env, &ptr, 10); + if (ptr == odp_hp_env || *ptr != '\0' || + count < 0 || count > HP_CACHE_SIZE) + count = HP_CACHE_SIZE; + + for (int i = 0; i < count; ++i) { + int fd; + + fd = hp_create_file(hpc.len, hpc.seq_num++); + if (fd == -1) + break; + hpc.total++; + hpc.idx++; + hpc.fd[hpc.idx] = fd; + } + + ODP_DBG("HP cache has %d huge pages of size 0x%08" PRIx64 "\n", + hpc.total + 1, hpc.len); +} + +static int hp_get_cached(uint64_t len) +{ + int fd; + + if (hpc.idx < 0 || len != hpc.len) + return -1; + + fd = hpc.fd[hpc.idx]; + hpc.fd[hpc.idx--] = -1; + + return fd; +} + +static int hp_put_cached(int fd) +{ + if (hpc.idx > hpc.total) { + ODP_ERR("Trying to put more FD than allowed: %d\n", fd); + return -1; + } + + hpc.fd[++hpc.idx] = fd; + + return 0; +} + /* * Take a piece of the preallocated virtual space to fit "size" bytes. * (best fit). Size must be rounded up to an integer number of pages size. @@ -798,8 +935,14 @@ static int block_free_internal(int block_index, int close_fd, int deregister) block_index); /* close the related fd */ - if (close_fd) - close(ishm_proctable->entry[proc_index].fd); + if (close_fd) { + int fd = ishm_proctable->entry[proc_index].fd; + + if (block->huge == CACHED) + hp_put_cached(fd); + else + close(fd); + } /* remove entry from process local table: */ last = ishm_proctable->nb_entries - 1; @@ -910,6 +1053,7 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, new_block->huge = EXTERNAL; } else { new_block->external_fd = 0; + new_block->huge = UNKNOWN; } /* Otherwise, Try first huge pages when possible and needed: */ @@ -927,17 +1071,38 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, /* roundup to page size */ len = (size + (page_hp_size - 1)) & (-page_hp_size); - addr = do_map(new_index, len, hp_align, flags, HUGE, &fd); - - if (addr == NULL) { - if (!huge_error_printed) { - ODP_ERR("No huge pages, fall back to normal " - "pages. " - "check: /proc/sys/vm/nr_hugepages.\n"); - huge_error_printed = 1; + if (!(flags & _ODP_ISHM_SINGLE_VA)) { + /* try pre-allocated pages */ + fd = hp_get_cached(len); + if (fd != -1) { + /* do as if user provided a fd */ + new_block->external_fd = 1; + addr = do_map(new_index, len, hp_align, flags, + CACHED, &fd); + if (addr == NULL) { + ODP_ERR("Could not use cached hp %d\n", + fd); + hp_put_cached(fd); + fd = -1; + } else { + new_block->huge = CACHED; + } + } + } + if (fd == -1) { + addr = do_map(new_index, len, hp_align, flags, HUGE, + &fd); + + if (addr == NULL) { + if (!huge_error_printed) { + ODP_ERR("No huge pages, fall back to " + "normal pages. Check: " + "/proc/sys/vm/nr_hugepages.\n"); + huge_error_printed = 1; + } + } else { + new_block->huge = HUGE; } - } else { - new_block->huge = HUGE; } } @@ -961,8 +1126,12 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, /* if neither huge pages or normal pages works, we cannot proceed: */ if ((fd < 0) || (addr == NULL) || (len == 0)) { - if ((!new_block->external_fd) && (fd >= 0)) + if (new_block->external_fd) { + if (new_block->huge == CACHED) + hp_put_cached(fd); + } else if (fd >= 0) { close(fd); + } delete_file(new_block); odp_spinlock_unlock(&ishm_tbl->lock); ODP_ERR("_ishm_reserve failed.\n"); @@ -1564,6 +1733,9 @@ int _odp_ishm_init_global(const odp_init_t *init) /* get ready to create pools: */ _odp_ishm_pool_init(); + /* init cache files */ + hp_init(); + return 0; init_glob_err4: @@ -1776,6 +1948,9 @@ int _odp_ishm_status(const char *title) case EXTERNAL: huge = 'E'; break; + case CACHED: + huge = 'C'; + break; default: huge = '?'; } @@ -1896,6 +2071,9 @@ void _odp_ishm_print(int block_index) case EXTERNAL: str = "external"; break; + case CACHED: + str = "cached"; + break; default: str = "??"; }