From patchwork Thu Jul 20 11:40:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108325 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp2015358qge; Thu, 20 Jul 2017 04:40:13 -0700 (PDT) X-Received: by 10.101.83.139 with SMTP id x11mr3457917pgq.336.1500550813176; Thu, 20 Jul 2017 04:40:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500550813; cv=none; d=google.com; s=arc-20160816; b=V8mcdrVldmJd8sAu2SJZDu7/6jSq8pKUvb7TNUTTr2y4gm/xgIvT5AMoKlTSHcF01s mnDv3f173SSgUOpkgoZBcyjXo/sNDRF3+OiwYcRfCc1naMwRhhwyWdyT4OzqeYVEN12I o8KJKsx8V4uI1DqIFTZd8A90QjSY2b8KgXGC6rpPSLpIgvnuzm6frOpxsIgLVSBjkWHL s+bo10XZv36ljthUklvCv0HdCfGK46evfyr0rEi6EGKnznj1issdiy45AtEpQ0kerijg d940ToOPdrRm5ktnQ9O674XrRX7IPBlQkEaKmtAsszujP1ZBjoHVXH/hijRqXShApRto bRpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=ikVBKFX4P59I7Q4ELI2ixBcE1BUCebXFQpiqYZWteGE=; b=O3VSNt1r6rDGS5eplLLXz5f7ycwxGNjWhDNygpO2S/zTfD6Ef8fx4MUEjT2qiAwFpQ s/JdbxMaMY2KEQQ+5WTwCuF+diF/TbfMhCi6e6+4gmBJ9rjMc1zDdr1a8rz8u9HPXbuw EbCPFEcvaLSPq0cUn1QMOOtPe7Y9zjlF2y54cRzIdFa8n9Mu1Q+WQ7ivEPvk8hxB5Yi2 KQuY+aLKhbn2zf79AnE2EDu+s0xp0Gks2Q2fkwzNFpj/0grMyFjvVw0wDw5p9aK+xSl7 Ac4NJvEFlHno1+Ao6kDlARMGsB6R/xxTr/bwDjP293NDHGv9c7A77sgCSTVrlX53bh2w TjQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=UG6W7de+; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y83si1497375pfb.190.2017.07.20.04.40.12; Thu, 20 Jul 2017 04:40:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=UG6W7de+; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933912AbdGTLkL (ORCPT + 1 other); Thu, 20 Jul 2017 07:40:11 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:38366 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934828AbdGTLkK (ORCPT ); Thu, 20 Jul 2017 07:40:10 -0400 Received: by mail-wm0-f48.google.com with SMTP id w191so24292505wmw.1 for ; Thu, 20 Jul 2017 04:40:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=ikVBKFX4P59I7Q4ELI2ixBcE1BUCebXFQpiqYZWteGE=; b=UG6W7de+ZBcTq4DBNm3lUzvP+V/sGE+3XJC+c5CtoHui+nYrf2tAWLEDRVJwj/weKM oz8UuMm1D39+ZPZbUVmdOy1lunHMpzHkZP7NGol5v84TUwhjUrwGQzt6XwuudFwFB4Ma v9tyu/YN9B+HoeTg+rIkywTIc6ackNuSTIiBY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ikVBKFX4P59I7Q4ELI2ixBcE1BUCebXFQpiqYZWteGE=; b=k8ybidkYO8oVM1zVdcGMyE06Nag+YOvPDFoES6e2V3h8xKOSoSsSKm7hzGOuxX9s3k H4DxFwOImbDLVLALj4vTRYz8NAhkssDdiRR5ovWwL9/LaoYwTyYBrVP/R99PHf3vdeHG 7zlWiwv0PHrwaiDj371E9+OxKY/jDrz88IU9a7JH6DzW7YKspGQmuNxYek0XBH1rqFOx bGqWbwnY9todGesIgGaTFQJ6QdM/S34NuCFfAVB9hnV3s+YgLp3v7pXgKtiPoD0NV3L2 F6/Xahk2xOK30+eyrB3Wb8dhLDJ/rjlLg9LMv2JtroifGcWMnbLq4jPYdsB6Gi+3xdVj E+pw== X-Gm-Message-State: AIVw110w1T1Cd+Yo7VtRRbRSNSZMzkx5zdZPf8Sc68kKikru2Rld7Zt8 27Wj+WVyYfuV2Z2xiiDmaw== X-Received: by 10.28.210.149 with SMTP id j143mr1999681wmg.146.1500550808290; Thu, 20 Jul 2017 04:40:08 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id r191sm2113925wmg.6.2017.07.20.04.40.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jul 2017 04:40:07 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, giovanni.cabiddu@intel.com, Ard Biesheuvel Subject: [PATCH] crypto: scompress - eliminate percpu scratch buffers Date: Thu, 20 Jul 2017 12:40:00 +0100 Message-Id: <20170720114000.13840-1-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The scompress code unconditionally allocates 2 per-CPU scratch buffers of 128 KB each, in order to avoid allocation overhead in the async wrapper that encapsulates the synchronous compression algorithm, since it may execute in atomic context. There are a couple of issues with this: - The code may be invoked in process context with CRYPTO_TFM_REQ_MAY_SLEEP set in the request flags. If the CRYPTO_ACOMP_ALLOC_OUTPUT flag is also set, we end up calling kmalloc_array() and alloc_page() with the GFP flags set to GFP_KERNEL, and so we may attempt to sleep even though we are running with preemption disabled (due to the use of get_cpu()) - On a system such as Cavium ThunderX, which has 96 cores, the per-CPU buffers waste 24 MB of memory. - On 32-bit systems, claiming vmalloc space of this order (e.g., 1 MB on a 4-core system) is undesirable as well, given how small the vmalloc space typically is on such systems. - Lengthy CPU bound tasks such as compression should not be run with preemption disabled if we can avoid it. So replace the per-PCU buffers with on-demand page allocations in the handler. This reduces the memory and vmalloc space footprint of this driver to ~0 unless you actually use it. Signed-off-by: Ard Biesheuvel --- The req->dst case could be further optimized, and reuse the allocated scratch_dst buffer as the buffer that is returned to the caller. Given that there are no occurrences of crypto_alloc_acomp() anywhere in the kernel, I will leave that to the next person. crypto/scompress.c | 126 ++++---------------- 1 file changed, 22 insertions(+), 104 deletions(-) -- 2.9.3 diff --git a/crypto/scompress.c b/crypto/scompress.c index ae1d3cf209e4..3d4c35a4c5a8 100644 --- a/crypto/scompress.c +++ b/crypto/scompress.c @@ -30,10 +30,6 @@ #include "internal.h" static const struct crypto_type crypto_scomp_type; -static void * __percpu *scomp_src_scratches; -static void * __percpu *scomp_dst_scratches; -static int scomp_scratch_users; -static DEFINE_MUTEX(scomp_lock); #ifdef CONFIG_NET static int crypto_scomp_report(struct sk_buff *skb, struct crypto_alg *alg) @@ -70,67 +66,6 @@ static int crypto_scomp_init_tfm(struct crypto_tfm *tfm) return 0; } -static void crypto_scomp_free_scratches(void * __percpu *scratches) -{ - int i; - - if (!scratches) - return; - - for_each_possible_cpu(i) - vfree(*per_cpu_ptr(scratches, i)); - - free_percpu(scratches); -} - -static void * __percpu *crypto_scomp_alloc_scratches(void) -{ - void * __percpu *scratches; - int i; - - scratches = alloc_percpu(void *); - if (!scratches) - return NULL; - - for_each_possible_cpu(i) { - void *scratch; - - scratch = vmalloc_node(SCOMP_SCRATCH_SIZE, cpu_to_node(i)); - if (!scratch) - goto error; - *per_cpu_ptr(scratches, i) = scratch; - } - - return scratches; - -error: - crypto_scomp_free_scratches(scratches); - return NULL; -} - -static void crypto_scomp_free_all_scratches(void) -{ - if (!--scomp_scratch_users) { - crypto_scomp_free_scratches(scomp_src_scratches); - crypto_scomp_free_scratches(scomp_dst_scratches); - scomp_src_scratches = NULL; - scomp_dst_scratches = NULL; - } -} - -static int crypto_scomp_alloc_all_scratches(void) -{ - if (!scomp_scratch_users++) { - scomp_src_scratches = crypto_scomp_alloc_scratches(); - if (!scomp_src_scratches) - return -ENOMEM; - scomp_dst_scratches = crypto_scomp_alloc_scratches(); - if (!scomp_dst_scratches) - return -ENOMEM; - } - return 0; -} - static void crypto_scomp_sg_free(struct scatterlist *sgl) { int i, n; @@ -184,24 +119,31 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) void **tfm_ctx = acomp_tfm_ctx(tfm); struct crypto_scomp *scomp = *tfm_ctx; void **ctx = acomp_request_ctx(req); - const int cpu = get_cpu(); - u8 *scratch_src = *per_cpu_ptr(scomp_src_scratches, cpu); - u8 *scratch_dst = *per_cpu_ptr(scomp_dst_scratches, cpu); + u8 *scratch_src; + u8 *scratch_dst; + int alloc_order; + gfp_t gfp; int ret; - if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE) { - ret = -EINVAL; - goto out; - } + if (!req->src || !req->slen || req->slen > SCOMP_SCRATCH_SIZE) + return -EINVAL; - if (req->dst && !req->dlen) { - ret = -EINVAL; - goto out; - } + if (req->dst && !req->dlen) + return -EINVAL; if (!req->dlen || req->dlen > SCOMP_SCRATCH_SIZE) req->dlen = SCOMP_SCRATCH_SIZE; + gfp = req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL + : GFP_ATOMIC; + + alloc_order = get_order(PAGE_ALIGN(req->slen) + req->dlen); + scratch_src = (u8 *)__get_free_pages(gfp, alloc_order); + if (!scratch_src) + return -ENOMEM; + + scratch_dst = scratch_src + PAGE_ALIGN(req->slen); + scatterwalk_map_and_copy(scratch_src, req->src, 0, req->slen, 0); if (dir) ret = crypto_scomp_compress(scomp, scratch_src, req->slen, @@ -211,9 +153,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) scratch_dst, &req->dlen, *ctx); if (!ret) { if (!req->dst) { - req->dst = crypto_scomp_sg_alloc(req->dlen, - req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? - GFP_KERNEL : GFP_ATOMIC); + req->dst = crypto_scomp_sg_alloc(req->dlen, gfp); if (!req->dst) goto out; } @@ -221,7 +161,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir) 1); } out: - put_cpu(); + free_pages((unsigned long)scratch_src, alloc_order); return ret; } @@ -316,40 +256,18 @@ static const struct crypto_type crypto_scomp_type = { int crypto_register_scomp(struct scomp_alg *alg) { struct crypto_alg *base = &alg->base; - int ret = -ENOMEM; - - mutex_lock(&scomp_lock); - if (crypto_scomp_alloc_all_scratches()) - goto error; base->cra_type = &crypto_scomp_type; base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK; base->cra_flags |= CRYPTO_ALG_TYPE_SCOMPRESS; - ret = crypto_register_alg(base); - if (ret) - goto error; - - mutex_unlock(&scomp_lock); - return ret; - -error: - crypto_scomp_free_all_scratches(); - mutex_unlock(&scomp_lock); - return ret; + return crypto_register_alg(base); } EXPORT_SYMBOL_GPL(crypto_register_scomp); int crypto_unregister_scomp(struct scomp_alg *alg) { - int ret; - - mutex_lock(&scomp_lock); - ret = crypto_unregister_alg(&alg->base); - crypto_scomp_free_all_scratches(); - mutex_unlock(&scomp_lock); - - return ret; + return crypto_unregister_alg(&alg->base); } EXPORT_SYMBOL_GPL(crypto_unregister_scomp);