From patchwork Tue Jul 24 17:12:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 142818 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp7516445ljj; Tue, 24 Jul 2018 10:12:33 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfHtl0E3HCL0mwSZjLPGNqH7oLlPXQEG/MwLe55b1rBuYT6CELQsUPKznf9eAM6H9Pe3ti9 X-Received: by 2002:a63:3f05:: with SMTP id m5-v6mr16979806pga.51.1532452353860; Tue, 24 Jul 2018 10:12:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532452353; cv=none; d=google.com; s=arc-20160816; b=SYARumtTC6Y0WFB4G//23o3Zj5vbokpui/gCAZojMNdUXIDJoX+TJN8Q2V4f/QnUdl ybfJpyO3JcquT4vzNOHBUg7w732TsIxwCRK4nlbTsdlJk2uwciQQorWTxPenzK3z6wgC 7iNjqdyfx9xcWlUUNgQg0QKmavfgRlQqfaJ/gnTveOzAHwjdWVNcp5LXeQn8GFsDjY7S LW+46R1AJb9/FryuqP3V50FHvsA+TyYMaNln9Q1iN1HiB3R7Lysc+L0CAiIhbX2n073R xy0a5O9JqX1sdG1nhSAFSMJu+K7gabTzSKVgBFEkdzKuMlWNd+b1lxJHL/ujHWQ5+4/L kygw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=IHfrvSL8z6md7K2S3obEsDFKu7i1NZB3Xuzu3zbyCMw=; b=u5JwnBEgLAuWUX/Ee98LYlC57Ung/VT34zuNHkGM7ujnCaECXKYiHKCUL6HZHCXiqY SOePqp2r+O7YfEswEYd+sijk1Q94XZa5nZFRszYQLS53WRtHrgtRYqsAn1CLE6z44acA 96T6XO8FymiWwdZs22LB1PVwG6k2mtxTSigTaE/nydNsjf8wwkbSfOJcPMyxSWi/u4QW cxGkAW8hWFNrU2C/6gcN9IFQmt5qanpbMVu8iNyogBj8buhPV8gLHc4FAt1vunyvIUXG LZaxwGDeJ7OAV+GSu1KrF9Rr2rKtjUqQ/AxDjhCVf4FIrRqfbOHNZewIAjGfPZi5bx/Z wvJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FsEgXYVT; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x1-v6si10764942plb.8.2018.07.24.10.12.33; Tue, 24 Jul 2018 10:12:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FsEgXYVT; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388467AbeGXST7 (ORCPT + 1 other); Tue, 24 Jul 2018 14:19:59 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:39013 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388445AbeGXST7 (ORCPT ); Tue, 24 Jul 2018 14:19:59 -0400 Received: by mail-ed1-f67.google.com with SMTP id h4-v6so4747621edi.6 for ; Tue, 24 Jul 2018 10:12:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IHfrvSL8z6md7K2S3obEsDFKu7i1NZB3Xuzu3zbyCMw=; b=FsEgXYVTzFLYu9zt609FVm+HG65WsxUZ4iTVgzifkK9m58YvOOv7ip0ZzNTf3CDzKv XP9ywo+GR5oSAE1+IcXsbPjLlCSHFdOapv1cPP33XTya0Mhnw6LQ+tV7QJzXPrUexWVr bngVGsbTGXcxXOrxBhFm61iouRyxmY7qsVevg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IHfrvSL8z6md7K2S3obEsDFKu7i1NZB3Xuzu3zbyCMw=; b=eeW/7aJyxMcadBPNyibIQMfxD6rhJ/NMCtVruMF1RFAi+1sancSU6JQRjIQmukmU3C tYHqGDVU99MJ8j8BxwMpvc1mMzMAPzYU2c894qu/1scSE7GXUSoV40jTQih1IKQkL49Q v4sIosC8Bb6vp7wRd9XFsvpTds4kQxNNIZ3L5lUpKGWKJjtTSWK4hLl6PG6AI0SYFpLG cgKfy78FgNHVpdLUMsj9nYA5h/45FMWBd+9Qo0q4mVaXJnONEc2i1yp8J0i/YSuGvY6D S9KHxXmWlppbT22zrAeIU3qPVJoYHSpIzCsvK0Iay/9BozC5Bw6A4CKDbYniMA+VMCcE ZHTw== X-Gm-Message-State: AOUpUlHT6GbfvuWCirDfxo05uQOQvfeBZLZw2D3rDWQgL3uwtGX8jqlV Gj7fulZ1YdEjL3hQkWctElU1xbBVPqE= X-Received: by 2002:a50:9a64:: with SMTP id o91-v6mr19804467edb.123.1532452350902; Tue, 24 Jul 2018 10:12:30 -0700 (PDT) Received: from rev02.home (b80182.upc-b.chello.nl. [212.83.80.182]) by smtp.gmail.com with ESMTPSA id j50-v6sm11267948ede.0.2018.07.24.10.12.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Jul 2018 10:12:30 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, will.deacon@arm.com, dave.martin@arm.com, vakul.garg@nxp.com, bigeasy@linutronix.de, Ard Biesheuvel Subject: [PATCH 1/4] crypto/arm64: ghash - reduce performance impact of NEON yield checks Date: Tue, 24 Jul 2018 19:12:21 +0200 Message-Id: <20180724171224.17363-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180724171224.17363-1-ard.biesheuvel@linaro.org> References: <20180724171224.17363-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As reported by Vakul, checking the TIF_NEED_RESCHED flag after every iteration of the GHASH and AES-GCM core routines is having a considerable performance impact on cores such as the Cortex-A53 with Crypto Extensions implemented. GHASH performance is down by 22% for large block sizes, and AES-GCM is down by 16% for large block sizes and 128 bit keys. This appears to be a result of the high performance of the crypto instructions on the one hand (2.0 cycles per byte for GHASH, 3.0 cpb for AES-GCM), combined with the relatively poor load/store performance of this simple core. So let's reduce this performance impact by only doing the yield check once every 32 blocks for GHASH (or 4 when using the version based on 8-bit polynomial multiplication), and once every 16 blocks for AES-GCM. This way, we recover most of the performance while still limiting the duration of scheduling blackouts due to disabling preemption to ~1000 cycles. Cc: Vakul Garg Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-core.S | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) -- 2.11.0 diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index dcffb9e77589..9c14beaabeee 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -212,7 +212,7 @@ ushr XL.2d, XL.2d, #1 .endm - .macro __pmull_ghash, pn + .macro __pmull_ghash, pn, yield_count frame_push 5 mov x19, x0 @@ -259,6 +259,9 @@ CPU_LE( rev64 T1.16b, T1.16b ) eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b + tst w19, #(\yield_count - 1) + b.ne 1b + cbz w19, 3f if_will_cond_yield_neon @@ -279,11 +282,11 @@ CPU_LE( rev64 T1.16b, T1.16b ) * struct ghash_key const *k, const char *head) */ ENTRY(pmull_ghash_update_p64) - __pmull_ghash p64 + __pmull_ghash p64, 32 ENDPROC(pmull_ghash_update_p64) ENTRY(pmull_ghash_update_p8) - __pmull_ghash p8 + __pmull_ghash p8, 4 ENDPROC(pmull_ghash_update_p8) KS .req v8 @@ -428,6 +431,9 @@ CPU_LE( rev x28, x28 ) st1 {INP.16b}, [x21], #16 .endif + tst w19, #0xf // do yield check only + b.ne 1b // once every 16 blocks + cbz w19, 3f if_will_cond_yield_neon