[v5,13/15] dm-verity: hash blocks with shash import+finup when possible

From: Eric Biggers <ebiggers@google.com>

From: Eric Biggers <ebiggers@google.com>

Currently dm-verity computes the hash of each block by using multiple
calls to the "ahash" crypto API.  While the exact sequence depends on
the chosen dm-verity settings, in the vast majority of cases it is:

    1. crypto_ahash_init()
    2. crypto_ahash_update() [salt]
    3. crypto_ahash_update() [data]
    4. crypto_ahash_final()

This is inefficient for two main reasons:

- It makes multiple indirect calls, which is expensive on modern CPUs
  especially when mitigations for CPU vulnerabilities are enabled.

  Since the salt is the same across all blocks on a given dm-verity
  device, a much more efficient sequence would be to do an import of the
  pre-salted state, then a finup.

- It uses the ahash (asynchronous hash) API, despite the fact that
  CPU-based hashing is almost always used in practice, and therefore it
  experiences the overhead of the ahash-based wrapper for shash.

  Because dm-verity was intentionally converted to ahash to support
  off-CPU crypto accelerators, a full reversion to shash might not be
  acceptable.  Yet, we should still provide a fast path for shash with
  the most common dm-verity settings.

  Another reason for shash over ahash is that the upcoming multibuffer
  hashing support, which is specific to CPU-based hashing, is much
  better suited for shash than for ahash.  Supporting it via ahash would
  add significant complexity and overhead.  And it's not possible for
  the "same" code to properly support both multibuffer hashing and HW
  accelerators at the same time anyway, given the different computation
  models.  Unfortunately there will always be code specific to each
  model needed (for users who want to support both).

Therefore, this patch adds a new shash import+finup based fast path to
dm-verity.  It is used automatically when appropriate.  This makes
dm-verity optimized for what the vast majority of users want: CPU-based
hashing with the most common settings, while still retaining support for
rarer settings and off-CPU crypto accelerators.

In benchmarks with veritysetup's default parameters (SHA-256, 4K data
and hash block sizes, 32-byte salt), which also match the parameters
that Android currently uses, this patch improves block hashing
performance by about 15% on x86_64 using the SHA-NI instructions, or by
about 5% on arm64 using the ARMv8 SHA2 instructions.  On x86_64 roughly
two-thirds of the improvement comes from the use of import and finup,
while the remaining third comes from the switch from ahash to shash.

Note that another benefit of using "import" to handle the salt is that
if the salt size is equal to the input size of the hash algorithm's
compression function, e.g. 64 bytes for SHA-256, then the performance is
exactly the same as no salt.  This doesn't seem to be much better than
veritysetup's current default of 32-byte salts, due to the way SHA-256's
finalization padding works, but it should be marginally better.

Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 drivers/md/dm-verity-target.c | 169 ++++++++++++++++++++++++----------
 drivers/md/dm-verity.h        |  18 ++--
 2 files changed, 130 insertions(+), 57 deletions(-)

Message ID	20240611034822.36603-14-ebiggers@kernel.org
State	Superseded
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FAC117164C; Tue, 11 Jun 2024 03:49:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718077756; cv=none; b=kkL0p1dwjyzPPKaHOhh4JDRb6fdC1cMY060C0m/sinAngTTGsPQkEe1f46z2Tc5ojEo5NUm9voJBTsT4c3wuQN/9zH0bxjSNZvjqVRvEWLvKfk++eALFafGheQhlhBrQyfnin7koWzOUL+9hkVN+F/YdTz5zFgm4zpnDc+nyVHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718077756; c=relaxed/simple; bh=aykATy3Ldo6MPAMBrqtV74fEZxWhNXzrZQw1bRaizfw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Iu/0sHmJLFj0O/jbwAMXOGOy8Mv4BVqUDFv6hrXJXlnXjgYAt7iLHxuJZRmjnC2RlXO1b8w5I1CK+TEHaDRJvS3D3W9AOqAxL7+ErAZRpmX83gZRDc5S3tVATSb2NPk4w63UkSk1dcTSoaZlphWAnny2PtlOFLISrRhr50bBsyg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=otEg+zPh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="otEg+zPh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D862CC4AF1C; Tue, 11 Jun 2024 03:49:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1718077756; bh=aykATy3Ldo6MPAMBrqtV74fEZxWhNXzrZQw1bRaizfw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=otEg+zPhWSBFxNbC9HrWxgaqbg0JVqPjJUwwZl9HpJivoX3YQlcHzOWyMp8kpx8W8 KWo7coCkO8eUagkhMCc4jjNMhfG4cNDwB8JY9EKRMMm5UIuQ91xxYNwtfEdCAizD/S KwKAjKKw3et44UCnkRUo8tLGD/cAaZ3qPGoj1swRdXQ234lZ+WzBjpRCzoajx5YcCG O6Kud7eL4vzUz06OgDE1I1Mbp5HdveAtN2DxAuxDduhlEBDIk2XZxZ1IN8hHzirT4/ KScNIC/mvIe6sUkL9g2ceR3IHpwXAafRjBiMx2B0hjpzcB2li6APJ23if4hvFjecP4 WCP6RI5Zz8ncQ== From: Eric Biggers <ebiggers@kernel.org> To: linux-crypto@vger.kernel.org, fsverity@lists.linux.dev, dm-devel@lists.linux.dev Cc: x86@kernel.org, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel <ardb@kernel.org>, Sami Tolvanen <samitolvanen@google.com>, Bart Van Assche <bvanassche@acm.org>, Herbert Xu <herbert@gondor.apana.org.au> Subject: [PATCH v5 13/15] dm-verity: hash blocks with shash import+finup when possible Date: Mon, 10 Jun 2024 20:48:20 -0700 Message-ID: <20240611034822.36603-14-ebiggers@kernel.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240611034822.36603-1-ebiggers@kernel.org> References: <20240611034822.36603-1-ebiggers@kernel.org> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: <linux-crypto.vger.kernel.org> List-Subscribe: <mailto:linux-crypto+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-crypto+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Optimize dm-verity and fsverity using multibuffer hashing \| expand [v5,00/15] Optimize dm-verity and fsverity using multibuffer hashing [v5,01/15] crypto: shash - add support for finup_mb [v5,02/15] crypto: testmgr - generate power-of-2 lengths more often [v5,03/15] crypto: testmgr - add tests for finup_mb [v5,04/15] crypto: x86/sha256-ni - add support for finup_mb [v5,05/15] crypto: arm64/sha256-ce - add support for finup_mb [v5,06/15] fsverity: improve performance by using multibuffer hashing [v5,07/15] dm-verity: move hash algorithm setup into its own function [v5,08/15] dm-verity: move data hash mismatch handling into its own function [v5,09/15] dm-verity: make real_digest and want_digest fixed-length [v5,10/15] dm-verity: provide dma_alignment limit in io_hints [v5,11/15] dm-verity: always "map" the data blocks [v5,12/15] dm-verity: make verity_hash() take dm_verity_io instead of ahash_request [v5,13/15] dm-verity: hash blocks with shash import+finup when possible [v5,14/15] dm-verity: reduce scope of real and wanted digests [v5,15/15] dm-verity: improve performance by using multibuffer hashing

[v5,13/15] dm-verity: hash blocks with shash import+finup when possible

Commit Message

Patch