Message ID | 20221103192259.2229-1-ardb@kernel.org |
---|---|
Headers | show |
Series | crypto: Add AES-GCM implementation to lib/crypto | expand |
> -----Original Message----- > From: Ard Biesheuvel <ardb@kernel.org> > Sent: Thursday, November 3, 2022 2:23 PM > Subject: [PATCH v5 3/3] crypto: aesgcm - Provide minimal library implementation > Given include/crypto/aes.h: struct crypto_aes_ctx { u32 key_enc[AES_MAX_KEYLENGTH_U32]; u32 key_dec[AES_MAX_KEYLENGTH_U32]; u32 key_length; }; plus: ... +struct aesgcm_ctx { + be128 ghash_key; + struct crypto_aes_ctx aes_ctx; + unsigned int authsize; +}; ... > +static void aesgcm_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst, ... > + local_irq_save(flags); > + aes_encrypt(ctx, dst, src); > + local_irq_restore(flags); > +} ... > +int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key, > + unsigned int keysize, unsigned int authsize) > +{ > + u8 kin[AES_BLOCK_SIZE] = {}; > + int ret; > + > + ret = crypto_gcm_check_authsize(authsize) ?: > + aes_expandkey(&ctx->aes_ctx, key, keysize); Since GCM uses the underlying cipher's encrypt algorithm for both encryption and decryption, is there any need for the 240-byte aesctx->key_dec decryption key schedule that aes_expandkey also prepares? For modes like this, it might be worth creating a specialized struct that only holds the encryption key schedule (key_enc), with a derivative of aes_expandkey() that only updates it.
On Thu, 3 Nov 2022 at 22:16, Elliott, Robert (Servers) <elliott@hpe.com> wrote: > > > > > -----Original Message----- > > From: Ard Biesheuvel <ardb@kernel.org> > > Sent: Thursday, November 3, 2022 2:23 PM > > Subject: [PATCH v5 3/3] crypto: aesgcm - Provide minimal library implementation > > > > Given include/crypto/aes.h: > struct crypto_aes_ctx { > u32 key_enc[AES_MAX_KEYLENGTH_U32]; > u32 key_dec[AES_MAX_KEYLENGTH_U32]; > u32 key_length; > }; > > plus: > ... > +struct aesgcm_ctx { > + be128 ghash_key; > + struct crypto_aes_ctx aes_ctx; > + unsigned int authsize; > +}; > ... > > +static void aesgcm_encrypt_block(const struct crypto_aes_ctx *ctx, void *dst, > ... > > + local_irq_save(flags); > > + aes_encrypt(ctx, dst, src); > > + local_irq_restore(flags); > > +} > ... > > +int aesgcm_expandkey(struct aesgcm_ctx *ctx, const u8 *key, > > + unsigned int keysize, unsigned int authsize) > > +{ > > + u8 kin[AES_BLOCK_SIZE] = {}; > > + int ret; > > + > > + ret = crypto_gcm_check_authsize(authsize) ?: > > + aes_expandkey(&ctx->aes_ctx, key, keysize); > > Since GCM uses the underlying cipher's encrypt algorithm for both > encryption and decryption, is there any need for the 240-byte > aesctx->key_dec decryption key schedule that aes_expandkey > also prepares? > No. But this applies to all uses of AES in CTR, XCTR, CMAC, CCM modes, not just to the AES library. > For modes like this, it might be worth creating a specialized > struct that only holds the encryption key schedule (key_enc), > with a derivative of aes_expandkey() that only updates it. > I'm not sure what problem we would be solving here tbh. AES key expansion is unlikely to occur on a hot path, and the 240 byte overhead doesn't seem that big of a deal either. Note that only full table based C implementations of AES have a need for the decryption key schedule, the AES library version could be tweaked to use the encryption key schedule for decryption as well (see below). But the instruction based versions are constructed in a way that also requires the modified schedule for decryption. So I agree that there appears to be /some/ room for improvement here, but I'm not sure it's worth anyone's time tbh. We could explore splitting off the expandkey routine that is exposed to other AES implementations, and use a reduced schedule inside the library itself. Beyond that, I don't see the need to clutter up the API and force all AES code in the tree to choose between an encryption-only or a full key schedule. -------------8<----------------- --- a/lib/crypto/aes.c +++ b/lib/crypto/aes.c @@ -310,3 +310,3 @@ { - const u32 *rkp = ctx->key_dec + 4; + const u32 *rkp = ctx->key_enc + ctx->key_length + 16; int rounds = 6 + ctx->key_length / 4; @@ -315,6 +315,6 @@ - st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in); - st0[1] = ctx->key_dec[1] ^ get_unaligned_le32(in + 4); - st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8); - st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12); + st0[0] = rkp[ 8] ^ get_unaligned_le32(in); + st0[1] = rkp[ 9] ^ get_unaligned_le32(in + 4); + st0[2] = rkp[10] ^ get_unaligned_le32(in + 8); + st0[3] = rkp[11] ^ get_unaligned_le32(in + 12); @@ -331,7 +331,7 @@ - for (round = 0;; round += 2, rkp += 8) { - st1[0] = inv_mix_columns(inv_subshift(st0, 0)) ^ rkp[0]; - st1[1] = inv_mix_columns(inv_subshift(st0, 1)) ^ rkp[1]; - st1[2] = inv_mix_columns(inv_subshift(st0, 2)) ^ rkp[2]; - st1[3] = inv_mix_columns(inv_subshift(st0, 3)) ^ rkp[3]; + for (round = 0;; round += 2, rkp -= 8) { + st1[0] = inv_mix_columns(inv_subshift(st0, 0) ^ rkp[4]); + st1[1] = inv_mix_columns(inv_subshift(st0, 1) ^ rkp[5]); + st1[2] = inv_mix_columns(inv_subshift(st0, 2) ^ rkp[6]); + st1[3] = inv_mix_columns(inv_subshift(st0, 3) ^ rkp[7]); @@ -340,12 +340,12 @@ - st0[0] = inv_mix_columns(inv_subshift(st1, 0)) ^ rkp[4]; - st0[1] = inv_mix_columns(inv_subshift(st1, 1)) ^ rkp[5]; - st0[2] = inv_mix_columns(inv_subshift(st1, 2)) ^ rkp[6]; - st0[3] = inv_mix_columns(inv_subshift(st1, 3)) ^ rkp[7]; + st0[0] = inv_mix_columns(inv_subshift(st1, 0) ^ rkp[0]); + st0[1] = inv_mix_columns(inv_subshift(st1, 1) ^ rkp[1]); + st0[2] = inv_mix_columns(inv_subshift(st1, 2) ^ rkp[2]); + st0[3] = inv_mix_columns(inv_subshift(st1, 3) ^ rkp[3]); } - put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[4], out); - put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[5], out + 4); - put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[6], out + 8); - put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[7], out + 12); + put_unaligned_le32(inv_subshift(st1, 0) ^ rkp[0], out); + put_unaligned_le32(inv_subshift(st1, 1) ^ rkp[1], out + 4); + put_unaligned_le32(inv_subshift(st1, 2) ^ rkp[2], out + 8); + put_unaligned_le32(inv_subshift(st1, 3) ^ rkp[3], out + 12);
On Thu, Nov 03, 2022 at 08:22:56PM +0100, Ard Biesheuvel wrote: > Provide a generic library implementation of AES-GCM which can be used > really early during boot, e.g., to communicate with the security > coprocessor on SEV-SNP virtual machines to bring up secondary cores. > This is needed because the crypto API is not available yet this early. > > We cannot rely on special instructions for AES or polynomial > multiplication, which are arch specific and rely on in-kernel SIMD > infrastructure. Instead, add a generic C implementation that combines > the existing C implementations of AES and multiplication in GF(2^128). > > To reduce the risk of forgery attacks, replace data dependent table > lookups and conditional branches in the used gf128mul routine with > constant-time equivalents. The AES library has already been robustified > to some extent to prevent known-plaintext timing attacks on the key, but > we call it with interrupts disabled to make it a bit more robust. (Note > that in SEV-SNP context, the VMM is untrusted, and is able to inject > interrupts arbitrarily, and potentially maliciously.) > > Changes since v4: > - Rename CONFIG_CRYPTO_GF128MUL to CONFIG_CRYPTO_LIB_GF128MUL > - Use bool return value for decrypt routine to align with other AEAD > library code > - Return -ENODEV on selftest failure to align with other algos > - Use pr_err() not WARN() on selftest failure for the same reason > - Mention in a code comment that the counter cannot roll over or result > in a carry due to the width of the type representing the size of the > input > > Changes since v3: > - rename GCM-AES to AES-GCM > > Changes since v2: > - move gf128mul to lib/crypto > - add patch #2 to make gf128mul_lle constant time > - fix kerneldoc headers and drop them from the .h file > > Changes since v1: > - rename gcm to gcmaes to reflect that GCM is also used in > combination with other symmetric ciphers (Jason) > - add Nikunj's Tested-by > > Cc: Eric Biggers <ebiggers@kernel.org> > Cc: Robert Elliott <elliott@hpe.com> > Cc: Jason A. Donenfeld <Jason@zx2c4.com> > Cc: Nikunj A Dadhania <nikunj@amd.com> > > Ard Biesheuvel (3): > crypto: move gf128mul library into lib/crypto > crypto: gf128mul - make gf128mul_lle time invariant > crypto: aesgcm - Provide minimal library implementation > > arch/arm/crypto/Kconfig | 2 +- > arch/arm64/crypto/Kconfig | 2 +- > crypto/Kconfig | 9 +- > crypto/Makefile | 1 - > drivers/crypto/chelsio/Kconfig | 2 +- > include/crypto/gcm.h | 22 + > lib/crypto/Kconfig | 9 + > lib/crypto/Makefile | 5 + > lib/crypto/aesgcm.c | 727 ++++++++++++++++++++ > {crypto => lib/crypto}/gf128mul.c | 58 +- > 10 files changed, 808 insertions(+), 29 deletions(-) > create mode 100644 lib/crypto/aesgcm.c > rename {crypto => lib/crypto}/gf128mul.c (87%) > > -- > 2.35.1 All applied. Thanks.