diff mbox series

[2/3] crypto: inside-secure - Reduce stack usage

Message ID 20190930121520.1388317-2-arnd@arndb.de
State Superseded
Headers show
Series None | expand

Commit Message

Arnd Bergmann Sept. 30, 2019, 12:14 p.m. UTC
safexcel_aead_setkey() contains three large stack variables, totalling
slightly more than the 1024 byte warning limit:

drivers/crypto/inside-secure/safexcel_cipher.c:303:12: error: stack frame size of 1032 bytes in function 'safexcel_aead_setkey' [-Werror,-Wframe-larger-than=]

The function already contains a couple of dynamic allocations, so it is
likely not performance critical and it can only be called in a context
that allows sleeping, so the easiest workaround is to add change it
to use dynamic allocations. Combining istate and ostate into a single
variable simplifies the allocation at the cost of making it slightly
less readable.

Alternatively, it should be possible to shrink these allocations
as the extra buffers appear to be largely unnecessary, but doing
this would be a much more invasive change.

Fixes: 0e17e3621a28 ("crypto: inside-secure - add support for authenc(hmac(sha*),rfc3686(ctr(aes))) suites")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
 .../crypto/inside-secure/safexcel_cipher.c    | 53 ++++++++++++-------
 1 file changed, 35 insertions(+), 18 deletions(-)

-- 
2.20.0

Comments

Arnd Bergmann Sept. 30, 2019, 8:11 p.m. UTC | #1
On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen
<pvanleeuwen@verimatrix.com> wrote:

> > Alternatively, it should be possible to shrink these allocations

> > as the extra buffers appear to be largely unnecessary, but doing

> > this would be a much more invasive change.

> >

> Actually, for HMAC-SHA512 you DO need all that buffer space.

> You could shrink it to 2 * ctx->state_sz but then your simple indexing

> is no longer going to fly. Not sure if that would be worth the effort.


Stack allocations can no longer be dynamically sized in the kernel,
so that would not work.

What I noticed though is that the largest part of safexcel_ahash_export_state
is used in the 'cache' member, and this is apparently only referenced inside of
safexcel_hmac_init_iv, which you call twice. If that cache can be allocated
only once, you save SHA512_BLOCK_SIZE bytes in one of the two paths.

> I don't like the part where you dynamically allocate the cryto_aes_ctx

> though, I think that was not necessary considering its a lot smaller.


I count 484 bytes for it, which is really large.

> And it conflicts with another change I have waiting that gets rid of

> aes_expandkey and that struct alltogether (since it was really just

> abused to do a key size check, which was very wasteful since the

> function actually generates all roundkeys we don't need at all ...)


Right, this is what I noticed there. With 480 of the 484 bytes gone,
you are well below the warning limit even without the other change.

       Arnd
Pascal Van Leeuwen Sept. 30, 2019, 9:09 p.m. UTC | #2
> -----Original Message-----

> From: Arnd Bergmann <arnd@arndb.de>

> Sent: Monday, September 30, 2019 10:12 PM

> To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>

> Cc: Antoine Tenart <antoine.tenart@bootlin.com>; Herbert Xu <herbert@gondor.apana.org.au>;

> David S. Miller <davem@davemloft.net>; Pascal van Leeuwen <pascalvanl@gmail.com>; Ard

> Biesheuvel <ard.biesheuvel@linaro.org>; Eric Biggers <ebiggers@google.com>; linux-

> crypto@vger.kernel.org; linux-kernel@vger.kernel.org

> Subject: Re: [PATCH 2/3] crypto: inside-secure - Reduce stack usage

> 

> On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen

> <pvanleeuwen@verimatrix.com> wrote:

> 

> > > Alternatively, it should be possible to shrink these allocations

> > > as the extra buffers appear to be largely unnecessary, but doing

> > > this would be a much more invasive change.

> > >

> > Actually, for HMAC-SHA512 you DO need all that buffer space.

> > You could shrink it to 2 * ctx->state_sz but then your simple indexing

> > is no longer going to fly. Not sure if that would be worth the effort.

> 

> Stack allocations can no longer be dynamically sized in the kernel,

> so that would not work.

> 

I was actually referring to your kzalloc, not to the original stack
based implementation ...

> What I noticed though is that the largest part of safexcel_ahash_export_state

> is used in the 'cache' member, and this is apparently only referenced inside of

> safexcel_hmac_init_iv, which you call twice. If that cache can be allocated

> only once, you save SHA512_BLOCK_SIZE bytes in one of the two paths.

> 

Well ... hmmm ... "my"... This is not originally "my" code. And until you
brought this up, I did not fully realise it was using this export_state
struct to store those digests. Seems to have something to do with directly
taking the crypto_ahash_export state output, which is much more than that,
in case you need to continue the hash (which you don't here).

I guess you need to "catch" that output somewhere, so probably you could
save a bit of memory but I doubt it would be a significant improvement.

> > I don't like the part where you dynamically allocate the cryto_aes_ctx

> > though, I think that was not necessary considering its a lot smaller.

> 

> I count 484 bytes for it, which is really large.

> 

Maybe I should've counted myself ... totally unexpectedly huge. Why??
Anyway, glad I got rid of it already then.

> > And it conflicts with another change I have waiting that gets rid of

> > aes_expandkey and that struct alltogether (since it was really just

> > abused to do a key size check, which was very wasteful since the

> > function actually generates all roundkeys we don't need at all ...)

> 

> Right, this is what I noticed there. With 480 of the 484 bytes gone,

> you are well below the warning limit even without the other change.

> 

And by "other change" you mean the safexcel_ahash_export_state?
Ok, good to known, although I do like to improve that one as well,
but preferably by not exporting the cache so I don't need the full
struct.

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com
Arnd Bergmann Oct. 1, 2019, 6:49 p.m. UTC | #3
On Mon, Sep 30, 2019 at 11:09 PM Pascal Van Leeuwen
<pvanleeuwen@verimatrix.com> wrote:
> > Subject: Re: [PATCH 2/3] crypto: inside-secure - Reduce stack usage

> >

> > On Mon, Sep 30, 2019 at 9:04 PM Pascal Van Leeuwen

> > <pvanleeuwen@verimatrix.com> wrote:

> >

> > > > Alternatively, it should be possible to shrink these allocations

> > > > as the extra buffers appear to be largely unnecessary, but doing

> > > > this would be a much more invasive change.

> > > >

> > > Actually, for HMAC-SHA512 you DO need all that buffer space.

> > > You could shrink it to 2 * ctx->state_sz but then your simple indexing

> > > is no longer going to fly. Not sure if that would be worth the effort.

> >

> > Stack allocations can no longer be dynamically sized in the kernel,

> > so that would not work.

> >

> I was actually referring to your kzalloc, not to the original stack

> based implementation ...


Ok, got it. For the kzalloc version, the size matters much less, as
this is not coming from a scarce resource and only takes a few more
cycles to do the initial clearing of the larger struct.

> > > And it conflicts with another change I have waiting that gets rid of

> > > aes_expandkey and that struct alltogether (since it was really just

> > > abused to do a key size check, which was very wasteful since the

> > > function actually generates all roundkeys we don't need at all ...)

> >

> > Right, this is what I noticed there. With 480 of the 484 bytes gone,

> > you are well below the warning limit even without the other change.

> >

> And by "other change" you mean the safexcel_ahash_export_state?


Yes.

> Ok, good to known, although I do like to improve that one as well,

> but preferably by not exporting the cache so I don't need the full

> struct.


Sounds good to me.

      Arnd
diff mbox series

Patch

diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index ef51f8c2b473..51a4112aa9bc 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -305,10 +305,10 @@  static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key,
 {
 	struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
 	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
-	struct safexcel_ahash_export_state istate, ostate;
+	struct safexcel_ahash_export_state *state;
 	struct safexcel_crypto_priv *priv = ctx->priv;
+	struct crypto_aes_ctx *aes;
 	struct crypto_authenc_keys keys;
-	struct crypto_aes_ctx aes;
 	int err = -EINVAL;
 
 	if (crypto_authenc_extractkeys(&keys, key, len) != 0)
@@ -334,7 +334,14 @@  static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key,
 			goto badkey_expflags;
 		break;
 	case SAFEXCEL_AES:
-		err = aes_expandkey(&aes, keys.enckey, keys.enckeylen);
+		aes = kzalloc(sizeof(*aes), GFP_KERNEL);
+		if (!aes) {
+			err = -ENOMEM;
+			goto badkey;
+		}
+
+		err = aes_expandkey(aes, keys.enckey, keys.enckeylen);
+		kfree(aes);
 		if (unlikely(err))
 			goto badkey;
 		break;
@@ -347,56 +354,66 @@  static int safexcel_aead_setkey(struct crypto_aead *ctfm, const u8 *key,
 	    memcmp(ctx->key, keys.enckey, keys.enckeylen))
 		ctx->base.needs_inv = true;
 
+	state = kzalloc(sizeof(struct safexcel_ahash_export_state) * 2, GFP_KERNEL);
+	if (!state) {
+		err = -ENOMEM;
+		goto badkey;
+	}
+
 	/* Auth key */
 	switch (ctx->hash_alg) {
 	case CONTEXT_CONTROL_CRYPTO_ALG_SHA1:
 		if (safexcel_hmac_setkey("safexcel-sha1", keys.authkey,
-					 keys.authkeylen, &istate, &ostate))
-			goto badkey;
+					 keys.authkeylen, &state[0], &state[1]))
+			goto badkey_free;
 		break;
 	case CONTEXT_CONTROL_CRYPTO_ALG_SHA224:
 		if (safexcel_hmac_setkey("safexcel-sha224", keys.authkey,
-					 keys.authkeylen, &istate, &ostate))
-			goto badkey;
+					 keys.authkeylen, &state[0], &state[1]))
+			goto badkey_free;
 		break;
 	case CONTEXT_CONTROL_CRYPTO_ALG_SHA256:
 		if (safexcel_hmac_setkey("safexcel-sha256", keys.authkey,
-					 keys.authkeylen, &istate, &ostate))
-			goto badkey;
+					 keys.authkeylen, &state[0], &state[1]))
+			goto badkey_free;
 		break;
 	case CONTEXT_CONTROL_CRYPTO_ALG_SHA384:
 		if (safexcel_hmac_setkey("safexcel-sha384", keys.authkey,
-					 keys.authkeylen, &istate, &ostate))
-			goto badkey;
+					 keys.authkeylen, &state[0], &state[1]))
+			goto badkey_free;
 		break;
 	case CONTEXT_CONTROL_CRYPTO_ALG_SHA512:
 		if (safexcel_hmac_setkey("safexcel-sha512", keys.authkey,
-					 keys.authkeylen, &istate, &ostate))
-			goto badkey;
+					 keys.authkeylen, &state[0], &state[1]))
+			goto badkey_free;
 		break;
 	default:
 		dev_err(priv->dev, "aead: unsupported hash algorithm\n");
-		goto badkey;
+		goto badkey_free;
 	}
 
 	crypto_aead_set_flags(ctfm, crypto_aead_get_flags(ctfm) &
 				    CRYPTO_TFM_RES_MASK);
 
 	if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma &&
-	    (memcmp(ctx->ipad, istate.state, ctx->state_sz) ||
-	     memcmp(ctx->opad, ostate.state, ctx->state_sz)))
+	    (memcmp(ctx->ipad, &state[0].state, ctx->state_sz) ||
+	     memcmp(ctx->opad, &state[1].state, ctx->state_sz)))
 		ctx->base.needs_inv = true;
 
 	/* Now copy the keys into the context */
 	memcpy(ctx->key, keys.enckey, keys.enckeylen);
 	ctx->key_len = keys.enckeylen;
 
-	memcpy(ctx->ipad, &istate.state, ctx->state_sz);
-	memcpy(ctx->opad, &ostate.state, ctx->state_sz);
+	memcpy(ctx->ipad, &state[0].state, ctx->state_sz);
+	memcpy(ctx->opad, &state[1].state, ctx->state_sz);
 
 	memzero_explicit(&keys, sizeof(keys));
+	kfree(state);
+
 	return 0;
 
+badkey_free:
+	kfree(state);
 badkey:
 	crypto_aead_set_flags(ctfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
 badkey_expflags: