Message ID | 20200806163551.14395-2-andrei.botila@oss.nxp.com |
---|---|
State | New |
Headers | show |
Series | crypto: caam - xts(aes) updates | expand |
On 9/9/2020 1:10 AM, Herbert Xu wrote: > On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: >> >>> Just go with the get_unaligned unconditionally. >> >> Won't this lead to sub-optimal code for ARMv7 >> in case the IV is aligned? > > If this should be optimised in ARMv7 then that should be done > in get_unaligned itself and not open-coded. > I am not sure what's wrong with avoiding using the unaligned accessors in case data is aligned. Documentation/core-api/unaligned-memory-access.rst clearly states: These macros work for memory accesses of any length (not just 32 bits as in the examples above). Be aware that when compared to standard access of aligned memory, using these macros to access unaligned memory can be costly in terms of performance. So IMO it makes sense to use get_unaligned() only when needed. There are several cases of users doing this, e.g. siphash. Thanks, Horia
On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: > > On 9/9/2020 1:10 AM, Herbert Xu wrote: > > On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: > >> > >>> Just go with the get_unaligned unconditionally. > >> > >> Won't this lead to sub-optimal code for ARMv7 > >> in case the IV is aligned? > > > > If this should be optimised in ARMv7 then that should be done > > in get_unaligned itself and not open-coded. > > > I am not sure what's wrong with avoiding using the unaligned accessors > in case data is aligned. > > Documentation/core-api/unaligned-memory-access.rst clearly states: > These macros work for memory accesses of any length (not just 32 bits as > in the examples above). Be aware that when compared to standard access of > aligned memory, using these macros to access unaligned memory can be costly in > terms of performance. > > So IMO it makes sense to use get_unaligned() only when needed. > There are several cases of users doing this, e.g. siphash. > For ARMv7 code, using the unaligned accessors unconditionally is fine, and it will not affect performance. In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, you can use the unaligned accessors. If it is not, it helps to have different code paths. This is a bit murky, and through the years, the interpretation of unaligned-memory-access.rst has shifted a bit, but in this case, it makes no sense to make the distinction.
On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: > On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: >> >> On 9/9/2020 1:10 AM, Herbert Xu wrote: >>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: >>>> >>>>> Just go with the get_unaligned unconditionally. >>>> >>>> Won't this lead to sub-optimal code for ARMv7 >>>> in case the IV is aligned? >>> >>> If this should be optimised in ARMv7 then that should be done >>> in get_unaligned itself and not open-coded. >>> >> I am not sure what's wrong with avoiding using the unaligned accessors >> in case data is aligned. >> >> Documentation/core-api/unaligned-memory-access.rst clearly states: >> These macros work for memory accesses of any length (not just 32 bits as >> in the examples above). Be aware that when compared to standard access of >> aligned memory, using these macros to access unaligned memory can be costly in >> terms of performance. >> >> So IMO it makes sense to use get_unaligned() only when needed. >> There are several cases of users doing this, e.g. siphash. >> > > For ARMv7 code, using the unaligned accessors unconditionally is fine, > and it will not affect performance. > > In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, > you can use the unaligned accessors. If it is not, it helps to have > different code paths. > arch/arm/include/asm/unaligned.h doesn't make use of linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is set. I understand the comment in the file, however using get_unaligned() unconditionally takes away the opportunity to generate optimized code (using ldrd/ldm) when data is aligned. > This is a bit murky, and through the years, the interpretation of > unaligned-memory-access.rst has shifted a bit, but in this case, it > makes no sense to make the distinction. > Thanks, Horia
On Mon, 14 Sep 2020 at 20:12, Horia Geantă <horia.geanta@nxp.com> wrote: > > On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: > > On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: > >> > >> On 9/9/2020 1:10 AM, Herbert Xu wrote: > >>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: > >>>> > >>>>> Just go with the get_unaligned unconditionally. > >>>> > >>>> Won't this lead to sub-optimal code for ARMv7 > >>>> in case the IV is aligned? > >>> > >>> If this should be optimised in ARMv7 then that should be done > >>> in get_unaligned itself and not open-coded. > >>> > >> I am not sure what's wrong with avoiding using the unaligned accessors > >> in case data is aligned. > >> > >> Documentation/core-api/unaligned-memory-access.rst clearly states: > >> These macros work for memory accesses of any length (not just 32 bits as > >> in the examples above). Be aware that when compared to standard access of > >> aligned memory, using these macros to access unaligned memory can be costly in > >> terms of performance. > >> > >> So IMO it makes sense to use get_unaligned() only when needed. > >> There are several cases of users doing this, e.g. siphash. > >> > > > > For ARMv7 code, using the unaligned accessors unconditionally is fine, > > and it will not affect performance. > > > > In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, > > you can use the unaligned accessors. If it is not, it helps to have > > different code paths. > > > arch/arm/include/asm/unaligned.h doesn't make use of > linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > is set. > > I understand the comment in the file, however using get_unaligned() > unconditionally takes away the opportunity to generate optimized code > (using ldrd/ldm) when data is aligned. > But the minimal optimization that is possible here (one ldrd/ldm instruction vs two ldr instructions) is defeated by the fact that you are using a conditional branch to select between the two. And this is not even a hot path to begin with, > > This is a bit murky, and through the years, the interpretation of > > unaligned-memory-access.rst has shifted a bit, but in this case, it > > makes no sense to make the distinction. > > > > Thanks, > Horia
On 9/14/2020 9:20 PM, Ard Biesheuvel wrote: > On Mon, 14 Sep 2020 at 20:12, Horia Geantă <horia.geanta@nxp.com> wrote: >> >> On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: >>> On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: >>>> >>>> On 9/9/2020 1:10 AM, Herbert Xu wrote: >>>>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: >>>>>> >>>>>>> Just go with the get_unaligned unconditionally. >>>>>> >>>>>> Won't this lead to sub-optimal code for ARMv7 >>>>>> in case the IV is aligned? >>>>> >>>>> If this should be optimised in ARMv7 then that should be done >>>>> in get_unaligned itself and not open-coded. >>>>> >>>> I am not sure what's wrong with avoiding using the unaligned accessors >>>> in case data is aligned. >>>> >>>> Documentation/core-api/unaligned-memory-access.rst clearly states: >>>> These macros work for memory accesses of any length (not just 32 bits as >>>> in the examples above). Be aware that when compared to standard access of >>>> aligned memory, using these macros to access unaligned memory can be costly in >>>> terms of performance. >>>> >>>> So IMO it makes sense to use get_unaligned() only when needed. >>>> There are several cases of users doing this, e.g. siphash. >>>> >>> >>> For ARMv7 code, using the unaligned accessors unconditionally is fine, >>> and it will not affect performance. >>> >>> In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, >>> you can use the unaligned accessors. If it is not, it helps to have >>> different code paths. >>> >> arch/arm/include/asm/unaligned.h doesn't make use of >> linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS >> is set. >> >> I understand the comment in the file, however using get_unaligned() >> unconditionally takes away the opportunity to generate optimized code >> (using ldrd/ldm) when data is aligned. >> > > But the minimal optimization that is possible here (one ldrd/ldm > instruction vs two ldr instructions) is defeated by the fact that you > are using a conditional branch to select between the two. And this is > not even a hot path to begin with, > This is actually on the hot path (encrypt/decrypt callbacks), but you're probably right that the conditional branching is going to offset the optimized code. To avoid branching, code could be rewritten as: #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS size = *(u64 *)(req->iv + (ivsize / 2)); #else size = get_unaligned((u64 *)(req->iv + (ivsize / 2))); #endif however in this case ARMv7 would suffer since CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and ldrd/ldm for accesses not word-aligned are inefficient - lead to traps. Would it be ok to use: #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && !defined(CONFIG_ARM) to workaround the ARMv7 inconsistency? Thanks, Horia
On Tue, 15 Sep 2020 at 13:02, Horia Geantă <horia.geanta@nxp.com> wrote: > > On 9/14/2020 9:20 PM, Ard Biesheuvel wrote: > > On Mon, 14 Sep 2020 at 20:12, Horia Geantă <horia.geanta@nxp.com> wrote: > >> > >> On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: > >>> On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: > >>>> > >>>> On 9/9/2020 1:10 AM, Herbert Xu wrote: > >>>>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: > >>>>>> > >>>>>>> Just go with the get_unaligned unconditionally. > >>>>>> > >>>>>> Won't this lead to sub-optimal code for ARMv7 > >>>>>> in case the IV is aligned? > >>>>> > >>>>> If this should be optimised in ARMv7 then that should be done > >>>>> in get_unaligned itself and not open-coded. > >>>>> > >>>> I am not sure what's wrong with avoiding using the unaligned accessors > >>>> in case data is aligned. > >>>> > >>>> Documentation/core-api/unaligned-memory-access.rst clearly states: > >>>> These macros work for memory accesses of any length (not just 32 bits as > >>>> in the examples above). Be aware that when compared to standard access of > >>>> aligned memory, using these macros to access unaligned memory can be costly in > >>>> terms of performance. > >>>> > >>>> So IMO it makes sense to use get_unaligned() only when needed. > >>>> There are several cases of users doing this, e.g. siphash. > >>>> > >>> > >>> For ARMv7 code, using the unaligned accessors unconditionally is fine, > >>> and it will not affect performance. > >>> > >>> In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, > >>> you can use the unaligned accessors. If it is not, it helps to have > >>> different code paths. > >>> > >> arch/arm/include/asm/unaligned.h doesn't make use of > >> linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > >> is set. > >> > >> I understand the comment in the file, however using get_unaligned() > >> unconditionally takes away the opportunity to generate optimized code > >> (using ldrd/ldm) when data is aligned. > >> > > > > But the minimal optimization that is possible here (one ldrd/ldm > > instruction vs two ldr instructions) is defeated by the fact that you > > are using a conditional branch to select between the two. And this is > > not even a hot path to begin with, > > > This is actually on the hot path (encrypt/decrypt callbacks), > but you're probably right that the conditional branching is going to offset > the optimized code. > This is called once per XTS request, right? And you are saying the extra cycle makes a difference? > To avoid branching, code could be rewritten as: > > #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > size = *(u64 *)(req->iv + (ivsize / 2)); > #else > size = get_unaligned((u64 *)(req->iv + (ivsize / 2))); > #endif > > however in this case ARMv7 would suffer since > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and > ldrd/ldm for accesses not word-aligned are inefficient - lead to traps. > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS means 'just use the unaligned accessors as they are basically free'. Casting a potentially misaligned u8* to a u64* is not permitted by the C standard. > Would it be ok to use: > #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && !defined(CONFIG_ARM) > to workaround the ARMv7 inconsistency? > No, please just use the get_unaligned() accessor.
On 9/15/2020 1:26 PM, Ard Biesheuvel wrote: > On Tue, 15 Sep 2020 at 13:02, Horia Geantă <horia.geanta@nxp.com> wrote: >> >> On 9/14/2020 9:20 PM, Ard Biesheuvel wrote: >>> On Mon, 14 Sep 2020 at 20:12, Horia Geantă <horia.geanta@nxp.com> wrote: >>>> >>>> On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: >>>>> On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: >>>>>> >>>>>> On 9/9/2020 1:10 AM, Herbert Xu wrote: >>>>>>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: >>>>>>>> >>>>>>>>> Just go with the get_unaligned unconditionally. >>>>>>>> >>>>>>>> Won't this lead to sub-optimal code for ARMv7 >>>>>>>> in case the IV is aligned? >>>>>>> >>>>>>> If this should be optimised in ARMv7 then that should be done >>>>>>> in get_unaligned itself and not open-coded. >>>>>>> >>>>>> I am not sure what's wrong with avoiding using the unaligned accessors >>>>>> in case data is aligned. >>>>>> >>>>>> Documentation/core-api/unaligned-memory-access.rst clearly states: >>>>>> These macros work for memory accesses of any length (not just 32 bits as >>>>>> in the examples above). Be aware that when compared to standard access of >>>>>> aligned memory, using these macros to access unaligned memory can be costly in >>>>>> terms of performance. >>>>>> >>>>>> So IMO it makes sense to use get_unaligned() only when needed. >>>>>> There are several cases of users doing this, e.g. siphash. >>>>>> >>>>> >>>>> For ARMv7 code, using the unaligned accessors unconditionally is fine, >>>>> and it will not affect performance. >>>>> >>>>> In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, >>>>> you can use the unaligned accessors. If it is not, it helps to have >>>>> different code paths. >>>>> >>>> arch/arm/include/asm/unaligned.h doesn't make use of >>>> linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS >>>> is set. >>>> >>>> I understand the comment in the file, however using get_unaligned() >>>> unconditionally takes away the opportunity to generate optimized code >>>> (using ldrd/ldm) when data is aligned. >>>> >>> >>> But the minimal optimization that is possible here (one ldrd/ldm >>> instruction vs two ldr instructions) is defeated by the fact that you >>> are using a conditional branch to select between the two. And this is >>> not even a hot path to begin with, >>> >> This is actually on the hot path (encrypt/decrypt callbacks), >> but you're probably right that the conditional branching is going to offset >> the optimized code. >> > > This is called once per XTS request, right? And you are saying the > extra cycle makes a difference? > Yes, once per request and no, not super-important. >> To avoid branching, code could be rewritten as: >> >> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS >> size = *(u64 *)(req->iv + (ivsize / 2)); >> #else >> size = get_unaligned((u64 *)(req->iv + (ivsize / 2))); >> #endif >> >> however in this case ARMv7 would suffer since >> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and >> ldrd/ldm for accesses not word-aligned are inefficient - lead to traps. >> > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS means 'just use the unaligned > accessors as they are basically free'. Casting a potentially > misaligned u8* to a u64* is not permitted by the C standard. > Seems that I misunderstood CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. Looking at its usage, e.g. ether_addr_equal() or __crypto_memneq_*(), I see similar casts of pointers possibly misaligned. >> Would it be ok to use: >> #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && !defined(CONFIG_ARM) >> to workaround the ARMv7 inconsistency? >> > > No, please just use the get_unaligned() accessor. > Ok. Thanks, Horia
On Tue, 15 Sep 2020 at 15:45, Horia Geantă <horia.geanta@nxp.com> wrote: > > On 9/15/2020 1:26 PM, Ard Biesheuvel wrote: > > On Tue, 15 Sep 2020 at 13:02, Horia Geantă <horia.geanta@nxp.com> wrote: > >> > >> On 9/14/2020 9:20 PM, Ard Biesheuvel wrote: > >>> On Mon, 14 Sep 2020 at 20:12, Horia Geantă <horia.geanta@nxp.com> wrote: > >>>> > >>>> On 9/14/2020 7:28 PM, Ard Biesheuvel wrote: > >>>>> On Mon, 14 Sep 2020 at 19:24, Horia Geantă <horia.geanta@nxp.com> wrote: > >>>>>> > >>>>>> On 9/9/2020 1:10 AM, Herbert Xu wrote: > >>>>>>> On Tue, Sep 08, 2020 at 01:35:04PM +0300, Horia Geantă wrote: > >>>>>>>> > >>>>>>>>> Just go with the get_unaligned unconditionally. > >>>>>>>> > >>>>>>>> Won't this lead to sub-optimal code for ARMv7 > >>>>>>>> in case the IV is aligned? > >>>>>>> > >>>>>>> If this should be optimised in ARMv7 then that should be done > >>>>>>> in get_unaligned itself and not open-coded. > >>>>>>> > >>>>>> I am not sure what's wrong with avoiding using the unaligned accessors > >>>>>> in case data is aligned. > >>>>>> > >>>>>> Documentation/core-api/unaligned-memory-access.rst clearly states: > >>>>>> These macros work for memory accesses of any length (not just 32 bits as > >>>>>> in the examples above). Be aware that when compared to standard access of > >>>>>> aligned memory, using these macros to access unaligned memory can be costly in > >>>>>> terms of performance. > >>>>>> > >>>>>> So IMO it makes sense to use get_unaligned() only when needed. > >>>>>> There are several cases of users doing this, e.g. siphash. > >>>>>> > >>>>> > >>>>> For ARMv7 code, using the unaligned accessors unconditionally is fine, > >>>>> and it will not affect performance. > >>>>> > >>>>> In general, when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is defined, > >>>>> you can use the unaligned accessors. If it is not, it helps to have > >>>>> different code paths. > >>>>> > >>>> arch/arm/include/asm/unaligned.h doesn't make use of > >>>> linux/unaligned/access_ok.h, even if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > >>>> is set. > >>>> > >>>> I understand the comment in the file, however using get_unaligned() > >>>> unconditionally takes away the opportunity to generate optimized code > >>>> (using ldrd/ldm) when data is aligned. > >>>> > >>> > >>> But the minimal optimization that is possible here (one ldrd/ldm > >>> instruction vs two ldr instructions) is defeated by the fact that you > >>> are using a conditional branch to select between the two. And this is > >>> not even a hot path to begin with, > >>> > >> This is actually on the hot path (encrypt/decrypt callbacks), > >> but you're probably right that the conditional branching is going to offset > >> the optimized code. > >> > > > > This is called once per XTS request, right? And you are saying the > > extra cycle makes a difference? > > > Yes, once per request and no, not super-important. > > >> To avoid branching, code could be rewritten as: > >> > >> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > >> size = *(u64 *)(req->iv + (ivsize / 2)); > >> #else > >> size = get_unaligned((u64 *)(req->iv + (ivsize / 2))); > >> #endif > >> > >> however in this case ARMv7 would suffer since > >> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and > >> ldrd/ldm for accesses not word-aligned are inefficient - lead to traps. > >> > > > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS means 'just use the unaligned > > accessors as they are basically free'. Casting a potentially > > misaligned u8* to a u64* is not permitted by the C standard. > > > Seems that I misunderstood CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. > You're not the only one :-) I have been intending to get the discussion going with the networking folks, who rely heavily on this as well. > Looking at its usage, e.g. ether_addr_equal() or __crypto_memneq_*(), > I see similar casts of pointers possibly misaligned. > Yes, that is the confusion. CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS should indicate whether using the unaligned accessors is fine in all cases, or whether you should find other ways to load the data more efficiently (compare NET_IP_ALIGN, which shifts the entire IP header so the 32-bit address field appear aligned in memory) CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS does *not* mean that you can simply cast any pointer to any type and dereference it, but the meaning appears to have shifted this way over the years (and the Documentation/ was even updated to this effect) Pre-v6 ARM (and MIPS as well, IIRC) require byte sized accesses and shift/or sequences to do unaligned accesses, whereas v6 and up simply allows ldr from a misaligned address. So in the former case, you could use cra_alignmask to align the data in memory, while the latter case can ignore it. (arch/arm/crypto/aes-cipher-glue.c uses this as well) > >> Would it be ok to use: > >> #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && !defined(CONFIG_ARM) > >> to workaround the ARMv7 inconsistency? > >> > > > > No, please just use the get_unaligned() accessor. > > > Ok. > > Thanks, > Horia
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c index 91feda5b63f6..ebf4dc87ca2e 100644 --- a/drivers/crypto/caam/caamalg.c +++ b/drivers/crypto/caam/caamalg.c @@ -57,6 +57,7 @@ #include "key_gen.h" #include "caamalg_desc.h" #include <crypto/engine.h> +#include <asm/unaligned.h> /* * crypto alg @@ -114,10 +115,12 @@ struct caam_ctx { struct alginfo adata; struct alginfo cdata; unsigned int authsize; + struct crypto_skcipher *fallback; }; struct caam_skcipher_req_ctx { struct skcipher_edesc *edesc; + struct skcipher_request fallback_req; }; struct caam_aead_req_ctx { @@ -830,12 +833,17 @@ static int xts_skcipher_setkey(struct crypto_skcipher *skcipher, const u8 *key, struct caam_ctx *ctx = crypto_skcipher_ctx(skcipher); struct device *jrdev = ctx->jrdev; u32 *desc; + int err; if (keylen != 2 * AES_MIN_KEY_SIZE && keylen != 2 * AES_MAX_KEY_SIZE) { dev_dbg(jrdev, "key size mismatch\n"); return -EINVAL; } + err = crypto_skcipher_setkey(ctx->fallback, key, keylen); + if (err) + return err; + ctx->cdata.keylen = keylen; ctx->cdata.key_virt = key; ctx->cdata.key_inline = true; @@ -1755,6 +1763,20 @@ static int skcipher_do_one_req(struct crypto_engine *engine, void *areq) return ret; } +static bool xts_skcipher_ivsize(struct skcipher_request *req) +{ + struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); + unsigned int ivsize = crypto_skcipher_ivsize(skcipher); + u64 size = 0; + + if (IS_ALIGNED((unsigned long)req->iv, __alignof__(u64))) + size = *(u64 *)(req->iv + (ivsize / 2)); + else + size = get_unaligned((u64 *)(req->iv + (ivsize / 2))); + + return !!size; +} + static inline int skcipher_crypt(struct skcipher_request *req, bool encrypt) { struct skcipher_edesc *edesc; @@ -1768,6 +1790,21 @@ static inline int skcipher_crypt(struct skcipher_request *req, bool encrypt) if (!req->cryptlen) return 0; + if (ctx->fallback && xts_skcipher_ivsize(req)) { + struct caam_skcipher_req_ctx *rctx = skcipher_request_ctx(req); + + skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback); + skcipher_request_set_callback(&rctx->fallback_req, + req->base.flags, + req->base.complete, + req->base.data); + skcipher_request_set_crypt(&rctx->fallback_req, req->src, + req->dst, req->cryptlen, req->iv); + + return encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) : + crypto_skcipher_decrypt(&rctx->fallback_req); + } + /* allocate extended descriptor */ edesc = skcipher_edesc_alloc(req, DESC_JOB_IO_LEN * CAAM_CMD_SZ); if (IS_ERR(edesc)) @@ -1905,6 +1942,7 @@ static struct caam_skcipher_alg driver_algs[] = { .base = { .cra_name = "xts(aes)", .cra_driver_name = "xts-aes-caam", + .cra_flags = CRYPTO_ALG_NEED_FALLBACK, .cra_blocksize = AES_BLOCK_SIZE, }, .setkey = xts_skcipher_setkey, @@ -3344,12 +3382,30 @@ static int caam_cra_init(struct crypto_skcipher *tfm) struct caam_skcipher_alg *caam_alg = container_of(alg, typeof(*caam_alg), skcipher); struct caam_ctx *ctx = crypto_skcipher_ctx(tfm); + u32 alg_aai = caam_alg->caam.class1_alg_type & OP_ALG_AAI_MASK; crypto_skcipher_set_reqsize(tfm, sizeof(struct caam_skcipher_req_ctx)); ctx->enginectx.op.do_one_request = skcipher_do_one_req; - return caam_init_common(crypto_skcipher_ctx(tfm), &caam_alg->caam, + if (alg_aai == OP_ALG_AAI_XTS) { + const char *tfm_name = crypto_tfm_alg_name(&tfm->base); + struct crypto_skcipher *fallback; + + fallback = crypto_alloc_skcipher(tfm_name, 0, + CRYPTO_ALG_NEED_FALLBACK); + if (IS_ERR(fallback)) { + pr_err("Failed to allocate %s fallback: %ld\n", + tfm_name, PTR_ERR(fallback)); + return PTR_ERR(fallback); + } + + ctx->fallback = fallback; + crypto_skcipher_set_reqsize(tfm, sizeof(struct caam_skcipher_req_ctx) + + crypto_skcipher_reqsize(fallback)); + } + + return caam_init_common(ctx, &caam_alg->caam, false); } @@ -3378,7 +3434,11 @@ static void caam_exit_common(struct caam_ctx *ctx) static void caam_cra_exit(struct crypto_skcipher *tfm) { - caam_exit_common(crypto_skcipher_ctx(tfm)); + struct caam_ctx *ctx = crypto_skcipher_ctx(tfm); + + if (ctx->fallback) + crypto_free_skcipher(ctx->fallback); + caam_exit_common(ctx); } static void caam_aead_exit(struct crypto_aead *tfm) @@ -3412,8 +3472,8 @@ static void caam_skcipher_alg_init(struct caam_skcipher_alg *t_alg) alg->base.cra_module = THIS_MODULE; alg->base.cra_priority = CAAM_CRA_PRIORITY; alg->base.cra_ctxsize = sizeof(struct caam_ctx); - alg->base.cra_flags = CRYPTO_ALG_ASYNC | CRYPTO_ALG_ALLOCATES_MEMORY | - CRYPTO_ALG_KERN_DRIVER_ONLY; + alg->base.cra_flags |= (CRYPTO_ALG_ASYNC | CRYPTO_ALG_ALLOCATES_MEMORY | + CRYPTO_ALG_KERN_DRIVER_ONLY); alg->init = caam_cra_init; alg->exit = caam_cra_exit;