mbox series

[v2,0/3] crypto/arm64: aes-ce-gcm - switch to 2-way aggregation

Message ID 20180730210642.25180-1-ard.biesheuvel@linaro.org
Headers show
Series crypto/arm64: aes-ce-gcm - switch to 2-way aggregation | expand

Message

Ard Biesheuvel July 30, 2018, 9:06 p.m. UTC
Update the combined AES-GCM AEAD implementation to process two blocks
at a time, allowing us to switch to a faster version of the GHASH
implementation.

Note that this does not update the core GHASH transform, only the
combined AES-GCM AEAD mode. GHASH is mostly used with AES anyway, and
the ARMv8 architecture mandates support for AES instructions if
64-bit polynomial multiplication instructions are implemented. This
means that mosts users of the pmull.p64 based GHASH routines are better
off using the combined AES-GCM code anyway. Users of the pmull.p8 based
GHASH implementation are unlikely to benefit substantially from aggregation,
given that the multiplication phase is much more dominant in this case
(and it is only the reduction phase that is amortized over multiple
blocks)

Performance numbers for Cortex-A53 can be found after patches #2 and #3.

Changes since v1:
- rebase to take the changes in patch 'crypto: arm64 - revert NEON yield for
  fast AEAD implementations' which I sent out on July 29th
- add a patch to reduce the number of invocations of kernel_neon_begin()
  and kernel_neon_end() on the common path

Ard Biesheuvel (3):
  crypto/arm64: aes-ce-gcm - operate on two input blocks at a time
  crypto/arm64: aes-ce-gcm - implement 2-way aggregation
  crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable

 arch/arm64/crypto/ghash-ce-core.S | 136 +++++++++------
 arch/arm64/crypto/ghash-ce-glue.c | 176 ++++++++++++--------
 2 files changed, 198 insertions(+), 114 deletions(-)

-- 
2.18.0

Comments

Herbert Xu Aug. 3, 2018, 3:47 p.m. UTC | #1
On Mon, Jul 30, 2018 at 11:06:39PM +0200, Ard Biesheuvel wrote:
> Update the combined AES-GCM AEAD implementation to process two blocks

> at a time, allowing us to switch to a faster version of the GHASH

> implementation.

> 

> Note that this does not update the core GHASH transform, only the

> combined AES-GCM AEAD mode. GHASH is mostly used with AES anyway, and

> the ARMv8 architecture mandates support for AES instructions if

> 64-bit polynomial multiplication instructions are implemented. This

> means that mosts users of the pmull.p64 based GHASH routines are better

> off using the combined AES-GCM code anyway. Users of the pmull.p8 based

> GHASH implementation are unlikely to benefit substantially from aggregation,

> given that the multiplication phase is much more dominant in this case

> (and it is only the reduction phase that is amortized over multiple

> blocks)

> 

> Performance numbers for Cortex-A53 can be found after patches #2 and #3.

> 

> Changes since v1:

> - rebase to take the changes in patch 'crypto: arm64 - revert NEON yield for

>   fast AEAD implementations' which I sent out on July 29th

> - add a patch to reduce the number of invocations of kernel_neon_begin()

>   and kernel_neon_end() on the common path

> 

> Ard Biesheuvel (3):

>   crypto/arm64: aes-ce-gcm - operate on two input blocks at a time

>   crypto/arm64: aes-ce-gcm - implement 2-way aggregation

>   crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable

> 

>  arch/arm64/crypto/ghash-ce-core.S | 136 +++++++++------

>  arch/arm64/crypto/ghash-ce-glue.c | 176 ++++++++++++--------

>  2 files changed, 198 insertions(+), 114 deletions(-)


All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Ard Biesheuvel Aug. 3, 2018, 4:46 p.m. UTC | #2
On 3 August 2018 at 17:47, Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Jul 30, 2018 at 11:06:39PM +0200, Ard Biesheuvel wrote:

>> Update the combined AES-GCM AEAD implementation to process two blocks

>> at a time, allowing us to switch to a faster version of the GHASH

>> implementation.

>>

>> Note that this does not update the core GHASH transform, only the

>> combined AES-GCM AEAD mode. GHASH is mostly used with AES anyway, and

>> the ARMv8 architecture mandates support for AES instructions if

>> 64-bit polynomial multiplication instructions are implemented. This

>> means that mosts users of the pmull.p64 based GHASH routines are better

>> off using the combined AES-GCM code anyway. Users of the pmull.p8 based

>> GHASH implementation are unlikely to benefit substantially from aggregation,

>> given that the multiplication phase is much more dominant in this case

>> (and it is only the reduction phase that is amortized over multiple

>> blocks)

>>

>> Performance numbers for Cortex-A53 can be found after patches #2 and #3.

>>

>> Changes since v1:

>> - rebase to take the changes in patch 'crypto: arm64 - revert NEON yield for

>>   fast AEAD implementations' which I sent out on July 29th

>> - add a patch to reduce the number of invocations of kernel_neon_begin()

>>   and kernel_neon_end() on the common path

>>

>> Ard Biesheuvel (3):

>>   crypto/arm64: aes-ce-gcm - operate on two input blocks at a time

>>   crypto/arm64: aes-ce-gcm - implement 2-way aggregation

>>   crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable

>>

>>  arch/arm64/crypto/ghash-ce-core.S | 136 +++++++++------

>>  arch/arm64/crypto/ghash-ce-glue.c | 176 ++++++++++++--------

>>  2 files changed, 198 insertions(+), 114 deletions(-)

>

> All applied.  Thanks.


Thanks Herbert.