diff mbox series

[net] net: macsec: fix ingress frame ordering

Message ID 20200424223433.955775-1-scott@scottdial.com
State New
Headers show
Series [net] net: macsec: fix ingress frame ordering | expand

Commit Message

Scott Dial April 24, 2020, 10:34 p.m. UTC
MACsec decryption always occurs in a softirq context. Since
the FPU may not be usable in the softirq context, this call to
decrypt may be scheduled on the cryptd work queue. The cryptd
work queue does not provide ordering guarantees. Therefore,
preserving order requires masking out ASYNC implementations
of gcm(aes).

On recent x86 architectures, the request for gcm(aes) will
select the generic-gcm-aesni driver from the aesni_intel
module. However, this implementation requires the FPU, so it
cannot preserve frame ordering. With this change, such a
system would select a hybrid software/hardware solution, such
as gcm_base(ctr(aes-aesni),ghash-generic). It's notable
that the aes-aesni implementation will fallback to the
aes-asm implementation if the FPU is unavailable.

By using a synchronous version of gcm(aes), the decryption
will complete before returning from crypto_aead_decrypt().
Therefore, the macsec_decrypt_done() callback will be called
before returning from macsec_decrypt(). Thus, order will be
preserved with calls to macsec_post_decrypt().

While it's presumable that the pure AES-NI version of gcm(aes)
is more performant, the hybrid solution is capable of gigabit
speeds on modest hardware. Regardless, preserving the order
of frames is paramount for many network protocols (e.g.,
triggering TCP retries). Within the MACsec driver itself, the
replay protection is tripped by the out-of-order frames,
causing frames to be dropped.

This bug has been present in this code since it was added in
v4.6, however it may have went unnoticed since not all CPUs
have gcm(aes) offload available. Additionally, the bug
manifests as occasional out-of-order packets that are easily
misattributed to other network phenomena.

When this code was added in v4.6, the crypto/gcm.c code did
not restrict selection of the ghash function based on the
ASYNC flag. For instance, x86 CPUs with the PCLMULQDQ CPU
feature would select ghash-clmulni driver is selected instead
of ghash-generic, which will submit to the cryptd work queue
if the FPU is busy. However, this issue was was corrected in
v4.8 by commit b30bdfa86431afbafe15284a3ad5ac19b49b88e3, and
was backported all the way back to the v3.14 stable branch,
so this patch should be applicable all the way back to the
v4.6 stable branch.

Signed-off-by: Scott Dial <scott@scottdial.com>
---
 drivers/net/macsec.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index a183250ff66a..bce8b8fde400 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -1305,7 +1305,8 @@  static struct crypto_aead *macsec_alloc_tfm(char *key, int key_len, int icv_len)
 	struct crypto_aead *tfm;
 	int ret;
 
-	tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+	/* Pick a sync gcm(aes) cipher to ensure order is preserved. */
+	tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
 
 	if (IS_ERR(tfm))
 		return tfm;