diff mbox series

[net-next,v2,08/10] crypto: af_alg: Support MSG_SPLICE_PAGES

Message ID 20230530141635.136968-9-dhowells@redhat.com
State New
Headers show
Series crypto, splice, net: Make AF_ALG handle sendmsg(MSG_SPLICE_PAGES) | expand

Commit Message

David Howells May 30, 2023, 2:16 p.m. UTC
Make AF_ALG sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
---
 crypto/af_alg.c         | 28 ++++++++++++++++++++++++++--
 crypto/algif_aead.c     | 22 +++++++++++-----------
 crypto/algif_skcipher.c |  8 ++++----
 3 files changed, 41 insertions(+), 17 deletions(-)

Comments

Paolo Abeni June 1, 2023, 9:49 a.m. UTC | #1
On Tue, 2023-05-30 at 15:16 +0100, David Howells wrote:
> Make AF_ALG sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
> spliced from the source iterator.
> 
> This allows ->sendpage() to be replaced by something that can handle
> multiple multipage folios in a single transaction.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Herbert Xu <herbert@gondor.apana.org.au>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-crypto@vger.kernel.org
> cc: netdev@vger.kernel.org
> ---
>  crypto/af_alg.c         | 28 ++++++++++++++++++++++++++--
>  crypto/algif_aead.c     | 22 +++++++++++-----------
>  crypto/algif_skcipher.c |  8 ++++----
>  3 files changed, 41 insertions(+), 17 deletions(-)
> 
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index fd56ccff6fed..62f4205d42e3 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -940,6 +940,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
>  	bool init = false;
>  	int err = 0;
>  
> +	if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
> +	    !iov_iter_is_bvec(&msg->msg_iter))
> +		return -EINVAL;
> +
>  	if (msg->msg_controllen) {
>  		err = af_alg_cmsg_send(msg, &con);
>  		if (err)
> @@ -985,7 +989,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
>  	while (size) {
>  		struct scatterlist *sg;
>  		size_t len = size;
> -		size_t plen;
> +		ssize_t plen;
>  
>  		/* use the existing memory in an allocated page */
>  		if (ctx->merge) {
> @@ -1030,7 +1034,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
>  		if (sgl->cur)
>  			sg_unmark_end(sg + sgl->cur - 1);
>  
> -		if (1 /* TODO check MSG_SPLICE_PAGES */) {
> +		if (msg->msg_flags & MSG_SPLICE_PAGES) {
> +			struct sg_table sgtable = {
> +				.sgl		= sg,
> +				.nents		= sgl->cur,
> +				.orig_nents	= sgl->cur,
> +			};
> +
> +			plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable,
> +						  MAX_SGL_ENTS, 0);

It looks like the above expect/supports only ITER_BVEC iterators, what
about adding a WARN_ON_ONCE(<other iov type>)?

Also, I'm keeping this series a bit more in pw to allow Herbert or
others to have a look.

Cheers,

Paolo
David Howells June 1, 2023, 11:35 a.m. UTC | #2
Paolo Abeni <pabeni@redhat.com> wrote:

> > +	if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
> > +	    !iov_iter_is_bvec(&msg->msg_iter))
> > +		return -EINVAL;
> > +
> ...
> It looks like the above expect/supports only ITER_BVEC iterators, what
> about adding a WARN_ON_ONCE(<other iov type>)?

Meh.  I relaxed that requirement as I'm now using tools to extract stuff from
any iterator (extract_iter_to_sg() in this case) rather than walking the
bvec[] directly.  I forgot to remove the check from af_alg.  I can add an
extra patch to remove it.  Also, it probably doesn't matter for AF_ALG since
that's only likely to be called from userspace, either directly (which will
not set MSG_SPLICE_PAGES) or via splice (which will pass a BVEC).  Internal
kernel code will use crypto API directly.

> Also, I'm keeping this series a bit more in pw to allow Herbert or
> others to have a look.

Thanks.

David
Paolo Abeni June 6, 2023, 8:32 a.m. UTC | #3
On Thu, 2023-06-01 at 12:35 +0100, David Howells wrote:
> Paolo Abeni <pabeni@redhat.com> wrote:
> 
> > > +	if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
> > > +	    !iov_iter_is_bvec(&msg->msg_iter))
> > > +		return -EINVAL;
> > > +
> > ...
> > It looks like the above expect/supports only ITER_BVEC iterators, what
> > about adding a WARN_ON_ONCE(<other iov type>)?
> 
> Meh.  I relaxed that requirement as I'm now using tools to extract stuff from
> any iterator (extract_iter_to_sg() in this case) rather than walking the
> bvec[] directly.  I forgot to remove the check from af_alg.  I can add an
> extra patch to remove it.  Also, it probably doesn't matter for AF_ALG since
> that's only likely to be called from userspace, either directly (which will
> not set MSG_SPLICE_PAGES) or via splice (which will pass a BVEC).  Internal
> kernel code will use crypto API directly.

Thank you for the clarification, I got lost a bit. The patch LGTM as
is.

> 
> > Also, I'm keeping this series a bit more in pw to allow Herbert or
> > others to have a look.

@Herbert, the series LGTM, I think we should apply it. If you have any
concerns, please voice them soon!

Thanks,

Paolo
diff mbox series

Patch

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index fd56ccff6fed..62f4205d42e3 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -940,6 +940,10 @@  int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	bool init = false;
 	int err = 0;
 
+	if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
+	    !iov_iter_is_bvec(&msg->msg_iter))
+		return -EINVAL;
+
 	if (msg->msg_controllen) {
 		err = af_alg_cmsg_send(msg, &con);
 		if (err)
@@ -985,7 +989,7 @@  int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	while (size) {
 		struct scatterlist *sg;
 		size_t len = size;
-		size_t plen;
+		ssize_t plen;
 
 		/* use the existing memory in an allocated page */
 		if (ctx->merge) {
@@ -1030,7 +1034,27 @@  int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		if (sgl->cur)
 			sg_unmark_end(sg + sgl->cur - 1);
 
-		if (1 /* TODO check MSG_SPLICE_PAGES */) {
+		if (msg->msg_flags & MSG_SPLICE_PAGES) {
+			struct sg_table sgtable = {
+				.sgl		= sg,
+				.nents		= sgl->cur,
+				.orig_nents	= sgl->cur,
+			};
+
+			plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable,
+						  MAX_SGL_ENTS, 0);
+			if (plen < 0) {
+				err = plen;
+				goto unlock;
+			}
+
+			for (; sgl->cur < sgtable.nents; sgl->cur++)
+				get_page(sg_page(&sg[sgl->cur]));
+			len -= plen;
+			ctx->used += plen;
+			copied += plen;
+			size -= plen;
+		} else {
 			do {
 				struct page *pg;
 				unsigned int i = sgl->cur;
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 829878025dba..35bfa283748d 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,8 +9,8 @@ 
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
+ * filled by user space with the data submitted via sendpage. Filling up
+ * the TX SGL does not cause a crypto operation -- the data will only be
  * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
  * provide a buffer which is tracked with the RX SGL.
  *
@@ -113,19 +113,19 @@  static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 	}
 
 	/*
-	 * Data length provided by caller via sendmsg/sendpage that has not
-	 * yet been processed.
+	 * Data length provided by caller via sendmsg that has not yet been
+	 * processed.
 	 */
 	used = ctx->used;
 
 	/*
-	 * Make sure sufficient data is present -- note, the same check is
-	 * also present in sendmsg/sendpage. The checks in sendpage/sendmsg
-	 * shall provide an information to the data sender that something is
-	 * wrong, but they are irrelevant to maintain the kernel integrity.
-	 * We need this check here too in case user space decides to not honor
-	 * the error message in sendmsg/sendpage and still call recvmsg. This
-	 * check here protects the kernel integrity.
+	 * Make sure sufficient data is present -- note, the same check is also
+	 * present in sendmsg. The checks in sendmsg shall provide an
+	 * information to the data sender that something is wrong, but they are
+	 * irrelevant to maintain the kernel integrity.  We need this check
+	 * here too in case user space decides to not honor the error message
+	 * in sendmsg and still call recvmsg. This check here protects the
+	 * kernel integrity.
 	 */
 	if (!aead_sufficient_data(sk))
 		return -EINVAL;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index a251cd6bd5b9..b1f321b9f846 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -9,10 +9,10 @@ 
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg. Filling up the TX
+ * SGL does not cause a crypto operation -- the data will only be tracked by
+ * the kernel. Upon receipt of one recvmsg call, the caller must provide a
+ * buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed