diff mbox series

[v4,01/10] crypto: acomp - Define two new interfaces for compress/decompress batching.

Message ID 20241123070127.332773-2-kanchana.p.sridhar@intel.com
State New
Headers show
Series zswap IAA compress batching | expand

Commit Message

Kanchana P Sridhar Nov. 23, 2024, 7:01 a.m. UTC
This commit adds batch_compress() and batch_decompress() interfaces to:

  struct acomp_alg
  struct crypto_acomp

This allows the iaa_crypto Intel IAA driver to register implementations for
the batch_compress() and batch_decompress() API, that can subsequently be
invoked from the kernel zswap/zram swap modules to compress/decompress
up to CRYPTO_BATCH_SIZE (i.e. 8) pages in parallel in the IAA hardware
accelerator to improve swapout/swapin performance.

A new helper function acomp_has_async_batching() can be invoked to query
if a crypto_acomp has registered these batch_compress and batch_decompress
interfaces.

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 crypto/acompress.c                  |  2 +
 include/crypto/acompress.h          | 91 +++++++++++++++++++++++++++++
 include/crypto/internal/acompress.h | 16 +++++
 3 files changed, 109 insertions(+)

Comments

Herbert Xu Nov. 25, 2024, 9:35 a.m. UTC | #1
On Fri, Nov 22, 2024 at 11:01:18PM -0800, Kanchana P Sridhar wrote:
> This commit adds batch_compress() and batch_decompress() interfaces to:
> 
>   struct acomp_alg
>   struct crypto_acomp
> 
> This allows the iaa_crypto Intel IAA driver to register implementations for
> the batch_compress() and batch_decompress() API, that can subsequently be
> invoked from the kernel zswap/zram swap modules to compress/decompress
> up to CRYPTO_BATCH_SIZE (i.e. 8) pages in parallel in the IAA hardware
> accelerator to improve swapout/swapin performance.
> 
> A new helper function acomp_has_async_batching() can be invoked to query
> if a crypto_acomp has registered these batch_compress and batch_decompress
> interfaces.
> 
> Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
> ---
>  crypto/acompress.c                  |  2 +
>  include/crypto/acompress.h          | 91 +++++++++++++++++++++++++++++
>  include/crypto/internal/acompress.h | 16 +++++
>  3 files changed, 109 insertions(+)

This should be rebased on top of my request chaining patch:

https://lore.kernel.org/linux-crypto/677614fbdc70b31df2e26483c8d2cd1510c8af91.1730021644.git.herbert@gondor.apana.org.au/

Request chaining provides a perfect fit for batching.

Cheers,
Kanchana P Sridhar Nov. 25, 2024, 8:03 p.m. UTC | #2
> -----Original Message-----
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Sent: Monday, November 25, 2024 1:35 AM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> chengming.zhou@linux.dev; usamaarif642@gmail.com;
> ryan.roberts@arm.com; ying.huang@intel.com; 21cnbao@gmail.com;
> akpm@linux-foundation.org; linux-crypto@vger.kernel.org;
> davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> <kristen.c.accardi@intel.com>; Feghali, Wajdi K <wajdi.k.feghali@intel.com>;
> Gopal, Vinodh <vinodh.gopal@intel.com>
> Subject: Re: [PATCH v4 01/10] crypto: acomp - Define two new interfaces for
> compress/decompress batching.
> 
> On Fri, Nov 22, 2024 at 11:01:18PM -0800, Kanchana P Sridhar wrote:
> > This commit adds batch_compress() and batch_decompress() interfaces to:
> >
> >   struct acomp_alg
> >   struct crypto_acomp
> >
> > This allows the iaa_crypto Intel IAA driver to register implementations for
> > the batch_compress() and batch_decompress() API, that can subsequently
> be
> > invoked from the kernel zswap/zram swap modules to
> compress/decompress
> > up to CRYPTO_BATCH_SIZE (i.e. 8) pages in parallel in the IAA hardware
> > accelerator to improve swapout/swapin performance.
> >
> > A new helper function acomp_has_async_batching() can be invoked to
> query
> > if a crypto_acomp has registered these batch_compress and
> batch_decompress
> > interfaces.
> >
> > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
> > ---
> >  crypto/acompress.c                  |  2 +
> >  include/crypto/acompress.h          | 91 +++++++++++++++++++++++++++++
> >  include/crypto/internal/acompress.h | 16 +++++
> >  3 files changed, 109 insertions(+)
> 
> This should be rebased on top of my request chaining patch:
> 
> https://lore.kernel.org/linux-
> crypto/677614fbdc70b31df2e26483c8d2cd1510c8af91.1730021644.git.herb
> ert@gondor.apana.org.au/
> 
> Request chaining provides a perfect fit for batching.

Thanks Herbert. I am working on integrating the request chaining with
the iaa_crypto driver, expecting to have this ready for v5.

Thanks,
Kanchana

> 
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Kanchana P Sridhar Nov. 26, 2024, 2:13 a.m. UTC | #3
Hi Herbert,

> -----Original Message-----
> From: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Sent: Monday, November 25, 2024 12:03 PM
> To: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> chengming.zhou@linux.dev; usamaarif642@gmail.com;
> ryan.roberts@arm.com; ying.huang@intel.com; 21cnbao@gmail.com;
> akpm@linux-foundation.org; linux-crypto@vger.kernel.org;
> davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> <kristen.c.accardi@intel.com>; Feghali, Wajdi K <wajdi.k.feghali@intel.com>;
> Gopal, Vinodh <vinodh.gopal@intel.com>; Sridhar, Kanchana P
> <kanchana.p.sridhar@intel.com>
> Subject: RE: [PATCH v4 01/10] crypto: acomp - Define two new interfaces for
> compress/decompress batching.
> 
> 
> > -----Original Message-----
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Sent: Monday, November 25, 2024 1:35 AM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> > chengming.zhou@linux.dev; usamaarif642@gmail.com;
> > ryan.roberts@arm.com; ying.huang@intel.com; 21cnbao@gmail.com;
> > akpm@linux-foundation.org; linux-crypto@vger.kernel.org;
> > davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org;
> > ebiggers@google.com; surenb@google.com; Accardi, Kristen C
> > <kristen.c.accardi@intel.com>; Feghali, Wajdi K
> <wajdi.k.feghali@intel.com>;
> > Gopal, Vinodh <vinodh.gopal@intel.com>
> > Subject: Re: [PATCH v4 01/10] crypto: acomp - Define two new interfaces
> for
> > compress/decompress batching.
> >
> > On Fri, Nov 22, 2024 at 11:01:18PM -0800, Kanchana P Sridhar wrote:
> > > This commit adds batch_compress() and batch_decompress() interfaces
> to:
> > >
> > >   struct acomp_alg
> > >   struct crypto_acomp
> > >
> > > This allows the iaa_crypto Intel IAA driver to register implementations for
> > > the batch_compress() and batch_decompress() API, that can subsequently
> > be
> > > invoked from the kernel zswap/zram swap modules to
> > compress/decompress
> > > up to CRYPTO_BATCH_SIZE (i.e. 8) pages in parallel in the IAA hardware
> > > accelerator to improve swapout/swapin performance.
> > >
> > > A new helper function acomp_has_async_batching() can be invoked to
> > query
> > > if a crypto_acomp has registered these batch_compress and
> > batch_decompress
> > > interfaces.
> > >
> > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
> > > ---
> > >  crypto/acompress.c                  |  2 +
> > >  include/crypto/acompress.h          | 91
> +++++++++++++++++++++++++++++
> > >  include/crypto/internal/acompress.h | 16 +++++
> > >  3 files changed, 109 insertions(+)
> >
> > This should be rebased on top of my request chaining patch:
> >
> > https://lore.kernel.org/linux-
> >
> crypto/677614fbdc70b31df2e26483c8d2cd1510c8af91.1730021644.git.herb
> > ert@gondor.apana.org.au/
> >
> > Request chaining provides a perfect fit for batching.

I wanted to make sure I understand your suggestion: Are you suggesting we
implement request chaining for "struct acomp_req" similar to how this is being
done for "struct ahash_request" in your patch?

I guess I was a bit confused by your comment about rebasing, which would
imply a direct use of the request chaining API you've provided for "crypto hash".
I would appreciate it if you could clarify.

Thanks,
Kanchana

> 
> Thanks Herbert. I am working on integrating the request chaining with
> the iaa_crypto driver, expecting to have this ready for v5.
> 
> Thanks,
> Kanchana
> 
> >
> > Cheers,
> > --
> > Email: Herbert Xu <herbert@gondor.apana.org.au>
> > Home Page: http://gondor.apana.org.au/~herbert/
> > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Kanchana P Sridhar Nov. 26, 2024, 2:37 a.m. UTC | #4
> -----Original Message-----
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Sent: Monday, November 25, 2024 6:14 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com>
> Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> hannes@cmpxchg.org; yosryahmed@google.com; nphamcs@gmail.com;
> chengming.zhou@linux.dev; usamaarif642@gmail.com;
> ryan.roberts@arm.com; 21cnbao@gmail.com; akpm@linux-foundation.org;
> linux-crypto@vger.kernel.org; davem@davemloft.net; clabbe@baylibre.com;
> ardb@kernel.org; ebiggers@google.com; surenb@google.com; Accardi,
> Kristen C <kristen.c.accardi@intel.com>; Feghali, Wajdi K
> <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com>
> Subject: Re: [PATCH v4 01/10] crypto: acomp - Define two new interfaces for
> compress/decompress batching.
> 
> On Tue, Nov 26, 2024 at 02:13:00AM +0000, Sridhar, Kanchana P wrote:
> >
> > I wanted to make sure I understand your suggestion: Are you suggesting we
> > implement request chaining for "struct acomp_req" similar to how this is
> being
> > done for "struct ahash_request" in your patch?
> >
> > I guess I was a bit confused by your comment about rebasing, which would
> > imply a direct use of the request chaining API you've provided for "crypto
> hash".
> > I would appreciate it if you could clarify.
> 
> Yes I was referring to the generic part of request chaining,
> and not rebasing acomp on top of ahash.

Ok, thanks for the clarification! Would it be simpler if you could submit a
crypto_acomp request chaining patch that I can then use in iaa_crypto?
I would greatly appreciate this.

Thanks,
Kanchana


> 
> Cheers,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu Nov. 27, 2024, 5:04 a.m. UTC | #5
On Wed, Nov 27, 2024 at 01:22:40AM +0000, Sridhar, Kanchana P wrote:
>
> 1) It appears a calling module like zswap would only be able to get 1 error
>      status for all the requests that are chained, as against individual error
>      statuses for each of the [de]compress jobs. Is this understanding correct?

No, each request gets its own error in req->base.err.

> 2) The request chaining makes use of the req->base.complete and req->base.data,
>      which are also used for internal data by the iaa_crypto driver. I can add another
>      "void *data1" member to struct crypto_async_request to work around this,
>      such that iaa_crypto uses "data1" instead of "data".

These fields are meant for the user.  It's best not to use them
to store driver data, but if you really wanted to, then the API
ahash code provides an example of doing it (please only do this
as a last resort as it's rather fragile).

Cheers,
diff mbox series

Patch

diff --git a/crypto/acompress.c b/crypto/acompress.c
index 6fdf0ff9f3c0..a506db499a37 100644
--- a/crypto/acompress.c
+++ b/crypto/acompress.c
@@ -71,6 +71,8 @@  static int crypto_acomp_init_tfm(struct crypto_tfm *tfm)
 
 	acomp->compress = alg->compress;
 	acomp->decompress = alg->decompress;
+	acomp->batch_compress = alg->batch_compress;
+	acomp->batch_decompress = alg->batch_decompress;
 	acomp->dst_free = alg->dst_free;
 	acomp->reqsize = alg->reqsize;
 
diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 54937b615239..4252bab3d0e1 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -37,12 +37,20 @@  struct acomp_req {
 	void *__ctx[] CRYPTO_MINALIGN_ATTR;
 };
 
+/*
+ * The max compress/decompress batch size, for crypto algorithms
+ * that support batch_compress and batch_decompress API.
+ */
+#define CRYPTO_BATCH_SIZE 8UL
+
 /**
  * struct crypto_acomp - user-instantiated objects which encapsulate
  * algorithms and core processing logic
  *
  * @compress:		Function performs a compress operation
  * @decompress:		Function performs a de-compress operation
+ * @batch_compress:	Function performs a batch compress operation
+ * @batch_decompress:	Function performs a batch decompress operation
  * @dst_free:		Frees destination buffer if allocated inside the
  *			algorithm
  * @reqsize:		Context size for (de)compression requests
@@ -51,6 +59,20 @@  struct acomp_req {
 struct crypto_acomp {
 	int (*compress)(struct acomp_req *req);
 	int (*decompress)(struct acomp_req *req);
+	void (*batch_compress)(struct acomp_req *reqs[],
+			       struct crypto_wait *wait,
+			       struct page *pages[],
+			       u8 *dsts[],
+			       unsigned int dlens[],
+			       int errors[],
+			       int nr_pages);
+	void (*batch_decompress)(struct acomp_req *reqs[],
+				 struct crypto_wait *wait,
+				 u8 *srcs[],
+				 struct page *pages[],
+				 unsigned int slens[],
+				 int errors[],
+				 int nr_pages);
 	void (*dst_free)(struct scatterlist *dst);
 	unsigned int reqsize;
 	struct crypto_tfm base;
@@ -142,6 +164,13 @@  static inline bool acomp_is_async(struct crypto_acomp *tfm)
 	       CRYPTO_ALG_ASYNC;
 }
 
+static inline bool acomp_has_async_batching(struct crypto_acomp *tfm)
+{
+	return (acomp_is_async(tfm) &&
+		(crypto_comp_alg_common(tfm)->base.cra_flags & CRYPTO_ALG_TYPE_ACOMPRESS) &&
+		tfm->batch_compress && tfm->batch_decompress);
+}
+
 static inline struct crypto_acomp *crypto_acomp_reqtfm(struct acomp_req *req)
 {
 	return __crypto_acomp_tfm(req->base.tfm);
@@ -265,4 +294,66 @@  static inline int crypto_acomp_decompress(struct acomp_req *req)
 	return crypto_acomp_reqtfm(req)->decompress(req);
 }
 
+/**
+ * crypto_acomp_batch_compress() -- Invoke asynchronous compress of
+ *                                  a batch of requests
+ *
+ * Function invokes the asynchronous batch compress operation
+ *
+ * @reqs: @nr_pages asynchronous compress requests.
+ * @wait: crypto_wait for synchronous acomp batch compress. If NULL, the
+ *        driver must provide a way to process completions asynchronously.
+ * @pages: Pages to be compressed.
+ * @dsts: Pre-allocated destination buffers to store results of compression.
+ * @dlens: Will contain the compressed lengths.
+ * @errors: zero on successful compression of the corresponding
+ *          req, or error code in case of error.
+ * @nr_pages: The number of pages, up to CRYPTO_BATCH_SIZE,
+ *            to be compressed.
+ */
+static inline void crypto_acomp_batch_compress(struct acomp_req *reqs[],
+					       struct crypto_wait *wait,
+					       struct page *pages[],
+					       u8 *dsts[],
+					       unsigned int dlens[],
+					       int errors[],
+					       int nr_pages)
+{
+	struct crypto_acomp *tfm = crypto_acomp_reqtfm(reqs[0]);
+
+	return tfm->batch_compress(reqs, wait, pages, dsts,
+				   dlens, errors, nr_pages);
+}
+
+/**
+ * crypto_acomp_batch_decompress() -- Invoke asynchronous decompress of
+ *                                    a batch of requests
+ *
+ * Function invokes the asynchronous batch decompress operation
+ *
+ * @reqs: @nr_pages asynchronous decompress requests.
+ * @wait: crypto_wait for synchronous acomp batch decompress. If NULL, the
+ *        driver must provide a way to process completions asynchronously.
+ * @srcs: The src buffers to be decompressed.
+ * @pages: The pages to store the decompressed buffers.
+ * @slens: Compressed lengths of @srcs.
+ * @errors: zero on successful compression of the corresponding
+ *          req, or error code in case of error.
+ * @nr_pages: The number of pages, up to CRYPTO_BATCH_SIZE,
+ *            to be decompressed.
+ */
+static inline void crypto_acomp_batch_decompress(struct acomp_req *reqs[],
+						 struct crypto_wait *wait,
+						 u8 *srcs[],
+						 struct page *pages[],
+						 unsigned int slens[],
+						 int errors[],
+						 int nr_pages)
+{
+	struct crypto_acomp *tfm = crypto_acomp_reqtfm(reqs[0]);
+
+	return tfm->batch_decompress(reqs, wait, srcs, pages,
+				     slens, errors, nr_pages);
+}
+
 #endif
diff --git a/include/crypto/internal/acompress.h b/include/crypto/internal/acompress.h
index 8831edaafc05..acfe2d9d5a83 100644
--- a/include/crypto/internal/acompress.h
+++ b/include/crypto/internal/acompress.h
@@ -17,6 +17,8 @@ 
  *
  * @compress:	Function performs a compress operation
  * @decompress:	Function performs a de-compress operation
+ * @batch_compress:	Function performs a batch compress operation
+ * @batch_decompress:	Function performs a batch decompress operation
  * @dst_free:	Frees destination buffer if allocated inside the algorithm
  * @init:	Initialize the cryptographic transformation object.
  *		This function is used to initialize the cryptographic
@@ -37,6 +39,20 @@ 
 struct acomp_alg {
 	int (*compress)(struct acomp_req *req);
 	int (*decompress)(struct acomp_req *req);
+	void (*batch_compress)(struct acomp_req *reqs[],
+			       struct crypto_wait *wait,
+			       struct page *pages[],
+			       u8 *dsts[],
+			       unsigned int dlens[],
+			       int errors[],
+			       int nr_pages);
+	void (*batch_decompress)(struct acomp_req *reqs[],
+				 struct crypto_wait *wait,
+				 u8 *srcs[],
+				 struct page *pages[],
+				 unsigned int slens[],
+				 int errors[],
+				 int nr_pages);
 	void (*dst_free)(struct scatterlist *dst);
 	int (*init)(struct crypto_acomp *tfm);
 	void (*exit)(struct crypto_acomp *tfm);