mbox series

[RFC,00/21] crypto: consolidate and clean up compression APIs

Message ID 20230718125847.3869700-1-ardb@kernel.org
Headers show
Series crypto: consolidate and clean up compression APIs | expand

Message

Ard Biesheuvel July 18, 2023, 12:58 p.m. UTC
This series is presented as an RFC, because I haven't quite convinced
myself that the acomp API really needs both scatterlists and request
objects to encapsulate the in- and output buffers, and perhaps there are
more drastic simplifications that we might consider.

However, the current situation with comp, scomp and acomp APIs is
definitely something that needs cleaning up, and so I implemented this
series under the working assumption that we will keep the current acomp
semantics wrt scatterlists and request objects.

Patch #1 drops zlib-deflate support in software, along with the test
cases we have for it. This has no users and should have never been
added.

Patch #2 removes the support for on-the-fly allocation of destination
buffers and scatterlists from the Intel QAT driver. This is never used,
and not even implemented by all drivers (the HiSilicon ZIP driver does
not support it). The diffstat of this patch makes a good case why the
caller should be in charge of allocating the memory, not the driver.

Patch #3 removes this on-the-fly allocation from the core acomp API.

Patch #4 does a minimal conversion of IPcomp to the acomp API.

Patch #5 and #6 are independent UBIFS fixes for things I ran into while
working on patch #7.

Patch #7 converts UBIFS to the acomp API.

Patch #8 converts the zram block driver to the acomp API.

Patches #9 to #19 remove the existing 'comp' API implementations as well
as the core plumbing, now that all clients of the API have been
converted. (Note that pstore stopped using the 'comp' API as well, but
these changes are already queued elsewhere)

Patch #20 converts the generic deflate compression driver to the acomp
API, so that it can natively operate on discontiguous buffers, rather
than requiring scratch buffers. This is the only IPcomp compression
algorithm we actually implement in software in the kernel, and this
conversion could help IPcomp if we decide to convert it further, and
remove the code that 'linearizes' SKBs in order to present them to the
compression API as a contiguous range.

Patch #21 converts the acomp-to-scomp adaptation layer so it no longer
requires per-CPU scratch buffers. This takes advantage of the fact that
all existing users of the acomp API pass contiguous memory regions, and
so scratch buffers are only needed in exceptional cases, and can be
allocated and deallocated on the fly. This removes the need for
preallocated per-CPU scratch buffers that can easily add up to tens of
megabytes on modern systems with high core counts and SMT.

These changes have been build tested and only lightly runtime tested. In
particular, I haven't performed any thorough testing on the acomp
conversions of IPcomp, UBIFS and ZRAM. Any hints on which respective
methods and test cases to use here are highly appreciated.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Haren Myneni <haren@us.ibm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: David Ahern <dsahern@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-block@vger.kernel.org
Cc: qat-linux@intel.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-mtd@lists.infradead.org
Cc: netdev@vger.kernel.org

Ard Biesheuvel (21):
  crypto: scomp - Revert "add support for deflate rfc1950 (zlib)"
  crypto: qat - Drop support for allocating destination buffers
  crypto: acompress - Drop destination scatterlist allocation feature
  net: ipcomp: Migrate to acomp API from deprecated comp API
  ubifs: Pass worst-case buffer size to compression routines
  ubifs: Avoid allocating buffer space unnecessarily
  ubifs: Migrate to acomp compression API
  zram: Migrate to acomp compression API
  crypto: nx - Migrate to scomp API
  crypto: 842 - drop obsolete 'comp' implementation
  crypto: deflate - drop obsolete 'comp' implementation
  crypto: lz4 - drop obsolete 'comp' implementation
  crypto: lz4hc - drop obsolete 'comp' implementation
  crypto: lzo-rle - drop obsolete 'comp' implementation
  crypto: lzo - drop obsolete 'comp' implementation
  crypto: zstd - drop obsolete 'comp' implementation
  crypto: cavium/zip - drop obsolete 'comp' implementation
  crypto: compress_null - drop obsolete 'comp' implementation
  crypto: remove obsolete 'comp' compression API
  crypto: deflate - implement acomp API directly
  crypto: scompress - Drop the use of per-cpu scratch buffers

 Documentation/crypto/architecture.rst               |   2 -
 crypto/842.c                                        |  63 +---
 crypto/Makefile                                     |   2 +-
 crypto/acompress.c                                  |   6 -
 crypto/api.c                                        |   4 -
 crypto/compress.c                                   |  32 --
 crypto/crypto_null.c                                |  31 +-
 crypto/crypto_user_base.c                           |  16 -
 crypto/crypto_user_stat.c                           |   4 -
 crypto/deflate.c                                    | 386 ++++++--------------
 crypto/lz4.c                                        |  61 +---
 crypto/lz4hc.c                                      |  63 +---
 crypto/lzo-rle.c                                    |  60 +--
 crypto/lzo.c                                        |  60 +--
 crypto/proc.c                                       |   3 -
 crypto/scompress.c                                  | 169 ++++-----
 crypto/testmgr.c                                    | 184 +---------
 crypto/testmgr.h                                    |  75 ----
 crypto/zstd.c                                       |  56 +--
 drivers/block/zram/zcomp.c                          |  67 +++-
 drivers/block/zram/zcomp.h                          |   7 +-
 drivers/block/zram/zram_drv.c                       |  12 +-
 drivers/crypto/cavium/zip/zip_crypto.c              |  40 --
 drivers/crypto/cavium/zip/zip_crypto.h              |  10 -
 drivers/crypto/cavium/zip/zip_main.c                |  50 +--
 drivers/crypto/intel/qat/qat_common/qat_bl.c        | 159 --------
 drivers/crypto/intel/qat/qat_common/qat_bl.h        |   6 -
 drivers/crypto/intel/qat/qat_common/qat_comp_algs.c |  86 +----
 drivers/crypto/intel/qat/qat_common/qat_comp_req.h  |  10 -
 drivers/crypto/nx/nx-842.c                          |  34 +-
 drivers/crypto/nx/nx-842.h                          |  14 +-
 drivers/crypto/nx/nx-common-powernv.c               |  30 +-
 drivers/crypto/nx/nx-common-pseries.c               |  32 +-
 fs/ubifs/compress.c                                 |  61 +++-
 fs/ubifs/file.c                                     |  46 +--
 fs/ubifs/journal.c                                  |  33 +-
 fs/ubifs/ubifs.h                                    |  15 +-
 include/crypto/acompress.h                          |  21 +-
 include/crypto/internal/scompress.h                 |   2 -
 include/crypto/scatterwalk.h                        |   2 +-
 include/linux/crypto.h                              |  49 +--
 include/net/ipcomp.h                                |   4 +-
 net/xfrm/xfrm_algo.c                                |   7 +-
 net/xfrm/xfrm_ipcomp.c                              | 107 ++++--
 44 files changed, 502 insertions(+), 1679 deletions(-)
 delete mode 100644 crypto/compress.c

Comments

Herbert Xu July 28, 2023, 9:55 a.m. UTC | #1
On Tue, Jul 18, 2023 at 02:58:26PM +0200, Ard Biesheuvel wrote:
>
> Patch #2 removes the support for on-the-fly allocation of destination
> buffers and scatterlists from the Intel QAT driver. This is never used,
> and not even implemented by all drivers (the HiSilicon ZIP driver does
> not support it). The diffstat of this patch makes a good case why the
> caller should be in charge of allocating the memory, not the driver.

The implementation in qat may not be optimal, but being able to
allocate memory in the algorithm is a big plus for IPComp at least.

Being able to allocate memory page by page as you decompress
means that:

1. We're not affected by memory fragmentation.
2. We don't waste memory by always allocating for the worst case.

Cheers,
Ard Biesheuvel July 28, 2023, 9:57 a.m. UTC | #2
On Fri, 28 Jul 2023 at 11:56, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Jul 18, 2023 at 02:58:26PM +0200, Ard Biesheuvel wrote:
> >
> > Patch #2 removes the support for on-the-fly allocation of destination
> > buffers and scatterlists from the Intel QAT driver. This is never used,
> > and not even implemented by all drivers (the HiSilicon ZIP driver does
> > not support it). The diffstat of this patch makes a good case why the
> > caller should be in charge of allocating the memory, not the driver.
>
> The implementation in qat may not be optimal, but being able to
> allocate memory in the algorithm is a big plus for IPComp at least.
>
> Being able to allocate memory page by page as you decompress
> means that:
>
> 1. We're not affected by memory fragmentation.
> 2. We don't waste memory by always allocating for the worst case.
>

So will IPcomp be able to simply assign those pages to the SKB afterwards?
Herbert Xu July 28, 2023, 9:59 a.m. UTC | #3
On Fri, Jul 28, 2023 at 11:57:42AM +0200, Ard Biesheuvel wrote:
>
> So will IPcomp be able to simply assign those pages to the SKB afterwards?

Yes that is the idea.  The network stack is very much in love with
SG lists :)

Thanks,
Ard Biesheuvel July 28, 2023, 10:03 a.m. UTC | #4
On Fri, 28 Jul 2023 at 11:59, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Fri, Jul 28, 2023 at 11:57:42AM +0200, Ard Biesheuvel wrote:
> >
> > So will IPcomp be able to simply assign those pages to the SKB afterwards?
>
> Yes that is the idea.  The network stack is very much in love with
> SG lists :)
>

Fair enough. But my point remains: this requires a lot of boilerplate
on the part of the driver, and it would be better if we could do this
in the acomp generic layer.

Does the IPcomp case always know the decompressed size upfront?
Herbert Xu July 28, 2023, 10:05 a.m. UTC | #5
On Fri, Jul 28, 2023 at 12:03:23PM +0200, Ard Biesheuvel wrote:
>
> Fair enough. But my point remains: this requires a lot of boilerplate
> on the part of the driver, and it would be better if we could do this
> in the acomp generic layer.

Absolutely.  If the hardware can't support allocate-as-you-go then
this should very much go into the generic layer.

> Does the IPcomp case always know the decompressed size upfront?

No it doesn't know.  Of course, we could optimise it because we know
that in 99% cases, the packet is going to be less than 4K.  But we
need a safety-net for those weird jumbo packets.

Thanks,