mbox series

[v5,0/3] mm/zswap & crypto/compress: remove a couple of memcpy

Message ID 20240220064414.262582-1-21cnbao@gmail.com
Headers show
Series mm/zswap & crypto/compress: remove a couple of memcpy | expand

Message

Barry Song Feb. 20, 2024, 6:44 a.m. UTC
From: Barry Song <v-songbaohua@oppo.com>

The patchset removes a couple of memcpy in zswap and crypto
to improve zswap's performance.

Thanks for Chengming Zhou's test and perf data.
Quote from Chengming,
 I just tested these three patches on my server, found improvement in the
 kernel build testcase on a tmpfs with zswap (lz4 + zsmalloc) enabled.
 
         mm-stable 501a06fe8e4c  patched
 real    1m38.028s               1m32.317s
 user    19m11.482s              18m39.439s
 sys     19m26.445s              17m5.646s


This patchset applies to mm-unstable as recently zswap has
lots of change.

-v5:
  * remove the helper of exposing algorithm flags, alternative directly
    expose acomp_is_async() by test ASYNC flag according to Herbert;
  * remove the fixes of cra_flags for intel and hisilicon async drivers,
    they are separated patches[1] according to Herbert

[1] https://lore.kernel.org/linux-crypto/20240220044222.197614-1-v-songbaohua@oppo.com/

Barry Song (3):
  crypto: introduce: acomp_is_async to expose if comp drivers might
    sleep
  mm/zswap: remove the memcpy if acomp is not sleepable
  crypto: scompress: remove memcpy if sg_nents is 1

 crypto/scompress.c         | 36 +++++++++++++++++++++++++++++-------
 include/crypto/acompress.h |  6 ++++++
 mm/zswap.c                 |  6 ++++--
 3 files changed, 39 insertions(+), 9 deletions(-)

Comments

Herbert Xu Feb. 21, 2024, 5:31 a.m. UTC | #1
On Tue, Feb 20, 2024 at 07:44:12PM +1300, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> acomp's users might want to know if acomp is really async to
> optimize themselves. One typical user which can benefit from
> exposed async stat is zswap.
> 
> In zswap, zsmalloc is the most commonly used allocator for
> (and perhaps the only one). For zsmalloc, we cannot sleep
> while we map the compressed memory, so we copy it to a
> temporary buffer. By knowing the alg won't sleep can help
> zswap to avoid the need for a buffer. This shows noticeable
> improvement in load/store latency of zswap.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  include/crypto/acompress.h | 6 ++++++
>  1 file changed, 6 insertions(+)

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
Barry Song Feb. 21, 2024, 5:55 a.m. UTC | #2
On Wed, Feb 21, 2024 at 6:35 PM Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Tue, Feb 20, 2024 at 07:44:14PM +1300, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > while sg_nents is 1 which is always true for the current kernel
> > as the only user - zswap is the case, we should remove two big
> > memcpy.
> >
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > Tested-by: Chengming Zhou <zhouchengming@bytedance.com>
> > ---
> >  crypto/scompress.c | 36 +++++++++++++++++++++++++++++-------
> >  1 file changed, 29 insertions(+), 7 deletions(-)
>
> This patch is independent of the other two.  Please split it
> out so I can apply it directly.

Ok. OTOH, patch 3/3 has no dependency with other patches. so patch
3/3 should be perfectly applicable to crypto :-)

Hi Andrew,
Would you please handle patch 1/3 and 2/3 in mm-tree given Herbert's ack on
1/3?

>
> > @@ -134,13 +135,25 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
> >       scratch = raw_cpu_ptr(&scomp_scratch);
> >       spin_lock(&scratch->lock);
> >
> > -     scatterwalk_map_and_copy(scratch->src, req->src, 0, req->slen, 0);
> > +     if (sg_nents(req->src) == 1) {
> > +             src = kmap_local_page(sg_page(req->src)) + req->src->offset;
>
> What if the SG entry is longer than PAGE_SIZE (or indeed crosses a
> page boundary)? I think the test needs to be strengthened.

I don't understand what is the problem for a nents to cross two pages
as anyway they are contiguous in both physical and virtual addresses.
if they are not contiguous, they will be two nents.

>
> Thanks,
> --
> Email: Herbert Xu <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Thanks
Barry