diff mbox series

[v2,3/3] crypto: scompress - defer allocation of scratch buffer to first use

Message ID 20170721154238.21697-4-ard.biesheuvel@linaro.org
State Accepted
Commit 6a8487a1f29fa2801a530200c6cdf81592b1a4d7
Headers show
Series crypto: scompress - defer allocation of percpu scratch buffers | expand

Commit Message

Ard Biesheuvel July 21, 2017, 3:42 p.m. UTC
The scompress code allocates 2 x 128 KB of scratch buffers for each CPU,
so that clients of the async API can use synchronous implementations
even from atomic context. However, on systems such as Cavium Thunderx
(which has 96 cores), this adds up to a non-negligible 24 MB. Also,
32-bit systems may prefer to use their precious vmalloc space for other
things,especially since there don't appear to be any clients for the
async compression API yet.

So let's defer allocation of the scratch buffers until the first time
we allocate an acompress cipher based on an scompress implementation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---
 crypto/scompress.c | 46 ++++++++------------
 1 file changed, 17 insertions(+), 29 deletions(-)

-- 
2.11.0

Comments

Giovanni Cabiddu July 25, 2017, 11:36 p.m. UTC | #1
Hi Ard,

On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:
> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)

> +{

> +	int ret;

> +

> +	mutex_lock(&scomp_lock);

> +	ret = crypto_scomp_alloc_all_scratches();

> +	mutex_unlock(&scomp_lock);

> +

> +	return ret;

> +}

If you allocate the scratch buffers at init_tfm, don't you end
up with a situation where if you allocate two tfms of the same algo
then you get twice the number of scratches?
If that is the case, we should implement a reference count
mechanism.
Am I missing something?

Regards,

-- 
Giovanni
Ard Biesheuvel July 26, 2017, 12:07 a.m. UTC | #2
> On 26 Jul 2017, at 00:36, Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote:

> 

> Hi Ard,

> 

>> On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:

>> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)

>> +{

>> +    int ret;

>> +

>> +    mutex_lock(&scomp_lock);

>> +    ret = crypto_scomp_alloc_all_scratches();

>> +    mutex_unlock(&scomp_lock);

>> +

>> +    return ret;

>> +}

> If you allocate the scratch buffers at init_tfm, don't you end

> up with a situation where if you allocate two tfms of the same algo

> then you get twice the number of scratches?

> If that is the case, we should implement a reference count

> mechanism.

> Am I missing something?

> 


Isn't the mutex supposed to take care of that?
Giovanni Cabiddu July 28, 2017, 9:13 a.m. UTC | #3
On Wed, Jul 26, 2017 at 01:07:34AM +0100, Ard Biesheuvel wrote:
> 

> > On 26 Jul 2017, at 00:36, Giovanni Cabiddu <giovanni.cabiddu@intel.com> wrote:

> > 

> > Hi Ard,

> > 

> >> On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote:

> >> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)

> >> +{

> >> +    int ret;

> >> +

> >> +    mutex_lock(&scomp_lock);

> >> +    ret = crypto_scomp_alloc_all_scratches();

> >> +    mutex_unlock(&scomp_lock);

> >> +

> >> +    return ret;

> >> +}

> > If you allocate the scratch buffers at init_tfm, don't you end

> > up with a situation where if you allocate two tfms of the same algo

> > then you get twice the number of scratches?

> > If that is the case, we should implement a reference count

> > mechanism.

> > Am I missing something?

> > 

> 

> Isn't the mutex supposed to take care of that?

Ignore my previous question. There is already a reference count
mechanism that prevents that to happen.
Your change sounds correct to me.

Regards,

-- 
Giovanni
diff mbox series

Patch

diff --git a/crypto/scompress.c b/crypto/scompress.c
index 2c07648305ad..2075e2c4e7df 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -65,11 +65,6 @@  static void crypto_scomp_show(struct seq_file *m, struct crypto_alg *alg)
 	seq_puts(m, "type         : scomp\n");
 }
 
-static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
-{
-	return 0;
-}
-
 static void crypto_scomp_free_scratches(void * __percpu *scratches)
 {
 	int i;
@@ -134,6 +129,17 @@  static int crypto_scomp_alloc_all_scratches(void)
 	return 0;
 }
 
+static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
+{
+	int ret;
+
+	mutex_lock(&scomp_lock);
+	ret = crypto_scomp_alloc_all_scratches();
+	mutex_unlock(&scomp_lock);
+
+	return ret;
+}
+
 static void crypto_scomp_sg_free(struct scatterlist *sgl)
 {
 	int i, n;
@@ -241,6 +247,10 @@  static void crypto_exit_scomp_ops_async(struct crypto_tfm *tfm)
 	struct crypto_scomp **ctx = crypto_tfm_ctx(tfm);
 
 	crypto_free_scomp(*ctx);
+
+	mutex_lock(&scomp_lock);
+	crypto_scomp_free_all_scratches();
+	mutex_unlock(&scomp_lock);
 }
 
 int crypto_init_scomp_ops_async(struct crypto_tfm *tfm)
@@ -317,40 +327,18 @@  static const struct crypto_type crypto_scomp_type = {
 int crypto_register_scomp(struct scomp_alg *alg)
 {
 	struct crypto_alg *base = &alg->base;
-	int ret = -ENOMEM;
-
-	mutex_lock(&scomp_lock);
-	if (crypto_scomp_alloc_all_scratches())
-		goto error;
 
 	base->cra_type = &crypto_scomp_type;
 	base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK;
 	base->cra_flags |= CRYPTO_ALG_TYPE_SCOMPRESS;
 
-	ret = crypto_register_alg(base);
-	if (ret)
-		goto error;
-
-	mutex_unlock(&scomp_lock);
-	return ret;
-
-error:
-	crypto_scomp_free_all_scratches();
-	mutex_unlock(&scomp_lock);
-	return ret;
+	return crypto_register_alg(base);
 }
 EXPORT_SYMBOL_GPL(crypto_register_scomp);
 
 int crypto_unregister_scomp(struct scomp_alg *alg)
 {
-	int ret;
-
-	mutex_lock(&scomp_lock);
-	ret = crypto_unregister_alg(&alg->base);
-	crypto_scomp_free_all_scratches();
-	mutex_unlock(&scomp_lock);
-
-	return ret;
+	return crypto_unregister_alg(&alg->base);
 }
 EXPORT_SYMBOL_GPL(crypto_unregister_scomp);