diff mbox series

[RFC,05/21] ubifs: Pass worst-case buffer size to compression routines

Message ID 20230718125847.3869700-6-ardb@kernel.org
State New
Headers show
Series crypto: consolidate and clean up compression APIs | expand

Commit Message

Ard Biesheuvel July 18, 2023, 12:58 p.m. UTC
Currently, the ubifs code allocates a worst case buffer size to
recompress a data node, but does not pass the size of that buffer to the
compression code. This means that the compression code will never use
the additional space, and might fail spuriously due to lack of space.

So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
buffer. Doing so is guaranteed not to overflow, given that the preceding
kmalloc_array() call would have failed otherwise.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 fs/ubifs/journal.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Eric Biggers July 18, 2023, 10:38 p.m. UTC | #1
On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> Currently, the ubifs code allocates a worst case buffer size to
> recompress a data node, but does not pass the size of that buffer to the
> compression code. This means that the compression code will never use
> the additional space, and might fail spuriously due to lack of space.
> 
> So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
> buffer. Doing so is guaranteed not to overflow, given that the preceding
> kmalloc_array() call would have failed otherwise.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  fs/ubifs/journal.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> index dc52ac0f4a345f30..4e5961878f336033 100644
> --- a/fs/ubifs/journal.c
> +++ b/fs/ubifs/journal.c
> @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
>  	if (!buf)
>  		return -ENOMEM;
>  
> +	out_len *= WORST_COMPR_FACTOR;
> +
>  	dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
>  	data_size = dn_size - UBIFS_DATA_NODE_SZ;
>  	compr_type = le16_to_cpu(dn->compr_type);

This looks like another case where data that would be expanded by compression
should just be stored uncompressed instead.

In fact, it seems that UBIFS does that already.  ubifs_compress() has this:

        /*
         * If the data compressed only slightly, it is better to leave it
         * uncompressed to improve read speed.
         */
        if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
                goto no_compr;

So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.

- Eric
Ard Biesheuvel July 19, 2023, 8:33 a.m. UTC | #2
On Wed, 19 Jul 2023 at 00:38, Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> > Currently, the ubifs code allocates a worst case buffer size to
> > recompress a data node, but does not pass the size of that buffer to the
> > compression code. This means that the compression code will never use
> > the additional space, and might fail spuriously due to lack of space.
> >
> > So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
> > buffer. Doing so is guaranteed not to overflow, given that the preceding
> > kmalloc_array() call would have failed otherwise.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  fs/ubifs/journal.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> > index dc52ac0f4a345f30..4e5961878f336033 100644
> > --- a/fs/ubifs/journal.c
> > +++ b/fs/ubifs/journal.c
> > @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
> >       if (!buf)
> >               return -ENOMEM;
> >
> > +     out_len *= WORST_COMPR_FACTOR;
> > +
> >       dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
> >       data_size = dn_size - UBIFS_DATA_NODE_SZ;
> >       compr_type = le16_to_cpu(dn->compr_type);
>
> This looks like another case where data that would be expanded by compression
> should just be stored uncompressed instead.
>
> In fact, it seems that UBIFS does that already.  ubifs_compress() has this:
>
>         /*
>          * If the data compressed only slightly, it is better to leave it
>          * uncompressed to improve read speed.
>          */
>         if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
>                 goto no_compr;
>
> So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.
>

It is not. The buffer is used for decompression in the truncation
path, so none of this logic even matters. Even if the subsequent
recompression of the truncated data node could result in expansion
beyond the uncompressed size of the original data (which seems
impossible to me), increasing the size of this buffer would not help
as it is the input buffer for the compression not the output buffer.
Zhihao Cheng July 19, 2023, 2:23 p.m. UTC | #3
在 2023/7/19 16:33, Ard Biesheuvel 写道:
> On Wed, 19 Jul 2023 at 00:38, Eric Biggers <ebiggers@kernel.org> wrote:
>>
>> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
>>> Currently, the ubifs code allocates a worst case buffer size to
>>> recompress a data node, but does not pass the size of that buffer to the
>>> compression code. This means that the compression code will never use

I think you mean the 'out_len' which describes the lengh of 'buf' is 
passed into ubifs_decompress, which effects the result of 
decompressor(eg. lz4 uses length to calculate the buffer end pos).
So, we should pass the real lenghth of 'buf'.

Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>

>>> the additional space, and might fail spuriously due to lack of space.
>>>
>>> So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
>>> buffer. Doing so is guaranteed not to overflow, given that the preceding
>>> kmalloc_array() call would have failed otherwise.
>>>
>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>>> ---
>>>   fs/ubifs/journal.c | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
>>> index dc52ac0f4a345f30..4e5961878f336033 100644
>>> --- a/fs/ubifs/journal.c
>>> +++ b/fs/ubifs/journal.c
>>> @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
>>>        if (!buf)
>>>                return -ENOMEM;
>>>
>>> +     out_len *= WORST_COMPR_FACTOR;
>>> +
>>>        dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
>>>        data_size = dn_size - UBIFS_DATA_NODE_SZ;
>>>        compr_type = le16_to_cpu(dn->compr_type);
>>
>> This looks like another case where data that would be expanded by compression
>> should just be stored uncompressed instead.
>>
>> In fact, it seems that UBIFS does that already.  ubifs_compress() has this:
>>
>>          /*
>>           * If the data compressed only slightly, it is better to leave it
>>           * uncompressed to improve read speed.
>>           */
>>          if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
>>                  goto no_compr;
>>
>> So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.
>>
> 
> It is not. The buffer is used for decompression in the truncation
> path, so none of this logic even matters. Even if the subsequent
> recompression of the truncated data node could result in expansion
> beyond the uncompressed size of the original data (which seems
> impossible to me), increasing the size of this buffer would not help
> as it is the input buffer for the compression not the output buffer.
> .
>
Ard Biesheuvel July 19, 2023, 2:38 p.m. UTC | #4
On Wed, 19 Jul 2023 at 16:23, Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> 在 2023/7/19 16:33, Ard Biesheuvel 写道:
> > On Wed, 19 Jul 2023 at 00:38, Eric Biggers <ebiggers@kernel.org> wrote:
> >>
> >> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> >>> Currently, the ubifs code allocates a worst case buffer size to
> >>> recompress a data node, but does not pass the size of that buffer to the
> >>> compression code. This means that the compression code will never use
>
> I think you mean the 'out_len' which describes the lengh of 'buf' is
> passed into ubifs_decompress, which effects the result of
> decompressor(eg. lz4 uses length to calculate the buffer end pos).
> So, we should pass the real lenghth of 'buf'.
>

Yes, that is what I meant.

But Eric makes a good point, and looking a bit more closely, there is
really no need for the multiplication here: we know the size of the
decompressed data, so we don't need the additional space.

I intend to drop this patch, and replace it with the following:

----------------8<--------------

Currently, when truncating a data node, a decompression buffer is
allocated that is twice the size of the data node's uncompressed size.
However, the fact that this space is available is not communicated to
the compression routines, as out_len itself is not updated.

The additional space is not needed even in the theoretical worst case
where compression might lead to inadvertent expansion: first of all,
increasing the size of the input buffer does not help mitigate that
issue. And given the truncation of the data node and the fact that the
original data compressed well enough to pass the UBIFS_MIN_COMPRESS_DIFF
test, there is no way on this particular code path that compression
could result in expansion beyond the original decompressed size, and so
no mitigation is necessary to begin with.

So let's just drop WORST_COMPR_FACTOR here.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..0b55cbfe0c30505e 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1489,7 +1489,7 @@ static int truncate_data_node(const struct
ubifs_info *c, const struct inode *in
        int err, dlen, compr_type, out_len, data_size;

        out_len = le32_to_cpu(dn->size);
-       buf = kmalloc_array(out_len, WORST_COMPR_FACTOR, GFP_NOFS);
+       buf = kmalloc(out_len, GFP_NOFS);
        if (!buf)
                return -ENOMEM;
Zhihao Cheng July 20, 2023, 1:23 a.m. UTC | #5
在 2023/7/19 22:38, Ard Biesheuvel 写道:
> On Wed, 19 Jul 2023 at 16:23, Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>>
>> 在 2023/7/19 16:33, Ard Biesheuvel 写道:
>>> On Wed, 19 Jul 2023 at 00:38, Eric Biggers <ebiggers@kernel.org> wrote:
>>>>
>>>> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
>>>>> Currently, the ubifs code allocates a worst case buffer size to
>>>>> recompress a data node, but does not pass the size of that buffer to the
>>>>> compression code. This means that the compression code will never use
>>
>> I think you mean the 'out_len' which describes the lengh of 'buf' is
>> passed into ubifs_decompress, which effects the result of
>> decompressor(eg. lz4 uses length to calculate the buffer end pos).
>> So, we should pass the real lenghth of 'buf'.
>>
> 
> Yes, that is what I meant.
> 
> But Eric makes a good point, and looking a bit more closely, there is
> really no need for the multiplication here: we know the size of the
> decompressed data, so we don't need the additional space.
> 

Right, we get 'out_len' from 'dn->size' which is the length of 
uncompressed data. ubifs_compress makes sure the compressed length is 
smaller than original length.

> I intend to drop this patch, and replace it with the following:
> 
> ----------------8<--------------
> 
> Currently, when truncating a data node, a decompression buffer is
> allocated that is twice the size of the data node's uncompressed size.
> However, the fact that this space is available is not communicated to
> the compression routines, as out_len itself is not updated.
> 
> The additional space is not needed even in the theoretical worst case
> where compression might lead to inadvertent expansion: first of all,
> increasing the size of the input buffer does not help mitigate that
> issue. And given the truncation of the data node and the fact that the
> original data compressed well enough to pass the UBIFS_MIN_COMPRESS_DIFF
> test, there is no way on this particular code path that compression
> could result in expansion beyond the original decompressed size, and so
> no mitigation is necessary to begin with.
> 
> So let's just drop WORST_COMPR_FACTOR here.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> 
> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> index dc52ac0f4a345f30..0b55cbfe0c30505e 100644
> --- a/fs/ubifs/journal.c
> +++ b/fs/ubifs/journal.c
> @@ -1489,7 +1489,7 @@ static int truncate_data_node(const struct
> ubifs_info *c, const struct inode *in
>          int err, dlen, compr_type, out_len, data_size;
> 
>          out_len = le32_to_cpu(dn->size);
> -       buf = kmalloc_array(out_len, WORST_COMPR_FACTOR, GFP_NOFS);
> +       buf = kmalloc(out_len, GFP_NOFS);
>          if (!buf)
>                  return -ENOMEM;
> .
> 

This version looks better.

Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
diff mbox series

Patch

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..4e5961878f336033 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1493,6 +1493,8 @@  static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
 	if (!buf)
 		return -ENOMEM;
 
+	out_len *= WORST_COMPR_FACTOR;
+
 	dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
 	data_size = dn_size - UBIFS_DATA_NODE_SZ;
 	compr_type = le16_to_cpu(dn->compr_type);