[v4,02/11] lib/charset: add u16_strlcat() function

Message ID	20220324135443.1571-3-masahisa.kojima@linaro.org
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of u-boot-bounces@lists.denx.de designates 85.214.62.61 as permitted sender) client-ip=85.214.62.61; From: Masahisa Kojima <masahisa.kojima@linaro.org> To: u-boot@lists.denx.de Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>, Ilias Apalodimas <ilias.apalodimas@linaro.org>, Simon Glass <sjg@chromium.org>, Takahiro Akashi <takahiro.akashi@linaro.org>, Francois Ozog <francois.ozog@linaro.org>, Mark Kettenis <mark.kettenis@xs4all.nl>, Masahisa Kojima <masahisa.kojima@linaro.org> Subject: [PATCH v4 02/11] lib/charset: add u16_strlcat() function Date: Thu, 24 Mar 2022 22:54:34 +0900 Message-Id: <20220324135443.1571-3-masahisa.kojima@linaro.org> In-Reply-To: <20220324135443.1571-1-masahisa.kojima@linaro.org> References: <20220324135443.1571-1-masahisa.kojima@linaro.org> Precedence: list Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" <u-boot-bounces@lists.denx.de>
Series	enable menu-driven boot device selection \| expand [v4,00/11] enable menu-driven boot device selection [v4,01/11] bootmenu: fix menu API error handling [v4,02/11] lib/charset: add u16_strlcat() function [v4,03/11] test: unit test for u16_strlcat() [v4,04/11] menu: always show the menu regardless of the number of entry [v4,05/11] efi_loader: export efi_locate_device_handle() [v4,06/11] efi_loader: bootmgr: add booting from removable media [v4,07/11] bootmenu: add UEFI and disto_boot entries [v4,08/11] bootmenu: factor out the user input handling [v4,09/11] efi_loader: add menu-driven UEFI Boot Variable maintenance [v4,10/11] bootmenu: add removable media entries [v4,11/11] doc:bootmenu: add UEFI boot variable and distro boot support

Masahisa Kojima March 24, 2022, 1:54 p.m. UTC

Provide u16 string version of strlcat().

Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Reviewed-by: Simon Glass <sjg@chromium.org>
---
Changes in v4:
- add blank line above the return statement

Changes in v2:
- implement u16_strlcat(with the destination buffer size in argument)
  instead of u16_strcat

 include/charset.h | 15 +++++++++++++++
 lib/charset.c     | 21 +++++++++++++++++++++
 2 files changed, 36 insertions(+)

Heinrich Schuchardt April 2, 2022, 7:14 a.m. UTC | #1

On 3/24/22 14:54, Masahisa Kojima wrote:
> Provide u16 string version of strlcat().
>
> Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
> Reviewed-by: Simon Glass <sjg@chromium.org>
> ---
> Changes in v4:
> - add blank line above the return statement
>
> Changes in v2:
> - implement u16_strlcat(with the destination buffer size in argument)
>    instead of u16_strcat
>
>   include/charset.h | 15 +++++++++++++++
>   lib/charset.c     | 21 +++++++++++++++++++++
>   2 files changed, 36 insertions(+)
>
> diff --git a/include/charset.h b/include/charset.h
> index b93d023092..dc5fc275ec 100644
> --- a/include/charset.h
> +++ b/include/charset.h
> @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
>    */
>   u16 *u16_strdup(const void *src);
>
> +/**
> + * u16_strlcat() - Append a length-limited, %NUL-terminated string to another

The function should be called u16_strncat() in reference to the
strncat() function.

> + *
> + * Append the src string to the dest string, overwriting the terminating
> + * null word at the end of dest, and then adds a terminating null word.
> + * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating the result.
> + *
> + * @dest:		destination buffer (null terminated)
> + * @src:		source buffer (null terminated)
> + * @size:		destination buffer size in bytes
> + * Return:		total size of the created string in bytes.
> + *			If return value >= size, truncation occurred.
> + */
> +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> +
>   /**
>    * utf16_to_utf8() - Convert an utf16 string to utf8
>    *
> diff --git a/lib/charset.c b/lib/charset.c
> index f44c58d9d8..47997eca7d 100644
> --- a/lib/charset.c
> +++ b/lib/charset.c
> @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
>   	return new;
>   }
>
> +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> +{
> +	size_t dstrlen = u16_strnlen(dest, size >> 1);
> +	size_t dlen = dstrlen * sizeof(u16);
> +	size_t len = u16_strlen(src) * sizeof(u16);
> +	size_t ret = dlen + len;
> +
> +	if (dlen >= size)
> +		return ret;
> +
> +	dest += dstrlen;
> +	size -= dlen;
> +	if (len >= size)
> +		len = size - sizeof(u16);

For size = dlen + 1 this results in

len = SIZE_MAX = 0xffffffffffffffff

Something must be missing in your unit test.

Best regards

Heinrich

> +
> +	memcpy(dest, src, len);
> +	dest[len >> 1] = u'\0';
> +
> +	return ret;
> +}
> +
>   /* Convert UTF-16 to UTF-8.  */
>   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
>   {

Masahisa Kojima April 4, 2022, 2:50 p.m. UTC | #2

Hi Heinrich,

On Sat, 2 Apr 2022 at 16:19, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
> On 3/24/22 14:54, Masahisa Kojima wrote:
> > Provide u16 string version of strlcat().
> >
> > Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
> > Reviewed-by: Simon Glass <sjg@chromium.org>
> > ---
> > Changes in v4:
> > - add blank line above the return statement
> >
> > Changes in v2:
> > - implement u16_strlcat(with the destination buffer size in argument)
> >    instead of u16_strcat
> >
> >   include/charset.h | 15 +++++++++++++++
> >   lib/charset.c     | 21 +++++++++++++++++++++
> >   2 files changed, 36 insertions(+)
> >
> > diff --git a/include/charset.h b/include/charset.h
> > index b93d023092..dc5fc275ec 100644
> > --- a/include/charset.h
> > +++ b/include/charset.h
> > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> >    */
> >   u16 *u16_strdup(const void *src);
> >
> > +/**
> > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
>
> The function should be called u16_strncat() in reference to the
> strncat() function.

I intended to implement the string concatenation function with destination
buffer size check, it is u16_strlcat().
strncat() is not safe. strncat() has size parameter, but it indicates
the size to be copied to the destination, not the size of the
destination buffer.

>
> > + *
> > + * Append the src string to the dest string, overwriting the terminating
> > + * null word at the end of dest, and then adds a terminating null word.
> > + * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating the result.
> > + *
> > + * @dest:            destination buffer (null terminated)
> > + * @src:             source buffer (null terminated)
> > + * @size:            destination buffer size in bytes
> > + * Return:           total size of the created string in bytes.
> > + *                   If return value >= size, truncation occurred.
> > + */
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > +
> >   /**
> >    * utf16_to_utf8() - Convert an utf16 string to utf8
> >    *
> > diff --git a/lib/charset.c b/lib/charset.c
> > index f44c58d9d8..47997eca7d 100644
> > --- a/lib/charset.c
> > +++ b/lib/charset.c
> > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> >       return new;
> >   }
> >
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > +{
> > +     size_t dstrlen = u16_strnlen(dest, size >> 1);
> > +     size_t dlen = dstrlen * sizeof(u16);
> > +     size_t len = u16_strlen(src) * sizeof(u16);
> > +     size_t ret = dlen + len;
> > +
> > +     if (dlen >= size)
> > +             return ret;
> > +
> > +     dest += dstrlen;
> > +     size -= dlen;
> > +     if (len >= size)
> > +             len = size - sizeof(u16);
>
> For size = dlen + 1 this results in
>
> len = SIZE_MAX = 0xffffffffffffffff
>
> Something must be missing in your unit test.

Yes, you are correct.
I need to care about the case that the size is an odd number.

Thanks,
Masahisa Kojima

>
> Best regards
>
> Heinrich
>
> > +
> > +     memcpy(dest, src, len);
> > +     dest[len >> 1] = u'\0';
> > +
> > +     return ret;
> > +}
> > +
> >   /* Convert UTF-16 to UTF-8.  */
> >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> >   {
>

Heinrich Schuchardt April 16, 2022, 7:32 a.m. UTC | #3

On 3/24/22 14:54, Masahisa Kojima wrote:
> Provide u16 string version of strlcat().
>
> Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
> Reviewed-by: Simon Glass <sjg@chromium.org>
> ---
> Changes in v4:
> - add blank line above the return statement
>
> Changes in v2:
> - implement u16_strlcat(with the destination buffer size in argument)
>    instead of u16_strcat
>
>   include/charset.h | 15 +++++++++++++++
>   lib/charset.c     | 21 +++++++++++++++++++++
>   2 files changed, 36 insertions(+)
>
> diff --git a/include/charset.h b/include/charset.h
> index b93d023092..dc5fc275ec 100644
> --- a/include/charset.h
> +++ b/include/charset.h
> @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
>    */
>   u16 *u16_strdup(const void *src);
>
> +/**
> + * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
> + *
> + * Append the src string to the dest string, overwriting the terminating
> + * null word at the end of dest, and then adds a terminating null word.
> + * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating the result.

Why "- 1"?

If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
two extra bytes used for 0x0000.
If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
one byte of the buffer unused.

> + *
> + * @dest:		destination buffer (null terminated)
> + * @src:		source buffer (null terminated)
> + * @size:		destination buffer size in bytes

s/$/ including the trailing 0x0000/

> + * Return:		total size of the created string in bytes.
> + *			If return value >= size, truncation occurred.
> + */
> +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> +
>   /**
>    * utf16_to_utf8() - Convert an utf16 string to utf8
>    *
> diff --git a/lib/charset.c b/lib/charset.c
> index f44c58d9d8..47997eca7d 100644
> --- a/lib/charset.c
> +++ b/lib/charset.c
> @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
>   	return new;
>   }
>
> +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> +{

If you start the function with

     size >>= 1;

or

     size /= sizeof(u16);

this might simplify the code.

> +	size_t dstrlen = u16_strnlen(dest, size >> 1);
> +	size_t dlen = dstrlen * sizeof(u16);
> +	size_t len = u16_strlen(src) * sizeof(u16);
> +	size_t ret = dlen + len;

This misses the  trailing 0x0000.

Best regards

Heinrich

> +
> +	if (dlen >= size)
> +		return ret;
> +
> +	dest += dstrlen;
> +	size -= dlen;
> +	if (len >= size)
> +		len = size - sizeof(u16);
> +
> +	memcpy(dest, src, len);
> +	dest[len >> 1] = u'\0';
> +
> +	return ret;
> +}
> +
>   /* Convert UTF-16 to UTF-8.  */
>   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
>   {

Masahisa Kojima April 18, 2022, 7:47 a.m. UTC | #4

On Sat, 16 Apr 2022 at 16:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
> On 3/24/22 14:54, Masahisa Kojima wrote:
> > Provide u16 string version of strlcat().
> >
> > Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
> > Reviewed-by: Simon Glass <sjg@chromium.org>
> > ---
> > Changes in v4:
> > - add blank line above the return statement
> >
> > Changes in v2:
> > - implement u16_strlcat(with the destination buffer size in argument)
> >    instead of u16_strcat
> >
> >   include/charset.h | 15 +++++++++++++++
> >   lib/charset.c     | 21 +++++++++++++++++++++
> >   2 files changed, 36 insertions(+)
> >
> > diff --git a/include/charset.h b/include/charset.h
> > index b93d023092..dc5fc275ec 100644
> > --- a/include/charset.h
> > +++ b/include/charset.h
> > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> >    */
> >   u16 *u16_strdup(const void *src);
> >
> > +/**
> > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
> > + *
> > + * Append the src string to the dest string, overwriting the terminating
> > + * null word at the end of dest, and then adds a terminating null word.
> > + * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating the result.
>
> Why "- 1"?

It is my mistake, it should be 2.

>
> If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
> two extra bytes used for 0x0000.
> If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
> one byte of the buffer unused.

Thanks, It clearly explains the behavior.

>
> > + *
> > + * @dest:            destination buffer (null terminated)
> > + * @src:             source buffer (null terminated)
> > + * @size:            destination buffer size in bytes
>
> s/$/ including the trailing 0x0000/

OK, I will update "(null terminated)" to the suggested one.

>
> > + * Return:           total size of the created string in bytes.
> > + *                   If return value >= size, truncation occurred.
> > + */
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > +
> >   /**
> >    * utf16_to_utf8() - Convert an utf16 string to utf8
> >    *
> > diff --git a/lib/charset.c b/lib/charset.c
> > index f44c58d9d8..47997eca7d 100644
> > --- a/lib/charset.c
> > +++ b/lib/charset.c
> > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> >       return new;
> >   }
> >
> > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > +{
>
> If you start the function with
>
>      size >>= 1;
>
> or
>
>      size /= sizeof(u16);
>
> this might simplify the code.

In u16_strlcat(), there are two size definitions, u16 string size and
buffer size.
I will rename some of the variables to clearly identify the meaning.

>
> > +     size_t dstrlen = u16_strnlen(dest, size >> 1);
> > +     size_t dlen = dstrlen * sizeof(u16);
> > +     size_t len = u16_strlen(src) * sizeof(u16);
> > +     size_t ret = dlen + len;
>
> This misses the  trailing 0x0000.

Strlcat() is not the C standard function, but the linux implementation
of strlcat() does not include trailing 0x00[1],
also the same for openbsd.
[1] https://github.com/torvalds/linux/blob/master/lib/string.c#L319.

The current U-Boot strlcat() contains trailing 0x00, I think it needs
to be updated.

Thanks,
Masahisa Kojima

>
> Best regards
>
> Heinrich
>
> > +
> > +     if (dlen >= size)
> > +             return ret;
> > +
> > +     dest += dstrlen;
> > +     size -= dlen;
> > +     if (len >= size)
> > +             len = size - sizeof(u16);
> > +
> > +     memcpy(dest, src, len);
> > +     dest[len >> 1] = u'\0';
> > +
> > +     return ret;
> > +}
> > +
> >   /* Convert UTF-16 to UTF-8.  */
> >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> >   {
>

Masahisa Kojima April 28, 2022, 7:45 a.m. UTC | #5

Hi Heinrich,

On Mon, 18 Apr 2022 at 16:47, Masahisa Kojima
<masahisa.kojima@linaro.org> wrote:
>
> On Sat, 16 Apr 2022 at 16:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
> >
> > On 3/24/22 14:54, Masahisa Kojima wrote:
> > > Provide u16 string version of strlcat().
> > >
> > > Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
> > > Reviewed-by: Simon Glass <sjg@chromium.org>
> > > ---
> > > Changes in v4:
> > > - add blank line above the return statement
> > >
> > > Changes in v2:
> > > - implement u16_strlcat(with the destination buffer size in argument)
> > >    instead of u16_strcat
> > >
> > >   include/charset.h | 15 +++++++++++++++
> > >   lib/charset.c     | 21 +++++++++++++++++++++
> > >   2 files changed, 36 insertions(+)
> > >
> > > diff --git a/include/charset.h b/include/charset.h
> > > index b93d023092..dc5fc275ec 100644
> > > --- a/include/charset.h
> > > +++ b/include/charset.h
> > > @@ -259,6 +259,21 @@ u16 *u16_strcpy(u16 *dest, const u16 *src);
> > >    */
> > >   u16 *u16_strdup(const void *src);
> > >
> > > +/**
> > > + * u16_strlcat() - Append a length-limited, %NUL-terminated string to another
> > > + *
> > > + * Append the src string to the dest string, overwriting the terminating
> > > + * null word at the end of dest, and then adds a terminating null word.
> > > + * It will append at most size - u16_strlen(dst) - 1 bytes, NUL-terminating the result.
> >
> > Why "- 1"?
>
> It is my mistake, it should be 2.
>
> >
> > If size is even, we append up to size - u16_strlen(dst) - 2 bytes. The
> > two extra bytes used for 0x0000.
> > If size is odd, we append up to size - u16_strlen(dst) - 3 bytes leaving
> > one byte of the buffer unused.

To make behavior simple, I update the meaning of the 3rd parameter
from buffer size to u16 string count.
It is the same behavior as other u16_strxxx functions in U-boot.

Thanks,
Masahisa Kojima
>
> Thanks, It clearly explains the behavior.
>
> >
> > > + *
> > > + * @dest:            destination buffer (null terminated)
> > > + * @src:             source buffer (null terminated)
> > > + * @size:            destination buffer size in bytes
> >
> > s/$/ including the trailing 0x0000/
>
> OK, I will update "(null terminated)" to the suggested one.
>
> >
> > > + * Return:           total size of the created string in bytes.
> > > + *                   If return value >= size, truncation occurred.
> > > + */
> > > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size);
> > > +
> > >   /**
> > >    * utf16_to_utf8() - Convert an utf16 string to utf8
> > >    *
> > > diff --git a/lib/charset.c b/lib/charset.c
> > > index f44c58d9d8..47997eca7d 100644
> > > --- a/lib/charset.c
> > > +++ b/lib/charset.c
> > > @@ -428,6 +428,27 @@ u16 *u16_strdup(const void *src)
> > >       return new;
> > >   }
> > >
> > > +size_t u16_strlcat(u16 *dest, const u16 *src, size_t size)
> > > +{
> >
> > If you start the function with
> >
> >      size >>= 1;
> >
> > or
> >
> >      size /= sizeof(u16);
> >
> > this might simplify the code.
>
> In u16_strlcat(), there are two size definitions, u16 string size and
> buffer size.
> I will rename some of the variables to clearly identify the meaning.
>
> >
> > > +     size_t dstrlen = u16_strnlen(dest, size >> 1);
> > > +     size_t dlen = dstrlen * sizeof(u16);
> > > +     size_t len = u16_strlen(src) * sizeof(u16);
> > > +     size_t ret = dlen + len;
> >
> > This misses the  trailing 0x0000.
>
> Strlcat() is not the C standard function, but the linux implementation
> of strlcat() does not include trailing 0x00[1],
> also the same for openbsd.
> [1] https://github.com/torvalds/linux/blob/master/lib/string.c#L319.
>
> The current U-Boot strlcat() contains trailing 0x00, I think it needs
> to be updated.
>
> Thanks,
> Masahisa Kojima
>
> >
> > Best regards
> >
> > Heinrich
> >
> > > +
> > > +     if (dlen >= size)
> > > +             return ret;
> > > +
> > > +     dest += dstrlen;
> > > +     size -= dlen;
> > > +     if (len >= size)
> > > +             len = size - sizeof(u16);
> > > +
> > > +     memcpy(dest, src, len);
> > > +     dest[len >> 1] = u'\0';
> > > +
> > > +     return ret;
> > > +}
> > > +
> > >   /* Convert UTF-16 to UTF-8.  */
> > >   uint8_t *utf16_to_utf8(uint8_t *dest, const uint16_t *src, size_t size)
> > >   {
> >

[v4,02/11] lib/charset: add u16_strlcat() function

Commit Message

Comments

Patch