diff mbox

Fix undefined behaviour inconsistent for strtok

Message ID 1477393113-3845-1-git-send-email-adhemerval.zanella@linaro.org
State New
Headers show

Commit Message

Adhemerval Zanella Netto Oct. 25, 2016, 10:58 a.m. UTC
Although not stated in any standard how strtok should return if you
pass a null argument if the previous argument is also null, this patch
changes the default implementation to follow this idea.

The original bug report comment #1 states glibc code convention [6]
should not allow it, however for this specific function its contract
does not expect failure even if the returned is ignored (since it
would be a no-op).  Also, patch idea is more focuses on implementation
portability , since it aligns glibc with other implementation that
follows the same idea for strtok:

  - FreeBSD [1], OpenBSD [2], NetBSD [3];
  - uclibc and uclibc-ng [4]
  - musl [5]

I see little value to either assert on null input (as stated in comment
2 from original bug report), change both x86_64 and powerpc64le
implementation to fault on such input, or to keep a different behavior
compared to other libc implementations.

Checked on x86_64, aarch64, and powerpc64le.

	* string/strtok.c (strtok): Return null is previous input is also
	null.
	* string/tst-strtok.c (do_test): Add more strtok coverage.

[1] https://github.com/freebsd/freebsd/blob/386ddae58459341ec567604707805814a2128a57/lib/libc/string/strtok.c
[2] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/string/strtok_r.c?rev=1.10&content-type=text/x-cvsweb-markup&only_with_tag=MAIN
[3] https://github.com/openbsd/src/blob/5271000b44abe23907b73bbb3aa38ddf4a0bce08/lib/libc/string/strtok.c
[4] http://www.uclibc-ng.org/browser/uclibc-ng/libc/string/strtok_r.c
[5] https://git.musl-libc.org/cgit/musl/tree/src/string/strtok.c
[6] https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
---
 ChangeLog           |  7 +++++++
 string/strtok.c     |  4 ++--
 string/tst-strtok.c | 23 ++++++++++++-----------
 3 files changed, 21 insertions(+), 13 deletions(-)

-- 
2.7.4

Comments

Andreas Schwab Oct. 25, 2016, 11:31 a.m. UTC | #1
On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> 	* string/strtok.c (strtok): Return null is previous input is also


s/is/if/

> diff --git a/string/strtok.c b/string/strtok.c

> index 7a4574d..5c4b309 100644

> --- a/string/strtok.c

> +++ b/string/strtok.c

> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>  {

>    char *token;

>  

> -  if (s == NULL)

> -    s = olds;

> +  if ((s == NULL) && ((s = olds) == NULL))


Please avoid assignment in an expression.  And the parens are redundant.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Adhemerval Zanella Netto Oct. 25, 2016, 12:33 p.m. UTC | #2
On 25/10/2016 09:31, Andreas Schwab wrote:
> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> 

>> 	* string/strtok.c (strtok): Return null is previous input is also

> 

> s/is/if/

> 

>> diff --git a/string/strtok.c b/string/strtok.c

>> index 7a4574d..5c4b309 100644

>> --- a/string/strtok.c

>> +++ b/string/strtok.c

>> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>>  {

>>    char *token;

>>  

>> -  if (s == NULL)

>> -    s = olds;

>> +  if ((s == NULL) && ((s = olds) == NULL))

> 

> Please avoid assignment in an expression.  And the parens are redundant.

> 

> Andreas.

> 


Right, with these fixes would it be acceptable?
Andreas Schwab Oct. 25, 2016, 12:57 p.m. UTC | #3
On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> On 25/10/2016 09:31, Andreas Schwab wrote:

>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>> 

>>> 	* string/strtok.c (strtok): Return null is previous input is also

>> 

>> s/is/if/

>> 

>>> diff --git a/string/strtok.c b/string/strtok.c

>>> index 7a4574d..5c4b309 100644

>>> --- a/string/strtok.c

>>> +++ b/string/strtok.c

>>> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>>>  {

>>>    char *token;

>>>  

>>> -  if (s == NULL)

>>> -    s = olds;

>>> +  if ((s == NULL) && ((s = olds) == NULL))

>> 

>> Please avoid assignment in an expression.  And the parens are redundant.

>> 

>> Andreas.

>> 

>

> Right, with these fixes would it be acceptable?


I don't see much point in supporting invalid use of strtok.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Adhemerval Zanella Netto Oct. 25, 2016, 1:13 p.m. UTC | #4
On 25/10/2016 10:57, Andreas Schwab wrote:
> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> 

>> On 25/10/2016 09:31, Andreas Schwab wrote:

>>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>>>

>>>> 	* string/strtok.c (strtok): Return null is previous input is also

>>>

>>> s/is/if/

>>>

>>>> diff --git a/string/strtok.c b/string/strtok.c

>>>> index 7a4574d..5c4b309 100644

>>>> --- a/string/strtok.c

>>>> +++ b/string/strtok.c

>>>> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>>>>  {

>>>>    char *token;

>>>>  

>>>> -  if (s == NULL)

>>>> -    s = olds;

>>>> +  if ((s == NULL) && ((s = olds) == NULL))

>>>

>>> Please avoid assignment in an expression.  And the parens are redundant.

>>>

>>> Andreas.

>>>

>>

>> Right, with these fixes would it be acceptable?

> 

> I don't see much point in supporting invalid use of strtok.

> 

> Andreas.

> 


My point is just to add portability and align with other current
implementations.
Andreas Schwab Oct. 25, 2016, 1:19 p.m. UTC | #5
On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> On 25/10/2016 10:57, Andreas Schwab wrote:

>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>> 

>>> On 25/10/2016 09:31, Andreas Schwab wrote:

>>>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>>>>

>>>>> 	* string/strtok.c (strtok): Return null is previous input is also

>>>>

>>>> s/is/if/

>>>>

>>>>> diff --git a/string/strtok.c b/string/strtok.c

>>>>> index 7a4574d..5c4b309 100644

>>>>> --- a/string/strtok.c

>>>>> +++ b/string/strtok.c

>>>>> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>>>>>  {

>>>>>    char *token;

>>>>>  

>>>>> -  if (s == NULL)

>>>>> -    s = olds;

>>>>> +  if ((s == NULL) && ((s = olds) == NULL))

>>>>

>>>> Please avoid assignment in an expression.  And the parens are redundant.

>>>>

>>>> Andreas.

>>>>

>>>

>>> Right, with these fixes would it be acceptable?

>> 

>> I don't see much point in supporting invalid use of strtok.

>> 

>> Andreas.

>> 

>

> My point is just to add portability and align with other current

> implementations. 


Has it ever be a problem in the past?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Adhemerval Zanella Netto Oct. 25, 2016, 1:22 p.m. UTC | #6
On 25/10/2016 11:19, Andreas Schwab wrote:
> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> 

>> On 25/10/2016 10:57, Andreas Schwab wrote:

>>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>>>

>>>> On 25/10/2016 09:31, Andreas Schwab wrote:

>>>>> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

>>>>>

>>>>>> 	* string/strtok.c (strtok): Return null is previous input is also

>>>>>

>>>>> s/is/if/

>>>>>

>>>>>> diff --git a/string/strtok.c b/string/strtok.c

>>>>>> index 7a4574d..5c4b309 100644

>>>>>> --- a/string/strtok.c

>>>>>> +++ b/string/strtok.c

>>>>>> @@ -40,8 +40,8 @@ STRTOK (char *s, const char *delim)

>>>>>>  {

>>>>>>    char *token;

>>>>>>  

>>>>>> -  if (s == NULL)

>>>>>> -    s = olds;

>>>>>> +  if ((s == NULL) && ((s = olds) == NULL))

>>>>>

>>>>> Please avoid assignment in an expression.  And the parens are redundant.

>>>>>

>>>>> Andreas.

>>>>>

>>>>

>>>> Right, with these fixes would it be acceptable?

>>>

>>> I don't see much point in supporting invalid use of strtok.

>>>

>>> Andreas.

>>>

>>

>> My point is just to add portability and align with other current

>> implementations. 

> 

> Has it ever be a problem in the past?

> 

> Andreas.

> 


None I am aware of, but regardless it is a effort to close down old
glibc bugs and keep the backlog under control.
Florian Weimer Oct. 25, 2016, 1:32 p.m. UTC | #7
On 10/25/2016 12:58 PM, Adhemerval Zanella wrote:
> The original bug report comment #1


I don't see a reference to that bug anywhere, so it's not clear (to me 
at least) what the report was about.  (But I can guess …)

Florian
Andreas Schwab Oct. 25, 2016, 1:45 p.m. UTC | #8
On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> None I am aware of, but regardless it is a effort to close down old

> glibc bugs and keep the backlog under control.


Which bug?

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
Adhemerval Zanella Netto Oct. 25, 2016, 1:49 p.m. UTC | #9
On 25/10/2016 11:45, Andreas Schwab wrote:
> On Okt 25 2016, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote:

> 

>> None I am aware of, but regardless it is a effort to close down old

>> glibc bugs and keep the backlog under control.

> 

> Which bug?

> 

> Andreas.

> 


Oops, my bad. BZ#16640.
Joseph Myers Oct. 25, 2016, 1:51 p.m. UTC | #10
On Tue, 25 Oct 2016, Adhemerval Zanella wrote:

> None I am aware of, but regardless it is a effort to close down old

> glibc bugs and keep the backlog under control.


Well, if a bug report is invalid then closing it as INVALID is 
appropriate.  Or if there is an idea in the bug report that might or might 
not be a good idea but isn't appropriate for Bugzilla, closing as INVALID 
and putting a note on 
<https://sourceware.org/glibc/wiki/Development_Todo/Master> of the idea to 
consider (with a link to the previous discussion in the bug) is 
appropriate.

-- 
Joseph S. Myers
joseph@codesourcery.com
Adhemerval Zanella Netto Oct. 25, 2016, 2:08 p.m. UTC | #11
On 25/10/2016 11:51, Joseph Myers wrote:
> On Tue, 25 Oct 2016, Adhemerval Zanella wrote:

> 

>> None I am aware of, but regardless it is a effort to close down old

>> glibc bugs and keep the backlog under control.

> 

> Well, if a bug report is invalid then closing it as INVALID is 

> appropriate.  Or if there is an idea in the bug report that might or might 

> not be a good idea but isn't appropriate for Bugzilla, closing as INVALID 

> and putting a note on 

> <https://sourceware.org/glibc/wiki/Development_Todo/Master> of the idea to 

> consider (with a link to the previous discussion in the bug) is 

> appropriate.

> 


Right, but the bug report is about the inconsistent behaviour about
for x86_64 (and powerpc as well) and default one.  Bug report comments
from Carlos pointed that it should be fixed in x86_64/powerpc
implementation, while I argued that it would better to follow
what other libc are aiming for since this specific case that does
trigger any particular issue. That's why I think it is not invalid.

If the consensus is indeed to fix the x86_64/powerpc I will work
towards, although I still prefer aim for portability.
diff mbox

Patch

diff --git a/string/strtok.c b/string/strtok.c
index 7a4574d..5c4b309 100644
--- a/string/strtok.c
+++ b/string/strtok.c
@@ -40,8 +40,8 @@  STRTOK (char *s, const char *delim)
 {
   char *token;
 
-  if (s == NULL)
-    s = olds;
+  if ((s == NULL) && ((s = olds) == NULL))
+    return NULL;
 
   /* Scan leading delimiters.  */
   s += strspn (s, delim);
diff --git a/string/tst-strtok.c b/string/tst-strtok.c
index 6fbef9f..d9180a4 100644
--- a/string/tst-strtok.c
+++ b/string/tst-strtok.c
@@ -2,25 +2,26 @@ 
 #include <stdio.h>
 #include <string.h>
 
+static int do_test (void);
+
+#define TEST_FUNCTION do_test ()
+#include "../test-skeleton.c"
+
 static int
 do_test (void)
 {
   char buf[1] = { 0 };
   int result = 0;
 
+  if (strtok (NULL, " ") != NULL)
+    FAIL_RET ("first strtok call did not return NULL");
+  if (strtok (NULL, " ") != NULL)
+    FAIL_RET ("second strtok call did not return NULL");
+
   if (strtok (buf, " ") != NULL)
-    {
-      puts ("first strtok call did not return NULL");
-      result = 1;
-    }
+    FAIL_RET ("third strtok call did not return NULL");
   else if (strtok (NULL, " ") != NULL)
-    {
-      puts ("second strtok call did not return NULL");
-      result = 1;
-    }
+    FAIL_RET ("forth strtok call did not return NULL");
 
   return result;
 }
-
-#define TEST_FUNCTION do_test ()
-#include "../test-skeleton.c"