[COMMITTED] Do not require memset elimination in explicit_bzero test

Message ID d188b9b4-f134-697e-909c-526aa6a45750@linux.vnet.ibm.com
State New
Headers show

Commit Message

Stefan Liebler Jan. 10, 2017, 8:22 a.m.
On 12/30/2016 01:16 PM, Florian Weimer wrote:
> On 12/21/2016 07:04 PM, Florian Weimer wrote:

>> On 12/20/2016 11:09 AM, Florian Weimer wrote:

>>> Some targets fail to apply dead store elimination to the

>>> memset call in setup_ordinary_clear.  Before this commit,

>>> this causes the test case to fail.  Instead, the test case

>>> now logs lack of memset elimination as an informational

>>> message.

>>>

>>> 2016-12-20  Florian Weimer  <fweimer@redhat.com>

>>>

>>>     Do not require memset elimination in explicit_bzero test.

>>>     * string/tst-xbzero-opt.c (prepare_test_buffer): Force inlining.

>>>     (enum test_expectation): Add NO_EXPECTATIONS.

>>>     (subtests): NO_EXPECTATIONS for ordinary clear.

>>>     (check_test_buffer): Handle NO_EXPECTATIONS.

>>>     * string/Makefile (CFLAGS-tst-xbzero-opt.c): Compile with -O3.

>>

>> Stefan, this test still fails for me on s390x:

>>

>> PASS: no clear/prepare: expected 32 got 32

>> PASS: no clear/test: expected some got 32

>> PASS: ordinary clear/prepare: expected 32 got 32

>> INFO: ordinary clear/test: found 0 patterns (memset not eliminated)

>> PASS: explicit clear/prepare: expected 32 got 32

>> FAIL: explicit clear/test: expected 0 got 1

>>

>> Do you have an idea what's going on there?

>

> I filed bug 21006 and will add it as a release blocker.

>

> Thanks,

> Florian

>

Hi Florian,

the test is also failing on my system and I've had a look into it.

In setup_explicit_clear, the buffer is filled with the test_pattern.
On s390x the memcpy in prepare_test_buffer is done by loading
r4 / r5 with the test_pattern and using store multiple instruction
to store r4 / r5 to buf.
If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
stored to stack by _dl_runtime_resolve and the call to memmem in
count_test_patterns finds a hit of the test_pattern on the stack.

The attached patch resolves all symbols at program startup by linking
with -z now.  This omits the call of _dl_runtime_resolve within
setup_explicit_clear and the test passes.

If this is okay, I'll commit this patch and clear this bug in the 
release blockers list in the release-wiki.

Bye
Stefan

ChangeLog:

	[BZ #21006]
	* string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.

Comments

Stefan Liebler Jan. 16, 2017, 8:24 a.m. | #1
On 01/10/2017 09:22 AM, Stefan Liebler wrote:
> On 12/30/2016 01:16 PM, Florian Weimer wrote:

>> On 12/21/2016 07:04 PM, Florian Weimer wrote:

>>> On 12/20/2016 11:09 AM, Florian Weimer wrote:

>>>> Some targets fail to apply dead store elimination to the

>>>> memset call in setup_ordinary_clear.  Before this commit,

>>>> this causes the test case to fail.  Instead, the test case

>>>> now logs lack of memset elimination as an informational

>>>> message.

>>>>

>>>> 2016-12-20  Florian Weimer  <fweimer@redhat.com>

>>>>

>>>>     Do not require memset elimination in explicit_bzero test.

>>>>     * string/tst-xbzero-opt.c (prepare_test_buffer): Force inlining.

>>>>     (enum test_expectation): Add NO_EXPECTATIONS.

>>>>     (subtests): NO_EXPECTATIONS for ordinary clear.

>>>>     (check_test_buffer): Handle NO_EXPECTATIONS.

>>>>     * string/Makefile (CFLAGS-tst-xbzero-opt.c): Compile with -O3.

>>>

>>> Stefan, this test still fails for me on s390x:

>>>

>>> PASS: no clear/prepare: expected 32 got 32

>>> PASS: no clear/test: expected some got 32

>>> PASS: ordinary clear/prepare: expected 32 got 32

>>> INFO: ordinary clear/test: found 0 patterns (memset not eliminated)

>>> PASS: explicit clear/prepare: expected 32 got 32

>>> FAIL: explicit clear/test: expected 0 got 1

>>>

>>> Do you have an idea what's going on there?

>>

>> I filed bug 21006 and will add it as a release blocker.

>>

>> Thanks,

>> Florian

>>

> Hi Florian,

>

> the test is also failing on my system and I've had a look into it.

>

> In setup_explicit_clear, the buffer is filled with the test_pattern.

> On s390x the memcpy in prepare_test_buffer is done by loading

> r4 / r5 with the test_pattern and using store multiple instruction

> to store r4 / r5 to buf.

> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is

> stored to stack by _dl_runtime_resolve and the call to memmem in

> count_test_patterns finds a hit of the test_pattern on the stack.

>

> The attached patch resolves all symbols at program startup by linking

> with -z now.  This omits the call of _dl_runtime_resolve within

> setup_explicit_clear and the test passes.

>

> If this is okay, I'll commit this patch and clear this bug in the

> release blockers list in the release-wiki.

>

> Bye

> Stefan

>

> ChangeLog:

>

>     [BZ #21006]

>     * string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.

>

PING
Zack Weinberg Jan. 16, 2017, 3:28 p.m. | #2
On Mon, Jan 16, 2017 at 3:24 AM, Stefan Liebler <stli@linux.vnet.ibm.com> wrote:
> On 01/10/2017 09:22 AM, Stefan Liebler wrote:

>>

>> In setup_explicit_clear, the buffer is filled with the test_pattern.

>> On s390x the memcpy in prepare_test_buffer is done by loading

>> r4 / r5 with the test_pattern and using store multiple instruction

>> to store r4 / r5 to buf.

>> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is

>> stored to stack by _dl_runtime_resolve and the call to memmem in

>> count_test_patterns finds a hit of the test_pattern on the stack.

>>

>> The attached patch resolves all symbols at program startup by linking

>> with -z now.  This omits the call of _dl_runtime_resolve within

>> setup_explicit_clear and the test passes.

>>

>> If this is okay, I'll commit this patch and clear this bug in the

>> release blockers list in the release-wiki.


This seems like a reasonable workaround to me.  Please commit.

(Guess we better add "spill slots for callee-save registers, including
registers saved only by dynamic linker stubs" to the list of things to
worry about when adding explicit_bzero to the compiler...)

zw
Stefan Liebler Jan. 17, 2017, 8:04 a.m. | #3
On 01/16/2017 04:28 PM, Zack Weinberg wrote:
> On Mon, Jan 16, 2017 at 3:24 AM, Stefan Liebler <stli@linux.vnet.ibm.com> wrote:

>> On 01/10/2017 09:22 AM, Stefan Liebler wrote:

>>>

>>> In setup_explicit_clear, the buffer is filled with the test_pattern.

>>> On s390x the memcpy in prepare_test_buffer is done by loading

>>> r4 / r5 with the test_pattern and using store multiple instruction

>>> to store r4 / r5 to buf.

>>> If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is

>>> stored to stack by _dl_runtime_resolve and the call to memmem in

>>> count_test_patterns finds a hit of the test_pattern on the stack.

>>>

>>> The attached patch resolves all symbols at program startup by linking

>>> with -z now.  This omits the call of _dl_runtime_resolve within

>>> setup_explicit_clear and the test passes.

>>>

>>> If this is okay, I'll commit this patch and clear this bug in the

>>> release blockers list in the release-wiki.

>

> This seems like a reasonable workaround to me.  Please commit.

>

> (Guess we better add "spill slots for callee-save registers, including

> registers saved only by dynamic linker stubs" to the list of things to

> worry about when adding explicit_bzero to the compiler...)

>

> zw

>

Thanks.
Committed, closed bug and cleared entry in the release blockers list in 
the release-wiki.

Patch hide | download patch | download mbox

commit a07f447fd235d1898a56af7317a4ff6a11a603b9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Tue Jan 10 08:47:13 2017 +0100

    S390: Fix FAIL in test string/tst-xbzero-opt [BZ #21006]
    
    On s390x this test failed with:
    FAIL: explicit clear/test: expected 0 got 1
    
    In setup_explicit_clear, the buffer is filled with the test_pattern.
    On s390x the memcpy in prepare_test_buffer is done by loading
    r4 / r5 with the test_pattern and using store multiple instruction
    to store r4 / r5 to buf.
    If explicit_bzero is resolved in setup_explicit_clear, r4 / r5 is
    stored to stack by _dl_runtime_resolve and the call to memmem in
    count_test_patterns finds a hit of the test_pattern on the stack.
    
    This patch resolves all symbols at program startup by linking with
    -z now.  This omits the call of _dl_runtime_resolve within
    setup_explicit_clear and the test passes.
    
    ChangeLog:
    
    	[BZ #21006]
    	* string/Makefile (LDFLAGS-tst-xbzero-opt): New variable.

diff --git a/string/Makefile b/string/Makefile
index 04e9da9..87e0d1d 100644
--- a/string/Makefile
+++ b/string/Makefile
@@ -73,6 +73,14 @@  CFLAGS-stratcliff.c = -fno-builtin
 CFLAGS-test-ffs.c = -fno-builtin
 CFLAGS-tst-inlcall.c = -fno-builtin
 CFLAGS-tst-xbzero-opt.c = -O3
+# BZ 21006: Resolve all functions but at least explicit_bzero at startup.
+# Otherwise the test fails on s390x as the memcpy in prepare_test_buffer is
+# done by loading r4 / r5 with the test_pattern and using store multiple
+# instruction to store r4 / r5 to buf.  If explicit_bzero would be resolved in
+# setup_explicit_clear, r4 / r5 would be stored to stack by _dl_runtime_resolve
+# and the call to memmem in count_test_patterns will find a hit of the
+# test_pattern on the stack.
+LDFLAGS-tst-xbzero-opt = -z now
 
 # Called during TLS initialization.
 CFLAGS-memcpy.c = $(no-stack-protector)