From patchwork Fri Nov 6 10:46:05 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrylo Tkachov X-Patchwork-Id: 56104 Delivered-To: patch@linaro.org Received: by 10.112.61.134 with SMTP id p6csp931035lbr; Fri, 6 Nov 2015 02:46:37 -0800 (PST) X-Received: by 10.68.132.97 with SMTP id ot1mr16755707pbb.162.1446806797702; Fri, 06 Nov 2015 02:46:37 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id gl10si16933020pbc.104.2015.11.06.02.46.37 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Nov 2015 02:46:37 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-412865-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-return-412865-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-412865-patch=linaro.org@gcc.gnu.org; dkim=pass header.i=@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=y7Kr1+dkHJF2efPFeznLDN3JQ/qDf5bd8omF8r/T9e3 HiFNHdEMlgYRLedAun31YFlXGqvT38u1q263mTu6Lp4980swc2Mq6Qst4m2Jp5EY abTBamDRotfhz/beUPw30kx5YzuvxcTkAO/Q6g0gCJ+wv4j8cqmibi7ssKOQYYfI = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=/2mPaEgoX8eAaEACcs88wXqhNe8=; b=Gmgeoy7avgFPYOZxM Yfm9rxAE3Q538M/dhbz3sbzpBvnNu+r5V47WszWCtKz6ba6MpgHVawhh/NI6eSWk uTpakTU7XBhdhtHrtu0JyT53eFBCLvuaYKgORrBUZLnFUmIUecwH3Qz6AZhq9lmE NBk/y4fdq8TGrkLfJpSQaiM5Bo= Received: (qmail 35271 invoked by alias); 6 Nov 2015 10:46:14 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 35254 invoked by uid 89); 6 Nov 2015 10:46:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Nov 2015 10:46:11 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-28-40QoyppnQDm8VkkCDJxEng-1; Fri, 06 Nov 2015 10:46:06 +0000 Received: from [10.2.206.200] ([10.1.2.79]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 6 Nov 2015 10:46:06 +0000 Message-ID: <563C84ED.4010603@arm.com> Date: Fri, 06 Nov 2015 10:46:05 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Ramana Radhakrishnan , Richard Earnshaw Subject: [PATCH][ARM] PR 68143 Properly update memory offsets when expanding setmem X-MC-Unique: 40QoyppnQDm8VkkCDJxEng-1 X-IsSubscribed: yes Hi all, In this wrong-code PR the vector setmem expansion and arm_block_set_aligned_vect in particular use the wrong offset when calling adjust_automodify_address. In the attached testcase during the initial zeroing out we get two V16QI stores, but they both are recorded by adjust_automodify_address as modifying x+0 rather than x+0 and x+12 (the total size to be written is 28). This led to the scheduling pass moving the store from "x.g = 2;" to before the zeroing stores. This patch fixes the problem by keeping track of the offset to which stores are emitted and passing it to adjust_automodify_address as appropriate. From inspection I see arm_block_set_unaligned_vect also has this issue so I performed the same fix in that function as well. Bootstrapped and tested on arm-none-linux-gnueabihf. Ok for trunk? This bug appears on GCC 5 too and I'm currently testing this patch there. Ok to backport to GCC 5 as well? Thanks, Kyrill 2015-11-06 Kyrylo Tkachov PR target/68143 * config/arm/arm.c (arm_block_set_unaligned_vect): Keep track of offset from dstbase and use it appropriately in adjust_automodify_address. (arm_block_set_aligned_vect): Likewise. 2015-11-06 Kyrylo Tkachov PR target/68143 * gcc.target/arm/pr68143_1.c: New test. commit 78c6989a7af1df672ea227057180d79d717ed5f3 Author: Kyrylo Tkachov Date: Wed Oct 28 17:29:18 2015 +0000 [ARM] Properly update memory offsets when expanding setmem diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 66e8afc..adf3143 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -29268,7 +29268,7 @@ arm_block_set_unaligned_vect (rtx dstbase, rtx (*gen_func) (rtx, rtx); machine_mode mode; unsigned HOST_WIDE_INT v = value; - + unsigned int offset = 0; gcc_assert ((align & 0x3) != 0); nelt_v8 = GET_MODE_NUNITS (V8QImode); nelt_v16 = GET_MODE_NUNITS (V16QImode); @@ -29289,7 +29289,7 @@ arm_block_set_unaligned_vect (rtx dstbase, return false; dst = copy_addr_to_reg (XEXP (dstbase, 0)); - mem = adjust_automodify_address (dstbase, mode, dst, 0); + mem = adjust_automodify_address (dstbase, mode, dst, offset); v = sext_hwi (v, BITS_PER_WORD); val_elt = GEN_INT (v); @@ -29306,7 +29306,11 @@ arm_block_set_unaligned_vect (rtx dstbase, { emit_insn ((*gen_func) (mem, reg)); if (i + 2 * nelt_mode <= length) - emit_insn (gen_add2_insn (dst, GEN_INT (nelt_mode))); + { + emit_insn (gen_add2_insn (dst, GEN_INT (nelt_mode))); + offset += nelt_mode; + mem = adjust_automodify_address (dstbase, mode, dst, offset); + } } /* If there are not less than nelt_v8 bytes leftover, we must be in @@ -29317,6 +29321,9 @@ arm_block_set_unaligned_vect (rtx dstbase, if (i + nelt_v8 < length) { emit_insn (gen_add2_insn (dst, GEN_INT (length - i))); + offset += length - i; + mem = adjust_automodify_address (dstbase, mode, dst, offset); + /* We are shifting bytes back, set the alignment accordingly. */ if ((length & 1) != 0 && align >= 2) set_mem_align (mem, BITS_PER_UNIT); @@ -29327,12 +29334,13 @@ arm_block_set_unaligned_vect (rtx dstbase, else if (i < length && i + nelt_v8 >= length) { if (mode == V16QImode) - { - reg = gen_lowpart (V8QImode, reg); - mem = adjust_automodify_address (dstbase, V8QImode, dst, 0); - } + reg = gen_lowpart (V8QImode, reg); + emit_insn (gen_add2_insn (dst, GEN_INT ((length - i) + (nelt_mode - nelt_v8)))); + offset += (length - i) + (nelt_mode - nelt_v8); + mem = adjust_automodify_address (dstbase, V8QImode, dst, offset); + /* We are shifting bytes back, set the alignment accordingly. */ if ((length & 1) != 0 && align >= 2) set_mem_align (mem, BITS_PER_UNIT); @@ -29359,6 +29367,7 @@ arm_block_set_aligned_vect (rtx dstbase, rtx rval[MAX_VECT_LEN]; machine_mode mode; unsigned HOST_WIDE_INT v = value; + unsigned int offset = 0; gcc_assert ((align & 0x3) == 0); nelt_v8 = GET_MODE_NUNITS (V8QImode); @@ -29390,14 +29399,15 @@ arm_block_set_aligned_vect (rtx dstbase, /* Handle first 16 bytes specially using vst1:v16qi instruction. */ if (mode == V16QImode) { - mem = adjust_automodify_address (dstbase, mode, dst, 0); + mem = adjust_automodify_address (dstbase, mode, dst, offset); emit_insn (gen_movmisalignv16qi (mem, reg)); i += nelt_mode; /* Handle (8, 16) bytes leftover using vst1:v16qi again. */ if (i + nelt_v8 < length && i + nelt_v16 > length) { emit_insn (gen_add2_insn (dst, GEN_INT (length - nelt_mode))); - mem = adjust_automodify_address (dstbase, mode, dst, 0); + offset += length - nelt_mode; + mem = adjust_automodify_address (dstbase, mode, dst, offset); /* We are shifting bytes back, set the alignment accordingly. */ if ((length & 0x3) == 0) set_mem_align (mem, BITS_PER_UNIT * 4); @@ -29419,7 +29429,7 @@ arm_block_set_aligned_vect (rtx dstbase, for (; (i + nelt_mode <= length); i += nelt_mode) { addr = plus_constant (Pmode, dst, i); - mem = adjust_automodify_address (dstbase, mode, addr, i); + mem = adjust_automodify_address (dstbase, mode, addr, offset + i); emit_move_insn (mem, reg); } @@ -29428,8 +29438,8 @@ arm_block_set_aligned_vect (rtx dstbase, if (i + UNITS_PER_WORD == length) { addr = plus_constant (Pmode, dst, i - UNITS_PER_WORD); - mem = adjust_automodify_address (dstbase, mode, - addr, i - UNITS_PER_WORD); + offset += i - UNITS_PER_WORD; + mem = adjust_automodify_address (dstbase, mode, addr, offset); /* We are shifting 4 bytes back, set the alignment accordingly. */ if (align > UNITS_PER_WORD) set_mem_align (mem, BITS_PER_UNIT * UNITS_PER_WORD); @@ -29441,7 +29451,8 @@ arm_block_set_aligned_vect (rtx dstbase, else if (i < length) { emit_insn (gen_add2_insn (dst, GEN_INT (length - nelt_mode))); - mem = adjust_automodify_address (dstbase, mode, dst, 0); + offset += length - nelt_mode; + mem = adjust_automodify_address (dstbase, mode, dst, offset); /* We are shifting bytes back, set the alignment accordingly. */ if ((length & 1) == 0) set_mem_align (mem, BITS_PER_UNIT * 2); diff --git a/gcc/testsuite/gcc.target/arm/pr68143_1.c b/gcc/testsuite/gcc.target/arm/pr68143_1.c new file mode 100644 index 0000000..323473f --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr68143_1.c @@ -0,0 +1,36 @@ +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O3 -mcpu=cortex-a57" } */ +/* { dg-add-options arm_neon } */ + +#define NULL 0 + +struct stuff +{ + int a; + int b; + int c; + int d; + int e; + char *f; + int g; +}; + +void __attribute__ ((noinline)) +bar (struct stuff *x) +{ + if (x->g != 2) + __builtin_abort (); +} + +int +main (int argc, char** argv) +{ + struct stuff x = {0, 0, 0, 0, 0, NULL, 0}; + x.a = 100; + x.d = 100; + x.g = 2; + /* Struct should now look like {100, 0, 0, 100, 0, 0, 0, 2}. */ + bar (&x); + return 0; +}