[AArch64] , patch] PR71727 fix -mstrict-align

Message ID CAKdteOZ0M9rxfPkhi49YHQr=0fcLWwFp-wA7spkJBOt7Z_pDSQ@mail.gmail.com
State New
Headers show

Commit Message

Christophe Lyon July 18, 2017, 12:50 p.m.
Hello,

I've received a complaint that GCC for AArch64 would generate
vectorized code relying on unaligned memory accesses even when using
-mstrict-align. This is a problem for code where such accesses lead to
memory faults.

A previous patch (r243333) introduced
aarch64_builtin_support_vector_misalignment, which rejects such
accesses when the element size is 64 bits, and accept them otherwise,
which I think it shouldn't. The testcase added at that time only used
64 bits elements, and therefore didn't fully test the patch.

The report I received is about vectorized accesses to an array of
unsigned chars, whose start address is not aligned on a 128 bits
boundary.

The attached patch fixes the problem by making
aarch64_builtin_support_vector_misalignment always return false when
the misalignment is not known at compile time.

I've also added a testcase, which tries to check if the array start
address alignment is checked (using %16, and-ing with #15), so that
loop peeling is performed *before* using vectorized accesses. Without
the patch, vectorized accesses are used at the beginning of the array,
and byte accesses are used for the remainder at the end, and there is
not such 'and wX,wX,15'.

BTW, I'm not sure about the same hook for arm... it seems to me it has
a similar problem.

OK?

Thanks,

Christophe
2017-07-18  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/71727
	gcc/
	* config/aarch64/aarch64.c
	(aarch64_builtin_support_vector_misalignment): Always return false
	when misalignment is unknown.

	gcc/testsuite/
	* gcc.target/aarch64/pr71727-2.c: New test.

Comments

Christophe Lyon Aug. 21, 2017, 1:04 p.m. | #1
ping ?
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html

Christophe


On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> Hello,

>

> I've received a complaint that GCC for AArch64 would generate

> vectorized code relying on unaligned memory accesses even when using

> -mstrict-align. This is a problem for code where such accesses lead to

> memory faults.

>

> A previous patch (r243333) introduced

> aarch64_builtin_support_vector_misalignment, which rejects such

> accesses when the element size is 64 bits, and accept them otherwise,

> which I think it shouldn't. The testcase added at that time only used

> 64 bits elements, and therefore didn't fully test the patch.

>

> The report I received is about vectorized accesses to an array of

> unsigned chars, whose start address is not aligned on a 128 bits

> boundary.

>

> The attached patch fixes the problem by making

> aarch64_builtin_support_vector_misalignment always return false when

> the misalignment is not known at compile time.

>

> I've also added a testcase, which tries to check if the array start

> address alignment is checked (using %16, and-ing with #15), so that

> loop peeling is performed *before* using vectorized accesses. Without

> the patch, vectorized accesses are used at the beginning of the array,

> and byte accesses are used for the remainder at the end, and there is

> not such 'and wX,wX,15'.

>

> BTW, I'm not sure about the same hook for arm... it seems to me it has

> a similar problem.

>

> OK?

>

> Thanks,

>

> Christophe
Christophe Lyon Aug. 31, 2017, 9:22 a.m. | #2
ping^2 ?


On 21 August 2017 at 15:04, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> ping ?

> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html

>

> Christophe

>

>

> On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote:

>> Hello,

>>

>> I've received a complaint that GCC for AArch64 would generate

>> vectorized code relying on unaligned memory accesses even when using

>> -mstrict-align. This is a problem for code where such accesses lead to

>> memory faults.

>>

>> A previous patch (r243333) introduced

>> aarch64_builtin_support_vector_misalignment, which rejects such

>> accesses when the element size is 64 bits, and accept them otherwise,

>> which I think it shouldn't. The testcase added at that time only used

>> 64 bits elements, and therefore didn't fully test the patch.

>>

>> The report I received is about vectorized accesses to an array of

>> unsigned chars, whose start address is not aligned on a 128 bits

>> boundary.

>>

>> The attached patch fixes the problem by making

>> aarch64_builtin_support_vector_misalignment always return false when

>> the misalignment is not known at compile time.

>>

>> I've also added a testcase, which tries to check if the array start

>> address alignment is checked (using %16, and-ing with #15), so that

>> loop peeling is performed *before* using vectorized accesses. Without

>> the patch, vectorized accesses are used at the beginning of the array,

>> and byte accesses are used for the remainder at the end, and there is

>> not such 'and wX,wX,15'.

>>

>> BTW, I'm not sure about the same hook for arm... it seems to me it has

>> a similar problem.

>>

>> OK?

>>

>> Thanks,

>>

>> Christophe
Christophe Lyon Sept. 11, 2017, 8:29 a.m. | #3
ping^3 ?

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html

On 31 August 2017 at 11:22, Christophe Lyon <christophe.lyon@linaro.org> wrote:
> ping^2 ?

>

>

> On 21 August 2017 at 15:04, Christophe Lyon <christophe.lyon@linaro.org> wrote:

>> ping ?

>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html

>>

>> Christophe

>>

>>

>> On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote:

>>> Hello,

>>>

>>> I've received a complaint that GCC for AArch64 would generate

>>> vectorized code relying on unaligned memory accesses even when using

>>> -mstrict-align. This is a problem for code where such accesses lead to

>>> memory faults.

>>>

>>> A previous patch (r243333) introduced

>>> aarch64_builtin_support_vector_misalignment, which rejects such

>>> accesses when the element size is 64 bits, and accept them otherwise,

>>> which I think it shouldn't. The testcase added at that time only used

>>> 64 bits elements, and therefore didn't fully test the patch.

>>>

>>> The report I received is about vectorized accesses to an array of

>>> unsigned chars, whose start address is not aligned on a 128 bits

>>> boundary.

>>>

>>> The attached patch fixes the problem by making

>>> aarch64_builtin_support_vector_misalignment always return false when

>>> the misalignment is not known at compile time.

>>>

>>> I've also added a testcase, which tries to check if the array start

>>> address alignment is checked (using %16, and-ing with #15), so that

>>> loop peeling is performed *before* using vectorized accesses. Without

>>> the patch, vectorized accesses are used at the beginning of the array,

>>> and byte accesses are used for the remainder at the end, and there is

>>> not such 'and wX,wX,15'.

>>>

>>> BTW, I'm not sure about the same hook for arm... it seems to me it has

>>> a similar problem.

>>>

>>> OK?

>>>

>>> Thanks,

>>>

>>> Christophe
Andrew Pinski Sept. 11, 2017, 8:45 a.m. | #4
On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> Hello,

>

> I've received a complaint that GCC for AArch64 would generate

> vectorized code relying on unaligned memory accesses even when using

> -mstrict-align. This is a problem for code where such accesses lead to

> memory faults.

>

> A previous patch (r243333) introduced

> aarch64_builtin_support_vector_misalignment, which rejects such

> accesses when the element size is 64 bits, and accept them otherwise,

> which I think it shouldn't. The testcase added at that time only used

> 64 bits elements, and therefore didn't fully test the patch.

>

> The report I received is about vectorized accesses to an array of

> unsigned chars, whose start address is not aligned on a 128 bits

> boundary.

>

> The attached patch fixes the problem by making

> aarch64_builtin_support_vector_misalignment always return false when

> the misalignment is not known at compile time.

>

> I've also added a testcase, which tries to check if the array start

> address alignment is checked (using %16, and-ing with #15), so that

> loop peeling is performed *before* using vectorized accesses. Without

> the patch, vectorized accesses are used at the beginning of the array,

> and byte accesses are used for the remainder at the end, and there is

> not such 'and wX,wX,15'.

>

> BTW, I'm not sure about the same hook for arm... it seems to me it has

> a similar problem.

>

> OK?


I would keep part of the comment:
-  /* Misalignment factor is unknown at compile time but we know
-     it's word aligned.  */

Something like:
/* Misalignment factor is unknown at compile time. */

Otherwise it is not obvious what -1 means.

Other than I think this patch is good (I cannot approve though).

Thanks,
Andrew

>

> Thanks,

>

> Christophe
Christophe Lyon Sept. 20, 2017, 1:17 p.m. | #5
Hi,

On 11 September 2017 at 10:45, Andrew Pinski <pinskia@gmail.com> wrote:
> On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon

> <christophe.lyon@linaro.org> wrote:

>> Hello,

>>

>> I've received a complaint that GCC for AArch64 would generate

>> vectorized code relying on unaligned memory accesses even when using

>> -mstrict-align. This is a problem for code where such accesses lead to

>> memory faults.

>>

>> A previous patch (r243333) introduced

>> aarch64_builtin_support_vector_misalignment, which rejects such

>> accesses when the element size is 64 bits, and accept them otherwise,

>> which I think it shouldn't. The testcase added at that time only used

>> 64 bits elements, and therefore didn't fully test the patch.

>>

>> The report I received is about vectorized accesses to an array of

>> unsigned chars, whose start address is not aligned on a 128 bits

>> boundary.

>>

>> The attached patch fixes the problem by making

>> aarch64_builtin_support_vector_misalignment always return false when

>> the misalignment is not known at compile time.

>>

>> I've also added a testcase, which tries to check if the array start

>> address alignment is checked (using %16, and-ing with #15), so that

>> loop peeling is performed *before* using vectorized accesses. Without

>> the patch, vectorized accesses are used at the beginning of the array,

>> and byte accesses are used for the remainder at the end, and there is

>> not such 'and wX,wX,15'.

>>

>> BTW, I'm not sure about the same hook for arm... it seems to me it has

>> a similar problem.

>>

>> OK?

>

> I would keep part of the comment:

> -  /* Misalignment factor is unknown at compile time but we know

> -     it's word aligned.  */

>

> Something like:

> /* Misalignment factor is unknown at compile time. */

>

> Otherwise it is not obvious what -1 means.

>

> Other than I think this patch is good (I cannot approve though).

>


Here is an updated patch, with the comment added as Andrew suggested.

OK?

Thanks,

Christophe

> Thanks,

> Andrew

>

>>

>> Thanks,

>>

>> Christophe
2017-09-20  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/71727
	gcc/
	* config/aarch64/aarch64.c
	(aarch64_builtin_support_vector_misalignment): Always return false
	when misalignment is unknown.

	gcc/testsuite/
	* gcc.target/aarch64/pr71727-2.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 799989a..7cc67ec 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11757,19 +11757,9 @@ aarch64_builtin_support_vector_misalignment (machine_mode mode,
       if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
         return false;
 
+      /* Misalignment factor is unknown at compile time.  */
       if (misalignment == -1)
-	{
-	  /* Misalignment factor is unknown at compile time but we know
-	     it's word aligned.  */
-	  if (aarch64_simd_vector_alignment_reachable (type, is_packed))
-            {
-              int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type));
-
-              if (element_size != 64)
-                return true;
-            }
-	  return false;
-	}
+	return false;
     }
   return default_builtin_support_vector_misalignment (mode, type, misalignment,
 						      is_packed);
diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727-2.c b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
new file mode 100644
index 0000000..2bc803a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mstrict-align -O3" } */
+
+unsigned char foo(const unsigned char *buffer, unsigned int length)
+{
+  unsigned char sum;
+  unsigned int  count;
+
+  for (sum = 0, count = 0; count < length; count++) {
+    sum = (unsigned char) (sum + *(buffer + count));
+  }
+
+  return sum;
+}
+
+/* { dg-final { scan-assembler-times "and\tw\[0-9\]+, w\[0-9\]+, 15" 1 } } */

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 799989a..12a9fbe 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11758,18 +11758,7 @@  aarch64_builtin_support_vector_misalignment (machine_mode mode,
         return false;
 
       if (misalignment == -1)
-	{
-	  /* Misalignment factor is unknown at compile time but we know
-	     it's word aligned.  */
-	  if (aarch64_simd_vector_alignment_reachable (type, is_packed))
-            {
-              int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type));
-
-              if (element_size != 64)
-                return true;
-            }
-	  return false;
-	}
+	return false;
     }
   return default_builtin_support_vector_misalignment (mode, type, misalignment,
 						      is_packed);
diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727-2.c b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
new file mode 100644
index 0000000..8935a72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mstrict-align -O3" } */
+
+unsigned char foo(const unsigned char *buffer, unsigned int length)
+{
+  unsigned char sum;
+  unsigned int  count;
+
+  for (sum = 0, count = 0; count < length; count++) {
+    sum = (unsigned char) (sum + *(buffer + count));
+  }
+
+  return sum;
+}
+
+/* { dg-final { scan-assembler-times "and\tw\[0-9\]+, w\[0-9\]+, 15" 1 } } */