Message ID | 87k1wzm176.fsf@linaro.org |
---|---|
State | New |
Headers | show |
Series | Revert DECL_USER_ALIGN part of r241959 | expand |
On Wed, Jan 3, 2018 at 4:06 PM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > r241959 included code to stop us increasing the alignment of a > "user-aligned" variable. This wasn't the main purpose of the patch, > and I think it was just there to make the testcase work. > > The documentation for the aligned attribute says: > > This attribute specifies a minimum alignment for the variable or > structure field, measured in bytes. > > The DECL_USER_ALIGN code seemed to be treating as a sort of maximum > instead, but there's not really such a thing as a maximum here: the > variable might still end up at the start of a section that has a higher > alignment, or might end up by chance on a "very aligned" boundary at > link or load time. > > I think people who add alignment attributes want to ensure that > accesses to that variable are fast, so it seems counter-intuitive > for it to make the access slower. The vect-align-4.c test is an > example of this: for targets with 128-bit vectors, we get better > code without the aligned attribute than we do with it. > > Tested on aarch64-linux-gnu so far, will test more widely if OK. Works for me - I think I did this to copy behavior of code elsewhere (pass_increase_alignment::increase_alignment). Richard. > Thanks, > Richard > > > 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> > > gcc/ > * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't > punt for user-aligned variables. > > gcc/testsuite/ > * gcc.dg/vect/vect-align-4.c: New test. > * gcc.dg/vect/vect-nb-iter-ub-2.c (cc): Remove alignment attribute > and redefine as a structure with an unaligned member "b". > (foo): Update accordingly. > > Index: gcc/tree-vect-data-refs.c > =================================================================== > --- gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.301330558 +0000 > +++ gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.454324422 +0000 > @@ -920,19 +920,6 @@ vect_compute_data_ref_alignment (struct > return true; > } > > - if (DECL_USER_ALIGN (base)) > - { > - if (dump_enabled_p ()) > - { > - dump_printf_loc (MSG_NOTE, vect_location, > - "not forcing alignment of user-aligned " > - "variable: "); > - dump_generic_expr (MSG_NOTE, TDF_SLIM, base); > - dump_printf (MSG_NOTE, "\n"); > - } > - return true; > - } > - > /* Force the alignment of the decl. > NOTE: This is the only change to the code we make during > the analysis phase, before deciding to vectorize the loop. */ > Index: gcc/testsuite/gcc.dg/vect/vect-align-4.c > =================================================================== > --- /dev/null 2018-01-03 08:32:43.873058927 +0000 > +++ gcc/testsuite/gcc.dg/vect/vect-align-4.c 2018-01-03 15:03:14.453324462 +0000 > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target vect_int } */ > +/* { dg-add-options bind_pic_locally } */ > + > +__attribute__((aligned (8))) int a[2048] = {}; > + > +void > +f1 (void) > +{ > + for (int i = 0; i < 2048; i++) > + a[i]++; > +} > + > +/* { dg-final { scan-tree-dump-not "Vectorizing an unaligned access" "vect" } } */ > +/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.301330558 +0000 > +++ gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.454324422 +0000 > @@ -3,18 +3,19 @@ > #include "tree-vect.h" > > int ii[32]; > -char cc[66] __attribute__((aligned(1))) = > +struct { char a; char b[66]; } cc = { 0, > { 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, > 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, > 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, > - 30, 0, 31, 0 }; > + 30, 0, 31, 0 } > +}; > > void __attribute__((noinline,noclone)) > foo (int s) > { > int i; > for (i = 0; i < s; i++) > - ii[i] = (int) cc[i*2]; > + ii[i] = (int) cc.b[i*2]; > } > > int main (int argc, const char **argv)
On Fri, Jan 5, 2018 at 9:47 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On Wed, Jan 3, 2018 at 4:06 PM, Richard Sandiford > <richard.sandiford@linaro.org> wrote: >> r241959 included code to stop us increasing the alignment of a >> "user-aligned" variable. This wasn't the main purpose of the patch, >> and I think it was just there to make the testcase work. >> >> The documentation for the aligned attribute says: >> >> This attribute specifies a minimum alignment for the variable or >> structure field, measured in bytes. >> >> The DECL_USER_ALIGN code seemed to be treating as a sort of maximum >> instead, but there's not really such a thing as a maximum here: the >> variable might still end up at the start of a section that has a higher >> alignment, or might end up by chance on a "very aligned" boundary at >> link or load time. >> >> I think people who add alignment attributes want to ensure that >> accesses to that variable are fast, so it seems counter-intuitive >> for it to make the access slower. The vect-align-4.c test is an >> example of this: for targets with 128-bit vectors, we get better >> code without the aligned attribute than we do with it. >> >> Tested on aarch64-linux-gnu so far, will test more widely if OK. > > Works for me - I think I did this to copy behavior of code elsewhere > (pass_increase_alignment::increase_alignment). Note I had the impression that using the aligned attribute with a "low" alignment is a hint to the compiler to not increase .data by overaligning variables. I think there were PRs asking for that. Richard. > Richard. > >> Thanks, >> Richard >> >> >> 2018-01-03 Richard Sandiford <richard.sandiford@linaro.org> >> >> gcc/ >> * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't >> punt for user-aligned variables. >> >> gcc/testsuite/ >> * gcc.dg/vect/vect-align-4.c: New test. >> * gcc.dg/vect/vect-nb-iter-ub-2.c (cc): Remove alignment attribute >> and redefine as a structure with an unaligned member "b". >> (foo): Update accordingly. >> >> Index: gcc/tree-vect-data-refs.c >> =================================================================== >> --- gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.301330558 +0000 >> +++ gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.454324422 +0000 >> @@ -920,19 +920,6 @@ vect_compute_data_ref_alignment (struct >> return true; >> } >> >> - if (DECL_USER_ALIGN (base)) >> - { >> - if (dump_enabled_p ()) >> - { >> - dump_printf_loc (MSG_NOTE, vect_location, >> - "not forcing alignment of user-aligned " >> - "variable: "); >> - dump_generic_expr (MSG_NOTE, TDF_SLIM, base); >> - dump_printf (MSG_NOTE, "\n"); >> - } >> - return true; >> - } >> - >> /* Force the alignment of the decl. >> NOTE: This is the only change to the code we make during >> the analysis phase, before deciding to vectorize the loop. */ >> Index: gcc/testsuite/gcc.dg/vect/vect-align-4.c >> =================================================================== >> --- /dev/null 2018-01-03 08:32:43.873058927 +0000 >> +++ gcc/testsuite/gcc.dg/vect/vect-align-4.c 2018-01-03 15:03:14.453324462 +0000 >> @@ -0,0 +1,15 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target vect_int } */ >> +/* { dg-add-options bind_pic_locally } */ >> + >> +__attribute__((aligned (8))) int a[2048] = {}; >> + >> +void >> +f1 (void) >> +{ >> + for (int i = 0; i < 2048; i++) >> + a[i]++; >> +} >> + >> +/* { dg-final { scan-tree-dump-not "Vectorizing an unaligned access" "vect" } } */ >> +/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */ >> Index: gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c >> =================================================================== >> --- gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.301330558 +0000 >> +++ gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.454324422 +0000 >> @@ -3,18 +3,19 @@ >> #include "tree-vect.h" >> >> int ii[32]; >> -char cc[66] __attribute__((aligned(1))) = >> +struct { char a; char b[66]; } cc = { 0, >> { 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, >> 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, >> 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, >> - 30, 0, 31, 0 }; >> + 30, 0, 31, 0 } >> +}; >> >> void __attribute__((noinline,noclone)) >> foo (int s) >> { >> int i; >> for (i = 0; i < s; i++) >> - ii[i] = (int) cc[i*2]; >> + ii[i] = (int) cc.b[i*2]; >> } >> >> int main (int argc, const char **argv)
Richard Biener <richard.guenther@gmail.com> writes: > On Fri, Jan 5, 2018 at 9:47 AM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Wed, Jan 3, 2018 at 4:06 PM, Richard Sandiford >> <richard.sandiford@linaro.org> wrote: >>> r241959 included code to stop us increasing the alignment of a >>> "user-aligned" variable. This wasn't the main purpose of the patch, >>> and I think it was just there to make the testcase work. >>> >>> The documentation for the aligned attribute says: >>> >>> This attribute specifies a minimum alignment for the variable or >>> structure field, measured in bytes. >>> >>> The DECL_USER_ALIGN code seemed to be treating as a sort of maximum >>> instead, but there's not really such a thing as a maximum here: the >>> variable might still end up at the start of a section that has a higher >>> alignment, or might end up by chance on a "very aligned" boundary at >>> link or load time. >>> >>> I think people who add alignment attributes want to ensure that >>> accesses to that variable are fast, so it seems counter-intuitive >>> for it to make the access slower. The vect-align-4.c test is an >>> example of this: for targets with 128-bit vectors, we get better >>> code without the aligned attribute than we do with it. >>> >>> Tested on aarch64-linux-gnu so far, will test more widely if OK. >> >> Works for me - I think I did this to copy behavior of code elsewhere >> (pass_increase_alignment::increase_alignment). OK. I think the difference is that there we're increasing the alignment of variables speculatively, without knowing whether it's going to help vectorisation or not, whereas here we're more sure. That's actually the reason I care (forgot to say). increase_alignment shouldn't start giving things 256-byte alignment just because we're running the vectoriser on a target with 256-byte vectors. We need some kind of proof that it's going to help vectorise a particular access. Another reason is to support vectorising search loops, where the bound isn't known in advance. (That's queued for GCC 9.) If we have a loop that operates on both a wide and a narrow type, the wide type needs N vectors for each vector of the narrow type. We then need to align to N times the vector size to ensure that the accesses don't cross a page boundary. Part of the problem is that we don't distinguish true user alignments from ones that have been set by GCC itself. E.g. if increase_alignment does set an alignment, it records that as a "user" alignment, which stops the vectoriser from increasing it further. But since the docs say that the value's only a minimum, the current behaviour still feels wrong for user attributes. > Note I had the impression that using the aligned attribute with a "low" > alignment is a hint to the compiler to not increase .data by overaligning > variables. I think there were PRs asking for that. That's overloading the attribute so that sometimes it's a hint to optimise accesses and sometimes it's a hint not to. Maybe we should support aligned+packed for variables, if people really want a fixed alignment? (Although that's only going to work reliably for variables that go in their own section.) E.g. the current behaviour triggers even for __attribute__((aligned)), which doesn't specify a particular value. Is the patch OK as a compromise for GCC 8? We don't speculatively increase the user alignment in increase_alignment, but do still increase it if it helps to vectorise a particular loop access? Thanks, Richard
On Fri, Jan 05, 2018 at 09:49:56AM +0000, Richard Sandiford wrote: > Is the patch OK as a compromise for GCC 8? We don't speculatively > increase the user alignment in increase_alignment, but do still increase > it if it helps to vectorise a particular loop access? I'd be a little bit worried about code that puts some variables into user sections with specific alignment, i.e. __attribute__((section ("whatever"), aligned(N))) where data is collected from different TUs into the user section and any padding added there breaks this. E.g. Linux kernel and other programs use this technique heavily. Jakub
Jakub Jelinek <jakub@redhat.com> writes: > On Fri, Jan 05, 2018 at 09:49:56AM +0000, Richard Sandiford wrote: >> Is the patch OK as a compromise for GCC 8? We don't speculatively >> increase the user alignment in increase_alignment, but do still increase >> it if it helps to vectorise a particular loop access? > > I'd be a little bit worried about code that puts some variables into user > sections with specific alignment, i.e. > __attribute__((section ("whatever"), aligned(N))) > where data is collected from different TUs into the user section and > any padding added there breaks this. E.g. Linux kernel and other programs > use this technique heavily. Looking again, it seems we already prevent increasing alignment for the "used" attribute (with or without "aligned"). Is that good enough? That kind of construct is used without "aligned" too, and I think it should have "used" to stop it being removed as dead. Thanks, Richard
On Fri, Jan 05, 2018 at 10:25:35AM +0000, Richard Sandiford wrote: > Jakub Jelinek <jakub@redhat.com> writes: > > On Fri, Jan 05, 2018 at 09:49:56AM +0000, Richard Sandiford wrote: > >> Is the patch OK as a compromise for GCC 8? We don't speculatively > >> increase the user alignment in increase_alignment, but do still increase > >> it if it helps to vectorise a particular loop access? > > > > I'd be a little bit worried about code that puts some variables into user > > sections with specific alignment, i.e. > > __attribute__((section ("whatever"), aligned(N))) > > where data is collected from different TUs into the user section and > > any padding added there breaks this. E.g. Linux kernel and other programs > > use this technique heavily. > > Looking again, it seems we already prevent increasing alignment for > the "used" attribute (with or without "aligned"). Is that good enough? > > That kind of construct is used without "aligned" too, and I think it > should have "used" to stop it being removed as dead. Yeah, that should be likely good enough. Jakub
Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.301330558 +0000 +++ gcc/tree-vect-data-refs.c 2018-01-03 15:03:14.454324422 +0000 @@ -920,19 +920,6 @@ vect_compute_data_ref_alignment (struct return true; } - if (DECL_USER_ALIGN (base)) - { - if (dump_enabled_p ()) - { - dump_printf_loc (MSG_NOTE, vect_location, - "not forcing alignment of user-aligned " - "variable: "); - dump_generic_expr (MSG_NOTE, TDF_SLIM, base); - dump_printf (MSG_NOTE, "\n"); - } - return true; - } - /* Force the alignment of the decl. NOTE: This is the only change to the code we make during the analysis phase, before deciding to vectorize the loop. */ Index: gcc/testsuite/gcc.dg/vect/vect-align-4.c =================================================================== --- /dev/null 2018-01-03 08:32:43.873058927 +0000 +++ gcc/testsuite/gcc.dg/vect/vect-align-4.c 2018-01-03 15:03:14.453324462 +0000 @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-add-options bind_pic_locally } */ + +__attribute__((aligned (8))) int a[2048] = {}; + +void +f1 (void) +{ + for (int i = 0; i < 2048; i++) + a[i]++; +} + +/* { dg-final { scan-tree-dump-not "Vectorizing an unaligned access" "vect" } } */ +/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */ Index: gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c =================================================================== --- gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.301330558 +0000 +++ gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 2018-01-03 15:03:14.454324422 +0000 @@ -3,18 +3,19 @@ #include "tree-vect.h" int ii[32]; -char cc[66] __attribute__((aligned(1))) = +struct { char a; char b[66]; } cc = { 0, { 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, - 30, 0, 31, 0 }; + 30, 0, 31, 0 } +}; void __attribute__((noinline,noclone)) foo (int s) { int i; for (i = 0; i < s; i++) - ii[i] = (int) cc[i*2]; + ii[i] = (int) cc.b[i*2]; } int main (int argc, const char **argv)