| Submitter | Ira Rosen |
|---|---|
| Subject | [RFC] If-conversion of COMPONENT_REFs |
| Date | March 30, 2011, 9:13 a.m. |
| List thread | <AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com> |
| Project | gcc-patches |
| State | Accepted |
| Last updated | June 29, 2011, 9:37 a.m. |
| Headers | show |
Comments
On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: > On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >> Hi, >> >> With this patch a data-ref is marked as unconditionally read or >> written also if its adjacent field is read or written unconditionally >> in the loop. >> My concern is that this is not safe enough, even though the fields >> have to be non-pointers and non-aggregates, and this optimization is >> applied only with -ftree-loop-if-convert-stores flag. >> >> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >> >> OK for trunk? > > The restrictions do not make too much sense to me. For the C++ > memory model we can't do speculative stores at all, but for the > rest I'd say just checking if the data-refs access the same object > is enough, thus, instead of same_data_refs (a, b) simply check > operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) > or operand_equal_p (get_base_address (DR_REF (a)), get_base_address > (DR_REF (b)), 0), whatever makes more sense (I always confuse what > the contents of the various DR fields are). But what about int a[10], b[100], c[100]; for (i = 0; i < 100; i++) { if (i < 10) t = a[i]; else t = b[i]; c[i] = t + a[1]; } We can't transform this to int a[10], b[100], c[100]; for (i = 0; i < 100; i++) { t1 = a[i]; t2 = b[i]; t = (i < 10) ? t1 : t2; c[i] = t + a[1]; } since we create accesses to a[10:100], right? Thanks, Ira > > Thanks, > Richard. > >> Thanks, >> Ira >> >> >> ChangeLog: >> >> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >> if an adjacent field of the data-ref is accessed unconditionally. >> >> testsuite/ChangeLog: >> >> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >> -ftree-loop-if-convert-stores. >> >
On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote: > On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: >> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >>> Hi, >>> >>> With this patch a data-ref is marked as unconditionally read or >>> written also if its adjacent field is read or written unconditionally >>> in the loop. >>> My concern is that this is not safe enough, even though the fields >>> have to be non-pointers and non-aggregates, and this optimization is >>> applied only with -ftree-loop-if-convert-stores flag. >>> >>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >>> >>> OK for trunk? >> >> The restrictions do not make too much sense to me. For the C++ >> memory model we can't do speculative stores at all, but for the >> rest I'd say just checking if the data-refs access the same object >> is enough, thus, instead of same_data_refs (a, b) simply check >> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) >> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address >> (DR_REF (b)), 0), whatever makes more sense (I always confuse what >> the contents of the various DR fields are). > > But what about > > int a[10], b[100], c[100]; > for (i = 0; i < 100; i++) > { > if (i < 10) > t = a[i]; > else > t = b[i]; > > c[i] = t + a[1]; > } > > We can't transform this to > > int a[10], b[100], c[100]; > for (i = 0; i < 100; i++) > { > t1 = a[i]; > t2 = b[i]; > t = (i < 10) ? t1 : t2; > c[i] = t + a[1]; > } > > since we create accesses to a[10:100], right? That's correct - we may only ever create "valid" accesses, but any valid access to a is ok after we see an unconditional access to a. So we probably have to restrict ourselves to DECL_P (get_base_address ()) objects and make sure we don't access past it. Thus, I see what you were trying to do - may I suggest sth like ref_base_a = DR_REF (a); while (TREE_CODE (ref_base_a) == COMPONENT_REF || TREE_CODE (ref_base_a) == IMAGPART_EXPR || TREE_CODE (ref_base_a) == REALPART_EXPR) ref_base_a = TREE_OPERAND (ref_base_a, 0); ... same for DR_REF (b) ... if (operand_equal_p (ref_base_a, ref_base_b, 0)) ok that is, strip all non-variable offset handled_component_refs and compare the remaining base of the two accesses. If they are equal we are ok. Any hole in that? Thanks, Richard. > Thanks, > Ira > >> >> Thanks, >> Richard. >> >>> Thanks, >>> Ira >>> >>> >>> ChangeLog: >>> >>> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >>> if an adjacent field of the data-ref is accessed unconditionally. >>> >>> testsuite/ChangeLog: >>> >>> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >>> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >>> -ftree-loop-if-convert-stores. >>> >> >
On 30 March 2011 14:41, Richard Guenther <richard.guenther@gmail.com> wrote: > On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote: >> On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: >>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >>>> Hi, >>>> >>>> With this patch a data-ref is marked as unconditionally read or >>>> written also if its adjacent field is read or written unconditionally >>>> in the loop. >>>> My concern is that this is not safe enough, even though the fields >>>> have to be non-pointers and non-aggregates, and this optimization is >>>> applied only with -ftree-loop-if-convert-stores flag. >>>> >>>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >>>> >>>> OK for trunk? >>> >>> The restrictions do not make too much sense to me. For the C++ >>> memory model we can't do speculative stores at all, but for the >>> rest I'd say just checking if the data-refs access the same object >>> is enough, thus, instead of same_data_refs (a, b) simply check >>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) >>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address >>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what >>> the contents of the various DR fields are). >> >> But what about >> >> int a[10], b[100], c[100]; >> for (i = 0; i < 100; i++) >> { >> if (i < 10) >> t = a[i]; >> else >> t = b[i]; >> >> c[i] = t + a[1]; >> } >> >> We can't transform this to >> >> int a[10], b[100], c[100]; >> for (i = 0; i < 100; i++) >> { >> t1 = a[i]; >> t2 = b[i]; >> t = (i < 10) ? t1 : t2; >> c[i] = t + a[1]; >> } >> >> since we create accesses to a[10:100], right? > > That's correct - we may only ever create "valid" accesses, but any > valid access to a is ok after we see an unconditional access to a. > > So we probably have to restrict ourselves to DECL_P (get_base_address ()) > objects and make sure we don't access past it. > > Thus, I see what you were trying to do - may I suggest sth like > > ref_base_a = DR_REF (a); > while (TREE_CODE (ref_base_a) == COMPONENT_REF > || TREE_CODE (ref_base_a) == IMAGPART_EXPR > || TREE_CODE (ref_base_a) == REALPART_EXPR) > ref_base_a = TREE_OPERAND (ref_base_a, 0); > > ... same for DR_REF (b) ... > > if (operand_equal_p (ref_base_a, ref_base_b, 0)) > ok > > that is, strip all non-variable offset handled_component_refs and compare > the remaining base of the two accesses. If they are equal we are ok. > > Any hole in that? I don't see any :) I'll test your version. Thanks, Ira > > Thanks, > Richard. > >> Thanks, >> Ira >> >>> >>> Thanks, >>> Richard. >>> >>>> Thanks, >>>> Ira >>>> >>>> >>>> ChangeLog: >>>> >>>> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >>>> if an adjacent field of the data-ref is accessed unconditionally. >>>> >>>> testsuite/ChangeLog: >>>> >>>> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >>>> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >>>> -ftree-loop-if-convert-stores. >>>> >>> >> >
Patch
Index: testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c =================================================================== --- testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0) +++ testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0) @@ -0,0 +1,69 @@ +/* { dg-require-effective-target vect_int } */ + +#include <stdarg.h> +#include "tree-vect.h" + +#define N 50 + +typedef struct { + short a; + short b; +} data; + +data in1[N], in2[N], out[N]; +short result[N*2] = {10,-7,11,-6,12,-5,13,-4,14,-3,15,-2,16,-1,17,0,18,1,19,2,20,3,21,4,22,5,23,6,24,7,25,8,26,9,27,10,28,11,29,12,30,13,31,14,32,15,33,16,34,17,35,18,36,19,37,20,38,21,39,22,40,23,41,24,42,25,43,26,44,27,45,28,46,29,47,30,48,31,49,32,50,33,51,34,52,35,53,36,54,37,55,38,56,39,57,40,58,41,59,42}; +short out1[N], out2[N]; + +__attribute__ ((noinline)) void +foo () +{ + int i; + short c, d; + + for (i = 0; i < N; i++) + { + c = in1[i].b; + d = in2[i].b; + + if (c >= d) + { + out[i].b = in1[i].a; + out[i].a = d + 5; + } + else + { + out[i].b = d - 12; + out[i].a = in2[i].a + d; + } + } +} + +int +main (void) +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + in1[i].a = i; + in1[i].b = i + 2; + in2[i].a = 5; + in2[i].b = i + 5; + __asm__ volatile (""); + } + + foo (); + + for (i = 0; i < N; i++) + { + if (out[i].a != result[2*i] || out[i].b != result[2*i+1]) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || {! vect_strided } } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: testsuite/gcc.dg/vect/vect.exp =================================================================== --- testsuite/gcc.dg/vect/vect.exp (revision 171716) +++ testsuite/gcc.dg/vect/vect.exp (working copy) @@ -210,6 +210,12 @@ lappend DEFAULT_VECTCFLAGS "--param" "gg dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/ggc-*.\[cS\]]] \ "" $DEFAULT_VECTCFLAGS +# -ftree-loop-if-convert-stores +set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS +lappend DEFAULT_VECTCFLAGS "-ftree-loop-if-convert-stores" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/if-cvt-stores-vect-*.\[cS\]]] \ + "" $DEFAULT_VECTCFLAGS + # With -O3. # Don't allow IPA cloning, because it throws our counts out of whack. set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS Index: tree-if-conv.c =================================================================== --- tree-if-conv.c (revision 171716) +++ tree-if-conv.c (working copy) @@ -461,11 +461,11 @@ struct ifc_dr { #define DR_WRITTEN_AT_LEAST_ONCE(DR) (IFC_DR (DR)->written_at_least_once) #define DR_RW_UNCONDITIONALLY(DR) (IFC_DR (DR)->rw_unconditionally) -/* Returns true when the memory references of STMT are read or written - unconditionally. In other words, this function returns true when - for every data reference A in STMT there exist other accesses to - the same data reference with predicates that add up (OR-up) to the - true predicate: this ensures that the data reference A is touched +/* Returns true when the memory references of STMT (or their adjacent fields) + are read or writtenunconditionally. In other words, this function + returns true when for every data reference A in STMT there exist other + accesses to the same data reference with predicates that add up (OR-up) to + the true predicate: this ensures that the data reference A is touched (read or written) on every iteration of the if-converted loop. */ static bool @@ -479,7 +479,7 @@ memrefs_read_or_written_unconditionally for (i = 0; VEC_iterate (data_reference_p, drs, i, a); i++) if (DR_STMT (a) == stmt) { - bool found = false; + bool found = false, same_comp_ref = false; int x = DR_RW_UNCONDITIONALLY (a); if (x == 0) @@ -489,22 +489,45 @@ memrefs_read_or_written_unconditionally continue; for (j = 0; VEC_iterate (data_reference_p, drs, j, b); j++) - if (DR_STMT (b) != stmt - && same_data_refs (a, b)) - { - tree cb = bb_predicate (gimple_bb (DR_STMT (b))); + { + if (DR_STMT (b) == stmt) + continue; - if (DR_RW_UNCONDITIONALLY (b) == 1 - || is_true_predicate (cb) - || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb), - ca, cb))) - { - DR_RW_UNCONDITIONALLY (a) = 1; - DR_RW_UNCONDITIONALLY (b) = 1; - found = true; - break; - } - } + if (TREE_CODE (DR_REF (a)) == COMPONENT_REF + && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (a))) + && !POINTER_TYPE_P (TREE_TYPE (DR_REF (a))) + && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (b))) + && !POINTER_TYPE_P (TREE_TYPE (DR_REF (b))) + && operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) + && dr_equal_offsets_p (a, b)) + { + HOST_WIDE_INT type_size_a, type_size_b, init_a, init_b; + + type_size_a = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)))); + type_size_b = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b)))); + init_a = TREE_INT_CST_LOW (DR_INIT (a)); + init_b = TREE_INT_CST_LOW (DR_INIT (b)); + + if (init_a + type_size_a == init_b || init_b + type_size_b == init_a) + same_comp_ref = true; + } + + if (same_comp_ref || same_data_refs (a, b)) + { + tree cb = bb_predicate (gimple_bb (DR_STMT (b))); + + if (DR_RW_UNCONDITIONALLY (b) == 1 + || is_true_predicate (cb) + || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb), + ca, cb))) + { + DR_RW_UNCONDITIONALLY (a) = 1; + DR_RW_UNCONDITIONALLY (b) = 1; + found = true; + break; + } + } + } if (!found) {

Hi, With this patch a data-ref is marked as unconditionally read or written also if its adjacent field is read or written unconditionally in the loop. My concern is that this is not safe enough, even though the fields have to be non-pointers and non-aggregates, and this optimization is applied only with -ftree-loop-if-convert-stores flag. Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. OK for trunk? Thanks, Ira ChangeLog: * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true if an adjacent field of the data-ref is accessed unconditionally. testsuite/ChangeLog: * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with -ftree-loop-if-convert-stores.