Message ID | BANLkTi=a=dUCO_3LyEcP1HufCyOE-r0CsA@mail.gmail.com |
---|---|
State | New |
Headers | show |
Revital Eres <revital.eres@linaro.org> wrote on 13/06/2011 10:29:06 AM: > From: Revital Eres <revital.eres@linaro.org> > To: Ayal Zaks/Haifa/IBM@IBMIL > Cc: gcc-patches@gcc.gnu.org, Patch Tracking <patches@linaro.org> > Date: 13/06/2011 10:29 AM > Subject: [PATCH, SMS] Fix violation of memory dependence > > Hello, > > The attached patch fixes violation of memory dependencies. The > problematic scenario happens when -fmodulo-sched-allow-regmoves flag > is set and certain anti-dep edges are not created. > > For example, consider the following three instructions and the edges > between them. When -fmodulo-sched-allow-regmoves is set the edge (63 - > Anti, 0 -> 64) is not created. (probably due to transitivity) > > Insn 63) r168 = MEM[176] > Out edges: (63 - Anti, 0 -> 64) > In edges: (64 - True, 1 -> 63), (68 - True, 1 -> 63) > > insn 64) 176 = 176 + 4 > Out edges: (64 - True, 1 -> 63), (64 - True, 0-> 68) > In edges: (63 - Anti, 0 -> 64) > > insn 68) MEM[176 – 4] = 193 > Out edges: (68 - True, 1 -> 63) > In edges: (64 - True, 0-> 68) > > This anti-dep edge is on the path from one memory instruction to another > --- from 63 to 68; such that removing the edge caused a violation of > the memory dependencies as insn 63 was scheduled after insn 68. > > This patch adds intra edges between every two memory instructions in > this case. It fixes recent bootstrap failure on ARM. (with SMS flags) > > The patch was tested as follows: > On ppc64-redhat-linux regtest as well as bootstrap with SMS flags > enabling SMS also on loops with stage count 1. Regtested on SPU. > On arm-linux-gnueabi bootstrap c language with SMS > flags enabling SMS also on loops with stage count 1 > and currently regression testing on c,c++. > > OK for mainline once regtest on arm-linux-gnueabi completes? > Yes, this is a straightforward fix to a wrong-code bug, as discussed offline. Other alternatives that might introduce less edges: o connect predecessors of u with v, and u with successors of v, when removing edge (u,v). Maybe there are other cases which rely on transitivity (?). o have a version of sched_analyze that avoids creating register anti-deps to begin with, and thus will create memory-deps in the absence of transitivity. >> * ddg.c (add_intra_loop_mem_dep): New function. You could check first thing if (from->cuid == to->cuid), for code clarity. Nice catch, Ayal. > Thanks, > Revital > > Changelog: > > gcc/ > * ddg.c (add_intra_loop_mem_dep): New function. > (build_intra_loop_deps): Call it. > > testsuite/ > * gcc.dg/sms-9.c: New file. > [attachment "patch_fix_regmoves_12_6.txt" deleted by Ayal Zaks/Haifa/IBM]
Hello, > Yes, this is a straightforward fix to a wrong-code bug, as discussed > offline. Other alternatives that might introduce less edges: > o connect predecessors of u with v, and u with successors of v, when > removing edge (u,v). Maybe there are other cases which rely on transitivity > (?). Right. as discussed off-line I will further think if we are currently cover all the cases. > >>> * ddg.c (add_intra_loop_mem_dep): New function. > > You could check first thing if (from->cuid == to->cuid), for code clarity. I will address this point separately and commit the current version of the patch as is if that's OK. Thanks, Revital
Index: ddg.c =================================================================== --- ddg.c (revision 174906) +++ ddg.c (working copy) @@ -390,6 +390,36 @@ insns_may_alias_p (rtx insn1, rtx insn2) &PATTERN (insn2)); } +/* Given two nodes, analyze their RTL insns and add intra-loop mem deps + to ddg G. */ +static void +add_intra_loop_mem_dep (ddg_ptr g, ddg_node_ptr from, ddg_node_ptr to) +{ + + if (!insns_may_alias_p (from->insn, to->insn)) + /* Do not create edge if memory references have disjoint alias sets. */ + return; + + if (mem_write_insn_p (from->insn)) + { + if (mem_read_insn_p (to->insn)) + create_ddg_dep_no_link (g, from, to, + DEBUG_INSN_P (to->insn) + ? ANTI_DEP : TRUE_DEP, MEM_DEP, 0); + else if (from->cuid != to->cuid) + create_ddg_dep_no_link (g, from, to, + DEBUG_INSN_P (to->insn) + ? ANTI_DEP : OUTPUT_DEP, MEM_DEP, 0); + } + else + { + if (mem_read_insn_p (to->insn)) + return; + else if (from->cuid != to->cuid) + create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 0); + } +} + /* Given two nodes, analyze their RTL insns and add inter-loop mem deps to ddg G. */ static void @@ -477,10 +507,22 @@ build_intra_loop_deps (ddg_ptr g) if (DEBUG_INSN_P (j_node->insn)) continue; if (mem_access_insn_p (j_node->insn)) - /* Don't bother calculating inter-loop dep if an intra-loop dep - already exists. */ + { + /* Don't bother calculating inter-loop dep if an intra-loop dep + already exists. */ if (! TEST_BIT (dest_node->successors, j)) add_inter_loop_mem_dep (g, dest_node, j_node); + /* If -fmodulo-sched-allow-regmoves + is set certain anti-dep edges are not created. + It might be that these anti-dep edges are on the + path from one memory instruction to another such that + removing these edges could cause a violation of the + memory dependencies. Thus we add intra edges between + every two memory instructions in this case. */ + if (flag_modulo_sched_allow_regmoves + && !TEST_BIT (dest_node->predecessors, j)) + add_intra_loop_mem_dep (g, j_node, dest_node); + } } } } Index: testsuite/gcc.dg/sms-9.c =================================================================== --- testsuite/gcc.dg/sms-9.c (revision 0) +++ testsuite/gcc.dg/sms-9.c (revision 0) @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fmodulo-sched -fno-auto-inc-dec -O2 -fmodulo-sched-allow-regmoves" } */ + +#include <stdlib.h> +#include <stdarg.h> + +struct df_ref_info +{ + unsigned int *begin; + unsigned int *count; +}; + +extern void *memset (void *s, int c, __SIZE_TYPE__ n); + + +__attribute__ ((noinline)) +int +df_reorganize_refs_by_reg_by_insn (struct df_ref_info *ref_info, + int num, unsigned int start) +{ + unsigned int m = num; + unsigned int offset = 77; + unsigned int r; + + for (r = start; r < m; r++) + { + ref_info->begin[r] = offset; + offset += ref_info->count[r]; + ref_info->count[r] = 0; + } + + return offset; +} + +int +main () +{ + struct df_ref_info temp; + int num = 100; + unsigned int start = 5; + int i, offset; + + temp.begin = malloc (100 * sizeof (unsigned int)); + temp.count = malloc (100 * sizeof (unsigned int)); + + memset (temp.begin, 0, sizeof (unsigned int) * num); + memset (temp.count, 0, sizeof (unsigned int) * num); + + for (i = 0; i < num; i++) + temp.count[i] = i + 1; + + offset = df_reorganize_refs_by_reg_by_insn (&temp, num, start); + + if (offset != 5112) + abort (); + + free (temp.begin); + free (temp.count); + return 0; +}