diff mbox

RFC [1/3] divmod transform v2

Message ID CAAgBjMm9QVnm8V6vyc2m3K+aSbSPiciAbr8BDMbaAGLLUBvp6A@mail.gmail.com
State New
Headers show

Commit Message

Prathamesh Kulkarni Oct. 26, 2016, 10:17 a.m. UTC
On 25 October 2016 at 18:47, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

>

>> On 25 October 2016 at 16:17, Richard Biener <rguenther@suse.de> wrote:

>> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

>> >

>> >> On 25 October 2016 at 13:43, Richard Biener <richard.guenther@gmail.com> wrote:

>> >> > On Sun, Oct 16, 2016 at 7:59 AM, Prathamesh Kulkarni

>> >> > <prathamesh.kulkarni@linaro.org> wrote:

>> >> >> Hi,

>> >> >> After approval from Bernd Schmidt, I committed the patch to remove

>> >> >> optab functions for

>> >> >> sdivmod_optab and udivmod_optab in optabs.def, which removes the block

>> >> >> for divmod patch.

>> >> >>

>> >> >> This patch is mostly the same as previous one, except it drops

>> >> >> targeting __udivmoddi4() because

>> >> >> it gave undefined reference link error for calling __udivmoddi4() on

>> >> >> aarch64-linux-gnu.

>> >> >> It appears aarch64 has hardware insn for DImode div, so __udivmoddi4()

>> >> >> isn't needed for the target

>> >> >> (it was a bug in my patch that called __udivmoddi4() even though

>> >> >> aarch64 supported hardware div).

>> >> >>

>> >> >> However this makes me wonder if it's guaranteed that __udivmoddi4()

>> >> >> will be available for a target if it doesn't have hardware div and

>> >> >> divmod insn and doesn't have target-specific libfunc for

>> >> >> DImode divmod ? To be conservative, the attached patch doesn't

>> >> >> generate call to __udivmoddi4.

>> >> >>

>> >> >> Passes bootstrap+test on x86_64-unknown-linux.

>> >> >> Cross-tested on arm*-*-*, aarch64*-*-*.

>> >> >> Verified that there are no regressions with SPEC2006 on

>> >> >> x86_64-unknown-linux-gnu.

>> >> >> OK to commit ?

>> >> >

>> >> > I think the searching is still somewhat wrong - it's been some time

>> >> > since my last look at the

>> >> > patch so maybe I've said this already.  Please bail out early for

>> >> > stmt_can_throw_internal (stmt),

>> >> > otherwise the top stmt search might end up not working.  So

>> >> >

>> >> > +

>> >> > +  if (top_stmt == stmt && stmt_can_throw_internal (top_stmt))

>> >> > +    return false;

>> >> >

>> >> > can go.

>> >> >

>> >> > top_stmt may end up as a TRUNC_DIV_EXPR so it's pointless to only look

>> >> > for another

>> >> > TRUNC_DIV_EXPR later ... you may end up without a single TRUNC_MOD_EXPR.

>> >> > Which means you want a div_seen and a mod_seen, or simply record the top_stmt

>> >> > code and look for the opposite in the 2nd loop.

>> >> Um sorry I don't quite understand how we could end up without a trunc_mod stmt ?

>> >> The 2nd loop adds both trunc_div and trunc_mod to stmts vector, and

>> >> checks if we have

>> >> come across at least a single trunc_div stmt (and we bail out if no

>> >> div is seen).

>> >>

>> >> At 2nd loop I suppose we don't need mod_seen, because stmt is

>> >> guaranteed to be trunc_mod_expr.

>> >> In the 2nd loop the following condition will never trigger for stmt:

>> >>   if (stmt_can_throw_internal (use_stmt))

>> >>             continue;

>> >> since we checked before hand if stmt could throw and chose to bail out

>> >> in that case.

>> >>

>> >> and the following condition would also not trigger for stmt:

>> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

>> >>   {

>> >>     end_imm_use_stmt_traverse (&use_iter);

>> >>     return false;

>> >>   }

>> >> since gimple_bb (stmt) is always dominated by gimple_bb (top_stmt).

>> >>

>> >> The case where top_stmt == stmt, we wouldn't reach the above

>> >> condition, since we have above it:

>> >> if (top_stmt == stmt)

>> >>   continue;

>> >>

>> >> So IIUC, top_stmt and stmt would always get added to stmts vector.

>> >> Am I missing something ?

>> >

>> > Ah, indeed.  Maybe add a comment then, it wasn't really obvious ;)

>> >

>> > Please still move the stmt_can_throw_internal (stmt) check up.

>> Sure, I will move that up and do the other suggested changes.

>>

>> I was wondering if this condition in 2nd loop is too restrictive ?

>> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

>>   {

>>     end_imm_use_stmt_traverse (&use_iter);

>>     return false;

>>   }

>>

>> Should we rather "continue" in this case by not adding use_stmt to

>> stmts vector rather than dropping

>> the transform all-together if gimple_bb (use_stmt) is not dominated by

>> gimple_bb (top_stmt) ?

>

> Ah, yes - didn't spot that.

Hi,
Is this version OK ?

Thanks,
Prathamesh
>

> Richard.

>

>>

>> For instance if we have a test-case like:

>>

>> if (cond)

>> {

>>   t1 = x / y;

>>   t2 = x % y;

>> }

>> else

>>   t3 = x % y;

>>

>> and suppose stmt is "t2 = x % y", we would set top_stmt to "t1 = x / y";

>> In this case we would still want to do divmod transform in THEN block

>> even though "t3 = x % y" is not dominated by top_stmt ?

>>

>> if (cond)

>> {

>>   divmod_tmp = DIVMOD (x, y);

>>   t1 = REALPART_EXPR (divmod_tmp);

>>   t2 = IMAGPART_EXPR (divmod_tmp);

>> }

>> else

>>   t3 = x % y;

>>

>> We will always ensure that all the trunc_div, trunc_mod statements in

>> stmts vector will be dominated by top_stmt,

>> but I suppose they need not constitute all the trunc_div, trunc_mod

>> statements in the function.

>>

>> Thanks,

>> Prathamesh

>> >

>> > Thanks,

>> > Richard.

>> >

>> >> Thanks,

>> >> Prathamesh

>> >> >

>> >> > +      switch (gimple_assign_rhs_code (use_stmt))

>> >> > +       {

>> >> > +         case TRUNC_DIV_EXPR:

>> >> > +           new_rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res);

>> >> > +           break;

>> >> > +

>> >> > +         case TRUNC_MOD_EXPR:

>> >> > +           new_rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op2), res);

>> >> > +           break;

>> >> > +

>> >> >

>> >> > why type of op1 and type of op2 in the other case?  Choose one for consistency.

>> >> >

>> >> > +      if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))

>> >> > +       cfg_changed = true;

>> >> >

>> >> > as you are rejecting all internally throwing stmts this shouldn't be necessary.

>> >> >

>> >> > The patch is ok with those changes.

>> >> >

>> >> > Thanks,

>> >> > Richard.

>> >> >

>> >> >

>> >> >> Thanks,

>> >> >> Prathamesh

>> >>

>> >>

>> >

>> > --

>> > Richard Biener <rguenther@suse.de>

>> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

>>

>>

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
2016-10-26  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
	    Kugan Vivekanandarajah  <kuganv@linaro.org>
	    Jim Wilson  <jim.wilson@linaro.org>

	    * target.def: New hook expand_divmod_libfunc.
	    * doc/tm.texi.in: Add hook for TARGET_EXPAND_DIVMOD_LIBFUNC
	    * doc/tm.texi: Regenerate.
	    * internal-fn.def: Add new entry for DIVMOD ifn.
	    * internal-fn.c (expand_DIVMOD): New.
	    * tree-ssa-math-opts.c: Include optabs-libfuncs.h, tree-eh.h,
	    targhooks.h.
	    (widen_mul_stats): Add new field divmod_calls_inserted.
	    (target_supports_divmod_p): New.
	    (divmod_candidate_p): Likewise.
	    (convert_to_divmod): Likewise.
	    (pass_optimize_widening_mul::execute): Call
	    calculate_dominance_info(), renumber_gimple_stmt_uids() at
	    beginning of function. Call convert_to_divmod()
	    and record stats for divmod.

Comments

Richard Biener Oct. 26, 2016, 10:47 a.m. UTC | #1
On Wed, 26 Oct 2016, Prathamesh Kulkarni wrote:

> On 25 October 2016 at 18:47, Richard Biener <rguenther@suse.de> wrote:

> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

> >

> >> On 25 October 2016 at 16:17, Richard Biener <rguenther@suse.de> wrote:

> >> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

> >> >

> >> >> On 25 October 2016 at 13:43, Richard Biener <richard.guenther@gmail.com> wrote:

> >> >> > On Sun, Oct 16, 2016 at 7:59 AM, Prathamesh Kulkarni

> >> >> > <prathamesh.kulkarni@linaro.org> wrote:

> >> >> >> Hi,

> >> >> >> After approval from Bernd Schmidt, I committed the patch to remove

> >> >> >> optab functions for

> >> >> >> sdivmod_optab and udivmod_optab in optabs.def, which removes the block

> >> >> >> for divmod patch.

> >> >> >>

> >> >> >> This patch is mostly the same as previous one, except it drops

> >> >> >> targeting __udivmoddi4() because

> >> >> >> it gave undefined reference link error for calling __udivmoddi4() on

> >> >> >> aarch64-linux-gnu.

> >> >> >> It appears aarch64 has hardware insn for DImode div, so __udivmoddi4()

> >> >> >> isn't needed for the target

> >> >> >> (it was a bug in my patch that called __udivmoddi4() even though

> >> >> >> aarch64 supported hardware div).

> >> >> >>

> >> >> >> However this makes me wonder if it's guaranteed that __udivmoddi4()

> >> >> >> will be available for a target if it doesn't have hardware div and

> >> >> >> divmod insn and doesn't have target-specific libfunc for

> >> >> >> DImode divmod ? To be conservative, the attached patch doesn't

> >> >> >> generate call to __udivmoddi4.

> >> >> >>

> >> >> >> Passes bootstrap+test on x86_64-unknown-linux.

> >> >> >> Cross-tested on arm*-*-*, aarch64*-*-*.

> >> >> >> Verified that there are no regressions with SPEC2006 on

> >> >> >> x86_64-unknown-linux-gnu.

> >> >> >> OK to commit ?

> >> >> >

> >> >> > I think the searching is still somewhat wrong - it's been some time

> >> >> > since my last look at the

> >> >> > patch so maybe I've said this already.  Please bail out early for

> >> >> > stmt_can_throw_internal (stmt),

> >> >> > otherwise the top stmt search might end up not working.  So

> >> >> >

> >> >> > +

> >> >> > +  if (top_stmt == stmt && stmt_can_throw_internal (top_stmt))

> >> >> > +    return false;

> >> >> >

> >> >> > can go.

> >> >> >

> >> >> > top_stmt may end up as a TRUNC_DIV_EXPR so it's pointless to only look

> >> >> > for another

> >> >> > TRUNC_DIV_EXPR later ... you may end up without a single TRUNC_MOD_EXPR.

> >> >> > Which means you want a div_seen and a mod_seen, or simply record the top_stmt

> >> >> > code and look for the opposite in the 2nd loop.

> >> >> Um sorry I don't quite understand how we could end up without a trunc_mod stmt ?

> >> >> The 2nd loop adds both trunc_div and trunc_mod to stmts vector, and

> >> >> checks if we have

> >> >> come across at least a single trunc_div stmt (and we bail out if no

> >> >> div is seen).

> >> >>

> >> >> At 2nd loop I suppose we don't need mod_seen, because stmt is

> >> >> guaranteed to be trunc_mod_expr.

> >> >> In the 2nd loop the following condition will never trigger for stmt:

> >> >>   if (stmt_can_throw_internal (use_stmt))

> >> >>             continue;

> >> >> since we checked before hand if stmt could throw and chose to bail out

> >> >> in that case.

> >> >>

> >> >> and the following condition would also not trigger for stmt:

> >> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

> >> >>   {

> >> >>     end_imm_use_stmt_traverse (&use_iter);

> >> >>     return false;

> >> >>   }

> >> >> since gimple_bb (stmt) is always dominated by gimple_bb (top_stmt).

> >> >>

> >> >> The case where top_stmt == stmt, we wouldn't reach the above

> >> >> condition, since we have above it:

> >> >> if (top_stmt == stmt)

> >> >>   continue;

> >> >>

> >> >> So IIUC, top_stmt and stmt would always get added to stmts vector.

> >> >> Am I missing something ?

> >> >

> >> > Ah, indeed.  Maybe add a comment then, it wasn't really obvious ;)

> >> >

> >> > Please still move the stmt_can_throw_internal (stmt) check up.

> >> Sure, I will move that up and do the other suggested changes.

> >>

> >> I was wondering if this condition in 2nd loop is too restrictive ?

> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

> >>   {

> >>     end_imm_use_stmt_traverse (&use_iter);

> >>     return false;

> >>   }

> >>

> >> Should we rather "continue" in this case by not adding use_stmt to

> >> stmts vector rather than dropping

> >> the transform all-together if gimple_bb (use_stmt) is not dominated by

> >> gimple_bb (top_stmt) ?

> >

> > Ah, yes - didn't spot that.

> Hi,

> Is this version OK ?


Yes.

Thanks,
Richard.
Prathamesh Kulkarni Oct. 28, 2016, 7:10 p.m. UTC | #2
On 26 October 2016 at 16:17, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 26 Oct 2016, Prathamesh Kulkarni wrote:

>

>> On 25 October 2016 at 18:47, Richard Biener <rguenther@suse.de> wrote:

>> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

>> >

>> >> On 25 October 2016 at 16:17, Richard Biener <rguenther@suse.de> wrote:

>> >> > On Tue, 25 Oct 2016, Prathamesh Kulkarni wrote:

>> >> >

>> >> >> On 25 October 2016 at 13:43, Richard Biener <richard.guenther@gmail.com> wrote:

>> >> >> > On Sun, Oct 16, 2016 at 7:59 AM, Prathamesh Kulkarni

>> >> >> > <prathamesh.kulkarni@linaro.org> wrote:

>> >> >> >> Hi,

>> >> >> >> After approval from Bernd Schmidt, I committed the patch to remove

>> >> >> >> optab functions for

>> >> >> >> sdivmod_optab and udivmod_optab in optabs.def, which removes the block

>> >> >> >> for divmod patch.

>> >> >> >>

>> >> >> >> This patch is mostly the same as previous one, except it drops

>> >> >> >> targeting __udivmoddi4() because

>> >> >> >> it gave undefined reference link error for calling __udivmoddi4() on

>> >> >> >> aarch64-linux-gnu.

>> >> >> >> It appears aarch64 has hardware insn for DImode div, so __udivmoddi4()

>> >> >> >> isn't needed for the target

>> >> >> >> (it was a bug in my patch that called __udivmoddi4() even though

>> >> >> >> aarch64 supported hardware div).

>> >> >> >>

>> >> >> >> However this makes me wonder if it's guaranteed that __udivmoddi4()

>> >> >> >> will be available for a target if it doesn't have hardware div and

>> >> >> >> divmod insn and doesn't have target-specific libfunc for

>> >> >> >> DImode divmod ? To be conservative, the attached patch doesn't

>> >> >> >> generate call to __udivmoddi4.

>> >> >> >>

>> >> >> >> Passes bootstrap+test on x86_64-unknown-linux.

>> >> >> >> Cross-tested on arm*-*-*, aarch64*-*-*.

>> >> >> >> Verified that there are no regressions with SPEC2006 on

>> >> >> >> x86_64-unknown-linux-gnu.

>> >> >> >> OK to commit ?

>> >> >> >

>> >> >> > I think the searching is still somewhat wrong - it's been some time

>> >> >> > since my last look at the

>> >> >> > patch so maybe I've said this already.  Please bail out early for

>> >> >> > stmt_can_throw_internal (stmt),

>> >> >> > otherwise the top stmt search might end up not working.  So

>> >> >> >

>> >> >> > +

>> >> >> > +  if (top_stmt == stmt && stmt_can_throw_internal (top_stmt))

>> >> >> > +    return false;

>> >> >> >

>> >> >> > can go.

>> >> >> >

>> >> >> > top_stmt may end up as a TRUNC_DIV_EXPR so it's pointless to only look

>> >> >> > for another

>> >> >> > TRUNC_DIV_EXPR later ... you may end up without a single TRUNC_MOD_EXPR.

>> >> >> > Which means you want a div_seen and a mod_seen, or simply record the top_stmt

>> >> >> > code and look for the opposite in the 2nd loop.

>> >> >> Um sorry I don't quite understand how we could end up without a trunc_mod stmt ?

>> >> >> The 2nd loop adds both trunc_div and trunc_mod to stmts vector, and

>> >> >> checks if we have

>> >> >> come across at least a single trunc_div stmt (and we bail out if no

>> >> >> div is seen).

>> >> >>

>> >> >> At 2nd loop I suppose we don't need mod_seen, because stmt is

>> >> >> guaranteed to be trunc_mod_expr.

>> >> >> In the 2nd loop the following condition will never trigger for stmt:

>> >> >>   if (stmt_can_throw_internal (use_stmt))

>> >> >>             continue;

>> >> >> since we checked before hand if stmt could throw and chose to bail out

>> >> >> in that case.

>> >> >>

>> >> >> and the following condition would also not trigger for stmt:

>> >> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

>> >> >>   {

>> >> >>     end_imm_use_stmt_traverse (&use_iter);

>> >> >>     return false;

>> >> >>   }

>> >> >> since gimple_bb (stmt) is always dominated by gimple_bb (top_stmt).

>> >> >>

>> >> >> The case where top_stmt == stmt, we wouldn't reach the above

>> >> >> condition, since we have above it:

>> >> >> if (top_stmt == stmt)

>> >> >>   continue;

>> >> >>

>> >> >> So IIUC, top_stmt and stmt would always get added to stmts vector.

>> >> >> Am I missing something ?

>> >> >

>> >> > Ah, indeed.  Maybe add a comment then, it wasn't really obvious ;)

>> >> >

>> >> > Please still move the stmt_can_throw_internal (stmt) check up.

>> >> Sure, I will move that up and do the other suggested changes.

>> >>

>> >> I was wondering if this condition in 2nd loop is too restrictive ?

>> >> if (!dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))

>> >>   {

>> >>     end_imm_use_stmt_traverse (&use_iter);

>> >>     return false;

>> >>   }

>> >>

>> >> Should we rather "continue" in this case by not adding use_stmt to

>> >> stmts vector rather than dropping

>> >> the transform all-together if gimple_bb (use_stmt) is not dominated by

>> >> gimple_bb (top_stmt) ?

>> >

>> > Ah, yes - didn't spot that.

>> Hi,

>> Is this version OK ?

>

> Yes.

Committed as r241660.
Thanks a lot!

Regards,
Prathamesh
>

> Thanks,

> Richard.
diff mbox

Patch

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index cffcfe9..d2bcdca 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -7096,6 +7096,11 @@  This is firstly introduced on ARM/AArch64 targets, please refer to
 the hook implementation for how different fusion types are supported.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_EXPAND_DIVMOD_LIBFUNC (rtx @var{libfunc}, machine_mode @var{mode}, rtx @var{op0}, rtx @var{op1}, rtx *@var{quot}, rtx *@var{rem})
+Define this hook for enabling divmod transform if the port does not have
+hardware divmod insn but defines target-specific divmod libfuncs.
+@end deftypefn
+
 @node Sections
 @section Dividing the Output into Sections (Texts, Data, @dots{})
 @c the above section title is WAY too long.  maybe cut the part between
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d2dd45f..3399465 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4892,6 +4892,8 @@  them: try the first ones in this list first.
 
 @hook TARGET_SCHED_FUSION_PRIORITY
 
+@hook TARGET_EXPAND_DIVMOD_LIBFUNC
+
 @node Sections
 @section Dividing the Output into Sections (Texts, Data, @dots{})
 @c the above section title is WAY too long.  maybe cut the part between
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 4477697..022a97f 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2220,6 +2220,53 @@  expand_LAUNDER (internal_fn, gcall *call)
   expand_assignment (lhs, gimple_call_arg (call, 0), false);
 }
 
+/* Expand DIVMOD() using:
+ a) optab handler for udivmod/sdivmod if it is available.
+ b) If optab_handler doesn't exist, generate call to
+    target-specific divmod libfunc.  */
+
+static void
+expand_DIVMOD (internal_fn, gcall *call_stmt)
+{
+  tree lhs = gimple_call_lhs (call_stmt);
+  tree arg0 = gimple_call_arg (call_stmt, 0);
+  tree arg1 = gimple_call_arg (call_stmt, 1);
+
+  gcc_assert (TREE_CODE (TREE_TYPE (lhs)) == COMPLEX_TYPE);
+  tree type = TREE_TYPE (TREE_TYPE (lhs));
+  machine_mode mode = TYPE_MODE (type);
+  bool unsignedp = TYPE_UNSIGNED (type);
+  optab tab = (unsignedp) ? udivmod_optab : sdivmod_optab;
+
+  rtx op0 = expand_normal (arg0);
+  rtx op1 = expand_normal (arg1);
+  rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
+
+  rtx quotient, remainder, libfunc;
+
+  /* Check if optab_handler exists for divmod_optab for given mode.  */
+  if (optab_handler (tab, mode) != CODE_FOR_nothing)
+    {
+      quotient = gen_reg_rtx (mode);
+      remainder = gen_reg_rtx (mode);
+      expand_twoval_binop (tab, op0, op1, quotient, remainder, unsignedp);
+    }
+
+  /* Generate call to divmod libfunc if it exists.  */
+  else if ((libfunc = optab_libfunc (tab, mode)) != NULL_RTX)
+    targetm.expand_divmod_libfunc (libfunc, mode, op0, op1,
+				   &quotient, &remainder);
+
+  else
+    gcc_unreachable ();
+
+  /* Wrap the return value (quotient, remainder) within COMPLEX_EXPR.  */
+  expand_expr (build2 (COMPLEX_EXPR, TREE_TYPE (lhs),
+		       make_tree (TREE_TYPE (arg0), quotient),
+		       make_tree (TREE_TYPE (arg1), remainder)),
+	      target, VOIDmode, EXPAND_NORMAL);
+}
+
 /* Expand a call to FN using the operands in STMT.  FN has a single
    output operand and NARGS input operands.  */
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 28863df..cf2c402 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -201,6 +201,9 @@  DEF_INTERNAL_FN (FALLTHROUGH, ECF_LEAF | ECF_NOTHROW, NULL)
 /* To implement __builtin_launder.  */
 DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL)
 
+/* Divmod function.  */
+DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL)
+
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
 #undef DEF_INTERNAL_OPTAB_FN
diff --git a/gcc/target.def b/gcc/target.def
index 20def24..ae0ea16 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5055,6 +5055,15 @@  Normally, this is not needed.",
  bool, (const_tree field, machine_mode mode),
  default_member_type_forces_blk)
 
+/* See tree-ssa-math-opts.c:divmod_candidate_p for conditions
+   that gate the divod transform.  */
+DEFHOOK
+(expand_divmod_libfunc,
+ "Define this hook for enabling divmod transform if the port does not have\n\
+hardware divmod insn but defines target-specific divmod libfuncs.", 
+ void, (rtx libfunc, machine_mode mode, rtx op0, rtx op1, rtx *quot, rtx *rem),
+ NULL)
+
 /* Return the class for a secondary reload, and fill in extra information.  */
 DEFHOOK
 (secondary_reload,
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 0cea1a8..c315da8 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -112,6 +112,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "internal-fn.h"
 #include "case-cfn-macros.h"
+#include "optabs-libfuncs.h"
+#include "tree-eh.h"
+#include "targhooks.h"
 
 /* This structure represents one basic block that either computes a
    division, or is a common dominator for basic block that compute a
@@ -184,6 +187,9 @@  static struct
 
   /* Number of fp fused multiply-add ops inserted.  */
   int fmas_inserted;
+
+  /* Number of divmod calls inserted.  */
+  int divmod_calls_inserted;
 } widen_mul_stats;
 
 /* The instance of "struct occurrence" representing the highest
@@ -3793,6 +3799,213 @@  match_uaddsub_overflow (gimple_stmt_iterator *gsi, gimple *stmt,
   return true;
 }
 
+/* Return true if target has support for divmod.  */
+
+static bool
+target_supports_divmod_p (optab divmod_optab, optab div_optab, machine_mode mode) 
+{
+  /* If target supports hardware divmod insn, use it for divmod.  */
+  if (optab_handler (divmod_optab, mode) != CODE_FOR_nothing)
+    return true;
+
+  /* Check if libfunc for divmod is available.  */
+  rtx libfunc = optab_libfunc (divmod_optab, mode);
+  if (libfunc != NULL_RTX)
+    {
+      /* If optab_handler exists for div_optab, perhaps in a wider mode,
+	 we don't want to use the libfunc even if it exists for given mode.  */ 
+      for (machine_mode div_mode = mode;
+	   div_mode != VOIDmode;
+	   div_mode = GET_MODE_WIDER_MODE (div_mode))
+	if (optab_handler (div_optab, div_mode) != CODE_FOR_nothing)
+	  return false;
+
+      return targetm.expand_divmod_libfunc != NULL;
+    }
+  
+  return false; 
+}
+
+/* Check if stmt is candidate for divmod transform.  */
+
+static bool
+divmod_candidate_p (gassign *stmt)
+{
+  tree type = TREE_TYPE (gimple_assign_lhs (stmt));
+  enum machine_mode mode = TYPE_MODE (type);
+  optab divmod_optab, div_optab;
+
+  if (TYPE_UNSIGNED (type))
+    {
+      divmod_optab = udivmod_optab;
+      div_optab = udiv_optab;
+    }
+  else
+    {
+      divmod_optab = sdivmod_optab;
+      div_optab = sdiv_optab;
+    }
+
+  tree op1 = gimple_assign_rhs1 (stmt);
+  tree op2 = gimple_assign_rhs2 (stmt);
+
+  /* Disable the transform if either is a constant, since division-by-constant
+     may have specialized expansion.  */
+  if (CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2))
+    return false;
+
+  /* Exclude the case where TYPE_OVERFLOW_TRAPS (type) as that should
+     expand using the [su]divv optabs.  */
+  if (TYPE_OVERFLOW_TRAPS (type))
+    return false;
+  
+  if (!target_supports_divmod_p (divmod_optab, div_optab, mode)) 
+    return false;
+
+  return true;
+}
+
+/* This function looks for:
+   t1 = a TRUNC_DIV_EXPR b;
+   t2 = a TRUNC_MOD_EXPR b;
+   and transforms it to the following sequence:
+   complex_tmp = DIVMOD (a, b);
+   t1 = REALPART_EXPR(a);
+   t2 = IMAGPART_EXPR(b);
+   For conditions enabling the transform see divmod_candidate_p().
+
+   The pass has three parts:
+   1) Find top_stmt which is trunc_div or trunc_mod stmt and dominates all
+      other trunc_div_expr and trunc_mod_expr stmts.
+   2) Add top_stmt and all trunc_div and trunc_mod stmts dominated by top_stmt
+      to stmts vector.
+   3) Insert DIVMOD call just before top_stmt and update entries in
+      stmts vector to use return value of DIMOVD (REALEXPR_PART for div,
+      IMAGPART_EXPR for mod).  */
+
+static bool
+convert_to_divmod (gassign *stmt)
+{
+  if (stmt_can_throw_internal (stmt)
+      || !divmod_candidate_p (stmt))
+    return false;
+
+  tree op1 = gimple_assign_rhs1 (stmt);
+  tree op2 = gimple_assign_rhs2 (stmt);
+  
+  imm_use_iterator use_iter;
+  gimple *use_stmt;
+  auto_vec<gimple *> stmts; 
+
+  gimple *top_stmt = stmt; 
+  basic_block top_bb = gimple_bb (stmt);
+
+  /* Part 1: Try to set top_stmt to "topmost" stmt that dominates
+     at-least stmt and possibly other trunc_div/trunc_mod stmts
+     having same operands as stmt.  */
+
+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, op1)
+    {
+      if (is_gimple_assign (use_stmt)
+	  && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR
+	      || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR)
+	  && operand_equal_p (op1, gimple_assign_rhs1 (use_stmt), 0)
+	  && operand_equal_p (op2, gimple_assign_rhs2 (use_stmt), 0))
+	{
+	  if (stmt_can_throw_internal (use_stmt))
+	    continue;
+
+	  basic_block bb = gimple_bb (use_stmt);
+
+	  if (bb == top_bb)
+	    {
+	      if (gimple_uid (use_stmt) < gimple_uid (top_stmt))
+		top_stmt = use_stmt;
+	    }
+	  else if (dominated_by_p (CDI_DOMINATORS, top_bb, bb))
+	    {
+	      top_bb = bb;
+	      top_stmt = use_stmt;
+	    }
+	}
+    }
+
+  tree top_op1 = gimple_assign_rhs1 (top_stmt);
+  tree top_op2 = gimple_assign_rhs2 (top_stmt);
+
+  stmts.safe_push (top_stmt);
+  bool div_seen = (gimple_assign_rhs_code (top_stmt) == TRUNC_DIV_EXPR);
+
+  /* Part 2: Add all trunc_div/trunc_mod statements domianted by top_bb
+     to stmts vector. The 2nd loop will always add stmt to stmts vector, since
+     gimple_bb (top_stmt) dominates gimple_bb (stmt), so the
+     2nd loop ends up adding at-least single trunc_mod_expr stmt.  */  
+
+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, top_op1)
+    {
+      if (is_gimple_assign (use_stmt)
+	  && (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR
+	      || gimple_assign_rhs_code (use_stmt) == TRUNC_MOD_EXPR)
+	  && operand_equal_p (top_op1, gimple_assign_rhs1 (use_stmt), 0)
+	  && operand_equal_p (top_op2, gimple_assign_rhs2 (use_stmt), 0))
+	{
+	  if (use_stmt == top_stmt
+	      || stmt_can_throw_internal (use_stmt)
+	      || !dominated_by_p (CDI_DOMINATORS, gimple_bb (use_stmt), top_bb))
+	    continue;
+
+	  stmts.safe_push (use_stmt);
+	  if (gimple_assign_rhs_code (use_stmt) == TRUNC_DIV_EXPR)
+	    div_seen = true;
+	}
+    }
+
+  if (!div_seen)
+    return false;
+
+  /* Part 3: Create libcall to internal fn DIVMOD:
+     divmod_tmp = DIVMOD (op1, op2).  */
+
+  gcall *call_stmt = gimple_build_call_internal (IFN_DIVMOD, 2, op1, op2);
+  tree res = make_temp_ssa_name (build_complex_type (TREE_TYPE (op1)),
+				 call_stmt, "divmod_tmp");
+  gimple_call_set_lhs (call_stmt, res);
+
+  /* Insert the call before top_stmt.  */
+  gimple_stmt_iterator top_stmt_gsi = gsi_for_stmt (top_stmt);
+  gsi_insert_before (&top_stmt_gsi, call_stmt, GSI_SAME_STMT);
+
+  widen_mul_stats.divmod_calls_inserted++;		
+
+  /* Update all statements in stmts vector:
+     lhs = op1 TRUNC_DIV_EXPR op2 -> lhs = REALPART_EXPR<divmod_tmp>
+     lhs = op1 TRUNC_MOD_EXPR op2 -> lhs = IMAGPART_EXPR<divmod_tmp>.  */
+
+  for (unsigned i = 0; stmts.iterate (i, &use_stmt); ++i)
+    {
+      tree new_rhs;
+
+      switch (gimple_assign_rhs_code (use_stmt))
+	{
+	  case TRUNC_DIV_EXPR:
+	    new_rhs = fold_build1 (REALPART_EXPR, TREE_TYPE (op1), res);
+	    break;
+
+	  case TRUNC_MOD_EXPR:
+	    new_rhs = fold_build1 (IMAGPART_EXPR, TREE_TYPE (op1), res);
+	    break;
+
+	  default:
+	    gcc_unreachable ();
+	}
+
+      gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt);
+      gimple_assign_set_rhs_from_tree (&gsi, new_rhs);
+      update_stmt (use_stmt);
+    }
+
+  return true; 
+}    
 
 /* Find integer multiplications where the operands are extended from
    smaller types, and replace the MULT_EXPR with a WIDEN_MULT_EXPR
@@ -3837,6 +4050,8 @@  pass_optimize_widening_mul::execute (function *fun)
   bool cfg_changed = false;
 
   memset (&widen_mul_stats, 0, sizeof (widen_mul_stats));
+  calculate_dominance_info (CDI_DOMINATORS);
+  renumber_gimple_stmt_uids ();
 
   FOR_EACH_BB_FN (bb, fun)
     {
@@ -3870,6 +4085,10 @@  pass_optimize_widening_mul::execute (function *fun)
 		    match_uaddsub_overflow (&gsi, stmt, code);
 		  break;
 
+		case TRUNC_MOD_EXPR:
+		  convert_to_divmod (as_a<gassign *> (stmt));
+		  break;
+
 		default:;
 		}
 	    }
@@ -3916,6 +4135,8 @@  pass_optimize_widening_mul::execute (function *fun)
 			    widen_mul_stats.maccs_inserted);
   statistics_counter_event (fun, "fused multiply-adds inserted",
 			    widen_mul_stats.fmas_inserted);
+  statistics_counter_event (fun, "divmod calls inserted",
+			    widen_mul_stats.divmod_calls_inserted);
 
   return cfg_changed ? TODO_cleanup_cfg : 0;
 }