RFC [1/2] divmod transform

On 24 May 2016 at 17:42, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 24 May 2016, Prathamesh Kulkarni wrote:

>

>> On 23 May 2016 at 17:35, Richard Biener <richard.guenther@gmail.com> wrote:

>> > On Mon, May 23, 2016 at 10:58 AM, Prathamesh Kulkarni

>> > <prathamesh.kulkarni@linaro.org> wrote:

>> >> Hi,

>> >> I have updated my patch for divmod (attached), which was originally

>> >> based on Kugan's patch.

>> >> The patch transforms stmts with code TRUNC_DIV_EXPR and TRUNC_MOD_EXPR

>> >> having same operands to divmod representation, so we can cse computation of mod.

>> >>

>> >> t1 = a TRUNC_DIV_EXPR b;

>> >> t2 = a TRUNC_MOD_EXPR b

>> >> is transformed to:

>> >> complex_tmp = DIVMOD (a, b);

>> >> t1 = REALPART_EXPR (complex_tmp);

>> >> t2 = IMAGPART_EXPR (complex_tmp);

>> >>

>> >> * New hook divmod_expand_libfunc

>> >> The rationale for introducing the hook is that different targets have

>> >> incompatible calling conventions for divmod libfunc.

>> >> Currently three ports define divmod libfunc: c6x, spu and arm.

>> >> c6x and spu follow the convention of libgcc2.c:__udivmoddi4:

>> >> return quotient and store remainder in argument passed as pointer,

>> >> while the arm version takes two arguments and returns both

>> >> quotient and remainder having mode double the size of the operand mode.

>> >> The port should hence override the hook expand_divmod_libfunc

>> >> to generate call to target-specific divmod.

>> >> Ports should define this hook if:

>> >> a) The port does not have divmod or div insn for the given mode.

>> >> b) The port defines divmod libfunc for the given mode.

>> >> The default hook default_expand_divmod_libfunc() generates call

>> >> to libgcc2.c:__udivmoddi4 provided the operands are unsigned and

>> >> are of DImode.

>> >>

>> >> Patch passes bootstrap+test on x86_64-unknown-linux-gnu and

>> >> cross-tested on arm*-*-*.

>> >> Bootstrap+test in progress on arm-linux-gnueabihf.

>> >> Does this patch look OK ?

>> >

>> > diff --git a/gcc/targhooks.c b/gcc/targhooks.c

>> > index 6b4601b..e4a021a 100644

>> > --- a/gcc/targhooks.c

>> > +++ b/gcc/targhooks.c

>> > @@ -1965,4 +1965,31 @@ default_optab_supported_p (int, machine_mode,

>> > machine_mode, optimization_type)

>> >    return true;

>> >  }

>> >

>> > +void

>> > +default_expand_divmod_libfunc (bool unsignedp, machine_mode mode,

>> > +                              rtx op0, rtx op1,

>> > +                              rtx *quot_p, rtx *rem_p)

>> >

>> > functions need a comment.

>> >

>> > ISTR it was suggested that ARM change to libgcc2.c__udivmoddi4 style?  In that

>> > case we could avoid the target hook.

>> Well I would prefer adding the hook because that's more easier -;)

>> Would it be ok for now to go with the hook ?

>> >

>> > +      /* If target overrides expand_divmod_libfunc hook

>> > +        then perform divmod by generating call to the target-specifc divmod

>> > libfunc.  */

>> > +      if (targetm.expand_divmod_libfunc != default_expand_divmod_libfunc)

>> > +       return true;

>> > +

>> > +      /* Fall back to using libgcc2.c:__udivmoddi4.  */

>> > +      return (mode == DImode && unsignedp);

>> >

>> > I don't understand this - we know optab_libfunc returns non-NULL for 'mode'

>> > but still restrict this to DImode && unsigned?  Also if

>> > targetm.expand_divmod_libfunc

>> > is not the default we expect the target to handle all modes?

>> Ah indeed, the check for DImode is unnecessary.

>> However I suppose the check for unsignedp should be there,

>> since we want to generate call to __udivmoddi4 only if operand is unsigned ?

>

> The optab libfunc for sdivmod should be NULL in that case.

Ah indeed, thanks.
>

>> >

>> > That said - I expected the above piece to be simply a 'return true;' ;)

>> >

>> > Usually we use some can_expand_XXX helper in optabs.c to query if the target

>> > supports a specific operation (for example SImode divmod would use DImode

>> > divmod by means of widening operands - for the unsigned case of course).

>> Thanks for pointing out. So if a target does not support divmod

>> libfunc for a mode

>> but for a wider mode, then we could zero-extend operands to the wider-mode,

>> perform divmod on the wider-mode, and then cast result back to the

>> original mode.

>> I haven't done that in this patch, would it be OK to do that as a follow up ?

>

> I think that you should conservatively handle the div_optab query, thus if

> the target has a HW division in a wider mode don't use the divmod IFN.

> You'd simply iterate over GET_MODE_WIDER_MODE and repeat the

> if (optab_handler (div_optab, mode) != CODE_FOR_nothing) check, bailing

> out if that is available.

Done.
>

>> > +  /* Disable the transform if either is a constant, since

>> > division-by-constant

>> > +     may have specialized expansion.  */

>> > +  if (TREE_CONSTANT (op1) || TREE_CONSTANT (op2))

>> > +    return false;

>> >

>> > please use CONSTANT_CLASS_P (op1) || CONSTANT_CLASS_P (op2)

>> >

>> > +  if (TYPE_OVERFLOW_TRAPS (type))

>> > +    return false;

>> >

>> > why's that?  Generally please first test cheap things (trapping, constant-ness)

>> > before checking expensive stuff (target_supports_divmod_p).

>> I added TYPE_OVERFLOW_TRAPS (type) based on your suggestion in:

>> https://www.mail-archive.com/gcc@gcc.gnu.org/msg78534.html

>> "When looking at TRUNC_DIV_EXPR you should also exclude

>> the case where TYPE_OVERFLOW_TRAPS (type) as that should

>> expand using the [su]divv optabs (no trapping overflow

>> divmod optab exists)."

>

> Ok, didn't remember that.

>

>> >

>> > +static bool

>> > +convert_to_divmod (gassign *stmt)

>> > +{

>> > +  if (!divmod_candidate_p (stmt))

>> > +    return false;

>> > +

>> > +  tree op1 = gimple_assign_rhs1 (stmt);

>> > +  tree op2 = gimple_assign_rhs2 (stmt);

>> > +

>> > +  vec<gimple *> stmts = vNULL;

>> >

>> > use an auto_vec <gimple *> - you currently leak it in at least one place.

>> >

>> > +      if (maybe_clean_or_replace_eh_stmt (use_stmt, use_stmt))

>> > +       cfg_changed = true;

>> >

>> > note that this suggests you should check whether any of the stmts may throw

>> > internally as you don't update / transfer EH info correctly.  So for 'stmt' and

>> > all 'use_stmt' check stmt_can_throw_internal and if so do not add it to

>> > the list of stmts to modify.

>> >

>> > Btw, I think you should not add 'stmt' immediately but when iterating over

>> > all uses also gather uses in TRUNC_MOD_EXPR.

>> >

>> > Otherwise looks ok.

>> Done changes in this version. I am gathering mod uses same time as div uses,

>> so this imposes a constraint that mod dominates mod. I am not sure if

>> that's desirable.

>

> I think you also need a mod_seen variable now that you don't necessarily

> end up with 'stmt' in the vector of stmts.  I don't see how there is a

> constraint that mod dominates mod - it's just that the top_stmt needs

> to dominate all other uses that can be replaced with replacing top_stmt

> with a divmod.  It's just that the actual stmt set we choose may now

> depend on immediate uses order which on a second thought is bad

> as immediate uses order could be affected by debug stmts ... hmm.

>

> To avoid this please re-add the code adding 'stmt' to stmts immediately

> and add a use_stmt != stmt check to the immediate use processing loop

> so that we don't end up adding it twice.

Well I wonder what will happen for the following case:
t1 = x / y;
if (cond)
  t2 = x % y;
else
  t3 = x % y;

Assuming stmt == top_stmt is "t2 = x % y" and use_stmt is "t3 = x % y",
use_stmt will not get added to stmts vector, since top_stmt and
use_stmt are not in same bb,
and bb's containing top_stmt and use_stmt don't dominate each other.
Not sure if this is practical case (I assume fre will hoist mod
outside if-else?)

Now that we immediately add stmt to stmts vector, I suppose mod_seen
shall not be required ?

Thanks,
Prathamesh
>

> Richard.

RFC [1/2] divmod transform

Commit Message

Comments

Patch