diff mbox

Optimize certain end of loop conditions into min/max operation

Message ID 55B5A884.4060105@linaro.org
State New
Headers show

Commit Message

Michael Collison July 27, 2015, 3:41 a.m. UTC
This patch is designed to optimize end of loop conditions involving of 
the form
  i < x && i < y into i < min (x, y). Loop condition involving '>' are 
handled similarly using max(x,y).
As an example:

#define N 1024

int  a[N], b[N], c[N];

void add (unsignedint  m, unsignedint  n)
{
   unsignedint  i, bound = (m < n) ? m : n;
   for  (i = 0; i < m && i < n; ++i)
     a[i] = b[i] + c[i];
}


Performed bootstrap and make check on: x86_64_unknown-linux-gnu, 
arm-linux-gnueabihf, and aarch64-linux-gnu.
Okay for trunk?

2015-07-24  Michael Collison  <michael.collison@linaro.org>
                     Andrew Pinski <andrew.pinski@caviumnetworks.com>

                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
                 (x > y) and (x > z) -> x > max (y,z))

Comments

Michael Collison July 29, 2015, 10:10 p.m. UTC | #1
Richard and Jeff,

Any conclusion to this discussion? Is this okay in match.pd or would you 
like to see it implemented elsewhere?

On 7/28/2015 12:41 AM, Richard Biener wrote:
> On Mon, Jul 27, 2015 at 6:20 PM, Jeff Law <law@redhat.com> wrote:
>> On 07/27/2015 03:25 AM, Richard Biener wrote:
>>> On Mon, Jul 27, 2015 at 5:41 AM, Michael Collison
>>> <michael.collison@linaro.org> wrote:
>>>> This patch is designed to optimize end of loop conditions involving of
>>>> the
>>>> form
>>>>    i < x && i < y into i < min (x, y). Loop condition involving '>' are
>>>> handled similarly using max(x,y).
>>>> As an example:
>>>>
>>>> #define N 1024
>>>>
>>>> int  a[N], b[N], c[N];
>>>>
>>>> void add (unsignedint  m, unsignedint  n)
>>>> {
>>>>     unsignedint  i, bound = (m < n) ? m : n;
>>>>     for  (i = 0; i < m && i < n; ++i)
>>>>       a[i] = b[i] + c[i];
>>>> }
>>>>
>>>>
>>>> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
>>>> arm-linux-gnueabihf, and aarch64-linux-gnu.
>>>> Okay for trunk?
>>>
>>> So this works only for && that has been lowered to non-CFG form
>>> (I suppose phiopt would catch that?  If not, ifcombine would be the
>>> place to implement it I guess).
>> phiopt is supposed to be generating MIN/MAX expressions for us.  If it isn't
>> it'd be good to see any testcases where it isn't.
>>
>> I think that raises a general question though.  Does it make more sense to
>> capture MIN/MAX (and others) in phiopt or in the match.pd framework?
> match.pd is good for pattern recognition - patterns of fixed size.  There are
> cases that are done in fold-const.c for example that doesn't fit very well
> and should be done as separate pass, like for example figuring out whether
> an expression can be easily negated or whether there are sign-changes that
> can be stripped.  Basically all cases where fold currently recurses (unbound).
>
> The above case is a corner case I think - the number of && you can change
> into (multiple) MIN/MAX is unbound but we might only care about the case
> where there will be one MIN/MAX operation.
>
> Generally phiopt and other patterns that match the CFG are not yet well
> supported by match.pd (though I outlined how matching PHI nodes when
> facing (simplify (cond ...) ...) would be possible).
>
> So while putting something into match.pd is easy I'd like people to
> think if doing the same thing elsewhere is better - that is, if this is really
> a pattern transform operation or if you are just implementing a special-case
> of a general transform as a pattern.
>
> Richard.
>
>> Jeff
>>
Michael Collison July 31, 2015, 6:18 p.m. UTC | #2
Hi Jeff,

Yes I will create a test case. I'm not quite sure what to check for even 
in the machine dependent test case. It's quite possible for the 
instructions that are generated to change over time.

On 7/31/2015 9:20 AM, Jeff Law wrote:
> On 07/28/2015 01:41 AM, Richard Biener wrote:
>>
>> The above case is a corner case I think - the number of && you can 
>> change
>> into (multiple) MIN/MAX is unbound but we might only care about the case
>> where there will be one MIN/MAX operation.
> I suspect that's going to be the most important/common case.
>
>>
>> Generally phiopt and other patterns that match the CFG are not yet well
>> supported by match.pd (though I outlined how matching PHI nodes when
>> facing (simplify (cond ...) ...) would be possible).
> Right.  Though I thought the conclusion after outlining we determined 
> it wasn't really feasible, yet.
>
>
>>
>> So while putting something into match.pd is easy I'd like people to
>> think if doing the same thing elsewhere is better - that is, if this 
>> is really
>> a pattern transform operation or if you are just implementing a 
>> special-case
>> of a general transform as a pattern.
> So in this case we're taking something like:
>
>  _6 = i_1 < m_4(D);
>   _7 = i_1 < n_3(D);
>   _8 = _6 & _7;
>   if (_8 != 0)
>
>
> And turning it into
>
> _X = MIN (m_4, n_3)
> if (i_1 < _X)
>
> That seems to me like a good match for match.pd given its generality 
> and the need to walk up the use-def chain.  It's certainly not a good 
> fit for phi-opt since we're not looking at PHIs :-)
>
>
>
> Michael -- can you take your sample code and turn it into a test for 
> the testsuite.  I'd hazard a guess it'll need to be target specific 
> because of its interactions with branch-costing.  Might as well make 4 
> variants (lt -> MIN, le -> MIN, ge->MAX, gt->MAX).
>
> We're going to want that regardless of whether tackling this issue in 
> match.pd (my current preference) or elsewhere.
>
> jeff
diff mbox

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 5e8fd32..8691710 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1793,3 +1793,17 @@  along with GCC; see the file COPYING3.  If not see
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
                (convert:utype @4)))))))

+
+/* Transform (@0 < @1 and @0 < @2) to use min */
+(for op (lt le)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (min @1 @2)))))
+
+/* Transform (@0 > @1 and @0 > @2) to use max */
+(for op (gt ge)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (max @1 @2)))))