[match.pd] Fix for PR35691

On 8 November 2016 at 16:46, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 8 Nov 2016, Prathamesh Kulkarni wrote:

>

>> On 8 November 2016 at 13:23, Richard Biener <rguenther@suse.de> wrote:

>> > On Mon, 7 Nov 2016, Prathamesh Kulkarni wrote:

>> >

>> >> On 7 November 2016 at 23:06, Prathamesh Kulkarni

>> >> <prathamesh.kulkarni@linaro.org> wrote:

>> >> > On 7 November 2016 at 15:43, Richard Biener <rguenther@suse.de> wrote:

>> >> >> On Fri, 4 Nov 2016, Prathamesh Kulkarni wrote:

>> >> >>

>> >> >>> On 4 November 2016 at 13:41, Richard Biener <rguenther@suse.de> wrote:

>> >> >>> > On Thu, 3 Nov 2016, Marc Glisse wrote:

>> >> >>> >

>> >> >>> >> On Thu, 3 Nov 2016, Richard Biener wrote:

>> >> >>> >>

>> >> >>> >> > > > > The transform would also work for vectors (element_precision for

>> >> >>> >> > > > > the test but also a value-matching zero which should ensure the

>> >> >>> >> > > > > same number of elements).

>> >> >>> >> > > > Um sorry, I didn't get how to check vectors to be of equal length by a

>> >> >>> >> > > > matching zero.

>> >> >>> >> > > > Could you please elaborate on that ?

>> >> >>> >> > >

>> >> >>> >> > > He may have meant something like:

>> >> >>> >> > >

>> >> >>> >> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))

>> >> >>> >> >

>> >> >>> >> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the

>> >> >>> >> > point of the pattern.

>> >> >>> >>

>> >> >>> >> Oups, that's what I had written first, and then I somehow managed to confuse

>> >> >>> >> myself enough to remove it so as to remove the call to types_match :-(

>> >> >>> >>

>> >> >>> >> > > So the last operand is checked with operand_equal_p instead of

>> >> >>> >> > > integer_zerop. But the fact that we could compute bit_ior on the

>> >> >>> >> > > comparison results should already imply that the number of elements is the

>> >> >>> >> > > same.

>> >> >>> >> >

>> >> >>> >> > Though for equality compares we also allow scalar results IIRC.

>> >> >>> >>

>> >> >>> >> Oh, right, I keep forgetting that :-( And I have no idea how to generate one

>> >> >>> >> for a testcase, at least until the GIMPLE FE lands...

>> >> >>> >>

>> >> >>> >> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some

>> >> >>> >> > > vector mode on s390?), it would be cool to do the same for floats (most

>> >> >>> >> > > likely at the RTL level).

>> >> >>> >> >

>> >> >>> >> > On GIMPLE view-converts could come to the rescue here as well.  Or we cab

>> >> >>> >> > just allow bit-and/or on floats as much as we allow them on pointers.

>> >> >>> >>

>> >> >>> >> Would that generate sensible code on targets that do not have logic insns for

>> >> >>> >> floats? Actually, even on x86_64 that generates inefficient code, so there

>> >> >>> >> would be some work (for instance grep finds no gen_iordf3, only gen_iorv2df3).

>> >> >>> >>

>> >> >>> >> I am also a bit wary of doing those obfuscating optimizations too early...

>> >> >>> >> a==0 is something that other optimizations might use. long

>> >> >>> >> c=(long&)a|(long&)b; (double&)c==0; less so...

>> >> >>> >>

>> >> >>> >> (and I am assuming that signaling NaNs don't make the whole transformation

>> >> >>> >> impossible, which might be wrong)

>> >> >>> >

>> >> >>> > Yeah.  I also think it's not so much important - I just wanted to mention

>> >> >>> > vectors...

>> >> >>> >

>> >> >>> > Btw, I still think we need a more sensible infrastructure for passes

>> >> >>> > to gather, analyze and modify complex conditions.  (I'm always pointing

>> >> >>> > to tree-affine.c as an, albeit not very good, example for handling

>> >> >>> > a similar problem)

>> >> >>> Thanks for mentioning the value-matching capture @@, I wasn't aware of

>> >> >>> this match.pd feature.

>> >> >>> The current patch keeps it restricted to only bitwise operators on integers.

>> >> >>> Bootstrap+test running on x86_64-unknown-linux-gnu.

>> >> >>> OK to commit if passes ?

>> >> >>

>> >> >> +/* PR35691: Transform

>> >> >> +   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.

>> >> >> +   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */

>> >> >> +

>> >> >>

>> >> >> Please omit the vertical space

>> >> >>

>> >> >> +(for bitop (bit_and bit_ior)

>> >> >> +     cmp (eq ne)

>> >> >> + (simplify

>> >> >> +  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))

>> >> >>

>> >> >> if you capture the first integer_zerop as @2 then you can re-use it...

>> >> >>

>> >> >> +   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))

>> >> >> +       && INTEGRAL_TYPE_P (TREE_TYPE (@1))

>> >> >> +       && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE

>> >> >> (@1)))

>> >> >> +    (cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0));

>> >> >>

>> >> >> ... here inplace of the { build_zero_cst ... }.

>> >> >>

>> >> >> Ok with that changes.

>> >> > Thanks, committed the attached version as r241915.

>> >> ugh, the svn commit message has:

>> >>

>> >> testsuite/

>> >> * gcc.dg/pr35691-1.c: New test-case.

>> >> * gcc.dg/pr35691-4.c: Likewise.

>> >>

>> >> pr35691-4.c was a typo, should be pr35691-2.c :/

>> >> However testsuite/ChangeLog correctly has entry for pr35691-2.c

>> >> Is it possible to edit the commit message for r241915 ?

>> >> Sorry about this.

>> >

>> > No, just leave it as-is.

>> Hi,

>> Chritstophe reported to me that the commit caused test-cases

>> pr35691-1.c and pr35691-2.c (which were added by the commit)

>> to FAIL for cortex-a5:

>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/241915/arm-none-linux-gnueabihf/diff-gcc-rh60-arm-none-linux-gnueabihf-arm-cortex-a5-vfpv3-d16-fp16.txt

>>

>> It seems truth_andif_expr is not simplified to bit_and_expr on

>> cortex-a5 as for x86_64 (and other arm variants).

>> The differences in dumps start from 004t.gimple for pr35691-1.c:

>>

>> x86_64 gimple dump:

>> foo (int z0, unsigned int z1)

>> {

>>   int D.1800;

>>   int t0;

>>   int t1;

>>   int t2;

>>

>>   _1 = z0 == 0;

>>   t0 = (int) _1;

>>   _2 = z1 == 0;

>>   t1 = (int) _2;

>>   _3 = t0 != 0;

>>   _4 = t1 != 0;

>>   _5 = _3 & _4;

>>   t2 = (int) _5;

>>   D.1800 = t2;

>>   return D.1800;

>> }

>>

>> cortex-a5 gimple dump:

>> foo (int z0, unsigned int z1)

>> {

>>   int iftmp.0;

>>   int D.4176;

>>   int t0;

>>   int t1;

>>   int t2;

>>

>>   _1 = z0 == 0;

>>   t0 = (int) _1;

>>   _2 = z1 == 0;

>>   t1 = (int) _2;

>>   if (t0 != 0) goto <D.4174>; else goto <D.4172>;

>>   <D.4174>:

>>   if (t1 != 0) goto <D.4175>; else goto <D.4172>;

>>   <D.4175>:

>>   iftmp.0 = 1;

>>   goto <D.4173>;

>>   <D.4172>:

>>   iftmp.0 = 0;

>>   <D.4173>:

>>   t2 = iftmp.0;

>>   D.4176 = t2;

>>   return D.4176;

>> }

>>

>> Since the pattern expects truth_andif_expr to be converted to bit_and_expr,

>> it fails to match for cortex-a5.

>> This seems to happen only for cortex-a5 (the other variants a9, a15,

>> a57 are OK).

>>

>> Is my assumption that truth_andif_expr would be always converted to bit_and_expr

>> for above case incorrect ?

>

> Yes, it depends on LOGICAL_OP_SHORT_CIRCUIT.

Thanks, I wasn't aware of that. I adjusted the test-case in the attached patch,
to always contain bit_and_expr/bit_ior_expr, which passes on cortex-a5
and powerpc.
I will update PR35691 with the comment that it's partially fixed, for
targets when LOGICAL_OP_NON_SHORT_CIRCUIT is true.
Cross-testing on arm*-*-* in progress.
Ok to commit if passes ?

Thanks,
Prathamesh
>

> Richard.

>

>> Thanks,

>> Prathamesh

>> >

>> > Richard.

>> >

>> >> Regards,

>> >> Prathamesh

>> >> >

>> >> >>

>> >> >> Richard.

>> >>

>> >>

>> >

>> > --

>> > Richard Biener <rguenther@suse.de>

>> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

>>

>>

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

[match.pd] Fix for PR35691

Commit Message

Comments

Patch