Improve detection of widening multiplication in the vectorizer

On 1 June 2011 15:14, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
>> On 1 June 2011 12:42, Richard Guenther <richard.guenther@gmail.com> wrote:
>>
>>> Did you think about moving pass_optimize_widening_mul before
>>> loop optimizations?  Does that pass catch the cases you are
>>> teaching the pattern recognizer?  I think we should try to expose
>>> these more complicated instructions to loop optimizers.
>>>
>>
>> pass_optimize_widening_mul doesn't catch these cases, but I can try to
>> teach it instead of the vectorizer.
>> I am now testing
>>
>> Index: passes.c
>> ===================================================================
>> --- passes.c    (revision 174391)
>> +++ passes.c    (working copy)
>> @@ -870,6 +870,7 @@
>>       NEXT_PASS (pass_split_crit_edges);
>>       NEXT_PASS (pass_pre);
>>       NEXT_PASS (pass_sink_code);
>> +      NEXT_PASS (pass_optimize_widening_mul);
>>       NEXT_PASS (pass_tree_loop);
>>        {
>>          struct opt_pass **p = &pass_tree_loop.pass.sub;
>> @@ -934,7 +935,6 @@
>>       NEXT_PASS (pass_forwprop);
>>       NEXT_PASS (pass_phiopt);
>>       NEXT_PASS (pass_fold_builtins);
>> -      NEXT_PASS (pass_optimize_widening_mul);
>>       NEXT_PASS (pass_tail_calls);
>>       NEXT_PASS (pass_rename_ssa_copies);
>>       NEXT_PASS (pass_uncprop);
>>
>> to see how it affects other loop optimizations (vectorizer pattern
>> tests obviously fail).

Looks like it needs copy_prop and dce as well:

otherwise I get (on x86_64-suse-linux)

FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss
FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd

Ira

>
> Thanks.  I would hope that we eventually can get rid of the
> pattern recognizer ... at least for SSE there is also always
> a scalar variant instruction for each vectorized one.
>
> Richard.
>

Improve detection of widening multiplication in the vectorizer

Commit Message

Comments

Patch