diff mbox

Support vectorization of widening shifts

Message ID CAKSNEw4m5FirSqekmAyRi5SWqPidqz2kb2=X1r1WQrjwVEjStw@mail.gmail.com
State New
Headers show

Commit Message

Ira Rosen Oct. 2, 2011, 8:30 a.m. UTC
On 29 September 2011 17:30, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> On 19 September 2011 08:54, Ira Rosen <ira.rosen@linaro.org> wrote:
>
>>
>> Bootstrapped on powerpc64-suse-linux, tested on powerpc64-suse-linux
>> and arm-linux-gnueabi
>> OK for mainline?
>
> Sorry I missed this patch. Is there any reason why we need unspecs in
> this case ? Can't this be represented by subregs and zero/ sign
> extensions in RTL without the UNSPECs ?

Like this:

 ; because the ordering of vector elements in Q registers is different from what
 ; the semantics of the instructions require.

?

Thanks,
Ira


>
> cheers
> Ramana
>
>>
>> Thanks,
>> Ira
>>
>> ChangeLog:
>>
>>        * doc/md.texi (vec_widen_ushiftl_hi, vec_widen_ushiftl_lo,
>> vec_widen_sshiftl_hi,
>>        vec_widen_sshiftl_lo): Document.
>>        * tree-pretty-print.c (dump_generic_node): Handle WIDEN_SHIFT_LEFT_EXPR,
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        (op_code_prio): Likewise.
>>        (op_symbol_code): Handle WIDEN_SHIFT_LEFT_EXPR.
>>        * optabs.c (optab_for_tree_code): Handle
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        (init-optabs): Initialize optab codes for vec_widen_u/sshiftl_hi/lo.
>>        * optabs.h (enum optab_index): Add OTI_vec_widen_u/sshiftl_hi/lo.
>>        * genopinit.c (optabs): Initialize the new optabs.
>>        * expr.c (expand_expr_real_2): Handle
>>        VEC_WIDEN_SHIFT_LEFT_HI_EXPR and VEC_WIDEN_SHIFT_LEFT_LO_EXPR.
>>        * gimple-pretty-print.c (dump_binary_rhs): Likewise.
>>        * tree-vectorizer.h (NUM_PATTERNS): Increase to 6.
>>        * tree.def (WIDEN_SHIFT_LEFT_EXPR, VEC_WIDEN_SHIFT_LEFT_HI_EXPR,
>>        VEC_WIDEN_SHIFT_LEFT_LO_EXPR): New.
>>        * cfgexpand.c (expand_debug_expr):  Handle new tree codes.
>>        * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add
>>        vect_recog_widen_shift_pattern.
>>        (vect_handle_widen_mult_by_const): Rename...
>>        (vect_handle_widen_op_by_const): ...to this.  Handle shifts.
>>        Add a new argument, update documentation.
>>        (vect_recog_widen_mult_pattern): Assume that only second
>>        operand can be constant.  Update call to
>>        vect_handle_widen_op_by_const.
>>        (vect_operation_fits_smaller_type): Add the already existing
>>        def stmt to the list of pattern statements.
>>        (vect_recog_widen_shift_pattern): New.
>>        * tree-vect-stmts.c (vectorizable_type_promotion): Handle
>>        widening shifts.
>>        (supportable_widening_operation): Likewise.
>>        * tree-inline.c (estimate_operator_cost): Handle new tree codes.
>>        * tree-vect-generic.c (expand_vector_operations_1): Likewise.
>>        * tree-cfg.c (verify_gimple_assign_binary): Likewise.
>>        * config/arm/neon.md (neon_vec_<US>shiftl_lo_<mode>): New.
>>        (vec_widen_<US>shiftl_lo_<mode>, neon_vec_<US>shiftl_hi_<mode>,
>>        vec_widen_<US>shiftl_hi_<mode>, neon_vec_<US>shift_left_<mode>):
>>        Likewise.
>>        * tree-vect-slp.c (vect_build_slp_tree): Require same shift operand
>>        for widening shift.
>>
>> testsuite/ChangeLog:
>>
>>       * gcc.dg/vect/vect-widen-shift-s16.c: New.
>>       * gcc.dg/vect/vect-widen-shift-s8.c: New.
>>       * gcc.dg/vect/vect-widen-shift-u16.c: New.
>>       * gcc.dg/vect/vect-widen-shift-u8.c: New.
>>
>
diff mbox

Patch

Index: config/arm/neon.md
===================================================================
--- config/arm/neon.md  (revision 178942)
+++ config/arm/neon.md  (working copy)
@@ -5550,6 +5550,46 @@ 
  }
 )

+(define_insn "neon_vec_<US>shiftl_<mode>"
+ [(set (match_operand:<V_widen> 0 "register_operand" "=w")
+       (SE:<V_widen> (match_operand:VW 1 "register_operand" "w")))
+       (match_operand:SI 2 "immediate_operand" "i")]
+  "TARGET_NEON"
+{
+  /* The boundaries are: 0 < imm <= size.  */
+  neon_const_bounds (operands[2], 0, neon_element_bits (<MODE>mode) + 1);
+  return "vshll.<US><V_sz_elem> %q0, %P1, %2";
+}
+  [(set_attr "neon_type" "neon_shift_1")]
+)
+
+(define_expand "vec_widen_<US>shiftl_lo_<mode>"
+  [(match_operand:<V_unpack> 0 "register_operand" "")
+   (SE:<V_unpack> (match_operand:VU 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON && !BYTES_BIG_ENDIAN"
+ {
+  emit_insn (gen_neon_vec_<US>shiftl_<V_half> (operands[0],
+               simplify_gen_subreg (<V_HALF>mode, operands[1], <MODE>mode, 0),
+               operands[2]));
+   DONE;
+ }
+)
+
+(define_expand "vec_widen_<US>shiftl_hi_<mode>"
+  [(match_operand:<V_unpack> 0 "register_operand" "")
+   (SE:<V_unpack> (match_operand:VU 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON && !BYTES_BIG_ENDIAN"
+ {
+  emit_insn (gen_neon_vec_<US>shiftl_<V_half> (operands[0],
+                simplify_gen_subreg (<V_HALF>mode, operands[1], <MODE>mode,
+                                    GET_MODE_SIZE (<V_HALF>mode)),
+                operands[2]));
+   DONE;
+ }
+)
+
 ;; Vectorize for non-neon-quad case
 (define_insn "neon_unpack<US>_<mode>"
  [(set (match_operand:<V_widen> 0 "register_operand" "=w")
@@ -5626,6 +5666,34 @@ 
  }
 )

+(define_expand "vec_widen_<US>shiftl_hi_<mode>"
+ [(match_operand:<V_double_width> 0 "register_operand" "")
+   (SE:<V_double_width> (match_operand:VDI 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON"
+ {
+   rtx tmpreg = gen_reg_rtx (<V_widen>mode);
+   emit_insn (gen_neon_vec_<US>shiftl_<mode> (tmpreg, operands[1],
operands[2]));
+   emit_insn (gen_neon_vget_high<V_widen_l> (operands[0], tmpreg));
+
+   DONE;
+ }
+)
+
+(define_expand "vec_widen_<US>shiftl_lo_<mode>"
+  [(match_operand:<V_double_width> 0 "register_operand" "")
+   (SE:<V_double_width> (match_operand:VDI 1 "register_operand" ""))
+   (match_operand:SI 2 "immediate_operand" "i")]
+ "TARGET_NEON"
+ {
+   rtx tmpreg = gen_reg_rtx (<V_widen>mode);
+   emit_insn (gen_neon_vec_<US>shiftl_<mode> (tmpreg, operands[1],
operands[2]));
+   emit_insn (gen_neon_vget_low<V_widen_l> (operands[0], tmpreg));
+
+   DONE;
+ }
+)
+
 ; FIXME: These instruction patterns can't be used safely in big-endian mode