From patchwork Tue Sep 27 12:29:34 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 4378 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id A647523EF9 for ; Tue, 27 Sep 2011 12:29:36 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id 8B9F3A181AE for ; Tue, 27 Sep 2011 12:29:36 +0000 (UTC) Received: by fxe23 with SMTP id 23so9858546fxe.11 for ; Tue, 27 Sep 2011 05:29:36 -0700 (PDT) Received: by 10.223.57.17 with SMTP id a17mr7482329fah.65.1317126576368; Tue, 27 Sep 2011 05:29:36 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.3.234 with SMTP id f10cs71276laf; Tue, 27 Sep 2011 05:29:35 -0700 (PDT) Received: by 10.150.56.36 with SMTP id e36mr4871032yba.144.1317126574915; Tue, 27 Sep 2011 05:29:34 -0700 (PDT) Received: from mail-yx0-f178.google.com (mail-yx0-f178.google.com [209.85.213.178]) by mx.google.com with ESMTPS id n19si1250317ybk.67.2011.09.27.05.29.34 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 27 Sep 2011 05:29:34 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.213.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=209.85.213.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.213.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) smtp.mail=ira.rosen@linaro.org Received: by yxj19 with SMTP id 19so7233836yxj.37 for ; Tue, 27 Sep 2011 05:29:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.50.3 with SMTP id x3mr7011530ybx.350.1317126574149; Tue, 27 Sep 2011 05:29:34 -0700 (PDT) Received: by 10.151.113.18 with HTTP; Tue, 27 Sep 2011 05:29:34 -0700 (PDT) Date: Tue, 27 Sep 2011 15:29:34 +0300 Message-ID: Subject: [patch] Support multiple types in SLP From: Ira Rosen To: gcc-patches@gcc.gnu.org Cc: Patch Tracking Hi, This patch adds a support of multiple types (in the same SLP instance) in basic block vectorization. Bootstrapped and tested on powerpc64-suse-linux. Applied to trunk. Ira ChangeLog: * tree-vect-stmts.c (vectorizable_type_demotion): Handle basic block vectorization. (vectorizable_type_promotion): Likewise. (vect_analyze_stmt): Call vectorizable_type_demotion and vectorizable_type_promotion for basic blocks. (supportable_widening_operation): Don't assume loop vectorization. * tree-vect-slp.c (vect_build_slp_tree): Allow multiple types for basic blocks. Update vectorization factor for basic block vectorization. (vect_analyze_slp_instance): Allow multiple types for basic block vectorization. Recheck unrolling factor after construction of SLP instance. testsuite/ChangeLog: * gcc.dg/vect/bb-slp-11.c: Expect to get vectorized with 64-bit vectors. * gcc.dg/vect/bb-slp-27.c: New. * gcc.dg/vect/bb-slp-28.c: New. Index: ChangeLog =================================================================== --- ChangeLog (revision 179266) +++ ChangeLog (working copy) @@ -1,3 +1,18 @@ +2011-09-27 Ira Rosen + + * tree-vect-stmts.c (vectorizable_type_demotion): Handle basic block + vectorization. + (vectorizable_type_promotion): Likewise. + (vect_analyze_stmt): Call vectorizable_type_demotion and + vectorizable_type_promotion for basic blocks. + (supportable_widening_operation): Don't assume loop vectorization. + * tree-vect-slp.c (vect_build_slp_tree): Allow multiple types for + basic blocks. Update vectorization factor for basic block + vectorization. + (vect_analyze_slp_instance): Allow multiple types for basic block + vectorization. Recheck unrolling factor after construction of SLP + instance. + 2011-09-27 Richard Guenther * tree-object-size.c (compute_object_sizes): Fix dumping of Index: testsuite/gcc.dg/vect/bb-slp-27.c =================================================================== --- testsuite/gcc.dg/vect/bb-slp-27.c (revision 0) +++ testsuite/gcc.dg/vect/bb-slp-27.c (revision 0) @@ -0,0 +1,49 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include "tree-vect.h" + +#define A 3 +#define N 16 + +short src[N], dst[N]; + +void foo (int a) +{ + dst[0] += a*src[0]; + dst[1] += a*src[1]; + dst[2] += a*src[2]; + dst[3] += a*src[3]; + dst[4] += a*src[4]; + dst[5] += a*src[5]; + dst[6] += a*src[6]; + dst[7] += a*src[7]; +} + + +int main (void) +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + dst[i] = 0; + src[i] = i; + } + + foo (A); + + for (i = 0; i < 8; i++) + { + if (dst[i] != A * i) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target { vect_int_mult && vect_unpack && vect_pack_trunc } } } } */ +/* { dg-final { cleanup-tree-dump "slp" } } */ + Index: testsuite/gcc.dg/vect/bb-slp-28.c =================================================================== --- testsuite/gcc.dg/vect/bb-slp-28.c (revision 0) +++ testsuite/gcc.dg/vect/bb-slp-28.c (revision 0) @@ -0,0 +1,71 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include "tree-vect.h" + +#define A 300 +#define N 16 + +char src[N]; +short dst[N]; +short src1[N], dst1[N]; + +void foo (int a) +{ + dst[0] = (short) (a * (int) src[0]); + dst[1] = (short) (a * (int) src[1]); + dst[2] = (short) (a * (int) src[2]); + dst[3] = (short) (a * (int) src[3]); + dst[4] = (short) (a * (int) src[4]); + dst[5] = (short) (a * (int) src[5]); + dst[6] = (short) (a * (int) src[6]); + dst[7] = (short) (a * (int) src[7]); + dst[8] = (short) (a * (int) src[8]); + dst[9] = (short) (a * (int) src[9]); + dst[10] = (short) (a * (int) src[10]); + dst[11] = (short) (a * (int) src[11]); + dst[12] = (short) (a * (int) src[12]); + dst[13] = (short) (a * (int) src[13]); + dst[14] = (short) (a * (int) src[14]); + dst[15] = (short) (a * (int) src[15]); + + dst1[0] += src1[0]; + dst1[1] += src1[1]; + dst1[2] += src1[2]; + dst1[3] += src1[3]; + dst1[4] += src1[4]; + dst1[5] += src1[5]; + dst1[6] += src1[6]; + dst1[7] += src1[7]; +} + + +int main (void) +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + dst[i] = 2; + dst1[i] = 0; + src[i] = i; + src1[i] = i+2; + } + + foo (A); + + for (i = 0; i < N; i++) + { + if (dst[i] != A * i + || (i < N/2 && dst1[i] != i + 2)) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target { vect_int_mult && vect_pack_trunc && vect_unpack } } } } */ +/* { dg-final { cleanup-tree-dump "slp" } } */ + Index: testsuite/gcc.dg/vect/bb-slp-11.c =================================================================== --- testsuite/gcc.dg/vect/bb-slp-11.c (revision 179266) +++ testsuite/gcc.dg/vect/bb-slp-11.c (working copy) @@ -48,8 +48,6 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */ -/* { dg-final { scan-tree-dump-times "SLP with multiple types" 1 "slp" { xfail vect_multiple_sizes } } } */ -/* { dg-final { scan-tree-dump-times "SLP with multiple types" 2 "slp" { target vect_multiple_sizes } } } */ +/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect64 } } } */ /* { dg-final { cleanup-tree-dump "slp" } } */ Index: testsuite/ChangeLog =================================================================== --- testsuite/ChangeLog (revision 179266) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,10 @@ +2011-09-27 Ira Rosen + + * gcc.dg/vect/bb-slp-11.c: Expect to get vectorized with 64-bit + vectors. + * gcc.dg/vect/bb-slp-27.c: New. + * gcc.dg/vect/bb-slp-28.c: New. + 2011-09-27 Bernd Schmidt * testsuite/lib/target-supports.exp (check_profiling_available): Index: tree-vect-stmts.c =================================================================== --- tree-vect-stmts.c (revision 179266) +++ tree-vect-stmts.c (working copy) @@ -3039,11 +3039,9 @@ vectorizable_type_demotion (gimple stmt, gimple_st VEC (tree, heap) *vec_oprnds0 = NULL; VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL; tree last_oprnd, intermediate_type; + bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info); - /* FORNOW: not supported by basic block SLP vectorization. */ - gcc_assert (loop_vinfo); - - if (!STMT_VINFO_RELEVANT_P (stmt_info)) + if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) return false; if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) @@ -3071,7 +3069,7 @@ vectorizable_type_demotion (gimple stmt, gimple_st && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0)) && CONVERT_EXPR_CODE_P (code)))) return false; - if (!vect_is_simple_use_1 (op0, loop_vinfo, NULL, + if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt[0], &vectype_in)) { if (vect_print_dump_info (REPORT_DETAILS)) @@ -3318,11 +3316,9 @@ vectorizable_type_promotion (gimple stmt, gimple_s int multi_step_cvt = 0; VEC (tree, heap) *vec_oprnds0 = NULL, *vec_oprnds1 = NULL; VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL; + bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info); - /* FORNOW: not supported by basic block SLP vectorization. */ - gcc_assert (loop_vinfo); - - if (!STMT_VINFO_RELEVANT_P (stmt_info)) + if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo) return false; if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) @@ -3351,7 +3347,7 @@ vectorizable_type_promotion (gimple stmt, gimple_s && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0)) && CONVERT_EXPR_CODE_P (code)))) return false; - if (!vect_is_simple_use_1 (op0, loop_vinfo, NULL, + if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt[0], &vectype_in)) { if (vect_print_dump_info (REPORT_DETAILS)) @@ -5083,7 +5079,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vect else { if (bb_vinfo) - ok = (vectorizable_shift (stmt, NULL, NULL, node) + ok = (vectorizable_type_promotion (stmt, NULL, NULL, node) + || vectorizable_type_demotion (stmt, NULL, NULL, node) + || vectorizable_shift (stmt, NULL, NULL, node) || vectorizable_operation (stmt, NULL, NULL, node) || vectorizable_assignment (stmt, NULL, NULL, node) || vectorizable_load (stmt, NULL, NULL, node, NULL) @@ -5719,7 +5717,7 @@ supportable_widening_operation (enum tree_code cod { stmt_vec_info stmt_info = vinfo_for_stmt (stmt); loop_vec_info loop_info = STMT_VINFO_LOOP_VINFO (stmt_info); - struct loop *vect_loop = LOOP_VINFO_LOOP (loop_info); + struct loop *vect_loop = NULL; bool ordered_p; enum machine_mode vec_mode; enum insn_code icode1, icode2; @@ -5728,6 +5726,9 @@ supportable_widening_operation (enum tree_code cod tree wide_vectype = vectype_out; enum tree_code c1, c2; + if (loop_info) + vect_loop = LOOP_VINFO_LOOP (loop_info); + /* The result of a vectorized widening operation usually requires two vectors (because the widened results do not fit int one vector). The generated vector results would normally be expected to be generated in the same @@ -5748,7 +5749,8 @@ supportable_widening_operation (enum tree_code cod iterations in parallel). We therefore don't allow to change the order of the computation in the inner-loop during outer-loop vectorization. */ - if (STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction + if (vect_loop + && STMT_VINFO_RELEVANT (stmt_info) == vect_used_by_reduction && !nested_in_vect_loop_p (vect_loop, stmt)) ordered_p = false; else Index: tree-vect-slp.c =================================================================== --- tree-vect-slp.c (revision 179266) +++ tree-vect-slp.c (working copy) @@ -393,20 +393,15 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_ return false; } - ncopies = vectorization_factor / TYPE_VECTOR_SUBPARTS (vectype); - if (ncopies != 1) + /* In case of multiple types we need to detect the smallest type. */ + if (*max_nunits < TYPE_VECTOR_SUBPARTS (vectype)) { - if (vect_print_dump_info (REPORT_SLP)) - fprintf (vect_dump, "SLP with multiple types "); - - /* FORNOW: multiple types are unsupported in BB SLP. */ - if (bb_vinfo) - return false; + *max_nunits = TYPE_VECTOR_SUBPARTS (vectype); + if (bb_vinfo) + vectorization_factor = *max_nunits; } - /* In case of multiple types we need to detect the smallest type. */ - if (*max_nunits < TYPE_VECTOR_SUBPARTS (vectype)) - *max_nunits = TYPE_VECTOR_SUBPARTS (vectype); + ncopies = vectorization_factor / TYPE_VECTOR_SUBPARTS (vectype); if (is_gimple_call (stmt)) rhs_code = CALL_EXPR; @@ -1201,7 +1196,6 @@ vect_analyze_slp_instance (loop_vec_info loop_vinf if (loop_vinfo) vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); else - /* No multitypes in BB SLP. */ vectorization_factor = nunits; /* Calculate the unrolling factor. */ @@ -1257,16 +1251,23 @@ vect_analyze_slp_instance (loop_vec_info loop_vinf &max_nunits, &load_permutation, &loads, vectorization_factor)) { + /* Calculate the unrolling factor based on the smallest type. */ + if (max_nunits > nunits) + unrolling_factor = least_common_multiple (max_nunits, group_size) + / group_size; + + if (unrolling_factor != 1 && !loop_vinfo) + { + if (vect_print_dump_info (REPORT_SLP)) + fprintf (vect_dump, "Build SLP failed: unrolling required in basic" + " block SLP"); + return false; + } + /* Create a new SLP instance. */ new_instance = XNEW (struct _slp_instance); SLP_INSTANCE_TREE (new_instance) = node; SLP_INSTANCE_GROUP_SIZE (new_instance) = group_size; - /* Calculate the unrolling factor based on the smallest type in the - loop. */ - if (max_nunits > nunits) - unrolling_factor = least_common_multiple (max_nunits, group_size) - / group_size; - SLP_INSTANCE_UNROLLING_FACTOR (new_instance) = unrolling_factor; SLP_INSTANCE_OUTSIDE_OF_LOOP_COST (new_instance) = outside_cost; SLP_INSTANCE_INSIDE_OF_LOOP_COST (new_instance) = inside_cost;