From patchwork Mon Jul 4 14:23:13 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 2437 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 23D2B23F53 for ; Mon, 4 Jul 2011 14:23:25 +0000 (UTC) Received: from mail-qw0-f52.google.com (mail-qw0-f52.google.com [209.85.216.52]) by fiordland.canonical.com (Postfix) with ESMTP id CC0EFA18231 for ; Mon, 4 Jul 2011 14:23:24 +0000 (UTC) Received: by qwb8 with SMTP id 8so3563984qwb.11 for ; Mon, 04 Jul 2011 07:23:24 -0700 (PDT) Received: by 10.229.1.140 with SMTP id 12mr4713809qcf.118.1309789404220; Mon, 04 Jul 2011 07:23:24 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.229.48.135 with SMTP id r7cs29354qcf; Mon, 4 Jul 2011 07:23:23 -0700 (PDT) Received: by 10.42.74.135 with SMTP id w7mr3494389icj.506.1309789402053; Mon, 04 Jul 2011 07:23:22 -0700 (PDT) Received: from mail.codesourcery.com (mail.codesourcery.com [38.113.113.100]) by mx.google.com with ESMTPS id f10si18998679icx.78.2011.07.04.07.23.20 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 04 Jul 2011 07:23:20 -0700 (PDT) Received-SPF: pass (google.com: domain of ams@codesourcery.com designates 38.113.113.100 as permitted sender) client-ip=38.113.113.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ams@codesourcery.com designates 38.113.113.100 as permitted sender) smtp.mail=ams@codesourcery.com Received: (qmail 26787 invoked from network); 4 Jul 2011 14:23:17 -0000 Received: from unknown (HELO ?192.168.0.100?) (ams@127.0.0.2) by mail.codesourcery.com with ESMTPA; 4 Jul 2011 14:23:17 -0000 Message-ID: <4E11CCD1.4010505@codesourcery.com> Date: Mon, 04 Jul 2011 15:23:13 +0100 From: Andrew Stubbs User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Richard Guenther CC: Michael Matz , gcc-patches@gcc.gnu.org, patches@linaro.org Subject: Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching References: <4E034EF2.3070503@codesourcery.com> <4E03504B.9060305@codesourcery.com> <4E044559.5000105@linaro.org> <1A77B5B39081C241A68E6CF16983025F020906F6@EU1-MAIL.mgc.mentorg.com> <4E09B142.4020402@codesourcery.com> <4E09FDEA.3000004@gmail.com> <1A77B5B39081C241A68E6CF16983025F0209071D@EU1-MAIL.mgc.mentorg.com> In-Reply-To: On 01/07/11 13:25, Richard Guenther wrote: > Well - some operations work the same on both signedness if you > just care about the twos-complement result. This includes > multiplication (but not for example division). For this special > case I suggest to not bother trying to invent a generic predicate > but do something local in tree-ssa-math-opts.c. OK, here's my updated patch. I've taken the view that we *know* what size and signedness the result of the multiplication is, and we know what size the input to the addition must be, so all the check has to do is make sure it does that same conversion, even if by a roundabout means. What I hadn't grasped before is that when extending a value it's the source type that is significant, not the destination, so the checks are not as complex as I had thought. So, this patch adds a test to ensure that: 1. the type is not truncated so far that we lose any information; and 2. the type is only ever extended in the proper signedness. Also, just to be absolutely sure, I've also added a little bit of logic to permit extends that are then undone by a truncate. I'm really not sure what guarantees there are about what sort of cast sequences can exist? Is this necessary? I haven't managed to coax it to generated any examples of extends followed by truncates myself, but in any case, it's hardly any code and it'll make sure it's future proofed. OK? Andrew 2011-06-28 Andrew Stubbs gcc/ * tree-ssa-math-opts.c (valid_types_for_madd_p): New function. (convert_plusminus_to_widen): Use valid_types_for_madd_p to identify optimization candidates. gcc/testsuite/ * gcc.target/arm/wmul-5.c: New file. * gcc.target/arm/no-wmla-1.c: New file. --- .../gcc/testsuite/gcc.target/arm/no-wmla-1.c | 11 ++ .../gcc/testsuite/gcc.target/arm/wmul-5.c | 10 ++ src/gcc-mainline/gcc/tree-ssa-math-opts.c | 112 ++++++++++++++++++-- 3 files changed, 123 insertions(+), 10 deletions(-) create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c new file mode 100644 index 0000000..17f7427 --- /dev/null +++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +int +foo (int a, short b, short c) +{ + int bc = b * c; + return a + (short)bc; +} + +/* { dg-final { scan-assembler "mul" } } */ diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c new file mode 100644 index 0000000..65c43e3 --- /dev/null +++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=armv7-a" } */ + +long long +foo (long long a, char *b, char *c) +{ + return a + *b * *c; +} + +/* { dg-final { scan-assembler "umlal" } } */ diff --git a/src/gcc-mainline/gcc/tree-ssa-math-opts.c b/src/gcc-mainline/gcc/tree-ssa-math-opts.c index d55ba57..5ef7bb4 100644 --- a/src/gcc-mainline/gcc/tree-ssa-math-opts.c +++ b/src/gcc-mainline/gcc/tree-ssa-math-opts.c @@ -2085,6 +2085,78 @@ convert_mult_to_widen (gimple stmt) return true; } +/* Check the input types, TYPE1 and TYPE2 to a widening multiply, + and then the convertions between the output of the multiply, and + the input to an addition EXPR, to ensure that they are compatible with + a widening multiply-and-accumulate. + + This function assumes that expr is a valid string of conversion expressions + terminated by a multiplication. + + This function tries NOT to make any (fragile) assumptions about what + sequence of conversions can exist in the input. */ + +static bool +valid_types_for_madd_p (tree type1, tree type2, tree expr) +{ + gimple stmt, prev_stmt; + enum tree_code code, prev_code; + tree prev_expr, type, prev_type; + int bitsize, prev_bitsize, initial_bitsize, min_bitsize; + bool initial_unsigned; + + initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2); + initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2); + + stmt = SSA_NAME_DEF_STMT (expr); + code = gimple_assign_rhs_code (stmt); + type = TREE_TYPE (expr); + bitsize = TYPE_PRECISION (type); + min_bitsize = bitsize; + + if (code == MULT_EXPR || code == WIDEN_MULT_EXPR) + return true; + + if (!INTEGRAL_TYPE_P (type) + || TYPE_PRECISION (type) < initial_bitsize) + return false; + + /* Step through the conversions backwards. */ + while (true) + { + prev_expr = gimple_assign_rhs1 (stmt); + prev_stmt = SSA_NAME_DEF_STMT (prev_expr); + prev_code = gimple_assign_rhs_code (prev_stmt); + prev_type = TREE_TYPE (prev_expr); + prev_bitsize = TYPE_PRECISION (prev_type); + + if (prev_code == MULT_EXPR || prev_code == WIDEN_MULT_EXPR) + break; + + /* If it's an unsuitable type or a truncate that damages the + original value, then were done. */ + if (!INTEGRAL_TYPE_P (prev_type) + || TYPE_PRECISION (prev_type) < initial_bitsize) + return false; + + /* If we have the wrong sort of extend for the value, then it + could still be ok if we already saw a truncate that reverses + the effect. */ + if (bitsize > prev_bitsize + && TYPE_UNSIGNED (prev_type) != initial_unsigned + && min_bitsize > prev_bitsize) + return false; + + stmt = prev_stmt; + code = prev_code; + type = prev_type; + bitsize = prev_bitsize; + min_bitsize = bitsize < min_bitsize ? bitsize : min_bitsize; + } + + return true; +} + /* Process a single gimple statement STMT, which is found at the iterator GSI and has a either a PLUS_EXPR or a MINUS_EXPR as its rhs (given by CODE), and try to convert it into a @@ -2098,6 +2170,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, gimple rhs1_stmt = NULL, rhs2_stmt = NULL; tree type, type1, type2; tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs; + tree tmp, mult_rhs; enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK; optab this_optab; enum tree_code wmult_code; @@ -2117,22 +2190,32 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, rhs1 = gimple_assign_rhs1 (stmt); rhs2 = gimple_assign_rhs2 (stmt); - if (TREE_CODE (rhs1) == SSA_NAME) + for (tmp = rhs1, rhs1_code = ERROR_MARK; + TREE_CODE (tmp) == SSA_NAME + && (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK); + tmp = gimple_assign_rhs1 (rhs1_stmt)) { - rhs1_stmt = SSA_NAME_DEF_STMT (rhs1); - if (is_gimple_assign (rhs1_stmt)) - rhs1_code = gimple_assign_rhs_code (rhs1_stmt); + rhs1_stmt = SSA_NAME_DEF_STMT (tmp); + if (!is_gimple_assign (rhs1_stmt)) + break; + rhs1_code = gimple_assign_rhs_code (rhs1_stmt); } - else + + if (TREE_CODE (tmp) != SSA_NAME) return false; - if (TREE_CODE (rhs2) == SSA_NAME) + for (tmp = rhs2, rhs2_code = ERROR_MARK; + TREE_CODE (tmp) == SSA_NAME + && (CONVERT_EXPR_CODE_P (rhs2_code) || rhs2_code == ERROR_MARK); + tmp = gimple_assign_rhs1 (rhs2_stmt)) { - rhs2_stmt = SSA_NAME_DEF_STMT (rhs2); - if (is_gimple_assign (rhs2_stmt)) - rhs2_code = gimple_assign_rhs_code (rhs2_stmt); + rhs2_stmt = SSA_NAME_DEF_STMT (tmp); + if (!is_gimple_assign (rhs2_stmt)) + break; + rhs2_code = gimple_assign_rhs_code (rhs2_stmt); } - else + + if (TREE_CODE (tmp) != SSA_NAME) return false; if (code == PLUS_EXPR && rhs1_code == MULT_EXPR) @@ -2140,6 +2223,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1, &type2, &mult_rhs2)) return false; + mult_rhs = rhs1; add_rhs = rhs2; } else if (rhs2_code == MULT_EXPR) @@ -2147,6 +2231,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1, &type2, &mult_rhs2)) return false; + mult_rhs = rhs2; add_rhs = rhs1; } else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR) @@ -2155,6 +2240,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt); type1 = TREE_TYPE (mult_rhs1); type2 = TREE_TYPE (mult_rhs2); + mult_rhs = rhs1; add_rhs = rhs2; } else if (rhs2_code == WIDEN_MULT_EXPR) @@ -2163,6 +2249,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt); type1 = TREE_TYPE (mult_rhs1); type2 = TREE_TYPE (mult_rhs2); + mult_rhs = rhs2; add_rhs = rhs1; } else @@ -2171,6 +2258,11 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2)) return false; + /* Verify that the convertions between the mult and the add doesn't do + anything unexpected. */ + if (!valid_types_for_madd_p (type1, type2, mult_rhs)) + return false; + /* Verify that the machine can perform a widening multiply accumulate in this mode/signedness combination, otherwise this transformation is likely to pessimize code. */