From patchwork Wed Jan 31 15:06:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 126364 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp814836ljc; Wed, 31 Jan 2018 07:07:13 -0800 (PST) X-Google-Smtp-Source: AH8x224kRDWUXHXDYihBTcZ4kg4AMdN2WPv7gvi7Whq+/CHuTayur8qUsxrRy7B8RwQhFeNegZIx X-Received: by 10.98.82.68 with SMTP id g65mr34162162pfb.212.1517411233695; Wed, 31 Jan 2018 07:07:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517411233; cv=none; d=google.com; s=arc-20160816; b=U8RlvTXM84+F5cbNdsou6/iDCrBmgRZwSkjFQWjyYQoAZcczPSQnDOvkll6auHl/8K GVmaKxuTkXO6ohWlksFQkqN82FGAfKhjOrBAvoxVjOi6ry/fVI3gR2ztVp1iPBJ/e4uY 4pI3LBF1Eg3CVlYYSfcP5DzDwKUAwuJo6cJ0PcAxCmfsCTQuFfkIIOFUiTQqcx3J0gFr Jr9GOVmKh8zeuT5udWxkBpS+7CvRyQk/lzpgumAehm+mpFK5ZU1WkMXbYTQHQLwXtbNj V6mgCW9Q46Jm75Di8gMUyQXxCjOTEXrxsz6z8BBD/iorpP7dJZv4novG5kdezY3J3ASa w2gg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:user-agent:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=+82keuDA/XLFeJEx4y2fuDFPd5qyPshiJdEbxDEI6jo=; b=c/ZWPeTcubgHZFM5VgQfHUhfHxDyduubSDFeeAH8TQWEj0Xre91GoxIKHlCyd+jZcj f+OvxM3E2Nem8XmCEuQYw7CMJswbhaDg77dCFI2sxYMXUycipXvEB0ZsgEoTqEE1o+7p PyMRkujQBeZPX1RsoG1w9mhw2iQy5wNz855sccarkb9jrAm1VxHRsXqfaCNB+/bs1ZT+ i+NAZDBWec4AWg0ExnX/aa1n9IinuHAX0JKVPyDo7IQFQVOW7i8LDZhhODkAjfevpoCW MD5o3o4Dsj7mYJwUPOe9j2Bq8wKSHSX56a0g6N5uwuAyQI6642cGwFnM3sGi0mwccahF Ba8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=g3oUirV/; spf=pass (google.com: domain of gcc-patches-return-472381-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472381-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id a59-v6si1604287plc.213.2018.01.31.07.07.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 31 Jan 2018 07:07:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-472381-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=g3oUirV/; spf=pass (google.com: domain of gcc-patches-return-472381-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472381-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=xnEsqqk9yrcFTHnjq1vSXipEd4N2tiXgjOWI8Hu8d1/ECaCFxpzdR 7b50jkmWJCwreKs4ykT0YA22kEk2e6jwdZyyQwOWHv4VLJLvhR6VxZPQM0B/H/uu NogU8e0BrsQV1os2rNjsV0Uw3cdUYBJlVO2+iuMtz47hp6HhGh1k2Y= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=/Z/8v42zu9o6HxH4aXn2SXKWXrg=; b=g3oUirV/t7yR4kPYtMPN gvuU6zTaRxrRraEi3Gsy5sF6XPGrTN7dYUscN7l7U7c7kfnbEs4Y3L9xxr0wgRzl n61X6AS4OpxuZTkhLZs7OrecSlTTiJvfmN/XtgnK73KtRh9laD7Ew9uzBJGYwxDV Ts7esKTauHorLrEVQJ7zieE= Received: (qmail 130634 invoked by alias); 31 Jan 2018 15:07:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 129573 invoked by uid 89); 31 Jan 2018 15:07:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-wm0-f49.google.com Received: from mail-wm0-f49.google.com (HELO mail-wm0-f49.google.com) (74.125.82.49) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 31 Jan 2018 15:06:58 +0000 Received: by mail-wm0-f49.google.com with SMTP id t74so8819049wme.3 for ; Wed, 31 Jan 2018 07:06:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:user-agent:date :message-id:mime-version; bh=+82keuDA/XLFeJEx4y2fuDFPd5qyPshiJdEbxDEI6jo=; b=fwcuVWkAfJDuNd8EnyjN//YI91UoB+z0gdV9whw1Zevc+JkS8OsLRoIMd8iUScRAt0 sL0zAHjPVAtdyabfr4stc8MAdM5HqoJpTBgsrZb6jSE8rv2RgV7yetY9KVuzFqS6xpD1 E8E2AcUuHWFIfYp/pDcb+MUBuaZ1Ltp6nGHw4wHmquIdi+zpKOYJSaBsPzYCrejABMw6 UlWfKOd3/tYHwXPMj2ZLfAyAMFIukrnpQ/ushjJZ/iRT+628dR53aIAhBMQ9kMIemQCj BXkAixANIt60j9NC96EmeN22iF+fvWINnWh3YYLURNRsLEipTQUOMahwaZ+hVwW189x9 BTug== X-Gm-Message-State: AKwxytf06CkP6xKj/npw+JOr9V2xSahk2EMnpMBpkYWB5SwuDG0395ck zTGnV4ljNPhWJXQDnTY7z3QmDSpdgWo= X-Received: by 10.28.9.140 with SMTP id 134mr23198802wmj.23.1517411215861; Wed, 31 Jan 2018 07:06:55 -0800 (PST) Received: from localhost (188.29.164.24.threembb.co.uk. [188.29.164.24]) by smtp.gmail.com with ESMTPSA id i33sm16519160wri.70.2018.01.31.07.06.54 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 31 Jan 2018 07:06:55 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: Use range info in split_constant_offset (PR 81635) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Date: Wed, 31 Jan 2018 15:06:53 +0000 Message-ID: <874ln25ape.fsf@linaro.org> MIME-Version: 1.0 This patch implements the original suggestion for fixing PR 81635: use range info in split_constant_offset to see whether a conversion of a wrapping type can be split. The range info problem described in: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01002.html seems to have been fixed. The patch is part 1. There needs to be a follow-on patch to handle: for (unsigned int i = 0; i < n; i += 4) { ...[i + 2]... ...[i + 3]... which the old SCEV test handles, but which the range check doesn't. At the moment we record that the low two bits of "i" are clear, but we still end up with a maximum range of 0xffffffff rather than 0xfffffffc. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the before and after testsuite assembly output for at least one target per CPU directory. Excluding a small number of register renamings on some targets, there were two differences: (1) In gcc.c-torture/compile/pr55350.c: void foo (__INTPTR_TYPE__ x, __INTPTR_TYPE__ y) { int i; void **a = (void *) (8UL * (x / 8UL)); for (i = 0; i < x; i++) a[i] = (void *) y; } we previously kept base "a" and offset 0, but now use "(void **) _n" (where _n holds the multiplication result). This is because the old test had the side-effect of prohibiting casts from unsigned ints to pointers of the same size. What we do for code like this isn't going to help much though. (2) In gcc.c-torture/execute/2003016-1.c, we unrolled: void f (unsigned int *x) { unsigned char i; int j; i = 0x10; for (j = 0; j < 0x10; j++) { i += 0xe8; x[i] = 0; i -= 0xe7; } } and ended up with an unpropagated degenerate phi: # i_38 = PHI <16(3)> ... i_40 = i_38 + 1; ... i_48 = i_40 + 1; ... etc ... i_178 = i_168 + 1; i_16 = i_178 + 232; _17 = (long unsigned int) i_16; _18 = _17 * 4; _19 = &x + _18; *_19 = 0; Calling split_constant_offset on each (long unsigned int) operand gives i_38 + 0xe8, i_38 + 0xe9, ..., i_38 + 0xf7, with i_38 still having the range [0x10, 0x20]. We can therefore tell that i_38 + 0xf0 has the range [0x00, 0x10], and similarly for +0xf1...+0xf7. We should really be folding to constants here though. OK to install? 2018-01-31 Richard Sandiford gcc/ PR tree-optimization/81635 * tree-data-ref.c (split_constant_offset_1): For types that wrap on overflow, try to use range info to prove that wrapping cannot occur. gcc/testsuite/ PR tree-optimization/81635 * gcc.dg/vect/bb-slp-pr81635-1.c: New test. * gcc.dg/vect/bb-slp-pr81635-2.c: Likewise. Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c 2018-01-13 18:02:00.946360352 +0000 +++ gcc/tree-data-ref.c 2018-01-31 13:26:13.488630604 +0000 @@ -704,11 +704,46 @@ split_constant_offset_1 (tree type, tree and the outer precision is at least as large as the inner. */ tree itype = TREE_TYPE (op0); if ((POINTER_TYPE_P (itype) - || (INTEGRAL_TYPE_P (itype) && TYPE_OVERFLOW_UNDEFINED (itype))) + || (INTEGRAL_TYPE_P (itype) && !TYPE_OVERFLOW_TRAPS (itype))) && TYPE_PRECISION (type) >= TYPE_PRECISION (itype) && (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type))) { - split_constant_offset (op0, &var0, off); + if (INTEGRAL_TYPE_P (itype) && TYPE_OVERFLOW_WRAPS (itype)) + { + /* Split the unconverted operand and try to prove that + wrapping isn't a problem. */ + tree tmp_var, tmp_off; + split_constant_offset (op0, &tmp_var, &tmp_off); + + /* See whether we have an SSA_NAME whose range is known + to be [A, B]. */ + if (TREE_CODE (tmp_var) != SSA_NAME) + return false; + wide_int var_min, var_max; + if (get_range_info (tmp_var, &var_min, &var_max) != VR_RANGE) + return false; + + /* See whether the range of OP0 (i.e. TMP_VAR + TMP_OFF) + is known to be [A + TMP_OFF, B + TMP_OFF], with all + operations done in ITYPE. The addition must overflow + at both ends of the range or at neither. */ + bool overflow[2]; + signop sgn = TYPE_SIGN (itype); + unsigned int prec = TYPE_PRECISION (itype); + wide_int woff = wi::to_wide (tmp_off, prec); + wide_int op0_min = wi::add (var_min, woff, sgn, &overflow[0]); + wi::add (var_max, woff, sgn, &overflow[1]); + if (overflow[0] != overflow[1]) + return false; + + /* Calculate (ssizetype) OP0 - (ssizetype) TMP_VAR. */ + widest_int diff = (widest_int::from (op0_min, sgn) + - widest_int::from (var_min, sgn)); + var0 = tmp_var; + *off = wide_int_to_tree (ssizetype, diff); + } + else + split_constant_offset (op0, &var0, off); *var = fold_convert (type, var0); return true; } Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-1.c =================================================================== --- /dev/null 2018-01-30 17:30:22.185477046 +0000 +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-1.c 2018-01-31 13:26:13.487630644 +0000 @@ -0,0 +1,92 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ +/* { dg-require-effective-target vect_double } */ +/* { dg-require-effective-target lp64 } */ + +void +f1 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < 1000; i += 4) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f2 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 2; i < ~0U - 4; i += 4) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f3 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < ~0U - 3; i += 4) + { + double a = q[i + 2] + p[i + 2]; + double b = q[i + 3] + p[i + 3]; + q[i + 2] = a; + q[i + 3] = b; + } +} + +void +f4 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < 500; i += 6) + for (unsigned int j = 0; j < 500; j += 4) + { + double a = q[j] + p[i]; + double b = q[j + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f5 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 2; i < 1000; i += 4) + { + double a = q[i - 2] + p[i - 2]; + double b = q[i - 1] + p[i - 1]; + q[i - 2] = a; + q[i - 1] = b; + } +} + +double p[1000]; +double q[1000]; + +void +f6 (int n) +{ + for (unsigned int i = 0; i < n; i += 4) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized" 6 "slp1" } } */ Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-2.c =================================================================== --- /dev/null 2018-01-30 17:30:22.185477046 +0000 +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-2.c 2018-01-31 13:26:13.487630644 +0000 @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ +/* { dg-require-effective-target lp64 } */ + +double p[1000]; +double q[1000]; + +void +f1 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 2; i < ~0U - 4; i += 4) + { + double a = q[i + 2] + p[i + 2]; + double b = q[i + 3] + p[i + 3]; + q[i + 2] = a; + q[i + 3] = b; + } +} + +void +f2 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < ~0U - 3; i += 4) + { + double a = q[i + 4] + p[i + 4]; + double b = q[i + 5] + p[i + 5]; + q[i + 4] = a; + q[i + 5] = b; + } +} + +void +f3 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < 1000; i += 4) + { + double a = q[i - 2] + p[i - 2]; + double b = q[i - 1] + p[i - 1]; + q[i - 2] = a; + q[i - 1] = b; + } +} + +void +f4 (double *p, double *q) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 2; i < 1000; i += 4) + { + double a = q[i - 4] + p[i - 4]; + double b = q[i - 3] + p[i - 3]; + q[i - 4] = a; + q[i - 3] = b; + } +} + +/* { dg-final { scan-tree-dump-not "basic block vectorized" "slp1" } } */