From patchwork Fri Feb 2 14:12:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 126709 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp679512ljc; Fri, 2 Feb 2018 06:12:31 -0800 (PST) X-Google-Smtp-Source: AH8x224MrCdL5Cb+Nq2EU17jX8VEKgY/VPUvTC0L79HjAg3RnM0E0VukFcBsM5yyNvEvG3ktIBqK X-Received: by 2002:a17:902:14cb:: with SMTP id y11-v6mr15508432plg.294.1517580751788; Fri, 02 Feb 2018 06:12:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517580751; cv=none; d=google.com; s=arc-20160816; b=M0L9TwGOMZg3US0aef7gEKhFVjV6Wowc5e/b9y98T2VeZJF3n7dMwiCqrSjUDfwqfc FYXYW6zzu4YKsBRRyt+e6Xe+0W87kRcskQAQwHNVZDcvChxl8KfK0cPgRHT2K/SQ2FD6 XrSiMw3YsimPcCvpoOrj4HfYJD+LZVLx8dRa67AHE2OhsL9lrCLTdKcMDRe4Uy262Yj+ rJzx9IokvLTQpP0tKOGoThVif0y28mAR/qEazJKhNV8GUtX9xlXJ7fj5SIlEUy3pWVlw l6qsIydQKN0u3BczRPgjbDkLx2hlXQcPQjq4yWuX87g+EFrtSo+nEpJmL2ODSOpTZ7ur yOmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:user-agent:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=Rn6zyMiinF/rsjj0QK3YAxhaDK3DUQmg9ucT9FuczBE=; b=h92ks7NV+VTRqEhCnRX9rmgidQCWpRdyyNDicVaQirC5Hii2+5JPU1QWuiLzSk49z+ dvmKsbf2+v0Ixa8nf9xgb1qbjBjSkoYN8vYOVcmCz7esMw8iARc2qzmXz0UBdzkFGaqA PB2248we+yiXGjRyviD679LPNfYzm+fbKylXWNjqWCvydbmGC7AiXZCw6Gmqj6FBpyla 18YII57x9LLCrc5qF3gQbMa4l5QBmrPM3c9Gja4LH3n1+XtPwmAzznGWRXKQ+UPubiqN 1kU3EZWVoMxzKym/h3STjJeV3WPTQHoePmUFROoBli4N7AA9r2f7MLqTOCYdpnxE3yuh n0NQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=aslp+Tl1; spf=pass (google.com: domain of gcc-patches-return-472527-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472527-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 91-v6si1859343ply.413.2018.02.02.06.12.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Feb 2018 06:12:31 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-472527-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=aslp+Tl1; spf=pass (google.com: domain of gcc-patches-return-472527-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472527-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=nJtkC7JAtHE0DUfnrB89UHCk5WozIBJtrcVIOcm1WZsKyM6GHpalk vGnZpAv5QCyWNSm4ZZQRtEcT1cKkNBIfbLzOg87vc9vtko4puslYIB9fbjkXOUXj 9DRLxGxBQA3+z4d40NpmExLL2dKJN9O0xWmsphfbJCKLxgVlH6/m60= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=Xm7oLI6u91f6lQ7QJoG4NF2Hj18=; b=aslp+Tl13vhdE2R5px2Z Kv+wzYxZ0SUI5OtxpSFNV78//v0eTP7j2pWdXszkn6khMvZ3zzGJz1vl5lSDr9Om hAS8+tRzNsYJUVgAeq0J8qlYmGHY7agNov+y3y5Nh7E5poXPGSee9xNCctqJZK7q r/asBLjfkYxEYPNrxwt6PXg= Received: (qmail 58860 invoked by alias); 2 Feb 2018 14:12:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 58374 invoked by uid 89); 2 Feb 2018 14:12:18 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=voi X-HELO: mail-wr0-f182.google.com Received: from mail-wr0-f182.google.com (HELO mail-wr0-f182.google.com) (209.85.128.182) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 02 Feb 2018 14:12:16 +0000 Received: by mail-wr0-f182.google.com with SMTP id i56so22673012wra.7 for ; Fri, 02 Feb 2018 06:12:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:user-agent:date :message-id:mime-version; bh=Rn6zyMiinF/rsjj0QK3YAxhaDK3DUQmg9ucT9FuczBE=; b=iYKnaB6dT2KB16bJa7i6Kstcsq6WKD8A7wclNeZ5WON+SoR1QnKggo8l46GtAz1PFC DrUOZBD2yYU98jfvUPmIPvBqP4whmtYU4QJwwdTnpBIOJK0CcHCm6B6UeoPomUEuxLLx MKkxC5Pryqfim+P8gAnTdjVlYLuGGbed9W4wwAwoPIThG/gYZZWoa4f4AohU53j3GBFE 64NvR/qq5V4L2v134HcGn7AmKtgbd+N/guS6LTqF0A+HvRX0YsUFY35mJ2cJAKKIhoNR 2krVT2QjUZC3nJ2EQDEygI0ud5jmv10KiWmV0JaMEe6opagxoLvUwTIOglZfcO7lYe5G 2Zag== X-Gm-Message-State: AKwxyteykeQNjJR00JktPWf+yj91M6U/8Hyv4fI6ONpqAseu8Ddd1j0J g3U5gv3HLoqDXTjIxb125/AU4r7x47Y= X-Received: by 10.223.151.151 with SMTP id s23mr16918482wrb.241.1517580733214; Fri, 02 Feb 2018 06:12:13 -0800 (PST) Received: from localhost (188.29.164.23.threembb.co.uk. [188.29.164.23]) by smtp.gmail.com with ESMTPSA id m11sm6093896wrf.75.2018.02.02.06.12.11 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 02 Feb 2018 06:12:12 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: Use nonzero bits to refine range in split_constant_offset (PR 81635) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) Date: Fri, 02 Feb 2018 14:12:09 +0000 Message-ID: <87a7wra3ba.fsf@linaro.org> MIME-Version: 1.0 This patch is part 2 of the fix for PR 81635. It means that split_constant_offset can handle loops like: for (unsigned int i = 0; i < n; i += 4) { a[i] = ...; a[i + 1] = ...; } CCP records that "i" must have its low 2 bits clear, but we don't include this information in the range of "i", which remains [0, +INF]. I tried making set_nonzero_bits update the range info in the same way that set_range_info updates the nonzero bits, but it regressed cases like vrp117.c and made some other tests worse. vrp117.c has a multiplication by 10, so CCP can infer that the low bit of the result is clear. If we included that in the range, the range would go from [-INF, +INF] to [-INF, not-quite-+INF]. However, the multiplication is also known to overflow in all cases, so VRP saturates the result to [INT_MAX, INT_MAX]. This obviously creates a contradiction with the nonzero bits, and intersecting the new saturated range with an existing not-quite-+INF range would make us drop to VR_UNDEFINED. We're prepared to fold a comparison with an [INT_MAX, INT_MAX] value but not with a VR_UNDEFINED value. The other problems were created when intersecting [-INF, not-quite-+INF] with a useful VR_ANTI_RANGE like ~[-1, 1]. The intersection would keep the former range rather than the latter. The patch therefore keeps the adjustment local to split_constant_offset for now, but adds a helper routine so that it's easy to move this later. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. OK to install? Richard 2018-02-02 Richard Sandiford gcc/ PR tree-optimization/81635 * wide-int.h (wi::round_down_for_mask, wi::round_up_for_mask): Declare. * wide-int.cc (wi::round_down_for_mask, wi::round_up_for_mask) (test_round_for_mask): New functions. (wide_int_cc_tests): Call test_round_for_mask. * tree-vrp.h (intersect_range_with_nonzero_bits): Declare. * tree-vrp.c (intersect_range_with_nonzero_bits): New function. * tree-data-ref.c (split_constant_offset_1): Use it to refine the range returned by get_range_info. gcc/testsuite/ PR tree-optimization/81635 * gcc.dg/vect/bb-slp-pr81635-3.c: New test. * gcc.dg/vect/bb-slp-pr81635-4.c: Likewise. Index: gcc/wide-int.h =================================================================== --- gcc/wide-int.h 2018-02-02 14:03:53.964530009 +0000 +++ gcc/wide-int.h 2018-02-02 14:03:54.185521788 +0000 @@ -3308,6 +3308,8 @@ gt_pch_nx (trailing_wide_ints *, voi wide_int set_bit_in_zero (unsigned int, unsigned int); wide_int insert (const wide_int &x, const wide_int &y, unsigned int, unsigned int); + wide_int round_down_for_mask (const wide_int &, const wide_int &); + wide_int round_up_for_mask (const wide_int &, const wide_int &); template T mask (unsigned int, bool); Index: gcc/wide-int.cc =================================================================== --- gcc/wide-int.cc 2018-02-02 14:03:53.964530009 +0000 +++ gcc/wide-int.cc 2018-02-02 14:03:54.185521788 +0000 @@ -2132,6 +2132,70 @@ wi::only_sign_bit_p (const wide_int_ref return only_sign_bit_p (x, x.precision); } +/* Return VAL if VAL has no bits set outside MASK. Otherwise round VAL + down to the previous value that has no bits set outside MASK. + This rounding wraps for signed values if VAL is negative and + the top bit of MASK is clear. + + For example, round_down_for_mask (6, 0xf1) would give 1 and + round_down_for_mask (24, 0xf1) would give 17. */ + +wide_int +wi::round_down_for_mask (const wide_int &val, const wide_int &mask) +{ + /* Get the bits in VAL that are outside the mask. */ + wide_int extra_bits = wi::bit_and_not (val, mask); + if (extra_bits == 0) + return val; + + /* Get a mask that includes the top bit in EXTRA_BITS and is all 1s + below that bit. */ + unsigned int precision = val.get_precision (); + wide_int lower_mask = wi::mask (precision - wi::clz (extra_bits), + false, precision); + + /* Clear the bits that aren't in MASK, but ensure that all bits + in MASK below the top cleared bit are set. */ + return (val & mask) | (mask & lower_mask); +} + +/* Return VAL if VAL has no bits set outside MASK. Otherwise round VAL + up to the next value that has no bits set outside MASK. The rounding + wraps if there are no suitable values greater than VAL. + + For example, round_up_for_mask (6, 0xf1) would give 16 and + round_up_for_mask (24, 0xf1) would give 32. */ + +wide_int +wi::round_up_for_mask (const wide_int &val, const wide_int &mask) +{ + /* Get the bits in VAL that are outside the mask. */ + wide_int extra_bits = wi::bit_and_not (val, mask); + if (extra_bits == 0) + return val; + + /* Get a mask that is all 1s above the top bit in EXTRA_BITS. */ + unsigned int precision = val.get_precision (); + wide_int upper_mask = wi::mask (precision - wi::clz (extra_bits), + true, precision); + + /* Get the bits of the mask that are above the top bit in EXTRA_BITS. */ + upper_mask &= mask; + + /* Conceptually we need to: + + - clear bits of VAL outside UPPER_MASK + - add the lowest bit in UPPER_MASK to VAL (or add 0 if UPPER_MASK is 0) + - propagate the carry through the bits of VAL in UPPER_MASK + + If (~VAL & UPPER_MASK) is nonzero, the carry eventually + reaches that bit and the process leaves all lower bits clear. + If (~VAL & UPPER_MASK) is zero then the result is also zero. */ + wide_int tmp = wi::bit_and_not (upper_mask, val); + + return (val | tmp) & -tmp; +} + /* * Private utilities. */ @@ -2384,6 +2448,53 @@ test_overflow () } } +/* Test the round_{down,up}_for_mask functions. */ + +static void +test_round_for_mask () +{ + unsigned int prec = 18; + ASSERT_EQ (17, wi::round_down_for_mask (wi::shwi (17, prec), + wi::shwi (0xf1, prec))); + ASSERT_EQ (17, wi::round_up_for_mask (wi::shwi (17, prec), + wi::shwi (0xf1, prec))); + + ASSERT_EQ (1, wi::round_down_for_mask (wi::shwi (6, prec), + wi::shwi (0xf1, prec))); + ASSERT_EQ (16, wi::round_up_for_mask (wi::shwi (6, prec), + wi::shwi (0xf1, prec))); + + ASSERT_EQ (17, wi::round_down_for_mask (wi::shwi (24, prec), + wi::shwi (0xf1, prec))); + ASSERT_EQ (32, wi::round_up_for_mask (wi::shwi (24, prec), + wi::shwi (0xf1, prec))); + + ASSERT_EQ (0x011, wi::round_down_for_mask (wi::shwi (0x22, prec), + wi::shwi (0x111, prec))); + ASSERT_EQ (0x100, wi::round_up_for_mask (wi::shwi (0x22, prec), + wi::shwi (0x111, prec))); + + ASSERT_EQ (100, wi::round_down_for_mask (wi::shwi (101, prec), + wi::shwi (0xfc, prec))); + ASSERT_EQ (104, wi::round_up_for_mask (wi::shwi (101, prec), + wi::shwi (0xfc, prec))); + + ASSERT_EQ (0x2bc, wi::round_down_for_mask (wi::shwi (0x2c2, prec), + wi::shwi (0xabc, prec))); + ASSERT_EQ (0x800, wi::round_up_for_mask (wi::shwi (0x2c2, prec), + wi::shwi (0xabc, prec))); + + ASSERT_EQ (0xabc, wi::round_down_for_mask (wi::shwi (0xabd, prec), + wi::shwi (0xabc, prec))); + ASSERT_EQ (0, wi::round_up_for_mask (wi::shwi (0xabd, prec), + wi::shwi (0xabc, prec))); + + ASSERT_EQ (0xabc, wi::round_down_for_mask (wi::shwi (0x1000, prec), + wi::shwi (0xabc, prec))); + ASSERT_EQ (0, wi::round_up_for_mask (wi::shwi (0x1000, prec), + wi::shwi (0xabc, prec))); +} + /* Run all of the selftests within this file, for all value types. */ void @@ -2393,6 +2504,7 @@ wide_int_cc_tests () run_all_wide_int_tests (); run_all_wide_int_tests (); test_overflow (); + test_round_for_mask (); } } // namespace selftest Index: gcc/tree-vrp.h =================================================================== --- gcc/tree-vrp.h 2018-02-02 14:03:53.964530009 +0000 +++ gcc/tree-vrp.h 2018-02-02 14:03:54.184521826 +0000 @@ -61,6 +61,8 @@ extern void extract_range_from_unary_exp tree op0_type); extern bool vrp_operand_equal_p (const_tree, const_tree); +extern enum value_range_type intersect_range_with_nonzero_bits + (enum value_range_type, wide_int *, wide_int *, const wide_int &, signop); struct assert_info { Index: gcc/tree-vrp.c =================================================================== --- gcc/tree-vrp.c 2018-02-02 14:03:53.964530009 +0000 +++ gcc/tree-vrp.c 2018-02-02 14:03:54.184521826 +0000 @@ -171,6 +171,53 @@ vrp_val_is_min (const_tree val) && operand_equal_p (val, type_min, 0))); } +/* VR_TYPE describes a range with mininum value *MIN and maximum + value *MAX. Restrict the range to the set of values that have + no bits set outside NONZERO_BITS. Update *MIN and *MAX and + return the new range type. + + SGN gives the sign of the values described by the range. */ + +enum value_range_type +intersect_range_with_nonzero_bits (enum value_range_type vr_type, + wide_int *min, wide_int *max, + const wide_int &nonzero_bits, + signop sgn) +{ + if (vr_type == VR_RANGE) + { + *max = wi::round_down_for_mask (*max, nonzero_bits); + + /* Check that the range contains at least one valid value. */ + if (wi::gt_p (*min, *max, sgn)) + return VR_UNDEFINED; + + *min = wi::round_up_for_mask (*min, nonzero_bits); + gcc_checking_assert (wi::le_p (*min, *max, sgn)); + } + if (vr_type == VR_ANTI_RANGE) + { + *max = wi::round_up_for_mask (*max, nonzero_bits); + + /* If the calculation wrapped, we now have a VR_RANGE whose + lower bound is *MAX and whose upper bound is *MIN. */ + if (wi::gt_p (*min, *max, sgn)) + { + std::swap (*min, *max); + *max = wi::round_down_for_mask (*max, nonzero_bits); + gcc_checking_assert (wi::le_p (*min, *max, sgn)); + return VR_RANGE; + } + + *min = wi::round_down_for_mask (*min, nonzero_bits); + gcc_checking_assert (wi::le_p (*min, *max, sgn)); + + /* Check whether we now have an empty set of values. */ + if (*min - 1 == *max) + return VR_UNDEFINED; + } + return vr_type; +} /* Set value range VR to VR_UNDEFINED. */ Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c 2018-02-02 14:03:53.964530009 +0000 +++ gcc/tree-data-ref.c 2018-02-02 14:03:54.184521826 +0000 @@ -721,7 +721,13 @@ split_constant_offset_1 (tree type, tree if (TREE_CODE (tmp_var) != SSA_NAME) return false; wide_int var_min, var_max; - if (get_range_info (tmp_var, &var_min, &var_max) != VR_RANGE) + value_range_type vr_type = get_range_info (tmp_var, &var_min, + &var_max); + wide_int var_nonzero = get_nonzero_bits (tmp_var); + signop sgn = TYPE_SIGN (itype); + if (intersect_range_with_nonzero_bits (vr_type, &var_min, + &var_max, var_nonzero, + sgn) != VR_RANGE) return false; /* See whether the range of OP0 (i.e. TMP_VAR + TMP_OFF) @@ -729,7 +735,6 @@ split_constant_offset_1 (tree type, tree operations done in ITYPE. The addition must overflow at both ends of the range or at neither. */ bool overflow[2]; - signop sgn = TYPE_SIGN (itype); unsigned int prec = TYPE_PRECISION (itype); wide_int woff = wi::to_wide (tmp_off, prec); wide_int op0_min = wi::add (var_min, woff, sgn, &overflow[0]); Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-3.c =================================================================== --- /dev/null 2018-02-02 09:03:36.168354735 +0000 +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-3.c 2018-02-02 14:03:54.183521863 +0000 @@ -0,0 +1,62 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ +/* { dg-require-effective-target vect_double } */ +/* { dg-require-effective-target lp64 } */ + +void +f1 (double *p, double *q, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < n; i += 4) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f2 (double *p, double *q, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < n; i += 2) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f3 (double *p, double *q, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < n; i += 6) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f4 (double *p, double *q, unsigned int start, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = start & -2; i < n; i += 2) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized" 4 "slp1" } } */ Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-4.c =================================================================== --- /dev/null 2018-02-02 09:03:36.168354735 +0000 +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr81635-4.c 2018-02-02 14:03:54.183521863 +0000 @@ -0,0 +1,47 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-fno-tree-loop-vectorize" } */ +/* { dg-require-effective-target lp64 } */ + +void +f1 (double *p, double *q, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < n; i += 1) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f2 (double *p, double *q, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = 0; i < n; i += 3) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +void +f3 (double *p, double *q, unsigned int start, unsigned int n) +{ + p = (double *) __builtin_assume_aligned (p, sizeof (double) * 2); + q = (double *) __builtin_assume_aligned (q, sizeof (double) * 2); + for (unsigned int i = start; i < n; i += 2) + { + double a = q[i] + p[i]; + double b = q[i + 1] + p[i + 1]; + q[i] = a; + q[i + 1] = b; + } +} + +/* { dg-final { scan-tree-dump-not "basic block vectorized" "slp1" } } */