From patchwork Fri Oct 26 05:04:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 149554 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp317577ljp; Thu, 25 Oct 2018 22:04:36 -0700 (PDT) X-Google-Smtp-Source: AJdET5cpiu8irWUTDPEtza93EvFDVQzPjN+F3nRQliYb8h1pxofdCQsyscrD468hkVMKOzU1cAN8 X-Received: by 2002:a62:1985:: with SMTP id 127-v6mr2125849pfz.51.1540530276410; Thu, 25 Oct 2018 22:04:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540530276; cv=none; d=google.com; s=arc-20160816; b=ShO0FVebQSM6g0UHNi8kp2g1nd/W4QIvSXBUW/BW8Jgd0Y8tcECtmhUe/shKXxnM2x ocMuwAcVyvb7eUAwhcccDnblEYM3cvXbGoaZJEc30O1o4hYg5AoOLDTarH3DEc0+mHsv cWV5K+x/kupdQuMhMobWplXA9pKcQyJz3Y27IFLAQwJwSLvatJ0TxSxEvI7DkxXXjInc 7NDfu5MC2IJAU/RN1/MpCueUgfn3Yvaof48cquZGpnMUq1zpgw8wRhkKjC0yGnTOOmbe gPO1tO16+7KT63TEUQeEbHFa1kLKtgpS7gsBuwDuajcC/p5Y2O7jNNFjTebH19rA4dlV jrKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:dkim-signature :delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature; bh=XUiBM/dNXrivPPkREhSrFJ3wSvTfexaLmCFm+4Ldv7g=; b=g0T6fZIKyOJhqQg6Tb4pJ6X7PQSH0JjgL1UFtgtsdKQzj2OXf26cZesplYNR6xjruE bx+WNLF1yUUxpuSpJSrk1Wjg9lFWhjaFtxUTxRcYoWFigGCTjEv4yRJbJhZmehHtNyGM dU68zfAm7L+UXJHUhT1EKW5j3b7PuoCn92R6yaijqxIWjQsagFZkLJwhYOqTClvmzWMx KLCEgeAlQK6fsV/zFdIILx4uDGT4LeKJGRqLPSACzdLBG21SK/DvOsTbDGMu7ZOdx0+i n0wTaDdP+SOlDYnNb+0PIEyAPNBAjTf807IWWq1K4MbS8887oJbDimsLbNryuqNhXRQ/ IBWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=p88UFAxy; dkim=neutral (body hash did not verify) header.i=@linaro.org header.s=google header.b="Jx/4mRf9"; spf=pass (google.com: domain of gcc-patches-return-488347-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-488347-patch=linaro.org@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id f15-v6si2535937pgd.152.2018.10.25.22.04.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Oct 2018 22:04:36 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-488347-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=p88UFAxy; dkim=neutral (body hash did not verify) header.i=@linaro.org header.s=google header.b="Jx/4mRf9"; spf=pass (google.com: domain of gcc-patches-return-488347-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-488347-patch=linaro.org@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=wVRrRb7K+bCz8SMfn+/MH7UnnkpYygAa3D7W6pko9400pd Is/xVlP0KfCeQ40k8BY+UbenOe7oFQf+jkcunugpSd4jsQ//354AoGeZjPlvgKSN R9+AJGPXudqKMO2lokF1HJsoMDlwCGhOt3nEyk1NFE+dRvUFeYV45kFWhfr6o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=ixy89eZRwr/L2GOirV/YwzRyMR8=; b=p88UFAxy6+kL5aY96aap c37lpQH9xIfziGw/uaVEcBhhkTwYZy30DUiq1x8koc/owWEZdqfUuGyR9kCSeBFo k03VgfwTiXXsvWj6MMBRvNbsY0YuBSUaACpMm8fvJtj82C28l3svfPXjQzlm3PH/ CkaSo4xhKwiaEfbU+kchOYs= Received: (qmail 103941 invoked by alias); 26 Oct 2018 05:04:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 103600 invoked by uid 89); 26 Oct 2018 05:04:21 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_ABUSE_SURBL autolearn=ham version=3.3.2 spammy= X-HELO: mail-lf1-f66.google.com Received: from mail-lf1-f66.google.com (HELO mail-lf1-f66.google.com) (209.85.167.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 26 Oct 2018 05:04:18 +0000 Received: by mail-lf1-f66.google.com with SMTP id c16so8498488lfj.8 for ; Thu, 25 Oct 2018 22:04:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:from:date:message-id:subject:to; bh=ZK7VqureDfdPk2SPduKYX+FUFXVN51eM84r+g2sNR9s=; b=Jx/4mRf9PmiafuAZ1pCDP72OZ6WBkoojN/GBxSfV8SUzt8Ayeo6lYvj3/JtXHErZZC zscb4Zek0GkPYtXicGtHuDb0Iu6qhNTSx/IMUZEhlKkAb/R/U7NQBQwuCjkeoWnvGJgx bwyd8pULAe2NcTp2ypUqlWVVYUacbBz65QtVE= MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Fri, 26 Oct 2018 10:34:02 +0530 Message-ID: Subject: [ARM] Implement division using vrecpe, vrecps To: gcc Patches , Kyrill Tkachov , Ramana Radhakrishnan X-IsSubscribed: yes Hi, This is a rebased version of patch that adds a pattern to neon.md for implementing division with multiplication by reciprocal using vrecpe/vrecps with -funsafe-math-optimizations excluding -Os. The newly added test-cases are not vectorized on armeb target with -O2. I posted the analysis for that here: https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01765.html Briefly, the difference between little and big-endian vectorizer is in arm_builtin_support_vector_misalignment() which calls default_builtin_support_vector_misalignment() for big-endian case, and that returns false because movmisalign_optab does not exist for V2SF mode. This isn't observed with -O3 because loop peeling for alignment gets enabled. It seems that the test cases in patch appear unsupported on armeb, after r221677 thus this patch requires no changes to target-supports.exp to adjust for armeb (unlike last time which stalled the patch). Bootstrap+tested on arm-linux-gnueabihf. Cross-tested on arm*-*-* variants. OK for trunk ? Thanks, Prathamesh 2018-10-26 Prathamesh Kulkarni * config/arm/neon.md (div3): New pattern. testsuite/ * gcc.target/arm/neon-vect-div-1.c: New test. * gcc.target/arm/neon-vect-div-2.c: Likewise. diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 5aeee4b08c1..25ed45d381a 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -620,6 +620,38 @@ (const_string "neon_mul_")))] ) +/* Perform division using multiply-by-reciprocal. + Reciprocal is calculated using Newton-Raphson method. + Enabled with -funsafe-math-optimizations -freciprocal-math + and disabled for -Os since it increases code size . */ + +(define_expand "div3" + [(set (match_operand:VCVTF 0 "s_register_operand" "=w") + (div:VCVTF (match_operand:VCVTF 1 "s_register_operand" "w") + (match_operand:VCVTF 2 "s_register_operand" "w")))] + "TARGET_NEON && !optimize_size + && flag_unsafe_math_optimizations && flag_reciprocal_math" + { + rtx rec = gen_reg_rtx (mode); + rtx vrecps_temp = gen_reg_rtx (mode); + + /* Reciprocal estimate. */ + emit_insn (gen_neon_vrecpe (rec, operands[2])); + + /* Perform 2 iterations of newton-raphson method. */ + for (int i = 0; i < 2; i++) + { + emit_insn (gen_neon_vrecps (vrecps_temp, rec, operands[2])); + emit_insn (gen_mul3 (rec, rec, vrecps_temp)); + } + + /* We now have reciprocal in rec, perform operands[0] = operands[1] * rec. */ + emit_insn (gen_mul3 (operands[0], operands[1], rec)); + DONE; + } +) + + (define_insn "mul3add_neon" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (plus:VDQW (mult:VDQW (match_operand:VDQW 2 "s_register_operand" "w") diff --git a/gcc/testsuite/gcc.target/arm/neon-vect-div-1.c b/gcc/testsuite/gcc.target/arm/neon-vect-div-1.c new file mode 100644 index 00000000000..50d04b4175b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vect-div-1.c @@ -0,0 +1,16 @@ +/* Test pattern div3. */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target vect_hw_misalign } */ +/* { dg-options "-O2 -ftree-vectorize -funsafe-math-optimizations -fdump-tree-vect-details" } */ +/* { dg-add-options arm_neon } */ + +void +foo (int len, float * __restrict p, float *__restrict x) +{ + len = len & ~31; + for (int i = 0; i < len; i++) + p[i] = p[i] / x[i]; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-vect-div-2.c b/gcc/testsuite/gcc.target/arm/neon-vect-div-2.c new file mode 100644 index 00000000000..606f54b4e0e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vect-div-2.c @@ -0,0 +1,16 @@ +/* Test pattern div3. */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target vect_hw_misalign } */ +/* { dg-options "-O3 -ftree-vectorize -funsafe-math-optimizations -fdump-tree-vect-details -fno-reciprocal-math" } */ +/* { dg-add-options arm_neon } */ + +void +foo (int len, float * __restrict p, float *__restrict x) +{ + len = len & ~31; + for (int i = 0; i < len; i++) + p[i] = p[i] / x[i]; +} + +/* { dg-final { scan-tree-dump-not "vectorized 1 loops" "vect" } } */