From patchwork Tue Jun 30 01:15:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jim Wilson X-Patchwork-Id: 50441 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f200.google.com (mail-wi0-f200.google.com [209.85.212.200]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 257DB218E4 for ; Tue, 30 Jun 2015 01:15:44 +0000 (UTC) Received: by wiar9 with SMTP id r9sf1028398wia.1 for ; Mon, 29 Jun 2015 18:15:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:mime-version:date:message-id:subject:from:to :content-type:x-original-sender:x-original-authentication-results; bh=gMpa91cwIKmVTtBXfrc4YMSF9gvXIoKXHcKL94gQ0XY=; b=OfrnCT1lw2mx7KIBXL30wcsLZx9aLm/M6rdYIGQovl3Qsos965mVHSyEIw2mr+tLA8 xO6HSGGVIdNhCNYBjU3leVWHB8QBgQ6SJAWBtxSRKDBkxJU0HHaMZFzuZn3jVVcN36qW poEfbVPmwUBHXs2Bm16iEX7COFgsolY/GeK+dzF2hWycXzrCRfD9RYogWj1xH2G39yaI px3A5KIORjKhKlEqF2J8IQxkyPf4xbwWHlK45G0OHJ1OoKvoyI1UpPTImEABGhOp/D+v Pl0gb6ut89YrSJBoI/0iOi0fT9bDbLmT507vp7EVKRmZwRLzdZqqzbXmP7iPjnfc1oBR 2tEg== X-Gm-Message-State: ALoCoQn/uUYWtPDvVh1aL78/iU+GEybyZanP3KcBkFCUBJOnProrVF3w1NzuJi0NgJ0rLz+fnAd0 X-Received: by 10.112.85.69 with SMTP id f5mr12410548lbz.23.1435626943364; Mon, 29 Jun 2015 18:15:43 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.37.196 with SMTP id a4ls692665lak.27.gmail; Mon, 29 Jun 2015 18:15:43 -0700 (PDT) X-Received: by 10.112.204.199 with SMTP id la7mr17003997lbc.114.1435626943025; Mon, 29 Jun 2015 18:15:43 -0700 (PDT) Received: from mail-la0-x22b.google.com (mail-la0-x22b.google.com. [2a00:1450:4010:c03::22b]) by mx.google.com with ESMTPS id ru4si36567196lbb.121.2015.06.29.18.15.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 29 Jun 2015 18:15:42 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22b as permitted sender) client-ip=2a00:1450:4010:c03::22b; Received: by lagc2 with SMTP id c2so29905162lag.3 for ; Mon, 29 Jun 2015 18:15:42 -0700 (PDT) X-Received: by 10.152.36.102 with SMTP id p6mr16932923laj.19.1435626942559; Mon, 29 Jun 2015 18:15:42 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.108.230 with SMTP id hn6csp2078155lbb; Mon, 29 Jun 2015 18:15:41 -0700 (PDT) X-Received: by 10.66.157.136 with SMTP id wm8mr23165134pab.117.1435626940498; Mon, 29 Jun 2015 18:15:40 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id ma8si67113878pdb.234.2015.06.29.18.15.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 29 Jun 2015 18:15:40 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-401598-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 120326 invoked by alias); 30 Jun 2015 01:15:26 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120291 invoked by uid 89); 30 Jun 2015 01:15:25 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qc0-f174.google.com Received: from mail-qc0-f174.google.com (HELO mail-qc0-f174.google.com) (209.85.216.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 30 Jun 2015 01:15:23 +0000 Received: by qczu9 with SMTP id u9so21526989qcz.3 for ; Mon, 29 Jun 2015 18:15:21 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.140.19.48 with SMTP id 45mr22528914qgg.50.1435626921567; Mon, 29 Jun 2015 18:15:21 -0700 (PDT) Received: by 10.140.80.167 with HTTP; Mon, 29 Jun 2015 18:15:21 -0700 (PDT) Date: Mon, 29 Jun 2015 18:15:21 -0700 Message-ID: Subject: [PATCH, ARM] stop changing signedness in PROMOTE_MODE From: Jim Wilson To: "gcc-patches@gcc.gnu.org" X-Original-Sender: jim.wilson@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22b as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 This is my suggested fix for PR 65932, which is a linux kernel miscompile with gcc-5.1. The problem here is caused by a chain of events. The first is that the relatively new eipa_sra pass creates fake parameters that behave slightly differently than normal parameters. The second is that the optimizer creates phi nodes that copy local variables to fake parameters and/or vice versa. The third is that the ouf-of-ssa pass assumes that it can emit simple move instructions for these phi nodes. And the fourth is that the ARM port has a PROMOTE_MODE macro that forces QImode and HImode to unsigned, but a TARGET_PROMOTE_FUNCTION_MODE hook that does not. So signed char and short parameters have different in register representations than local variables, and require a conversion when copying between them, a conversion that the out-of-ssa pass can't easily emit. Ultimately, I think this is a problem in the arm backend. It should not have a PROMOTE_MODE macro that is changing the sign of char and short local variables. I also think that we should merge the PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to prevent this from happening again. I see four general problems with the current ARM PROMOTE_MODE definition. 1) Unsigned char is only faster for armv5 and earlier, before the sxtb instruction was added. It is a lose for armv6 and later. 2) Unsigned short was only faster for targets that don't support unaligned accesses. Support for these targets was removed a while ago, and this PROMODE_MODE hunk should have been removed at the same time. It was accidentally left behind. 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was converted to a function, the PROMOTE_MODE code was copied without the UNSIGNEDP changes. Thus it is only an accident that TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree. Changing TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE changes to resolve the difference are safe. 4) There is a general principle that you should only change signedness in PROMOTE_MODE if the hardware forces it, as otherwise this results in extra conversion instructions that make code slower. The mips64 hardware for instance requires that 32-bit values be sign-extended regardless of type, and instructions may trap if this is not true. However, it has a set of 32-bit instructions that operate on these values, and hence no conversions are required. There is no similar case on ARM. Thus the conversions are unnecessary and unwise. This can be seen in the testcases where gcc emits both a zero-extend and a sign-extend inside a loop, as the sign-extend is required for a compare, and the zero-extend is required by PROMOTE_MODE. My change was tested with an arm bootstrap, make check, and SPEC CPU2000 run. The original poster verified that this gives a linux kernel that boots correctly. The PRMOTE_MODE change causes 3 testsuite testcases to fail. These are tests to verify that smulbb and/or smlabb are generated. Eliminating the unnecessary sign conversions causes us to get better code that doesn't include the smulbb and smlabb instructions. I had to modify the testcases to get them to emit the desired instructions. With the testcase changes there are no additional testsuite failures, though I'm concerned that these testcases with the changes may be fragile, and future changes may break them again. If there are ARM parts where smulbb/smlabb are faster than mul/mla, then maybe we should try to add new patterns to get the instructions emitted again for the unmodified testcases. Jim gcc/ 2015-06-29 Jim Wilson PR target/65932 * config/arm/arm.h (PROMOTE_MODE): Don't set UNSIGNEDP for QImode and HImode. gcc/testsuite/ 2015-06-29 Jim Wilson PR target/65932 * gcc.target/arm/wmul-1.c (mac): Change a and b to int pointers. Cast multiply operands to short. * gcc.target/arm/wmul-2.c (vec_mpy): Cast multiply result to short. * gcc.target/arm/wmul-3.c (mac): Cast multiply results to short. Index: config/arm/arm.h =================================================================== --- config/arm/arm.h (revision 224672) +++ config/arm/arm.h (working copy) @@ -523,16 +523,10 @@ extern int arm_arch_crc; type, but kept valid in the wider mode. The signedness of the extension may differ from that of the type. */ -/* It is far faster to zero extend chars than to sign extend them */ - #define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT \ && GET_MODE_SIZE (MODE) < 4) \ { \ - if (MODE == QImode) \ - UNSIGNEDP = 1; \ - else if (MODE == HImode) \ - UNSIGNEDP = 1; \ (MODE) = SImode; \ } Index: testsuite/gcc.target/arm/wmul-1.c =================================================================== --- testsuite/gcc.target/arm/wmul-1.c (revision 224672) +++ testsuite/gcc.target/arm/wmul-1.c (working copy) @@ -2,14 +2,14 @@ /* { dg-require-effective-target arm_dsp } */ /* { dg-options "-O1 -fexpensive-optimizations" } */ -int mac(const short *a, const short *b, int sqr, int *sum) +int mac(const int *a, const int *b, int sqr, int *sum) { int i; int dotp = *sum; for (i = 0; i < 150; i++) { - dotp += b[i] * a[i]; - sqr += b[i] * b[i]; + dotp += (short)b[i] * (short)a[i]; + sqr += (short)b[i] * (short)b[i]; } *sum = dotp; Index: testsuite/gcc.target/arm/wmul-2.c =================================================================== --- testsuite/gcc.target/arm/wmul-2.c (revision 224672) +++ testsuite/gcc.target/arm/wmul-2.c (working copy) @@ -7,7 +7,7 @@ void vec_mpy(int y[], const short x[], s int i; for (i = 0; i < 150; i++) - y[i] += ((scaler * x[i]) >> 31); + y[i] += ((short)(scaler * x[i]) >> 31); } /* { dg-final { scan-assembler-times "smulbb" 1 } } */ Index: testsuite/gcc.target/arm/wmul-3.c =================================================================== --- testsuite/gcc.target/arm/wmul-3.c (revision 224672) +++ testsuite/gcc.target/arm/wmul-3.c (working copy) @@ -8,8 +8,8 @@ int mac(const short *a, const short *b, int dotp = *sum; for (i = 0; i < 150; i++) { - dotp -= b[i] * a[i]; - sqr -= b[i] * b[i]; + dotp -= (short)(b[i] * a[i]); + sqr -= (short)(b[i] * b[i]); } *sum = dotp;