From patchwork Wed Nov 23 19:02:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 83745 Delivered-To: patch@linaro.org Received: by 10.182.1.168 with SMTP id 8csp2881620obn; Wed, 23 Nov 2016 11:03:15 -0800 (PST) X-Received: by 10.99.114.2 with SMTP id n2mr7669002pgc.130.1479927795746; Wed, 23 Nov 2016 11:03:15 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id n11si6546527plg.334.2016.11.23.11.03.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 Nov 2016 11:03:15 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-442436-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-442436-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-442436-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=AGIbzJqcdWTMGSFVk kRBiYTzCTe3hbC9FlN63nrsTYXnrAR/m5QONrimOCyalvzt5eBE6h3IVOjPqGId7 4f66y9SmhfWcSwoo3WGJTD7nkQBHSCBkqp6HS/n/1lFgNavguRgTjFK2r78mcyYx FrWyuUX50+KkJGubTtv/KyeVqg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=BXvqbnqJB+R8Z6QZDhG/VMm TeUg=; b=b1u6x4SE61ygRfaa+uaqrPl41xNOIfMHjT3bx3VVvXFQu4HkItHAH0O eEyeaTLtt6hsjhjjf6alTJggmCnvihiKr5gkL9p7PfSPaLFQj2/SlytKnar4g/4f 5yA3wKJXuVQH79cK8yYCu//QzCJ015zjoFbGjliD8U2/DjDyI1Es= Received: (qmail 128562 invoked by alias); 23 Nov 2016 19:03:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 128547 invoked by uid 89); 23 Nov 2016 19:03:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=BAYES_00, KAM_ASCII_DIVIDERS, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=81210, MIN, sum, growth X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 23 Nov 2016 19:03:00 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C4FF883F42 for ; Wed, 23 Nov 2016 19:02:59 +0000 (UTC) Received: from localhost.localdomain (vpn1-5-172.ams2.redhat.com [10.36.5.172]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uANJ2w9r000745 for ; Wed, 23 Nov 2016 14:02:58 -0500 Subject: [3/3] Fix PR78120, in ifcvt/rtlanal/i386. To: GCC Patches References: From: Bernd Schmidt Message-ID: <36753dee-a20d-7c91-da3d-2394b29727e1@redhat.com> Date: Wed, 23 Nov 2016 20:02:57 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: X-IsSubscribed: yes On 11/23/2016 07:57 PM, Bernd Schmidt wrote: > 3. ifcvt computes the sum of costs for the involved blocks, but only > makes a before/after comparison when optimizing for size. When > optimizing for speed, it uses max_seq_cost, which is an estimate > computed from BRANCH_COST, which in turn can be zero for predictable > branches on x86. This is the final patch and has the testcase. It also happens to be the least risky of the series so it could be applied on its own (without the test). Bernd PR rtl-optimization/78120 * ifcvt.c (noce_conversion_profitable_p): Check original cost in all cases, and additionally test against max_seq_cost for speed optimization. (noce_process_if_block): Compute an estimate for the original cost when optimizing for speed, using the minimum of then and else block costs. PR rtl-optimization/78120 * gcc.target/i386/pr78120.c: New test. Index: gcc/ifcvt.c =================================================================== --- gcc/ifcvt.c (revision 242038) +++ gcc/ifcvt.c (working copy) @@ -812,8 +812,10 @@ struct noce_if_info we're optimizing for size. */ bool speed_p; - /* The combined cost of COND, JUMP and the costs for THEN_BB and - ELSE_BB. */ + /* An estimate of the original costs. When optimizing for size, this is the + combined cost of COND, JUMP and the costs for THEN_BB and ELSE_BB. + When optimizing for speed, we use the costs of COND plus the minimum of + the costs for THEN_BB and ELSE_BB, as computed in the next field. */ unsigned int original_cost; /* Maximum permissible cost for the unconditional sequence we should @@ -852,12 +857,12 @@ noce_conversion_profitable_p (rtx_insn * /* Cost up the new sequence. */ unsigned int cost = seq_cost (seq, speed_p); + if (cost <= if_info->original_cost) + return true; + /* When compiling for size, we can make a reasonably accurately guess - at the size growth. */ - if (!speed_p) - return cost <= if_info->original_cost; - else - return cost <= if_info->max_seq_cost; + at the size growth. When compiling for speed, use the maximum. */ + return speed_p && cost <= if_info->max_seq_cost; } /* Helper function for noce_try_store_flag*. */ @@ -3441,15 +3446,24 @@ noce_process_if_block (struct noce_if_in } } - if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->original_cost, + bool speed_p = optimize_bb_for_speed_p (test_bb); + unsigned int then_cost = 0, else_cost = 0; + if (!bb_valid_for_noce_process_p (then_bb, cond, &then_cost, &if_info->then_simple)) return false; if (else_bb - && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->original_cost, - &if_info->else_simple)) + && !bb_valid_for_noce_process_p (else_bb, cond, &else_cost, + &if_info->else_simple)) return false; + if (else_bb == NULL) + if_info->original_cost += then_cost; + else if (speed_p) + if_info->original_cost += MIN (then_cost, else_cost); + else + if_info->original_cost += then_cost + else_cost; + insn_a = last_active_insn (then_bb, FALSE); set_a = single_set (insn_a); gcc_assert (set_a); Index: gcc/testsuite/gcc.target/i386/pr78120.c =================================================================== --- gcc/testsuite/gcc.target/i386/pr78120.c (nonexistent) +++ gcc/testsuite/gcc.target/i386/pr78120.c (working copy) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=generic" } */ +/* { dg-final { scan-assembler "adc" } } */ +/* { dg-final { scan-assembler-not "jmp" } } */ + +typedef unsigned long u64; + +typedef struct { + u64 hi, lo; +} u128; + +static inline u128 add_u128 (u128 a, u128 b) +{ + a.lo += b.lo; + if (a.lo < b.lo) + a.hi++; + + return a; +} + +extern u128 t1, t2, t3; + +void foo (void) +{ + t1 = add_u128 (t1, t2); + t1 = add_u128 (t1, t3); +}