From patchwork Tue May 16 08:02:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 99852 Delivered-To: patch@linaro.org Received: by 10.140.96.100 with SMTP id j91csp1899110qge; Tue, 16 May 2017 01:02:46 -0700 (PDT) X-Received: by 10.98.82.77 with SMTP id g74mr10652518pfb.115.1494921766446; Tue, 16 May 2017 01:02:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1494921766; cv=none; d=google.com; s=arc-20160816; b=TxCzJ05oTGfnpb929Z/bNrDmMNOXKHHXdjO1/dWM9m0AqftbnvBuVtWM1RgWF/yQGh hEtyF6VmNAyxgz14RS6EltEoAz5wcGLkch2jhGgdsb3G0Zj90O1NlXj2yh3NgKitjDOX jfYFx/APng9ANtpQZG4AooKjR6ib/EyopYIYaR1kJQvjt9mkwNlEj8atbkANx9TxH8nb k6ihK9Cj1BzlG2Bor6b/KSSSWMTe9KDdjmZvjqtW4by5ZGN5mqHfmgxvG+ASkyFe5Tnz bVM+oaBMM/6QcG0rYyJZW+iyH4oJJnlIoABdAxNBFWWaCb2bnuifDHgcoDA6a+2t6zlr P2fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=xCBzu+x4z4OmTennaqI7moVOwPnsp9LhfwsQe26E6Uc=; b=s5IkqCpr5NItYoFafrdERSHTYtdFsstOz37izQRPGNuF32kX7SzltUvtJwRvvhlkWM Ot36EaTsM7akubuiSodzH9d8/0AL9SsMnmqngSQNAGr0w5UkgW6HQ4E3PgL0X7erPJ+h BE3wubp0QrA7J+bTtLiWPwOCV8lYWC2pzBKGgUdT7oO+jKlWjffAMCTWv5+Fxn3iJBqQ Rw+c9wGHhnU3ENjUxPJo+lMEVOn05rwdivBUYQ1/W6Yo69oaihQ8n+e3EGbyUGDWgs5z wtAF5jMCzeR1DrhzTwrH746345j/5crCrMPAKp2rRPKDwYuVssJkDIrb+ftpmc/WTreb cxnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-453743-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-453743-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id w4si12824180pls.38.2017.05.16.01.02.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 May 2017 01:02:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-453743-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-453743-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-453743-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=dqjyWMm1IDAZKF+G+H0Qb/RZ8wteqUKMrIpyM7ph1R+issY+kmsDv Fr3JxZEehvSwTaHYyBLzIYs5DhJKAKVxR9Es5Mrp6YEzE4S+vxA/b2+dGfK8sEZ+ 7TMBmrze2dCpTRPGkcTMgYLSlNLwJoCcvdofRidvpS57Fb9AxPDMXg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=yCxCoVdMHhUURuT4mZL0DMn7SLE=; b=Ad7EW1a48jLrnLnHXQXl Ikf6nJvaD3SwGOg2AxREGDlBXEUHROhvcaK8EXl1WzZhql6BSv31BDM5f1vQeLel lSrwIszQdiplJww/Q1N37JIOSBK6LoI3bppELQoGVTeRjrDxdo9L5nBi67a9Srbr 58GBOPJ62CC2BQGThHezW1Q= Received: (qmail 121126 invoked by alias); 16 May 2017 08:02:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120838 invoked by uid 89); 16 May 2017 08:02:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=Statement X-HELO: mail-wm0-f45.google.com Received: from mail-wm0-f45.google.com (HELO mail-wm0-f45.google.com) (74.125.82.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 16 May 2017 08:02:10 +0000 Received: by mail-wm0-f45.google.com with SMTP id u65so33609083wmu.1 for ; Tue, 16 May 2017 01:02:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=xCBzu+x4z4OmTennaqI7moVOwPnsp9LhfwsQe26E6Uc=; b=IndUJ1lqBUnUBT7CHUpu6ZO2gVOojbIJuDdNLRbFYH2C7T7awUri/GRtTyBOmIv0KI iftmz+oIIeu4qwPcGoCGhiRIHJm+dXFMsRRzXmPvfka/BgaYEIecTx1YQH5QnOT9wtLN vlvbW6wVzaotAJmWbuDo3jPbzJ2FfpvYHHdetnid3wjQmhsX2rVf/3F6XRg4qCaOMCgx YKctn+Ys7VBVdtNIX81PDuQL3vYZmvn7txpBI/nTI5dFTT6vsgnr4lIDa/lt4QrZdWvY z1lh4MRKO/kIhwLvlfQXHYDG9N8iLgUnMEBYj9mU/LaWdfYCzrvHcxYWKUL7Z30UVdwK YcTw== X-Gm-Message-State: AODbwcBMlqoK58yycay7bQSDklyc8UykhVBLoBjhkvFD5bAGeYK0NGvl HjTbvzzgnrNMuAezPGj8zA== X-Received: by 10.28.11.200 with SMTP id 191mr6658334wml.89.1494921730858; Tue, 16 May 2017 01:02:10 -0700 (PDT) Received: from localhost (188.29.164.181.threembb.co.uk. [188.29.164.181]) by smtp.gmail.com with ESMTPSA id k65sm1014904wmd.28.2017.05.16.01.02.09 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 16 May 2017 01:02:10 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: [2/2] PR 80769: Incorrect strlen optimisation Date: Tue, 16 May 2017 09:02:08 +0100 Message-ID: <87h90lut4v.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 In this testcase, we (correctly) record after: strcpy (p1, "abcde"); char *p2 = strchr (p1, '\0'); strcpy (p2, q); that the length of p1 and p2 can be calculated by converting the second strcpy to: tmp = stpcpy (p2, q) and then doing tmp - p1 for p1 and tmp - p2 for p2. This is delayed until we know whether we actually need it. Then: char *p3 = strchr (p2, '\0'); forces us to calculate the length of p2 in this way. At this point we had three related strinfos: p1: delayed length, calculated from tmp = stpcpy (p2, q) p2: known length, tmp - p2 p3: known length, 0 After: memcpy (p3, "x", 2); we use adjust_related_strinfos to add 1 to each length. However, that didn't do anything for delayed lengths because: else if (si->stmt != NULL) /* Delayed length computation is unaffected. */ ; So after the memcpy we had: p1: delayed length, calculated from tmp = stpcpy (p2, q) p2: known length, tmp - p2 + 1 p3: known length, 1 where the length of p1 was no longer correct. I thought about three fixes: (1) Make adjust_related_strinfos drop strinfos with a delayed length. (2) Make adjust_related_strinfos compute the length itself (via get_string_length). (3) Make get_string_length update all related strinfos. We can then maintain the invariant that all lengths in a chain are delayed or none are. (3) seemed like the best. get_string_length has already made the invasive step of changing the code and computing the length for one strinfo. Updating the related strinfos is just a couple of fold_builds, of the kind that the pass already does in several places. The point is that the code above only triggers if one of the delayed lengths has been computed. That makes (1) unnecessarily pessimistic. It also wasn't obvious (to me) from first glance, so (2) might look more intrusive than it is. I think it becomes easier to reason about if we do (3) and can assume that all lengths are delayed or none are. It also makes the min-length/maybe-not-terminated patch easier. [ I can go into more detail about why this should be enough to maintain the invariant, and why the asserts should be valid, but the message is already pretty long. ] Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Thanks, Richard 2017-05-16 Richard Sandiford gcc/ PR tree-optimization/80769 * tree-ssa-strlen.c (strinfo): Document that "stmt" is also used for malloc and calloc. Document the new invariant that all related strinfos have delayed lengths or none do. (verify_related_strinfos): Move earlier in file. (set_endptr_and_length): New function, split out from... (get_string_length): ...here. Also set the lengths of related strinfos. (zero_length_string): Assert that chainsi has known (rather than delayed) lengths. (adjust_related_strinfos): Likewise. gcc/testsuite/ PR tree-optimization/80769 * gcc.dg/strlenopt-31.c: New test. * gcc.dg/strlenopt-31g.c: Likewise. Index: gcc/tree-ssa-strlen.c =================================================================== --- gcc/tree-ssa-strlen.c 2017-05-16 08:00:03.808133862 +0100 +++ gcc/tree-ssa-strlen.c 2017-05-16 08:20:51.408572143 +0100 @@ -61,7 +61,13 @@ struct strinfo tree length; /* Any of the corresponding pointers for querying alias oracle. */ tree ptr; - /* Statement for delayed length computation. */ + /* This is used for two things: + + - To record the statement that should be used for delayed length + computations. We maintain the invariant that all related strinfos + have delayed lengths or none do. + + - To record the malloc or calloc call that produced this result. */ gimple *stmt; /* Pointer to '\0' if known, if NULL, it can be computed as ptr + length. */ @@ -451,6 +457,45 @@ set_strinfo (int idx, strinfo *si) (*stridx_to_strinfo)[idx] = si; } +/* Return the first strinfo in the related strinfo chain + if all strinfos in between belong to the chain, otherwise NULL. */ + +static strinfo * +verify_related_strinfos (strinfo *origsi) +{ + strinfo *si = origsi, *psi; + + if (origsi->first == 0) + return NULL; + for (; si->prev; si = psi) + { + if (si->first != origsi->first) + return NULL; + psi = get_strinfo (si->prev); + if (psi == NULL) + return NULL; + if (psi->next != si->idx) + return NULL; + } + if (si->idx != si->first) + return NULL; + return si; +} + +/* Set SI's endptr to ENDPTR and compute its length based on SI->ptr. + Use LOC for folding. */ + +static void +set_endptr_and_length (location_t loc, strinfo *si, tree endptr) +{ + si->endptr = endptr; + si->stmt = NULL; + tree start_as_size = fold_convert_loc (loc, size_type_node, si->ptr); + tree end_as_size = fold_convert_loc (loc, size_type_node, endptr); + si->length = fold_build2_loc (loc, MINUS_EXPR, size_type_node, + end_as_size, start_as_size); +} + /* Return string length, or NULL if it can't be computed. */ static tree @@ -546,12 +591,12 @@ get_string_length (strinfo *si) case BUILT_IN_STPCPY_CHK_CHKP: gcc_assert (lhs != NULL_TREE); loc = gimple_location (stmt); - si->endptr = lhs; - si->stmt = NULL; - lhs = fold_convert_loc (loc, size_type_node, lhs); - si->length = fold_convert_loc (loc, size_type_node, si->ptr); - si->length = fold_build2_loc (loc, MINUS_EXPR, size_type_node, - lhs, si->length); + set_endptr_and_length (loc, si, lhs); + for (strinfo *chainsi = verify_related_strinfos (si); + chainsi != NULL; + chainsi = get_next_strinfo (chainsi)) + if (chainsi->length == NULL) + set_endptr_and_length (loc, chainsi, lhs); break; case BUILT_IN_MALLOC: break; @@ -620,32 +665,6 @@ unshare_strinfo (strinfo *si) return nsi; } -/* Return first strinfo in the related strinfo chain - if all strinfos in between belong to the chain, otherwise - NULL. */ - -static strinfo * -verify_related_strinfos (strinfo *origsi) -{ - strinfo *si = origsi, *psi; - - if (origsi->first == 0) - return NULL; - for (; si->prev; si = psi) - { - if (si->first != origsi->first) - return NULL; - psi = get_strinfo (si->prev); - if (psi == NULL) - return NULL; - if (psi->next != si->idx) - return NULL; - } - if (si->idx != si->first) - return NULL; - return si; -} - /* Attempt to create a new strinfo for BASESI + OFF, or find existing strinfo if there is any. Return it's idx, or 0 if no strinfo has been created. */ @@ -749,7 +768,8 @@ zero_length_string (tree ptr, strinfo *c { do { - gcc_assert (si->length || si->stmt); + /* We shouldn't mix delayed and non-delayed lengths. */ + gcc_assert (si->length); if (si->endptr == NULL_TREE) { si = unshare_strinfo (si); @@ -770,12 +790,17 @@ zero_length_string (tree ptr, strinfo *c return chainsi; } } - else if (chainsi->first || chainsi->prev || chainsi->next) + else { - chainsi = unshare_strinfo (chainsi); - chainsi->first = 0; - chainsi->prev = 0; - chainsi->next = 0; + /* We shouldn't mix delayed and non-delayed lengths. */ + gcc_assert (chainsi->length); + if (chainsi->first || chainsi->prev || chainsi->next) + { + chainsi = unshare_strinfo (chainsi); + chainsi->first = 0; + chainsi->prev = 0; + chainsi->next = 0; + } } } idx = new_stridx (ptr); @@ -820,18 +845,13 @@ adjust_related_strinfos (location_t loc, tree tem; si = unshare_strinfo (si); - if (si->length) - { - tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj); - si->length = fold_build2_loc (loc, PLUS_EXPR, - TREE_TYPE (si->length), si->length, - tem); - } - else if (si->stmt != NULL) - /* Delayed length computation is unaffected. */ - ; - else - gcc_unreachable (); + /* We shouldn't see delayed lengths here; the caller must have + calculated the old length in order to calculate the + adjustment. */ + gcc_assert (si->length); + tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj); + si->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (si->length), + si->length, tem); si->endptr = NULL_TREE; si->dont_invalidate = true; Index: gcc/testsuite/gcc.dg/strlenopt-31.c =================================================================== --- /dev/null 2017-05-11 19:11:40.989165404 +0100 +++ gcc/testsuite/gcc.dg/strlenopt-31.c 2017-05-16 08:20:26.780371709 +0100 @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) int +bar (char *p1, const char *q) +{ + strcpy (p1, "abcde"); + char *p2 = strchr (p1, '\0'); + strcpy (p2, q); + char *p3 = strchr (p2, '\0'); + memcpy (p3, "x", 2); + return strlen (p1); +} + +int +main (void) +{ + char buffer[10]; + int res = bar (buffer, "foo"); + if (strcmp (buffer, "abcdefoox") != 0 || res != 9) + abort (); + return 0; +} Index: gcc/testsuite/gcc.dg/strlenopt-31g.c =================================================================== --- /dev/null 2017-05-11 19:11:40.989165404 +0100 +++ gcc/testsuite/gcc.dg/strlenopt-31g.c 2017-05-16 08:20:26.780371709 +0100 @@ -0,0 +1,9 @@ +/* { dg-do run { target *-*-linux* *-*-gnu* } } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#define USE_GNU +#include "strlenopt-31.c" + +/* { dg-final { scan-tree-dump-times "stpcpy \\(" 1 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-not "strlen \\(" "strlen" } } */