From patchwork Wed Jan 24 21:56:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 125716 Delivered-To: patch@linaro.org Received: by 10.46.66.141 with SMTP id h13csp689463ljf; Wed, 24 Jan 2018 13:57:34 -0800 (PST) X-Google-Smtp-Source: AH8x226ciuEqxsFzO4X+0Hs0Z4WsGtjV8zpQhMnVxWrzQtwyqmmU/r5cT/qV/na32Vk3DD2iJVdG X-Received: by 2002:a17:902:2c83:: with SMTP id n3-v6mr9576249plb.227.1516831054359; Wed, 24 Jan 2018 13:57:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516831054; cv=none; d=google.com; s=arc-20160816; b=nFFjyAY6XkXVXwN7wIQJn9cH69fgr7RhRJlykrULqN2pDGmQLSS2nz83cwN5L9oKFk MSeTvWfrM4MKonb7rf/KfBJj9fArqrPl6bh7XbVdEmdoYmn+RRwgbpX3wo6hWCylEm2O tL1TogIzRFUz9wJmd1QzbdCqO6z886+HIMn/UdXw90Qii5V/9ZkXDrA6w2RxCGjjix8M K0res0T2gy7sMyhI+90PiwjHYNOVSZSBeBOa5r5wk05wuENcAbBNlJQvSICCsSwNW95p G/vKJREYk38aaWwuebGxzsFGIbVYTkGDUPF6NKm+YNXkrHN6ynJm3RF+nsR9vyfSblk3 HaXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:delivered-to:sender :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=qI9i9gj0yo3ziKX31wY7XxpKnaaYsjKBku+eIWOdsfo=; b=hAbPOAULDXbQG11n3N6HPs8gwfM++cmcC4IG4CFpCHfx7lT4NjlUdgQGw4N/VQUvr6 YN44bM3rG3sC7Q9SdMY1xyuO5GFWo7IYW43q7TnfhoHFAyGiWqw8TiTAbDGox/jlbzzS gMUsJ5YJifR06AXsbiEyIYTZTVKPnyK0nRW3MQNNdnnWio2jbcuUwPNhhAaODEOk9jbZ 5Z+2b5r7NJNxwosDksj6N5iafNjWXwmdbSVXZrmDjX13LZUE5eWf1tRFla5/d+b9TdNe TIsGyLV8BS6U6tiPvZQP9WUwcs5eXZ4e44JFfvwAa8CCwYM9hBfsocD7gx8Tcp8VDeTD Pwyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=X6+z7w0J; spf=pass (google.com: domain of gcc-patches-return-471991-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-471991-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id f18si623665pga.543.2018.01.24.13.57.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jan 2018 13:57:34 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-471991-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=X6+z7w0J; spf=pass (google.com: domain of gcc-patches-return-471991-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-471991-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=x77iktAUDa8YIf3nXsKrQ+KkQ6mbslMnCsZt5Rnz4HqkK3 VOoTBlqANUNIZxgkKJjJzPZoAwd7UP04xNW/jPbGvFUurLyWMR0KqtRY2TPdZJRJ I5LinelTGqQAC+AmFqGP6SlZNITeC5WjgBPYzLsW3uzoKXoscDRQZAc4rcwz0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=oOzKiLc1rgStUv/4DUAcdF7UEJE=; b=X6+z7w0JgudBXWUCIEgf RAzKUSFi3YtyYAwPgo3D+J85fZMYfqIhHFH224XErFIDDOeqxBzLzm3aIbCdl6re 3nqYoY0hYTn/OhSCLUGgUDDydyCPg6/8NT6relvaRbndj/6S05Uo+USiS7uJofzT v1c+/0X3/lLgicUxtcPfIMg= Received: (qmail 59716 invoked by alias); 24 Jan 2018 21:57:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 59707 invoked by uid 89); 24 Jan 2018 21:57:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f181.google.com Received: from mail-qt0-f181.google.com (HELO mail-qt0-f181.google.com) (209.85.216.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 24 Jan 2018 21:57:20 +0000 Received: by mail-qt0-f181.google.com with SMTP id l20so14323881qtj.11 for ; Wed, 24 Jan 2018 13:57:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=oUQxphvXwF3dMYQzYOkrUAq/Fad64QjfPX1HcbGoNtQ=; b=hGPzfwI28ObChKSN0aul/zSajo8JirmNHubIrfUtUy9uZw7CpNhlyTScFsRqVCEsVu iysTuqeSIWQKYPjZrrzjZpsuWKyR6ukxqVZ8r8sN3SzJkBSAAm9t7Rvtj+Qow5+MHPVF DaYCr85863jCNoTwbiyKFPfK4oV4ukmIIYdMuSyi17EZc1KSJmuFjVVja6LtX05lFs++ Cb2L6xKC5JsGSSKOvfcBv8AhKhkalUW1XZiMK51E4nlzSH649gtIa2LSyaQsWSMvW6LK RH5PbQgcGETd2zzrQ9Zz+Al7N31AYrdz3OFdcKQkeCJYTRaBEWE+Wp4q7LfLJq+woMhT jyrQ== X-Gm-Message-State: AKwxyteIMffxuD6egIbCUjOCkuIBLCzXJjwCLgUO3JIyms8dr/3uN9F6 Nwtm5BIhCz3ibu9bTMGh+ByCm7SxX4D0kXpfRqh8FSFUkf4= X-Received: by 10.55.96.4 with SMTP id u4mr11900361qkb.323.1516831038106; Wed, 24 Jan 2018 13:57:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.200.36.185 with HTTP; Wed, 24 Jan 2018 13:56:37 -0800 (PST) From: Kugan Vivekanandarajah Date: Thu, 25 Jan 2018 08:56:37 +1100 Message-ID: Subject: [RFC][PR82479] missing popcount builtin detection To: gcc-patches@gcc.gnu.org X-IsSubscribed: yes Hi All, Here is a patch for popcount builtin detection similar to LLVM. I would like to queue this for review for next stage 1. 1. This is done part of loop-distribution and effective for -O3 and above. 2. This does not distribute loop to detect popcount (like memcpy/memmove). I dont think that happens in practice. Please correct me if I am wrong. Bootstrapped and regression tested on aarch64-linux-gnu with no new regressions. Thanks, Kugan gcc/ChangeLog: 2018-01-25 Kugan Vivekanandarajah PR middle-end/82479 * tree-loop-distribution.c (handle_popcount): New. (pass_loop_distribution::execute): Use handle_popcount. gcc/testsuite/ChangeLog: 2018-01-25 Kugan Vivekanandarajah PR middle-end/82479 * gcc.dg/tree-ssa/popcount.c: New test. >From 9fa09af4b7013c6207e59a4920c82f089bfe45c2 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Wed, 24 Jan 2018 08:50:08 +1100 Subject: [PATCH] pocount builtin detection Change-Id: Ic6e175f9cc9a69bd417936a4845c2c046fd446b4 Change-Id: I680eb107445660c60a5d38f5d7300ab1a3243bf5 Change-Id: Ia9f0df89e05520091dc7797195098118768c7ac2 --- gcc/testsuite/gcc.dg/tree-ssa/popcount.c | 41 +++++++++ gcc/tree-loop-distribution.c | 145 +++++++++++++++++++++++++++++++ 2 files changed, 186 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c new file mode 100644 index 0000000..86a66cb --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-optimized" } */ + +extern int foo (int); + +int PopCount (long b) { + int c = 0; + b++; + + while (b) { + b &= b - 1; + c++; + } + return c; +} +int PopCount2 (long b) { + int c = 0; + + while (b) { + b &= b - 1; + c++; + } + foo (c); + return foo (c); +} + +void PopCount3 (long b1) { + + for (long i = 0; i < b1; ++i) + { + long b = i; + int c = 0; + while (b) { + b &= b - 1; + c++; + } + foo (c); + } +} + +/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } */ diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c index a3d76e4..1060700 100644 --- a/gcc/tree-loop-distribution.c +++ b/gcc/tree-loop-distribution.c @@ -1585,6 +1585,148 @@ classify_builtin_ldst (loop_p loop, struct graph *rdg, partition *partition, return; } +/* See if loop is a popcout implementation of the form + + int c = 0; + while (b) { + b = b & (b - 1); + c++; + } + + If so, convert this into c = __builtin_popcount (b) + return true if we did, false otherwise. */ + + +static bool +handle_popcount (loop_p loop) +{ + tree lhs, rhs; + tree dest, src; + gimple *and_minus_one; + int count = 0; + gphi *count_phi; + gimple *fn_call; + gimple *use_stmt; + use_operand_p use_p; + imm_use_iterator iter; + gimple_stmt_iterator gsi; + + /* Check loop terminating branch is like + if (b != 0). */ + gimple *stmt = last_stmt (loop->header); + if (!stmt + || gimple_code (stmt) != GIMPLE_COND + || !zerop (gimple_cond_rhs (stmt))) + return false; + + /* Cheeck "b = b & (b - 1)" is calculated. */ + lhs = gimple_cond_lhs (stmt); + gimple *and_stmt = SSA_NAME_DEF_STMT (lhs); + if (gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR) + return false; + lhs = gimple_assign_rhs1 (and_stmt); + rhs = gimple_assign_rhs2 (and_stmt); + if (TREE_CODE (lhs) == SSA_NAME + && (and_minus_one = SSA_NAME_DEF_STMT (lhs)) + && is_gimple_assign (and_minus_one) + && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR) + && integer_minus_onep (gimple_assign_rhs2 (and_minus_one))) + lhs = rhs; + else if (TREE_CODE (rhs) == SSA_NAME + && (and_minus_one = SSA_NAME_DEF_STMT (rhs)) + && is_gimple_assign (and_minus_one) + && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR) + && integer_minus_onep (gimple_assign_rhs2 (and_minus_one))) + ; + else + return false; + if ((gimple_assign_rhs1 (and_stmt) != gimple_assign_rhs1 (and_minus_one)) + && (gimple_assign_rhs2 (and_stmt) != gimple_assign_rhs1 (and_minus_one))) + return false; + + /* Check the recurrence. */ + gimple *phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one)); + gimple *src_phi = SSA_NAME_DEF_STMT (lhs); + if (gimple_code (phi) != GIMPLE_PHI + || gimple_code (src_phi) != GIMPLE_PHI) + return false; + + /* Check the loop closed SSA definition for just the variable c defined in + loop. */ + src = gimple_phi_arg_def (src_phi, loop_preheader_edge (loop)->dest_idx); + basic_block bb = single_exit (loop)->dest; + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + count_phi = gpi.phi (); + count++; + } + if (count != 1) + return false; + + /* Make sure we have a count by one and it starts from zero. */ + rhs = gimple_phi_arg_def (count_phi, 0); + stmt = SSA_NAME_DEF_STMT (rhs); + if (!is_gimple_assign (stmt) + || (gimple_assign_rhs_code (stmt) != PLUS_EXPR) + || !integer_onep (gimple_assign_rhs2 (stmt))) + return false; + rhs = gimple_assign_rhs1 (stmt); + stmt = SSA_NAME_DEF_STMT (rhs); + if (gimple_code (stmt) != GIMPLE_PHI + || !integer_zerop (gimple_phi_arg_def (stmt, loop_preheader_edge (loop)->dest_idx))) + return false; + + /* Create a var for builtin_popcount result and update the uses. */ + dest = gimple_phi_result (count_phi); + tree var = make_ssa_name (TREE_TYPE (dest), NULL); + FOR_EACH_IMM_USE_STMT (use_stmt, iter, dest) + { + FOR_EACH_IMM_USE_ON_STMT (use_p, iter) + SET_USE (use_p, var); + } + dest = var; + + /* Remove the loop closed PHI stmt. */ + gsi = gsi_after_labels (gimple_bb (count_phi)); + gphi_iterator gsi_phi = gsi_for_phi (count_phi); + remove_phi_node (&gsi_phi, true); + + /* Insert the builtin. */ + gsi = gsi_after_labels (loop_preheader_edge (loop)->src); + src = force_gimple_operand_gsi (&gsi, src, true, NULL_TREE, + false, GSI_CONTINUE_LINKING); + tree fn = build_fold_addr_expr (builtin_decl_implicit (BUILT_IN_POPCOUNT)); + fn_call = gimple_build_call (fn, 1, src); + gimple_call_set_lhs (fn_call, dest); + gsi_insert_before (&gsi, fn_call, GSI_CONTINUE_LINKING); + + /* In most cases, we will have if (b != 0) preceeding the loop in + the form + if (b != 0) + { + loop; + } + + Since __builtin_popcount (0) is defined, we can remove this condition + by making it always true. */ + + stmt = last_stmt (EDGE_PRED (loop_preheader_edge (loop)->src, 0)->src); + if (stmt + && gimple_code (stmt) == GIMPLE_COND + && (gimple_cond_lhs (stmt) == src) + && zerop (gimple_cond_rhs (stmt))) + { + gimple_cond_set_lhs (as_a (stmt), + build_int_cst (TREE_TYPE (src), 1)); + update_stmt (stmt); + } + + /* Finaly remove the loop. */ + destroy_loop (loop); + return true; +} + /* Classifies the builtin kind we can generate for PARTITION of RDG and LOOP. For the moment we detect memset, memcpy and memmove patterns. Bitmap STMT_IN_ALL_PARTITIONS contains statements belonging to all partitions. */ @@ -3032,6 +3174,9 @@ pass_loop_distribution::execute (function *fun) || !optimize_loop_for_speed_p (loop)) continue; + if (handle_popcount (loop)) + continue; + /* Don't distribute loop if niters is unknown. */ tree niters = number_of_latch_executions (loop); if (niters == NULL_TREE || niters == chrec_dont_know) -- 2.7.4