From patchwork Thu Jan 4 14:31:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Richard Earnshaw \(lists\)" X-Patchwork-Id: 123428 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp11500217qgn; Thu, 4 Jan 2018 06:32:49 -0800 (PST) X-Google-Smtp-Source: ACJfBouwP9GygZhkRfxNbyKpqPm1qbX3wqxA6U7txdZTZrLaPbseUsW861voUSnC1CgM53AeEmqX X-Received: by 10.98.62.221 with SMTP id y90mr4928860pfj.71.1515076369654; Thu, 04 Jan 2018 06:32:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515076369; cv=none; d=google.com; s=arc-20160816; b=Vp/blIwm5FJzzIkNimaKWICnTqRUu8mBiD1sh3BxgFxmcnTSMB+fAU78NZ9DN9qxOU mBJWv7lmAX/auzHsJYzXnBEm2Ndo1acRVo7EYtzwagSri2LuNOnQkHRonlqz+uq2Sc2F kocDF/Ku27cQLVoK3tstOdjoyNYslXVt+preu8ugneQb8OLAgangC6AwCKY3p7yA6wLN c1j2RaWdJRcIWiQDJuhiEVjjd7saV45KNYDg7a06tqzkn0wUmxJlrj6mq7GGqJZO4742 U7oB1dLw5RkJWoOLwevp2QmdbnC/D+TrycwQucet0bOPgo7q9dMbIUuemcsXNXT710NA Jsow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:references:in-reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:delivered-to:sender:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=yYMTB1qKOTVOIsqIWDrphF6OM54lE0ibt9RlRp2X/6o=; b=PU+3xXBGkTxXnfU/hNouKvZJWGBFKn2oNizumbWXfVL+7TXu/+opq0Qf/2+28OhiUT 3rGQ4Xve9w8TYZGB6sVsb/gIwdDXdNgrCalrAH/Z339gevqLEIOnhypUgL1T3MiKwaFo tgvyeLjYBK4KpmoNwOi0IKzg7YfowcfKtJdynpVU2IKa56l4WlG4o+OLAGLTdZlT8faZ AnmOoYp2m1OzJbDnP5qmZU/E++1S1sErNmrVlOst3GoP8C+1YGlmCjqWMJAQgNSRUTct 0Gb7d+3A552U+nqAIadDPeGUWAcvs+qBfqEAqyHWTDACWZiyyaHXsDkkXZR1KApiR+l5 Dkjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NvoIJ/kA; spf=pass (google.com: domain of gcc-patches-return-470142-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470142-patch=linaro.org@gcc.gnu.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id r26si2119417pgu.388.2018.01.04.06.32.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jan 2018 06:32:49 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-470142-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=NvoIJ/kA; spf=pass (google.com: domain of gcc-patches-return-470142-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470142-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; q=dns; s= default; b=haS83liUpMqffG4uDsBu9nmtmtvtfiWuxlgXzKX9eACU9XAUuaMzY ag0JdoGfTv7MaBaqZl715uigrlV3kq/TUuXN5AD9nXdQ57mxlZ7JlT0iasZacURf t13hN+lCeiWkJNVngGBlqMSoozYTA14IFWQdXQekwbzIcCtd0zILUk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; s=default; bh=Zo4VLvNuKhxD2s2hRu6d4VwCVHw=; b=NvoIJ/kA2qlX0eZBio8nxmEltAzN OtGAg6IA8WGLMdb49ZCOm0gEQHyR2seYjIZTQZ2hqtqtkzQIothaG5hzZVQ4YXdm H/UyPYoE3hR7n0Y1FaqfdpskUP2/SBRxPyWYk59bRXnTwpdxLfe0aEp5MqtxSFEn JdFuQOduHZOTm7c= Received: (qmail 100498 invoked by alias); 4 Jan 2018 14:32:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100352 invoked by uid 89); 4 Jan 2018 14:32:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS, T_RP_MATCHES_RCVD, URIBL_BLACK autolearn=ham version=3.3.2 spammy=basicblockh, UD:basic-block.h, basic-block.h X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 Jan 2018 14:32:07 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8F64115A2; Thu, 4 Jan 2018 06:32:06 -0800 (PST) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7EE7B3F41F; Thu, 4 Jan 2018 06:32:05 -0800 (PST) From: Richard Earnshaw To: gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: [PATCH 1/3] [gcc-7 backport] [builtins] Generic support for __builtin_load_no_speculate() Date: Thu, 4 Jan 2018 14:31:53 +0000 Message-Id: <260f96d6dd9a10d17d53f8e4a7ca49aa94a4ac64.1515075827.git.Richard.Earnshaw@arm.com> In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 This patch adds generic support for the new builtin __builtin_load_no_speculate. It provides the overloading of the different access sizes and a default fall-back expansion for targets that do not support a mechanism for inhibiting speculation. So that users can know that this version of GCC supports the new intrinsic we add the predefined macro definition __HAVE_LOAD_NO_SPECULATE while preprocessing. * builtin_types.def (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): New builtin type signature. (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR): Likewise. * builtins.def (BUILT_IN_LOAD_NO_SPECULATE_N): New builtin. (BUILT_IN_LOAD_NO_SPECULATE_1): Likewise. (BUILT_IN_LOAD_NO_SPECULATE_2): Likewise. (BUILT_IN_LOAD_NO_SPECULATE_4): Likewise. (BUILT_IN_LOAD_NO_SPECULATE_8): Likewise. (BUILT_IN_LOAD_NO_SPECULATE_16): Likewise. * target.def (inhibit_load_speculation): New hook. * doc/tm.texi.in (TARGET_INHIBIT_LOAD_SPECULATION): Add to documentation. * doc/tm.texi: Regenerated. * doc/cpp.texi: Document predefine __HAVE_LOAD_NO_SPECULATE. * doc/extend.texi: Document __builtin_load_no_speculate. * c-family/c-common.c (load_no_speculate_resolve_size): New function. (load_no_speculate_resolve_params): New function. (load_no_speculate_resolve_return): New function. (resolve_overloaded_builtin): Handle overloading __builtin_load_no_speculate. * c-family/c-cppbuiltin.c (c_cpp_builtins): Add predefine for __HAVE_LOAD_NO_SPECULATE. * builtins.c (expand_load_no_speculate): New function. (expand_builtin): Handle new no-speculation builtins. * targhooks.h (default_inhibit_load_speculation): Declare. * targhooks.c (default_inhibit_load_speculation): New function. --- gcc/builtin-types.def | 16 +++++ gcc/builtins.c | 99 ++++++++++++++++++++++++++ gcc/builtins.def | 22 ++++++ gcc/c-family/c-common.c | 164 ++++++++++++++++++++++++++++++++++++++++++++ gcc/c-family/c-cppbuiltin.c | 5 +- gcc/doc/cpp.texi | 4 ++ gcc/doc/extend.texi | 53 ++++++++++++++ gcc/doc/tm.texi | 6 ++ gcc/doc/tm.texi.in | 2 + gcc/target.def | 20 ++++++ gcc/targhooks.c | 67 +++++++++++++++++- gcc/targhooks.h | 3 + 12 files changed, 459 insertions(+), 2 deletions(-) diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index ac98944..109f11c 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -749,6 +749,22 @@ DEF_FUNCTION_TYPE_VAR_3 (BT_FN_SSIZE_STRING_SIZE_CONST_STRING_VAR, DEF_FUNCTION_TYPE_VAR_3 (BT_FN_INT_FILEPTR_INT_CONST_STRING_VAR, BT_INT, BT_FILEPTR, BT_INT, BT_CONST_STRING) +DEF_FUNCTION_TYPE_VAR_3 (BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + BT_I1, BT_CONST_VOLATILE_PTR, BT_CONST_VOLATILE_PTR, + BT_CONST_VOLATILE_PTR) +DEF_FUNCTION_TYPE_VAR_3 (BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + BT_I2, BT_CONST_VOLATILE_PTR, BT_CONST_VOLATILE_PTR, + BT_CONST_VOLATILE_PTR) +DEF_FUNCTION_TYPE_VAR_3 (BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + BT_I4, BT_CONST_VOLATILE_PTR, BT_CONST_VOLATILE_PTR, + BT_CONST_VOLATILE_PTR) +DEF_FUNCTION_TYPE_VAR_3 (BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + BT_I8, BT_CONST_VOLATILE_PTR, BT_CONST_VOLATILE_PTR, + BT_CONST_VOLATILE_PTR) +DEF_FUNCTION_TYPE_VAR_3 (BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + BT_I16, BT_CONST_VOLATILE_PTR, BT_CONST_VOLATILE_PTR, + BT_CONST_VOLATILE_PTR) + DEF_FUNCTION_TYPE_VAR_4 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VAR, BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING) diff --git a/gcc/builtins.c b/gcc/builtins.c index d7d4f0f..58a9dd8 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -6349,6 +6349,97 @@ expand_stack_save (void) return ret; } +/* Expand a call to __builtin_load_no_speculate_. MODE represents the + size of the first argument to that call. We emit a warning if the + result isn't used (IGNORE != 0), since the implementation might + rely on the value being used to correctly inhibit speculation. */ +static rtx +expand_load_no_speculate (machine_mode mode, tree exp, rtx target, int ignore) +{ + rtx ptr, op0, op1, op2, op3, op4; + unsigned nargs = call_expr_nargs (exp); + + if (ignore) + { + warning_at (input_location, 0, + "result of __builtin_load_no_speculate must be used to " + "ensure correct operation"); + target = NULL; + } + + tree arg0 = CALL_EXPR_ARG (exp, 0); + tree arg1 = CALL_EXPR_ARG (exp, 1); + tree arg2 = CALL_EXPR_ARG (exp, 2); + + ptr = expand_expr (arg0, NULL_RTX, ptr_mode, EXPAND_SUM); + op0 = validize_mem (gen_rtx_MEM (mode, convert_memory_address (Pmode, ptr))); + + set_mem_align (op0, MAX (GET_MODE_ALIGNMENT (mode), + get_pointer_alignment (arg0))); + set_mem_alias_set (op0, get_alias_set (TREE_TYPE (TREE_TYPE (arg0)))); + + /* Mark the memory access as volatile. We don't want the optimizers to + move it or otherwise substitue an alternative value. */ + MEM_VOLATILE_P (op0) = 1; + + if (integer_zerop (tree_strip_nop_conversions (arg1))) + op1 = NULL; + else + { + op1 = expand_normal (arg1); + if (GET_MODE (op1) != ptr_mode && GET_MODE (op1) != VOIDmode) + op1 = convert_modes (ptr_mode, VOIDmode, op1, + TYPE_UNSIGNED (TREE_TYPE (arg1))); + } + + if (integer_zerop (tree_strip_nop_conversions (arg2))) + op2 = NULL; + else + { + op2 = expand_normal (arg2); + if (GET_MODE (op2) != ptr_mode && GET_MODE (op2) != VOIDmode) + op2 = convert_modes (ptr_mode, VOIDmode, op2, + TYPE_UNSIGNED (TREE_TYPE (arg2))); + } + + if (nargs > 3) + { + tree arg3 = CALL_EXPR_ARG (exp, 3); + op3 = expand_normal (arg3); + if (CONST_INT_P (op3)) + op3 = gen_int_mode (INTVAL (op3), mode); + else if (GET_MODE (op3) != mode && GET_MODE (op3) != VOIDmode) + op3 = convert_modes (mode, VOIDmode, op3, + TYPE_UNSIGNED (TREE_TYPE (arg3))); + } + else + op3 = const0_rtx; + + if (nargs > 4) + { + tree arg4 = CALL_EXPR_ARG (exp, 4); + op4 = expand_normal (arg4); + if (GET_MODE (op4) != ptr_mode && GET_MODE (op4) != VOIDmode) + op4 = convert_modes (ptr_mode, VOIDmode, op4, + TYPE_UNSIGNED (TREE_TYPE (arg4))); + } + else + op4 = ptr; + + if (op1 == NULL && op2 == NULL) + { + error_at (input_location, + "at least one speculation bound must be non-NULL"); + /* Ensure we don't crash later. */ + op1 = op4; + } + + if (target == NULL) + target = gen_reg_rtx (mode); + + return targetm.inhibit_load_speculation (mode, target, op0, op1, op2, op3, + op4); +} /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient @@ -7469,6 +7560,14 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, folding. */ break; + case BUILT_IN_LOAD_NO_SPECULATE_1: + case BUILT_IN_LOAD_NO_SPECULATE_2: + case BUILT_IN_LOAD_NO_SPECULATE_4: + case BUILT_IN_LOAD_NO_SPECULATE_8: + case BUILT_IN_LOAD_NO_SPECULATE_16: + mode = get_builtin_sync_mode (fcode - BUILT_IN_LOAD_NO_SPECULATE_1); + return expand_load_no_speculate (mode, exp, target, ignore); + default: /* just do library call, if unknown builtin */ break; } diff --git a/gcc/builtins.def b/gcc/builtins.def index 58d78db..16894da 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -964,6 +964,28 @@ DEF_BUILTIN (BUILT_IN_EMUTLS_REGISTER_COMMON, true, true, true, ATTR_NOTHROW_LEAF_LIST, false, !targetm.have_tls) +/* Suppressing speculation. Users are expected to use the first (N) + variant, which will be translated internally into one of the other + types. */ +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_N, "load_no_speculate", + BT_FN_VOID_VAR, ATTR_NULL) + +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_1, "load_no_speculate_1", + BT_FN_I1_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + ATTR_NULL) +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_2, "load_no_speculate_2", + BT_FN_I2_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + ATTR_NULL) +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_4, "load_no_speculate_4", + BT_FN_I4_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + ATTR_NULL) +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_8, "load_no_speculate_8", + BT_FN_I8_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + ATTR_NULL) +DEF_GCC_BUILTIN (BUILT_IN_LOAD_NO_SPECULATE_16, "load_no_speculate_16", + BT_FN_I16_CONST_VPTR_CONST_VPTR_CONST_VPTR_VAR, + ATTR_NULL) + /* Exception support. */ DEF_BUILTIN_STUB (BUILT_IN_UNWIND_RESUME, "__builtin_unwind_resume") DEF_BUILTIN_STUB (BUILT_IN_CXA_END_CLEANUP, "__builtin_cxa_end_cleanup") diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index e272488..e44c40d 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -6553,6 +6553,146 @@ builtin_type_for_size (int size, bool unsignedp) return type ? type : error_mark_node; } +/* Work out the size of the object pointed to by the first arguement + of a call to __builtin_load_no_speculate. Only pointers to + integral types and pointers are permitted. Return 0 if the + arguement type is not supported of if the size is too large. */ +static int +load_no_speculate_resolve_size (tree function, vec *params) +{ + /* Type of the argument. */ + tree type; + int size; + + if (vec_safe_is_empty (params)) + { + error ("too few arguments to function %qE", function); + return 0; + } + + type = TREE_TYPE ((*params)[0]); + + if (!POINTER_TYPE_P (type)) + goto incompatible; + + type = TREE_TYPE (type); + + if (TREE_CODE (type) == ARRAY_TYPE) + { + /* Force array-to-pointer decay for c++. */ + gcc_assert (c_dialect_cxx()); + (*params)[0] = default_conversion ((*params)[0]); + type = TREE_TYPE ((*params)[0]); + } + + if (!INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type)) + goto incompatible; + + if (!COMPLETE_TYPE_P (type)) + goto incompatible; + + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); + if (size == 1 || size == 2 || size == 4 || size == 8 || size == 16) + return size; + + incompatible: + /* Issue the diagnostic only if the argument is valid, otherwise + it would be redundant at best and could be misleading. */ + if (type != error_mark_node) + error ("operand type %qT is incompatible with argument %d of %qE", + type, 1, function); + + return 0; +} + +/* Validate and coerce PARAMS, the arguments to ORIG_FUNCTION to fit + the prototype for FUNCTION. The first three arguments are + mandatory, but shouldn't need casting as they are all pointers and + we've already established that the first argument is a pointer to a + permitted type. The two optional arguments may need to be + fabricated if they have been omitted. */ +static bool +load_no_speculate_resolve_params (location_t loc, tree orig_function, + tree function, + vec *params) +{ + function_args_iterator iter; + + function_args_iter_init (&iter, TREE_TYPE (function)); + tree arg_type = function_args_iter_cond (&iter); + unsigned parmnum; + tree val; + + if (params->length () < 3) + { + error_at (loc, "too few arguments to function %qE", orig_function); + return false; + } + else if (params->length () > 5) + { + error_at (loc, "too many arguments to function %qE", orig_function); + return false; + } + + /* Required arguments. These must all be pointers. */ + for (parmnum = 0; parmnum < 3; parmnum++) + { + arg_type = function_args_iter_cond (&iter); + val = (*params)[parmnum]; + if (TREE_CODE (TREE_TYPE (val)) == ARRAY_TYPE) + val = default_conversion (val); + if (TREE_CODE (TREE_TYPE (val)) != POINTER_TYPE) + goto bad_arg; + (*params)[parmnum] = val; + } + + /* Optional integer value. */ + arg_type = function_args_iter_cond (&iter); + if (params->length () >= 4) + { + val = (*params)[parmnum]; + val = convert (arg_type, val); + (*params)[parmnum] = val; + } + else + return true; + + /* Optional pointer to compare against. */ + parmnum = 4; + arg_type = function_args_iter_cond (&iter); + if (params->length () == 5) + { + val = (*params)[parmnum]; + if (TREE_CODE (TREE_TYPE (val)) == ARRAY_TYPE) + val = default_conversion (val); + if (TREE_CODE (TREE_TYPE (val)) != POINTER_TYPE) + goto bad_arg; + (*params)[parmnum] = val; + } + + return true; + + bad_arg: + error_at (loc, "expecting argument of type %qT for argument %u", arg_type, + parmnum); + return false; +} + +/* Cast the result of the builtin back to the type pointed to by the + first argument, preserving any qualifiers that it might have. */ +static tree +load_no_speculate_resolve_return (tree first_param, tree result) +{ + tree ptype = TREE_TYPE (TREE_TYPE (first_param)); + tree rtype = TREE_TYPE (result); + ptype = TYPE_MAIN_VARIANT (ptype); + + if (tree_int_cst_equal (TYPE_SIZE (ptype), TYPE_SIZE (rtype))) + return convert (ptype, result); + + return result; +} + /* A helper function for resolve_overloaded_builtin in resolving the overloaded __sync_ builtins. Returns a positive power of 2 if the first operand of PARAMS is a pointer to a supported data type. @@ -7204,6 +7344,30 @@ resolve_overloaded_builtin (location_t loc, tree function, /* Handle BUILT_IN_NORMAL here. */ switch (orig_code) { + case BUILT_IN_LOAD_NO_SPECULATE_N: + { + int n = load_no_speculate_resolve_size (function, params); + tree new_function, first_param, result; + enum built_in_function fncode; + + if (n == 0) + return error_mark_node; + + fncode = (enum built_in_function)((int)orig_code + exact_log2 (n) + 1); + new_function = builtin_decl_explicit (fncode); + first_param = (*params)[0]; + if (!load_no_speculate_resolve_params (loc, function, new_function, + params)) + return error_mark_node; + + result = build_function_call_vec (loc, vNULL, new_function, params, + NULL); + if (result == error_mark_node) + return result; + + return load_no_speculate_resolve_return (first_param, result); + } + case BUILT_IN_ATOMIC_EXCHANGE: case BUILT_IN_ATOMIC_COMPARE_EXCHANGE: case BUILT_IN_ATOMIC_LOAD: diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index c5fadaa..c4cb763 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -1356,7 +1356,10 @@ c_cpp_builtins (cpp_reader *pfile) cpp_define (pfile, "__WCHAR_UNSIGNED__"); cpp_atomic_builtins (pfile); - + + /* Show support for __builtin_load_no_speculate (). */ + cpp_define (pfile, "__HAVE_LOAD_NO_SPECULATE"); + #ifdef DWARF2_UNWIND_INFO if (dwarf2out_do_cfi_asm ()) cpp_define (pfile, "__GCC_HAVE_DWARF2_CFI_ASM"); diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi index 6e16ffb..1718acb 100644 --- a/gcc/doc/cpp.texi +++ b/gcc/doc/cpp.texi @@ -2351,6 +2351,10 @@ If GCC cannot determine the current date, it will emit a warning message These macros are defined when the target processor supports atomic compare and swap operations on operands 1, 2, 4, 8 or 16 bytes in length, respectively. +@item __HAVE_LOAD_NO_SPECULATE +This macro is defined with the value 1 to show that this version of GCC +supports @code{__builtin_load_no_speculate}. + @item __GCC_HAVE_DWARF2_CFI_ASM This macro is defined when the compiler is emitting DWARF CFI directives to the assembler. When this is defined, it is possible to emit those same diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index ba309d0..d00be5b 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -10498,6 +10498,7 @@ in the Cilk Plus language manual which can be found at @findex __builtin_islessequal @findex __builtin_islessgreater @findex __builtin_isunordered +@findex __builtin_load_no_speculate @findex __builtin_powi @findex __builtin_powif @findex __builtin_powil @@ -11134,6 +11135,58 @@ an extension. @xref{Variable Length}, for details. @end deftypefn +@deftypefn {Built-in Function} @var{type} __builtin_load_no_speculate (const volatile @var{type} *ptr, const volatile void *lower_bound, const volatile void *upper_bound, @var{type} failval, const volatile void *cmpptr) +The @code{__builtin_load_no_speculation} function provides a means to +limit the extent to which a processor can continue speculative +execution with the result of loading a value stored at @var{ptr}. +Logically, the builtin implements the following behavior: + +@smallexample +inline @var{type} __builtin_load_no_speculate + (const volatile @var{type} *ptr, + const volatile void *lower_bound, + const volatile void *upper_bound, + @var{type} failval, + const volatile void *cmpptr) +@{ + @var{type} result; + if (cmpptr >= lower_bound && cmpptr < upper_bound) + result = *ptr; + else + result = failval; + return result; +@} +@end smallexample + +but in addition target-specific code will be inserted to ensure that +speculation using @code{*ptr} cannot occur when @var{cmpptr} lies outside of +the specified bounds. + +@var{type} may be any integral type (signed, or unsigned, @code{char}, +@code{short}, @code{int}, etc) or a pointer to any type. + +The final argument, @var{cmpptr}, may be omitted. If you do this, +then the compiler will use @var{ptr} for comparison against the upper +and lower bounds. Furthermore, if you omit @var{cmpptr}, you may also +omit @var{failval} and the compiler will use @code{(@var{type})0} for +the out-of-bounds result. + +Additionally, when it is know that one of the bounds can never fail, +you can use a literal @code{NULL} argument and the compiler will +generate code that only checks the other boundary condition. It is generally +only safe to do this when your code contains a loop construct where the only +boundary of interest is the one beyond the termination condition. You cannot +omit both boundary conditions in this way. + +The logical behaviour of the builtin is supported for all architectures, but +on machines where target-specific support for inhibiting speculation is not +implemented, or not necessary, the compiler will emit a warning. + +The pre-processor macro @code{__HAVE_LOAD_NO_SPECULATE} is defined with the +value 1 on all implementations of GCC that support this builtin. + +@end deftypefn + @deftypefn {Built-in Function} int __builtin_types_compatible_p (@var{type1}, @var{type2}) You can use the built-in function @code{__builtin_types_compatible_p} to diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index c4f2c89..0b43d70 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11866,6 +11866,12 @@ maintainer is familiar with. @end defmac +@deftypefn {Target Hook} rtx TARGET_INHIBIT_LOAD_SPECULATION (machine_mode @var{mode}, rtx @var{result}, rtx @var{mem}, rtx @var{lower_bound}, rtx @var{upper_bound}, rtx @var{fail_result}, rtx @var{cmpptr}) +Generate a target-specific code sequence that implements @code{__builtin_load_no_speculate}, returning the result in @var{result}. If @var{cmpptr} is greater than, or equal to, @var{lower_bound} and less than @var{upper_bound} then @var{mem}, a @code{MEM} of type @var{mode}, should be returned, otherwise @var{failval} should be returned. The expansion must ensure that subsequent speculation by the processor using the @var{mem} cannot occur if @var{cmpptr} lies outside of the specified bounds. At most one of @var{lower_bound} and @var{upper_bound} can be @code{NULL_RTX}, indicating that code for that bounds check should not be generated. + + The default implementation implements the logic of the builtin but cannot provide the target-specific code necessary to inhibit speculation. A warning will be emitted to that effect. +@end deftypefn + @deftypefn {Target Hook} void TARGET_RUN_TARGET_SELFTESTS (void) If selftests are enabled, run any selftests for this target. @end deftypefn diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 1c471d8..d595002 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -8318,4 +8318,6 @@ maintainer is familiar with. @end defmac +@hook TARGET_INHIBIT_LOAD_SPECULATION + @hook TARGET_RUN_TARGET_SELFTESTS diff --git a/gcc/target.def b/gcc/target.def index 6bebfd5..605b793 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -4062,6 +4062,26 @@ DEFHOOK hook_bool_void_true) DEFHOOK +(inhibit_load_speculation, + "Generate a target-specific code sequence that implements\ + @code{__builtin_load_no_speculate}, returning the result in @var{result}.\ + If @var{cmpptr} is greater than, or equal to, @var{lower_bound} and less\ + than @var{upper_bound} then @var{mem}, a @code{MEM} of type @var{mode},\ + should be returned, otherwise @var{failval} should be returned. The\ + expansion must ensure that subsequent speculation by the processor using\ + the @var{mem} cannot occur if @var{cmpptr} lies outside of the specified\ + bounds. At most one of @var{lower_bound} and @var{upper_bound} can be\ + @code{NULL_RTX}, indicating that code for that bounds check should not be\ + generated.\n\ + \n\ + The default implementation implements the logic of the builtin\ + but cannot provide the target-specific code necessary to inhibit\ + speculation. A warning will be emitted to that effect.", + rtx, (machine_mode mode, rtx result, rtx mem, rtx lower_bound, + rtx upper_bound, rtx fail_result, rtx cmpptr), + default_inhibit_load_speculation) + +DEFHOOK (can_use_doloop_p, "Return true if it is possible to use low-overhead loops (@code{doloop_end}\n\ and @code{doloop_begin}) for a particular loop. @var{iterations} gives the\n\ diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 1cdec06..178d5ea 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -79,7 +79,8 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" #include "params.h" #include "real.h" - +#include "dojump.h" +#include "basic-block.h" bool default_legitimate_address_p (machine_mode mode ATTRIBUTE_UNUSED, @@ -2107,4 +2108,68 @@ default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED) return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; } +/* Default implementation of the load-and-inhibit-speculation builtin. + This version does not have, or know of, the target-specific + mechanisms necessary to inhibit speculation, so it simply emits a + code sequence that implements the architectural aspects of the + builtin. */ +rtx +default_inhibit_load_speculation (machine_mode mode ATTRIBUTE_UNUSED, + rtx result, + rtx mem, + rtx lower_bound, + rtx upper_bound, + rtx fail_result, + rtx cmpptr) +{ + rtx_code_label *done_label = gen_label_rtx (); + rtx_code_label *inrange_label = gen_label_rtx (); + warning_at + (input_location, 0, + "this target does not support anti-speculation operations. " + "Your program will still execute correctly, but speculation " + "will not be inhibited"); + + /* We don't have any despeculation barriers, but if we mark the branch + probabilities to be always predicting the out-of-bounds path, then + there's a higher chance that the compiler will order code so that + static prediction will fall through a safe path. */ + if (lower_bound == NULL) + { + do_compare_rtx_and_jump (cmpptr, upper_bound, LTU, true, ptr_mode, + NULL, NULL, inrange_label, PROB_VERY_UNLIKELY); + emit_move_insn (result, fail_result); + emit_jump (done_label); + emit_label (inrange_label); + emit_move_insn (result, mem); + emit_label (done_label); + } + else if (upper_bound == NULL) + { + do_compare_rtx_and_jump (cmpptr, lower_bound, GEU, true, ptr_mode, + NULL, NULL, inrange_label, PROB_VERY_UNLIKELY); + emit_move_insn (result, fail_result); + emit_jump (done_label); + emit_label (inrange_label); + emit_move_insn (result, mem); + emit_label (done_label); + } + else + { + rtx_code_label *oob_label = gen_label_rtx (); + do_compare_rtx_and_jump (cmpptr, lower_bound, LTU, true, ptr_mode, + NULL, NULL, oob_label, PROB_ALWAYS); + do_compare_rtx_and_jump (cmpptr, upper_bound, LTU, true, ptr_mode, + NULL, NULL, inrange_label, PROB_VERY_UNLIKELY); + emit_label (oob_label); + emit_move_insn (result, fail_result); + emit_jump (done_label); + emit_label (inrange_label); + emit_move_insn (result, mem); + emit_label (done_label); + } + + return result; +} + #include "gt-targhooks.h" diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 18070df..e674e9d 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -264,4 +264,7 @@ extern unsigned int default_min_arithmetic_precision (void); extern enum flt_eval_method default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED); +extern rtx +default_inhibit_load_speculation (machine_mode, rtx, rtx, rtx, rtx, rtx, rtx); + #endif /* GCC_TARGHOOKS_H */ From patchwork Thu Jan 4 14:31:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Richard Earnshaw \(lists\)" X-Patchwork-Id: 123427 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp11499918qgn; Thu, 4 Jan 2018 06:32:36 -0800 (PST) X-Google-Smtp-Source: ACJfBoueWOd8nGinhHc9io6fa/JmFqZJ81MFTEoIIfiys2Zc5k5voiMigYk72KMB1JioKZiN5fXw X-Received: by 10.98.76.90 with SMTP id z87mr5011769pfa.194.1515076356185; Thu, 04 Jan 2018 06:32:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515076356; cv=none; d=google.com; s=arc-20160816; b=HHVAct9aSS0UCPB5JkB0MzX1F6mZRuv+6VBGwkquBhlmUzmNOwOT7m7znZAWRmDflt jaWCSXjXl//Bhy/dDtXI0nYXd56bmnJxuGMkvoMRXduDJ6caFmNoPNSoUV4ZFTCAF4x8 qTwmcP3U+Jat6pEH4nJUrOFiZiNwLh39YOwYFPUj/3dKF8gpyiy3RPdb9HPXzW5eje31 uU0QUlFkhtQQHJ/M3yiyZlKCoMLBB4pUibdufQN92qk6GT0EcP1T6pn3jBacMp4vrs2s N1I4IhOpHbb+g0JM06jS+RnO2gncxorzckZCSPm5gSdevzMhDBHt7HLl6oyXQ7Xv1sr7 fBhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:references:in-reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:delivered-to:sender:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=YCEZolzuI+9r4dQ9sdEHWvXAWpiLLoi3qeB7QITliyc=; b=ReCtgXA8jgVjo1KcAkURAlPO8lJzkIv0dcm94x8b8oIH0LplDYWuTC65HXgBt45VuE tpMqEgZuwhkRIWZlO0Zk9sLiWfzzKyCQhD/X9d5ICBv4kb1Z6obO3rqqYmJ5oGIuRUfG EX7NMsgqb2iii8mqtranP2ALJAF1BfluAurb3cmqgbZ1a4yGXpF12iAFY1QWLZ8h7zI6 /SnluVBFpIDObao8CkrrxfQ3+oSYFj8qdUzNYYWX3Yyfv4gG2ZDQ53bgdvp6OIWu9DrV Y0Xfe1VUBPEF4z0YEsBabtGPE142TFyr3JBDV2Bn0MqAFjimqNReg7+qnjqfXuj8+WhJ aTWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pU65Vfji; spf=pass (google.com: domain of gcc-patches-return-470141-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470141-patch=linaro.org@gcc.gnu.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id h20si2123376pgv.201.2018.01.04.06.32.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jan 2018 06:32:36 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-470141-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pU65Vfji; spf=pass (google.com: domain of gcc-patches-return-470141-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470141-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; q=dns; s= default; b=XEG6XDaQ9MrKeBtENOEiF3b/Wtj2Ef6ce2Z91n4Xwn29T0mQbCfpc BgLlDEwsTBdUAPvvZSWmngHw2P/DskWdehs0COC7VOcf7aEx2foEFOJDaLWZF3W+ 4iUQoXVQqs+sgEtcJjOQvM4OqPHnBXj18rTQsxd3pZNVuA9reWr1Bk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; s=default; bh=iFBDpW4qdzJxNNCFgo7NUARZKuM=; b=pU65VfjieD6Vqs8vsp9q3ZFPFFrV Vg0VD3QCfLvDB1JOCQJyi/l00k21E4ymAbOnYH4KjF9jh0yr2b/3NDRSgeS5qwGu qQowI9/d/BfuzklLfc/3QrWdeHRLXS2hSPBGQdwPDlFT6jN8DIFPd1UrO4ed2tBE LX7Juq+jDVa6sM4= Received: (qmail 100427 invoked by alias); 4 Jan 2018 14:32:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100266 invoked by uid 89); 4 Jan 2018 14:32:10 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 Jan 2018 14:32:08 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 95B4E15AD; Thu, 4 Jan 2018 06:32:07 -0800 (PST) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CD98E3F41F; Thu, 4 Jan 2018 06:32:06 -0800 (PST) From: Richard Earnshaw To: gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: [PATCH 2/3] [gcc-7 backport] [aarch64] Implement support for __builtin_load_no_speculate. Date: Thu, 4 Jan 2018 14:31:54 +0000 Message-Id: In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 This patch implements support for __builtin_load_no_speculate on AArch64. On this architecture we inhibit speclation by emitting a combination of CSEL and a hint instruction that ensures the CSEL is full resolved when the operands to the CSEL may involve a speculative load. * config/aarch64/aarch64.c (aarch64_print_operand): Handle zero passed to 'H' operand qualifier. (aarch64_inhibit_load_speculation): New function. (TARGET_INHIBIT_LOAD_SPECULATION): Redefine. * config/aarch64/aarch64.md (UNSPECV_NOSPECULATE): New unspec_volatile code. (nospeculate, nospeculateti): New patterns. --- gcc/config/aarch64/aarch64.c | 91 +++++++++++++++++++++++++++++++++++++++++++ gcc/config/aarch64/aarch64.md | 28 +++++++++++++ 2 files changed, 119 insertions(+) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 436091a..4a000b0 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -4987,6 +4987,13 @@ aarch64_print_operand (FILE *f, rtx x, int code) case 'H': /* Print the higher numbered register of a pair (TImode) of regs. */ + if (x == const0_rtx + || (CONST_DOUBLE_P (x) && aarch64_float_const_zero_rtx_p (x))) + { + asm_fprintf (f, "xzr"); + break; + } + if (!REG_P (x) || !GP_REGNUM_P (REGNO (x) + 1)) { output_operand_lossage ("invalid operand for '%%%c'", code); @@ -14708,6 +14715,87 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn) } } +static rtx +aarch64_inhibit_load_speculation (machine_mode mode, rtx result, rtx mem, + rtx lower_bound, rtx upper_bound, + rtx fail_result, rtx cmpptr) +{ + rtx cond, comparison; + rtx target = gen_reg_rtx (mode); + rtx tgt2 = result; + + if (!register_operand (cmpptr, ptr_mode)) + cmpptr = force_reg (ptr_mode, cmpptr); + + if (!register_operand (tgt2, mode)) + tgt2 = gen_reg_rtx (mode); + + if (upper_bound == NULL) + { + if (!register_operand (lower_bound, ptr_mode)) + lower_bound = force_reg (ptr_mode, lower_bound); + + cond = aarch64_gen_compare_reg (LTU, cmpptr, lower_bound); + comparison = gen_rtx_LTU (VOIDmode, cond, const0_rtx); + } + else if (lower_bound == NULL) + { + if (!register_operand (upper_bound, ptr_mode)) + upper_bound = force_reg (ptr_mode, upper_bound); + + cond = aarch64_gen_compare_reg (GEU, cmpptr, upper_bound); + comparison = gen_rtx_GEU (VOIDmode, cond, const0_rtx); + } + else + { + if (!register_operand (lower_bound, ptr_mode)) + lower_bound = force_reg (ptr_mode, lower_bound); + + if (!register_operand (upper_bound, ptr_mode)) + upper_bound = force_reg (ptr_mode, upper_bound); + + rtx cond1 = aarch64_gen_compare_reg (GEU, cmpptr, lower_bound); + rtx comparison1 = gen_rtx_GEU (ptr_mode, cond1, const0_rtx); + rtx failcond = GEN_INT (aarch64_get_condition_code (comparison1)^1); + cond = gen_rtx_REG (CCmode, CC_REGNUM); + if (ptr_mode == SImode) + emit_insn (gen_ccmpsi (cond1, cond, cmpptr, upper_bound, comparison1, + failcond)); + else + emit_insn (gen_ccmpdi (cond1, cond, cmpptr, upper_bound, comparison1, + failcond)); + comparison = gen_rtx_GEU (VOIDmode, cond, const0_rtx); + } + + rtx_code_label *label = gen_label_rtx (); + emit_jump_insn (gen_condjump (comparison, cond, label)); + emit_move_insn (target, mem); + emit_label (label); + + insn_code icode; + + switch (mode) + { + case QImode: icode = CODE_FOR_nospeculateqi; break; + case HImode: icode = CODE_FOR_nospeculatehi; break; + case SImode: icode = CODE_FOR_nospeculatesi; break; + case DImode: icode = CODE_FOR_nospeculatedi; break; + case TImode: icode = CODE_FOR_nospeculateti; break; + default: + gcc_unreachable (); + } + + if (! insn_operand_matches (icode, 4, fail_result)) + fail_result = force_reg (mode, fail_result); + + emit_insn (GEN_FCN (icode) (tgt2, comparison, cond, target, fail_result)); + + if (tgt2 != result) + emit_move_insn (result, tgt2); + + return result; +} + /* Target-specific selftests. */ #if CHECKING_P @@ -15136,6 +15224,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 4 +#undef TARGET_INHIBIT_LOAD_SPECULATION +#define TARGET_INHIBIT_LOAD_SPECULATION aarch64_inhibit_load_speculation + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 51368e2..ca91147 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -150,6 +150,7 @@ UNSPECV_SET_FPSR ; Represent assign of FPSR content. UNSPECV_BLOCKAGE ; Represent a blockage UNSPECV_PROBE_STACK_RANGE ; Represent stack range probing. + UNSPECV_NOSPECULATE ; Inhibit speculation ] ) @@ -5564,6 +5565,33 @@ DONE; }) +(define_insn "nospeculate" + [(set (match_operand:ALLI 0 "register_operand" "=r") + (unspec_volatile:ALLI + [(match_operator 1 "aarch64_comparison_operator" + [(match_operand 2 "cc_register" "") (const_int 0)]) + (match_operand:ALLI 3 "register_operand" "r") + (match_operand:ALLI 4 "aarch64_reg_or_zero" "rZ")] + UNSPECV_NOSPECULATE))] + "" + "csel\\t%0, %3, %4, %M1\;hint\t#0x14\t// CSDB" + [(set_attr "type" "csel") + (set_attr "length" "8")] +) + +(define_insn "nospeculateti" + [(set (match_operand:TI 0 "register_operand" "=r") + (unspec_volatile:TI + [(match_operator 1 "aarch64_comparison_operator" + [(match_operand 2 "cc_register" "") (const_int 0)]) + (match_operand:TI 3 "register_operand" "r") + (match_operand:TI 4 "aarch64_reg_or_zero" "rZ")] + UNSPECV_NOSPECULATE))] + "" + "csel\\t%x0, %x3, %x4, %M1\;csel\\t%H0, %H3, %H4, %M1\;hint\t#0x14\t// CSDB" + [(set_attr "type" "csel") + (set_attr "length" "12")] +) ;; AdvSIMD Stuff (include "aarch64-simd.md") From patchwork Thu Jan 4 14:31:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Richard Earnshaw \(lists\)" X-Patchwork-Id: 123429 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp11500525qgn; Thu, 4 Jan 2018 06:33:06 -0800 (PST) X-Google-Smtp-Source: ACJfBotBdeDy0bXCv2dbiZmG6ceqM458HStkOc7KqHQsyvaAh/7+cTl82ZLhdulzNpc4uiJQoYXr X-Received: by 10.159.207.136 with SMTP id z8mr4898010plo.223.1515076386043; Thu, 04 Jan 2018 06:33:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1515076386; cv=none; d=google.com; s=arc-20160816; b=E+imYQmtWvvDV6CP3hSZEuZXyX5aDV/B5N2PruiSJ3KxqhjkQcT9rNOuduPHPsi7Q2 56AnMqz5Ehjyo39NuwRhbaw+JPdf3IUY66J8y4U/oN/If0P+4lsijqTLp0qEH2X0q2aP BBR/NunCWCMEEjwdGoko4XFv5G4DLKvRr0I5o7p62OujLH3y4aNkLx948lONbEz7STWU 2azvjsriMqD7ciSBVpICzfRKNS6ucJUHMmk9o1VrqXZvVqshATWg/39qA5vA4rFT3Vh1 b07EVMbs0C7cisc48CgJ7e34179K82QitJ+UsI8V+6VRkmA601kNYF/XCvibVIXht/hY BGig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:references:in-reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:delivered-to:sender:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=6LVK8g1lCn7XyjXOmHFppgdqpxOhUqo/b8664UqLCOw=; b=YQUhzruntlMmV2EfuOzO1zXEVBVlHNwFVQGyR1/Qu70DtaLDCXvi0wvzCxj/KGSnpB cA0BAcER1sdkj2li7uKW/KXB2onzNLpxKp6PXYCD0r6C6+5Z0Y8zer1GyZKbdppGHbzF i3IRAqB0Uv8ggx4yi1q8TS2xU3BiJIQZYoA4/a1/N2CkJs/gNT98BQ9kf42EyUd+sdRf nZxOQ2Da0ZCLxJL0cb/cAZHdFn8VKm/ZmWGPsC9e+/u9STnKKChM1d3H2CbCTzV+xYoq IZ0Ttl4Mh1+ZpsajHTyi7IGg4UVDsOZvlx8AAJ+Hpxh6ww6KGGJbtN8onEcT2eEKwhn9 60aA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=s0yrQtDj; spf=pass (google.com: domain of gcc-patches-return-470143-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470143-patch=linaro.org@gcc.gnu.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 61si2359891plz.285.2018.01.04.06.33.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 04 Jan 2018 06:33:06 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-470143-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=s0yrQtDj; spf=pass (google.com: domain of gcc-patches-return-470143-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-470143-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; q=dns; s= default; b=U5zeel/4+B4TooZ8y2DWe+QyR5uE32Jrvzhbx5Y4oNj/TLNeX0DQW GKyuud1rFPTPRWz6NSPh28pMAhL9OrQ//aJH9uq4bxqbLZ3qNNlD8Vjd7tQEUsWc PQxXnlz0QMZd9yITD+WZD5qC6J5KaWAs5FbdxRsQwDIz3v8DkBetJg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-type; s=default; bh=yrI4F8Wu4Jxl0XhL6CP8YSJE7jE=; b=s0yrQtDjhPDAVb1DYogLvZOGG+OO WR9pZ2MO68p4JiWMAaPGI4f/AjBUfs5rfjtuK7563P0TUCyRVpLLWUOOLLbd0oVX G++u21dSUP0SJic1EGY47Zj+GbZ9wHOWu7+HWZTM/Vlxn4mZTSGPgObjWGcw0Nwv TFpOxTS8uDEu5aM= Received: (qmail 100576 invoked by alias); 4 Jan 2018 14:32:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100444 invoked by uid 89); 4 Jan 2018 14:32:12 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 Jan 2018 14:32:09 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9C3FD15BE; Thu, 4 Jan 2018 06:32:08 -0800 (PST) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D3C123F41F; Thu, 4 Jan 2018 06:32:07 -0800 (PST) From: Richard Earnshaw To: gcc-patches@gcc.gnu.org Cc: Richard Earnshaw Subject: [PATCH 3/3] [gcc-7 backport] [arm] Implement support for the de-speculation intrinsic Date: Thu, 4 Jan 2018 14:31:55 +0000 Message-Id: In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 This patch implements despeculation on ARM. We only support it when generating ARM or Thumb2 code (we need conditional execution); and we only support it for sizes up to DImode. For unsupported cases we fall back to the generic code generation sequence so that a suitable failure warning is emitted. * config/arm/arm.c (arm_inhibit_load_speculation): New function (TARGET_INHIBIT_LOAD_SPECULATION): Redefine. * config/arm/unspec.md (VUNSPEC_NOSPECULATE): New unspec_volatile code. * config/arm/arm.md (cmp_ior): Make this pattern callable. (nospeculate, nospeculatedi): New patterns. --- gcc/config/arm/arm.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++ gcc/config/arm/arm.md | 40 ++++++++++++++++- gcc/config/arm/unspecs.md | 1 + 3 files changed, 148 insertions(+), 1 deletion(-) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7b3f4c1..393bfd6 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -308,6 +308,8 @@ static unsigned int arm_elf_section_type_flags (tree decl, const char *name, int reloc); static void arm_expand_divmod_libfunc (rtx, machine_mode, rtx, rtx, rtx *, rtx *); static machine_mode arm_floatn_mode (int, bool); +static rtx arm_inhibit_load_speculation (machine_mode, rtx, rtx, rtx, rtx, + rtx, rtx); /* Table of machine attributes. */ static const struct attribute_spec arm_attribute_table[] = @@ -766,6 +768,9 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2 +#undef TARGET_INHIBIT_LOAD_SPECULATION +#define TARGET_INHIBIT_LOAD_SPECULATION arm_inhibit_load_speculation + struct gcc_target targetm = TARGET_INITIALIZER; /* Obstack for minipool constant handling. */ @@ -31112,4 +31117,107 @@ arm_coproc_ldc_stc_legitimate_address (rtx op) } return false; } + +static rtx +arm_inhibit_load_speculation (machine_mode mode, rtx result, rtx mem, + rtx lower_bound, rtx upper_bound, + rtx fail_result, rtx cmpptr) +{ + rtx cond, comparison; + + /* We can't support this for Thumb1 as we have no suitable conditional + move operations. Nor do we support it for TImode. For both + these cases fall back to the generic code sequence which will emit + a suitable warning for us. */ + if (mode == TImode || TARGET_THUMB1) + return default_inhibit_load_speculation (mode, result, mem, lower_bound, + upper_bound, fail_result, cmpptr); + + + rtx target = gen_reg_rtx (mode); + rtx tgt2 = result; + + if (!register_operand (tgt2, mode)) + tgt2 = gen_reg_rtx (mode); + + if (!register_operand (cmpptr, ptr_mode)) + cmpptr = force_reg (ptr_mode, cmpptr); + + if (upper_bound == NULL) + { + if (!register_operand (lower_bound, ptr_mode)) + lower_bound = force_reg (ptr_mode, lower_bound); + + cond = arm_gen_compare_reg (LTU, cmpptr, lower_bound, NULL); + comparison = gen_rtx_LTU (VOIDmode, cond, const0_rtx); + } + else if (lower_bound == NULL) + { + if (!register_operand (upper_bound, ptr_mode)) + upper_bound = force_reg (ptr_mode, upper_bound); + + cond = arm_gen_compare_reg (GEU, cmpptr, upper_bound, NULL); + comparison = gen_rtx_GEU (VOIDmode, cond, const0_rtx); + } + else + { + /* We want to generate code for + result = (cmpptr < lower || cmpptr >= upper) ? 0 : *ptr; + Which can be recast to + result = (cmpptr < lower || upper <= cmpptr) ? 0 : *ptr; + which can be implemented as + cmp cmpptr, lower + cmpcs upper, cmpptr + bls 1f + ldr result, [ptr] + 1: + movls result, #0 + with suitable IT instructions as needed for thumb2. Later + optimization passes may make the load conditional. */ + + if (!register_operand (lower_bound, ptr_mode)) + lower_bound = force_reg (ptr_mode, lower_bound); + + if (!register_operand (upper_bound, ptr_mode)) + upper_bound = force_reg (ptr_mode, upper_bound); + + rtx comparison1 = gen_rtx_LTU (SImode, cmpptr, lower_bound); + rtx comparison2 = gen_rtx_LEU (SImode, upper_bound, cmpptr); + cond = gen_rtx_REG (arm_select_dominance_cc_mode (comparison1, + comparison2, + DOM_CC_X_OR_Y), + CC_REGNUM); + emit_insn (gen_cmp_ior (cmpptr, lower_bound, upper_bound, cmpptr, + comparison1, comparison2, cond)); + comparison = gen_rtx_NE (SImode, cond, const0_rtx); + } + + rtx_code_label *label = gen_label_rtx (); + emit_jump_insn (gen_arm_cond_branch (label, comparison, cond)); + emit_move_insn (target, mem); + emit_label (label); + + insn_code icode; + + switch (mode) + { + case QImode: icode = CODE_FOR_nospeculateqi; break; + case HImode: icode = CODE_FOR_nospeculatehi; break; + case SImode: icode = CODE_FOR_nospeculatesi; break; + case DImode: icode = CODE_FOR_nospeculatedi; break; + default: + gcc_unreachable (); + } + + if (! insn_operand_matches (icode, 4, fail_result)) + fail_result = force_reg (mode, fail_result); + + emit_insn (GEN_FCN (icode) (tgt2, comparison, cond, target, fail_result)); + + if (tgt2 != result) + emit_move_insn (result, tgt2); + + return result; +} + #include "gt-arm.h" diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index f9365cd..7a6c134 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -9511,7 +9511,7 @@ (set_attr "type" "multiple")] ) -(define_insn "*cmp_ior" +(define_insn "cmp_ior" [(set (match_operand 6 "dominant_cc_register" "") (compare (ior:SI @@ -12054,6 +12054,44 @@ [(set_attr "length" "4") (set_attr "type" "coproc")]) +(define_insn "nospeculate" + [(set (match_operand:QHSI 0 "s_register_operand" "=l,l,r") + (unspec_volatile:QHSI + [(match_operator 1 "arm_comparison_operator" + [(match_operand 2 "cc_register" "") (const_int 0)]) + (match_operand:QHSI 3 "s_register_operand" "0,0,0") + (match_operand:QHSI 4 "arm_not_operand" "I,K,r")] + VUNSPEC_NOSPECULATE))] + "TARGET_32BIT" + { + if (TARGET_THUMB) + return \"it\\t%d1\;mov%d1\\t%0, %4\;.inst 0xf3af8014\t%@ CSDB\"; + return \"mov%d1\\t%0, %4\;.inst 0xe320f014\t%@ CSDB\"; + } + [(set_attr "type" "mov_imm,mvn_imm,mov_reg") + (set_attr "conds" "use") + (set_attr "length" "8")] +) + +(define_insn "nospeculatedi" + [(set (match_operand:DI 0 "s_register_operand" "=r") + (unspec_volatile:DI + [(match_operator 1 "arm_comparison_operator" + [(match_operand 2 "cc_register" "") (const_int 0)]) + (match_operand:DI 3 "s_register_operand" "0") + (match_operand:DI 4 "arm_rhs_operand" "rI")] + VUNSPEC_NOSPECULATE))] + "TARGET_32BIT" + { + if (TARGET_THUMB) + return \"it\\t%d1\;mov%d1\\t%Q0, %Q4\;it\\t%d1\;mov%d1\\t%R0, %R4\;.inst 0xf3af8014\t%@ CSDB\"; + return \"mov%d1\\t%Q0, %Q4\;mov%d1\\t%R0, %R4\;.inst 0xe320f014\t%@ CSDB\"; + } + [(set_attr "type" "mov_reg") + (set_attr "conds" "use") + (set_attr "length" "12")] +) + ;; Vector bits common to IWMMXT and Neon (include "vec-common.md") ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 99cfa41..7f296ae 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -168,6 +168,7 @@ VUNSPEC_MCRR2 ; Represent the coprocessor mcrr2 instruction. VUNSPEC_MRRC ; Represent the coprocessor mrrc instruction. VUNSPEC_MRRC2 ; Represent the coprocessor mrrc2 instruction. + VUNSPEC_NOSPECULATE ; Represent a despeculation sequence. ]) ;; Enumerators for NEON unspecs.