From patchwork Tue Apr 29 02:37:21 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 29279 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pd0-f198.google.com (mail-pd0-f198.google.com [209.85.192.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 5BDB720553 for ; Tue, 29 Apr 2014 02:37:54 +0000 (UTC) Received: by mail-pd0-f198.google.com with SMTP id y13sf640517pdi.9 for ; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:x-original-sender :x-original-authentication-results:content-type; bh=UdQOYg8hqa0RiG75VdEYYHnLJmiE4aUEhIgrTULEcrQ=; b=Ek8Bqo3piQg8yj+mjNSNwYl7SDrxnGK7EIwjgmrzr9nh6O1PmCPGKk5uyyvYMPnVft tC1jPTKmu3A4hf9pK9Or+jV5ak6FHQ8GKovZAMBDxV6FkPG1hAm2LKAcNcnAm0FSiVsD ahiKH8OM3YrTczsAeD/w5PLoxUwsJFNAu5TPtmdCSgH+hABIIPA+lN90Q8tr6Ext85lh AM5UuADCo+eJGcibppjPru9qg1Vog3ub6l/yo5bjVBQ06p20CyWi15fexE4CW/Dv7Vql alwaFUo8RINEhLdRRT5XXjkoqRxJ2A1J964Iaohrozof8zkzxNYtiM5Infp5Ey1rtJjz KytQ== X-Gm-Message-State: ALoCoQnFkVFU7I/2MNobeQ4M07uOFfsHs9YLSVV0k5XUMvvlwpzifXsaAh9sIsQ+mSv0PFk7zEIy X-Received: by 10.66.156.66 with SMTP id wc2mr6668720pab.23.1398739073583; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.38.133 with SMTP id t5ls2839216qgt.43.gmail; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) X-Received: by 10.58.185.145 with SMTP id fc17mr27773603vec.14.1398739073455; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) Received: from mail-ve0-x234.google.com (mail-ve0-x234.google.com [2607:f8b0:400c:c01::234]) by mx.google.com with ESMTPS id jb7si4174133vec.143.2014.04.28.19.37.53 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 28 Apr 2014 19:37:53 -0700 (PDT) Received-SPF: none (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does not designate permitted sender hosts) client-ip=2607:f8b0:400c:c01::234; Received: by mail-ve0-f180.google.com with SMTP id jz11so8980075veb.39 for ; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) X-Received: by 10.221.74.200 with SMTP id yx8mr26646585vcb.3.1398739073115; Mon, 28 Apr 2014 19:37:53 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp161985vcb; Mon, 28 Apr 2014 19:37:52 -0700 (PDT) X-Received: by 10.66.141.144 with SMTP id ro16mr12455991pab.131.1398739071752; Mon, 28 Apr 2014 19:37:51 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id nl9si3806126pbc.481.2014.04.28.19.37.51 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 28 Apr 2014 19:37:51 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-366218-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 14079 invoked by alias); 29 Apr 2014 02:37:38 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 14055 invoked by uid 89); 29 Apr 2014 02:37:34 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f179.google.com Received: from mail-pd0-f179.google.com (HELO mail-pd0-f179.google.com) (209.85.192.179) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 29 Apr 2014 02:37:31 +0000 Received: by mail-pd0-f179.google.com with SMTP id y10so1462783pdj.38 for ; Mon, 28 Apr 2014 19:37:29 -0700 (PDT) X-Received: by 10.66.149.37 with SMTP id tx5mr29403699pab.81.1398739049579; Mon, 28 Apr 2014 19:37:29 -0700 (PDT) Received: from [10.1.1.2] (58-6-183-210.dyn.iinet.net.au. [58.6.183.210]) by mx.google.com with ESMTPSA id xr9sm101041352pab.5.2014.04.28.19.37.23 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 28 Apr 2014 19:37:25 -0700 (PDT) Message-ID: <535F1061.90908@linaro.org> Date: Tue, 29 Apr 2014 12:37:21 +1000 From: Kugan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Ramana Radhakrishnan CC: "gcc-patches@gcc.gnu.org" , Marcus Shawcroft , Richard Earnshaw Subject: Re: [RFC][AARCH64] TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook References: <535B9132.40703@linaro.org> <535E3503.3020203@arm.com> In-Reply-To: <535E3503.3020203@arm.com> X-IsSubscribed: yes X-Original-Sender: kugan.vivekanandarajah@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does not designate permitted sender hosts) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 On 28/04/14 21:01, Ramana Radhakrishnan wrote: > On 04/26/14 11:57, Kugan wrote: >> Attached patch implements TARGET_ATOMIC_ASSIGN_EXPAND_FENV for AARCH64. >> With this, atomic test-case gcc.dg/atomic/c11-atomic-exec-5.c now PASS. >> >> This implementation is based on SPARC and i386 implementations. >> >> Regression tested on qemu-aarch64 for aarch64-none-linux-gnu with no new >> regression. Is this OK for trunk? > > Again like A32 please test on hardware to make sure this behaves > correctly with c11-atomic-exec-5.c . > > If you don't have access to hardware, let us know : we'll take it for a > spin once you update the patch according to Marcus's comments. > Thanks for the review. I have updated the patch. I also have updated hold, clear and update to be exactly as in feholdexcpt.c, fclrexcpt.c and feupdateenv.c of glibc/ports/sysdeps/aarch64/fpu. I have limited real hardware access and just did a bootstrap and tested c11-atomic-exec-5.c alone to make sure that it PASS. I have also regression tested again on qemu-aarch64 for aarch64-none-linux-gnu with no new regressions. I will appreciate if you could do the regression testing on real hw. As for the ARM version of the patch, I did test the previous version for c11-atomic-exec-5.c and did verified it on chromebook before I posted the match . I have now updated the patch based on your review and the full bootstrap and regression testing is now under way. I will post the patch once the results are available. Thanks, Kugan +2014-04-29 Kugan Vivekanandarajah + + * config/aarch64/aarch64.c (TARGET_ATOMIC_ASSIGN_EXPAND_FENV): New + define. + * config/aarch64/aarch64-protos.h (aarch64_atomic_assign_expand_fenv): + New function declaration. + * config/aarch64/aarch64-builtins.c (aarch64_builtins) : Add + AARCH64_BUILTIN_GET_FPCR, AARCH64_BUILTIN_SET_FPCR. + AARCH64_BUILTIN_GET_FPSR and AARCH64_BUILTIN_SET_FPSR. + (aarch64_init_builtins) : Initialize builtins + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. + (aarch64_expand_builtin) : Expand builtins __builtins_aarch64_set_fpcr + __builtins_aarch64_get_fpcr, __builtins_aarch64_get_fpsr, + and __builtins_aarch64_set_fpsr. + (aarch64_atomic_assign_expand_fenv): New function. + * config/aarch64/aarch64.md (set_fpcr): New pattern. + (get_fpcr) : Likewise. + (set_fpsr) : Likewise. + (get_fpsr) : Likewise. + (unspecv): Add UNSPECV_GET_FPCR and UNSPECV_SET_FPCR, UNSPECV_GET_FPSR + and UNSPECV_SET_FPSR. + * doc/extend.texi (AARCH64 Built-in Functions) : Document + __builtins_aarch64_set_fpcr, __builtins_aarch64_get_fpcr. + __builtins_aarch64_set_fpsr and __builtins_aarch64_get_fpsr. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 55cfe0a..5cdc978 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -371,6 +371,12 @@ static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = { enum aarch64_builtins { AARCH64_BUILTIN_MIN, + + AARCH64_BUILTIN_GET_FPCR, + AARCH64_BUILTIN_SET_FPCR, + AARCH64_BUILTIN_GET_FPSR, + AARCH64_BUILTIN_SET_FPSR, + AARCH64_SIMD_BUILTIN_BASE, #include "aarch64-simd-builtins.def" AARCH64_SIMD_BUILTIN_MAX = AARCH64_SIMD_BUILTIN_BASE @@ -752,6 +758,24 @@ aarch64_init_simd_builtins (void) void aarch64_init_builtins (void) { + tree ftype_set_fpr + = build_function_type_list (void_type_node, unsigned_type_node, NULL); + tree ftype_get_fpr + = build_function_type_list (unsigned_type_node, NULL); + + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR] + = add_builtin_function ("__builtin_aarch64_get_fpcr", ftype_get_fpr, + AARCH64_BUILTIN_GET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR] + = add_builtin_function ("__builtin_aarch64_set_fpcr", ftype_set_fpr, + AARCH64_BUILTIN_SET_FPCR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR] + = add_builtin_function ("__builtin_aarch64_get_fpsr", ftype_get_fpr, + AARCH64_BUILTIN_GET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR] + = add_builtin_function ("__builtin_aarch64_set_fpsr", ftype_set_fpr, + AARCH64_BUILTIN_SET_FPSR, BUILT_IN_MD, NULL, NULL_TREE); + if (TARGET_SIMD) aarch64_init_simd_builtins (); } @@ -964,6 +988,36 @@ aarch64_expand_builtin (tree exp, { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); int fcode = DECL_FUNCTION_CODE (fndecl); + int icode; + rtx pat, op0; + tree arg0; + + switch (fcode) + { + case AARCH64_BUILTIN_GET_FPCR: + case AARCH64_BUILTIN_SET_FPCR: + case AARCH64_BUILTIN_GET_FPSR: + case AARCH64_BUILTIN_SET_FPSR: + if ((fcode == AARCH64_BUILTIN_GET_FPCR) + || (fcode == AARCH64_BUILTIN_GET_FPSR)) + { + icode = (fcode == AARCH64_BUILTIN_GET_FPSR) ? + CODE_FOR_get_fpsr : CODE_FOR_get_fpcr; + target = gen_reg_rtx (SImode); + pat = GEN_FCN (icode) (target); + } + else + { + target = NULL_RTX; + icode = (fcode == AARCH64_BUILTIN_SET_FPSR) ? + CODE_FOR_set_fpsr : CODE_FOR_set_fpcr; + arg0 = CALL_EXPR_ARG (exp, 0); + op0 = expand_normal (arg0); + pat = GEN_FCN (icode) (op0); + } + emit_insn (pat); + return target; + } if (fcode >= AARCH64_SIMD_BUILTIN_BASE) return aarch64_simd_expand_builtin (fcode, exp, target); @@ -1196,6 +1250,103 @@ aarch64_gimple_fold_builtin (gimple_stmt_iterator *gsi) return changed; } +void +aarch64_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) +{ + const unsigned FE_INVALID = 1; + const unsigned FE_DIVBYZERO = 2; + const unsigned FE_OVERFLOW = 4; + const unsigned FE_UNDERFLOW = 8; + const unsigned FE_INEXACT = 16; + const unsigned HOST_WIDE_INT FE_ALL_EXCEPT = (FE_INVALID | FE_DIVBYZERO + | FE_OVERFLOW | FE_UNDERFLOW + | FE_INEXACT); + const unsigned HOST_WIDE_INT FE_EXCEPT_SHIFT = 8; + tree fenv_cr, fenv_sr, get_fpcr, set_fpcr, mask_cr, mask_sr; + tree ld_fenv_cr, ld_fenv_sr, masked_fenv_cr, masked_fenv_sr, hold_fnclex_cr; + tree hold_fnclex_sr, tmp_var, reload_fenv, restore_fnenv, get_fpsr, set_fpsr; + tree update_call, atomic_feraiseexcept, hold_fnclex, masked_fenv, ld_fenv; + + /* Generate the equivalence of : + unsigned int fenv_cr; + fenv_cr = __builtin_aarch64_get_fpcr (); + + unsigned int fenv_sr; + fenv_sr = __builtin_aarch64_get_fpsr (); + + Now set all exceptions to non-stop + unsigned int mask_cr = ~(FE_ALL_EXCEPT << FE_EXCEPT_SHIFT); + unsigned int masked_cr; + masked_cr = fenv_cr & mask_cr; + + And clear all exception flags + unsigned int maske_sr = ~FE_ALL_EXCEPT; + unsigned int masked_cr; + masked_sr = fenv_sr & mask_sr; + + __builtin_aarch64_set_cr (masked_cr); + __builtin_aarch64_set_sr (masked_sr); */ + + fenv_cr = create_tmp_var (unsigned_type_node, NULL); + fenv_sr = create_tmp_var (unsigned_type_node, NULL); + + get_fpcr = aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPCR]; + set_fpcr = aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPCR]; + get_fpsr = aarch64_builtin_decls[AARCH64_BUILTIN_GET_FPSR]; + set_fpsr = aarch64_builtin_decls[AARCH64_BUILTIN_SET_FPSR]; + + mask_cr = build_int_cst (unsigned_type_node, + ~(FE_ALL_EXCEPT << FE_EXCEPT_SHIFT)); + mask_sr = build_int_cst (unsigned_type_node, + ~(FE_ALL_EXCEPT)); + + ld_fenv_cr = build2 (MODIFY_EXPR, unsigned_type_node, + fenv_cr, build_call_expr (get_fpcr, 0)); + ld_fenv_sr = build2 (MODIFY_EXPR, unsigned_type_node, + fenv_sr, build_call_expr (get_fpsr, 0)); + + masked_fenv_cr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_cr, mask_cr); + masked_fenv_sr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_sr, mask_sr); + + hold_fnclex_cr = build_call_expr (set_fpcr, 1, masked_fenv_cr); + hold_fnclex_sr = build_call_expr (set_fpsr, 1, masked_fenv_sr); + + hold_fnclex = build2 (COMPOUND_EXPR, void_type_node, hold_fnclex_cr, + hold_fnclex_sr); + masked_fenv = build2 (COMPOUND_EXPR, void_type_node, masked_fenv_cr, + masked_fenv_sr); + ld_fenv = build2 (COMPOUND_EXPR, void_type_node, ld_fenv_cr, ld_fenv_sr); + + *hold = build2 (COMPOUND_EXPR, void_type_node, + build2 (COMPOUND_EXPR, void_type_node, masked_fenv, ld_fenv), + hold_fnclex); + + /* Store the value of masked_fenv to clear the exceptions: + __builtin_aarch64_set_fpcr (masked_sr); */ + + *clear = build_call_expr (set_fpsr, 1, masked_fenv_sr); + + /* Generate the equivalent of : + unsigned int tmp2_var; + tmp_var = __builtin_aarch64_get_fpsr (); + + __builtin_aarch64_set_fpsr (fenv_sr); + + __atomic_feraiseexcept (tmp_var); */ + + tmp_var = create_tmp_var (unsigned_type_node, NULL); + reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node, + tmp_var, build_call_expr (get_fpsr, 0)); + restore_fnenv = build_call_expr (set_fpsr, 1, fenv_sr); + atomic_feraiseexcept = builtin_decl_implicit (BUILT_IN_ATOMIC_FERAISEEXCEPT); + update_call = build_call_expr (atomic_feraiseexcept, 1, + fold_convert (integer_type_node, tmp_var)); + *update = build2 (COMPOUND_EXPR, void_type_node, + build2 (COMPOUND_EXPR, void_type_node, + reload_fenv, restore_fnenv), update_call); +} + + #undef AARCH64_CHECK_BUILTIN_MODE #undef AARCH64_FIND_FRINT_VARIANT #undef BUILTIN_DX diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 5542f02..f4f3f61 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -289,4 +289,5 @@ extern void aarch64_split_combinev16qi (rtx operands[3]); extern void aarch64_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel); extern bool aarch64_expand_vec_perm_const (rtx target, rtx op0, rtx op1, rtx sel); +void aarch64_atomic_assign_expand_fenv (tree *, tree *, tree *); #endif /* GCC_AARCH64_PROTOS_H */ diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a3147ee..fbbdc23 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8488,6 +8488,10 @@ aarch64_cannot_change_mode_class (enum machine_mode from, #define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES \ aarch64_autovectorize_vector_sizes +#undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV +#define TARGET_ATOMIC_ASSIGN_EXPAND_FENV \ + aarch64_atomic_assign_expand_fenv + /* Section anchor support. */ #undef TARGET_MIN_ANCHOR_OFFSET diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c86a29d..24f235f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -107,6 +107,10 @@ (define_c_enum "unspecv" [ UNSPECV_EH_RETURN ; Represent EH_RETURN + UNSPECV_GET_FPCR ; Represent fetch of FPCR content. + UNSPECV_SET_FPCR ; Represent assign of FPCR content. + UNSPECV_GET_FPSR ; Represent fetch of FPSR content. + UNSPECV_SET_FPSR ; Represent assign of FPSR content. ] ) @@ -3635,6 +3639,37 @@ DONE; }) +;; Write Floating-point Control Register. +(define_insn "set_fpcr" + [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPCR)] + "" + "msr\\tfpcr, %0" + [(set_attr "type" "mrs")]) + +;; Read Floating-point Control Register. +(define_insn "get_fpcr" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec_volatile:SI [(const_int 0)] UNSPECV_GET_FPCR))] + "" + "mrs\\t%0, fpcr" + [(set_attr "type" "mrs")]) + +;; Write Floating-point Status Register. +(define_insn "set_fpsr" + [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] UNSPECV_SET_FPSR)] + "" + "msr\\tfpsr, %0" + [(set_attr "type" "mrs")]) + +;; Read Floating-point Status Register. +(define_insn "get_fpsr" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec_volatile:SI [(const_int 0)] UNSPECV_GET_FPSR))] + "" + "mrs\\t%0, fpsr" + [(set_attr "type" "mrs")]) + + ;; AdvSIMD Stuff (include "aarch64-simd.md") diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 347a94a..8bd13f3 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -9107,6 +9107,7 @@ to those machines. Generally these generate calls to specific machine instructions, but allow the compiler to schedule those calls. @menu +* AARCH64 Built-in Functions:: * Alpha Built-in Functions:: * Altera Nios II Built-in Functions:: * ARC Built-in Functions:: @@ -9139,6 +9140,18 @@ instructions, but allow the compiler to schedule those calls. * TILEPro Built-in Functions:: @end menu +@node AARCH64 Built-in Functions +@subsection AARCH64 Built-in Functions + +These built-in functions are available for the AARCH64 family of +processors. +@smallexample +unsigned int __builtin_aarch64_get_fpcr () +void __builtin_aarch64_set_fpcr (unsigned int) +unsigned int __builtin_aarch64_get_fpsr () +void __builtin_aarch64_set_fpsr (unsigned int) +@end smallexample + @node Alpha Built-in Functions @subsection Alpha Built-in Functions