From patchwork Tue Nov 7 04:10:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 118101 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp3516103qgn; Mon, 6 Nov 2017 20:10:30 -0800 (PST) X-Google-Smtp-Source: ABhQp+TzE+aipHI1V4o5BJkaoVw0llPXibJ3dmc9KYr5kZLZvDs+taQ6v3IOH9rB0dknxJW9LT+Q X-Received: by 10.98.29.211 with SMTP id d202mr18827317pfd.49.1510027830469; Mon, 06 Nov 2017 20:10:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510027830; cv=none; d=google.com; s=arc-20160816; b=xGOExUEWKkO+NqIe5mnVAdw/qVp+UsSHt1G2noJgJK9RlCaTY9wwE7eBN7mBEGw8rg RiQnHXbQNi62hbyu3qXqfDAYXkB7PZ+NcYJu6eIpYJt8VHTO8tkQVWbKOx/PdF5wG6L/ 7S1DoaFr2VCRVk0yIEcXYaNieTB08ajfFgsIYefHVM/Xb6772zuDkwlIJqdVlXdFnmyT djqN15excPl9EDypqJhUYo7T+8hBlktQq8feyvGWBoYQPIrgW8Poks6UEk4SHF8DSbPO ebKVzmURVjAUK5IeX0Gw+WE+vV5nVrPn8EdHJQ/nGFbtyz79OLJwtHXxrAy9eqhNk0Zr 2THA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:delivered-to:sender :list-help:list-post:list-archive:list-unsubscribe:list-id :precedence:mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=jeLHZMgn4k2dORmQhMNo0fr5jgqVH8CFrD6mQy0lkMg=; b=cN9S5dMvc6EPjFfIm2w1K7u+IbGUaRQNp16Sc+c1NDsyuOfuJRWvWYi+RQlOtXGCvi SSNPnJHGCFT2VJsoHBt+i9X6bbd6vGF0UJ22+1e8XwCPUoeoWOl1Hw3cKfrW1Bvf+Hbz h8vz2pkmpUVr2z7J9TtGn0Oc72c5NmudeMqxz8xb3IY3KtiRSib3u6DQYhEnvtK/jQY6 jw7iisyGJbG2bdmW9INIANdFBjHn/P76EzBsLQp3XljMkkG1hb08hyEtJ+Uw/+BCJpGp l2HS+p/NwR/O9jLlEYtXyLjJdrIBx8yirvFqo5YS9E2v90lUivTR/ay2xSXpxWZROq4U Umpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=SpaM+YuL; spf=pass (google.com: domain of gcc-patches-return-466093-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-466093-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 1si280551pli.120.2017.11.06.20.10.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Nov 2017 20:10:30 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-466093-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=SpaM+YuL; spf=pass (google.com: domain of gcc-patches-return-466093-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-466093-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=haeaTBcYO5maqwyLDJjbCKtqkZAgQcgmByrft/S3OyuajA xpwYie7djssR2+RypCPmbDIuUOwOH7Z39nOj74yo+ZG7dZZ729oz8EvMCTNJQLf8 RZ048LsWUw8bvhJuwndqIFk3iRbD1YtzvUJhCTVkQsLZKH4PIDdl9ISh+ZsMY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=1wJEjdbs3ZgzdzlykIU3LVzByOg=; b=SpaM+YuLdYilX7e50r8M R+JyVAjfT5ThKIpUI3yo4UKHBADkKbg1Z3jPlHqdxf7Sg7MlffYryTDF5tr8/lmS gxxfxdTcO0J8cdOHwRoPdWaCDx6+dI+BptyENVGt3VlLvRQNcPJEAHRolmmQSbbb aHAoW8OdCGC8Agxa6+R/JZc= Received: (qmail 53244 invoked by alias); 7 Nov 2017 04:10:10 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 53203 invoked by uid 89); 7 Nov 2017 04:10:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=vld1, vst1, 5674, 5634 X-HELO: mail-qt0-f173.google.com Received: from mail-qt0-f173.google.com (HELO mail-qt0-f173.google.com) (209.85.216.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 07 Nov 2017 04:10:06 +0000 Received: by mail-qt0-f173.google.com with SMTP id f8so13732295qta.5 for ; Mon, 06 Nov 2017 20:10:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=BU/Li75YYrHngXskNnMUbUJB7u9sbr6G0QL17Pg3fAI=; b=Eq1DzbyZo29X/kLdLGkLYOlkbHQ6MWekUXmUJrAk1ajHko/y7dLKjFm0OUj0tMclqo elgYqqxFwwCIvUyo0EkUuAZ8l+MrnMoqKUaVx+hj/yNGLNnWT8WoQPizH5ljIUfy/Lfl 5hcwV8Ggnxb1uPrOeqStQp0dqgOJioNhBlZSGRKuwCWWdJLQBgIamPvFrdvrpvKos+OW /8bWZ4eIy+kXjb5kcmcZZ50Aw6NqRy0ssExgLCBCpz+pwDDJ8r/EIgF/J/8IKxCRmHJ9 NvZoKNhzhHMFeTQisXNm+ktEJmK39k/ZMC/AfhoONqMv7FjXdTU5wS8PRQaI5jssnk0d vawg== X-Gm-Message-State: AJaThX5hNMTFyVPvzYdRwoXTUZO2WR8vjxbRzgcI2TlieyLk4OKqQo+O wAFROZjSQFYEdINYkVFyeYnbA6uU7vjcqWKbGxGEM61dD20= X-Received: by 10.200.48.241 with SMTP id w46mr26010195qta.172.1510027804362; Mon, 06 Nov 2017 20:10:04 -0800 (PST) MIME-Version: 1.0 Received: by 10.200.36.59 with HTTP; Mon, 6 Nov 2017 20:10:03 -0800 (PST) From: Kugan Vivekanandarajah Date: Tue, 7 Nov 2017 15:10:03 +1100 Message-ID: Subject: [AARCH64] implements neon vld1_*_x2 intrinsics To: "gcc-patches@gcc.gnu.org" , James Greenhalgh , Richard Earnshaw X-IsSubscribed: yes Hi, Attached patch implements the vld1_*_x2 intrinsics as defined by the neon document. Bootstrap for the latest patch is ongoing on aarch64-linux-gnu. Is this OK for trunk if no regressions? Thanks, Kugan gcc/ChangeLog: 2017-11-06 Kugan Vivekanandarajah * config/aarch64/aarch64-simd.md (aarch64_ld1x2): New. (aarch64_ld1x2): Likewise. (aarch64_simd_ld1_x2): Likewise. (aarch64_simd_ld1_x2): Likewise. * config/aarch64/arm_neon.h (vld1_u8_x2): New. (vld1_s8_x2): Likewise. (vld1_u16_x2): Likewise. (vld1_s16_x2): Likewise. (vld1_u32_x2): Likewise. (vld1_s32_x2): Likewise. (vld1_u64_x2): Likewise. (vld1_s64_x2): Likewise. (vld1_f16_x2): Likewise. (vld1_f32_x2): Likewise. (vld1_f64_x2): Likewise. (vld1_p8_x2): Likewise. (vld1_p16_x2): Likewise. (vld1_p64_x2): Likewise. (vld1q_u8_x2): Likewise. (vld1q_s8_x2): Likewise. (vld1q_u16_x2): Likewise. (vld1q_s16_x2): Likewise. (vld1q_u32_x2): Likewise. (vld1q_s32_x2): Likewise. (vld1q_u64_x2): Likewise. (vld1q_s64_x2): Likewise. (vld1q_f16_x2): Likewise. (vld1q_f32_x2): Likewise. (vld1q_f64_x2): Likewise. (vld1q_p8_x2): Likewise. (vld1q_p16_x2): Likewise. (vld1q_p64_x2): Likewise. gcc/testsuite/ChangeLog: 2017-11-06 Kugan Vivekanandarajah * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: New test. >From dfdd8eba9fb49a776cdf8d82c0e34db0fb30d1b5 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah Date: Sat, 30 Sep 2017 04:51:08 +1000 Subject: [PATCH] add missing ld1 x2 builtins --- gcc/config/aarch64/aarch64-simd-builtins.def | 6 +- gcc/config/aarch64/aarch64-simd.md | 48 +++ gcc/config/aarch64/arm_neon.h | 336 +++++++++++++++++++++ .../gcc.target/aarch64/advsimd-intrinsics/vld1x2.c | 71 +++++ 4 files changed, 460 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index d713d5d..90736ba 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -86,6 +86,10 @@ VAR1 (SETREGP, set_qregoi, 0, v2di) VAR1 (SETREGP, set_qregci, 0, v2di) VAR1 (SETREGP, set_qregxi, 0, v2di) + /* Implemented by aarch64_ld1x2. */ + BUILTIN_VQ (LOADSTRUCT, ld1x2, 0) + /* Implemented by aarch64_ld1x2. */ + BUILTIN_VDC (LOADSTRUCT, ld1x2, 0) /* Implemented by aarch64_ld. */ BUILTIN_VDC (LOADSTRUCT, ld2, 0) BUILTIN_VDC (LOADSTRUCT, ld3, 0) @@ -563,4 +567,4 @@ BUILTIN_GPI (UNOP, fix_truncdf, 2) BUILTIN_GPI_I16 (UNOPUS, fixuns_trunchf, 2) BUILTIN_GPI (UNOPUS, fixuns_truncsf, 2) - BUILTIN_GPI (UNOPUS, fixuns_truncdf, 2) \ No newline at end of file + BUILTIN_GPI (UNOPUS, fixuns_truncdf, 2) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 70e9339..a7ed594 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -5071,6 +5071,33 @@ DONE; }) +(define_expand "aarch64_ld1x2" + [(match_operand:OI 0 "register_operand" "=w") + (match_operand:DI 1 "register_operand" "r") + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + "TARGET_SIMD" +{ + machine_mode mode = OImode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + + emit_insn (gen_aarch64_simd_ld1_x2 (operands[0], mem)); + DONE; +}) + +(define_expand "aarch64_ld1x2" + [(match_operand:OI 0 "register_operand" "=w") + (match_operand:DI 1 "register_operand" "r") + (unspec:VDC [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + "TARGET_SIMD" +{ + machine_mode mode = OImode; + rtx mem = gen_rtx_MEM (mode, operands[1]); + + emit_insn (gen_aarch64_simd_ld1_x2 (operands[0], mem)); + DONE; +}) + + (define_expand "aarch64_ld_lane" [(match_operand:VSTRUCT 0 "register_operand" "=w") (match_operand:DI 1 "register_operand" "w") @@ -5458,6 +5485,27 @@ [(set_attr "type" "neon_load1_all_lanes")] ) +(define_insn "aarch64_simd_ld1_x2" + [(set (match_operand:OI 0 "register_operand" "=w") + (unspec:OI [(match_operand:OI 1 "aarch64_simd_struct_operand" "Utv") + (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD1))] + "TARGET_SIMD" + "ld1\\t{%S0. - %T0.}, %1" + [(set_attr "type" "neon_load1_2reg")] +) + +(define_insn "aarch64_simd_ld1_x2" + [(set (match_operand:OI 0 "register_operand" "=w") + (unspec:OI [(match_operand:OI 1 "aarch64_simd_struct_operand" "Utv") + (unspec:VDC [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD1))] + "TARGET_SIMD" + "ld1\\t{%S0. - %T0.}, %1" + [(set_attr "type" "neon_load1_2reg")] +) + + (define_insn "aarch64_frecpe" [(set (match_operand:VHSDF 0 "register_operand" "=w") (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index d7b30b0..0f49cfd 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -17228,6 +17228,342 @@ vld1q_u8 (const uint8_t *a) __builtin_aarch64_ld1v16qi ((const __builtin_aarch64_simd_qi *) a); } +__extension__ extern __inline uint8x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_u8_x2 (const uint8_t *__a) +{ + uint8x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (uint8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0); + ret.val[1] = (uint8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1); + return ret; +} + +__extension__ extern __inline int8x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_s8_x2 (const uint8_t *__a) +{ + int8x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0); + ret.val[1] = (int8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1); + return ret; +} + +__extension__ extern __inline uint16x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_u16_x2 (const uint16_t *__a) +{ + uint16x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (uint16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0); + ret.val[1] = (uint16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1); + return ret; +} + +__extension__ extern __inline int16x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_s16_x2 (const int16_t *__a) +{ + int16x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0); + ret.val[1] = (int16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1); + return ret; +} + +__extension__ extern __inline uint32x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_u32_x2 (const uint32_t *__a) +{ + uint32x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) __a); + ret.val[0] = (uint32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0); + ret.val[1] = (uint32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1); + return ret; +} + +__extension__ extern __inline int32x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_s32_x2 (const uint32_t *__a) +{ + int32x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2si ((const __builtin_aarch64_simd_si *) __a); + ret.val[0] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 0); + ret.val[1] = (int32x2_t) __builtin_aarch64_get_dregoiv2si (__o, 1); + return ret; +} + +__extension__ extern __inline uint64x1x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_u64_x2 (const uint64_t *__a) +{ + uint64x1x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (uint64x1_t) __builtin_aarch64_get_dregoidi (__o, 0); + ret.val[1] = (uint64x1_t) __builtin_aarch64_get_dregoidi (__o, 1); + return ret; +} + +__extension__ extern __inline int64x1x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_s64_x2 (const int64_t *__a) +{ + int64x1x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (int64x1_t) __builtin_aarch64_get_dregoidi (__o, 0); + ret.val[1] = (int64x1_t) __builtin_aarch64_get_dregoidi (__o, 1); + return ret; +} + +__extension__ extern __inline float16x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_f16_x2 (const float16_t *__a) +{ + float16x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4hf ((const __builtin_aarch64_simd_hf *) __a); + ret.val[0] = (float16x4_t) __builtin_aarch64_get_dregoiv4hf (__o, 0); + ret.val[1] = (float16x4_t) __builtin_aarch64_get_dregoiv4hf (__o, 1); + return ret; +} + +__extension__ extern __inline float32x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_f32_x2 (const float32_t *__a) +{ + float32x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2sf ((const __builtin_aarch64_simd_sf *) __a); + ret.val[0] = (float32x2_t) __builtin_aarch64_get_dregoiv2sf (__o, 0); + ret.val[1] = (float32x2_t) __builtin_aarch64_get_dregoiv2sf (__o, 1); + return ret; +} + +__extension__ extern __inline float64x1x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_f64_x2 (const float64_t *__a) +{ + float64x1x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2df ((const __builtin_aarch64_simd_df *) __a); + ret.val[0] = (float64x1_t) {__builtin_aarch64_get_dregoidf (__o, 0)}; + ret.val[1] = (float64x1_t) {__builtin_aarch64_get_dregoidf (__o, 1)}; + return ret; +} + +__extension__ extern __inline poly8x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_p8_x2 (const poly8_t *__a) +{ + poly8x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (poly8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 0); + ret.val[1] = (poly8x8_t) __builtin_aarch64_get_dregoiv8qi (__o, 1); + return ret; +} + +__extension__ extern __inline poly16x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_p16_x2 (const poly16_t *__a) +{ + poly16x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (poly16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 0); + ret.val[1] = (poly16x4_t) __builtin_aarch64_get_dregoiv4hi (__o, 1); + return ret; +} + +__extension__ extern __inline poly64x1x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1_p64_x2 (const poly64_t *__a) +{ + poly64x1x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (poly64x1_t) __builtin_aarch64_get_dregoidi (__o, 0); + ret.val[1] = (poly64x1_t) __builtin_aarch64_get_dregoidi (__o, 1); + return ret; +} + +__extension__ extern __inline uint8x16x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_u8_x2 (const uint8_t *__a) +{ + uint8x16x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v16qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (uint8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 0); + ret.val[1] = (uint8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 1); + return ret; +} + +__extension__ extern __inline int8x16x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_s8_x2 (const int8_t *__a) +{ + int8x16x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v16qi ((const __builtin_aarch64_simd_qi *) __a); + ret.val[0] = (int8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 0); + ret.val[1] = (int8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 1); + return ret; +} + +__extension__ extern __inline uint16x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_u16_x2 (const uint16_t *__a) +{ + uint16x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (uint16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 0); + ret.val[1] = (uint16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 1); + return ret; +} + +__extension__ extern __inline int16x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_s16_x2 (const int16_t *__a) +{ + int16x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (int16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 0); + ret.val[1] = (int16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 1); + return ret; +} + +__extension__ extern __inline uint32x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_u32_x2 (const uint32_t *__a) +{ + uint32x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4si ((const __builtin_aarch64_simd_si *) __a); + ret.val[0] = (uint32x4_t) __builtin_aarch64_get_qregoiv4si (__o, 0); + ret.val[1] = (uint32x4_t) __builtin_aarch64_get_qregoiv4si (__o, 1); + return ret; +} + +__extension__ extern __inline int32x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_s32_x2 (const int32_t *__a) +{ + int32x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4si ((const __builtin_aarch64_simd_si *) __a); + ret.val[0] = (int32x4_t) __builtin_aarch64_get_qregoiv4si (__o, 0); + ret.val[1] = (int32x4_t) __builtin_aarch64_get_qregoiv4si (__o, 1); + return ret; +} + +__extension__ extern __inline uint64x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_u64_x2 (const uint64_t *__a) +{ + uint64x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (uint64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 0); + ret.val[1] = (uint64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 1); + return ret; +} + +__extension__ extern __inline int64x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_s64_x2 (const int64_t *__a) +{ + int64x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (int64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 0); + ret.val[1] = (int64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 1); + return ret; +} + +__extension__ extern __inline float16x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_f16_x2 (const float16_t *__a) +{ + float16x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8hf ((const __builtin_aarch64_simd_hf *) __a); + ret.val[0] = (float16x8_t) __builtin_aarch64_get_qregoiv8hf (__o, 0); + ret.val[1] = (float16x8_t) __builtin_aarch64_get_qregoiv8hf (__o, 1); + return ret; +} + +__extension__ extern __inline float32x4x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_f32_x2 (const float32_t *__a) +{ + float32x4x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v4sf ((const __builtin_aarch64_simd_sf *) __a); + ret.val[0] = (float32x4_t) __builtin_aarch64_get_qregoiv4sf (__o, 0); + ret.val[1] = (float32x4_t) __builtin_aarch64_get_qregoiv4sf (__o, 1); + return ret; +} + +__extension__ extern __inline float64x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_f64_x2 (const float64_t *__a) +{ + float64x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2df ((const __builtin_aarch64_simd_df *) __a); + ret.val[0] = (float64x2_t) __builtin_aarch64_get_qregoiv2df (__o, 0); + ret.val[1] = (float64x2_t) __builtin_aarch64_get_qregoiv2df (__o, 1); + return ret; +} + +__extension__ extern __inline poly8x16x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_p8_x2 (const poly8_t *__a) +{ + poly8x16x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v16qi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (poly8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 0); + ret.val[1] = (poly8x16_t) __builtin_aarch64_get_qregoiv16qi (__o, 1); + return ret; +} + +__extension__ extern __inline poly16x8x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_p16_x2 (const poly16_t *__a) +{ + poly16x8x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v8hi ((const __builtin_aarch64_simd_hi *) __a); + ret.val[0] = (poly16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 0); + ret.val[1] = (poly16x8_t) __builtin_aarch64_get_qregoiv8hi (__o, 1); + return ret; +} + +__extension__ extern __inline poly64x2x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vld1q_p64_x2 (const poly64_t *__a) +{ + poly64x2x2_t ret; + __builtin_aarch64_simd_oi __o; + __o = __builtin_aarch64_ld1x2v2di ((const __builtin_aarch64_simd_di *) __a); + ret.val[0] = (poly64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 0); + ret.val[1] = (poly64x2_t) __builtin_aarch64_get_qregoiv2di (__o, 1); + return ret; +} + __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vld1q_u16 (const uint16_t *a) diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c new file mode 100644 index 0000000..0a43d0d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c @@ -0,0 +1,71 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#include + +extern void abort (void); + +#define TESTMETH(BASE, ELTS, SUFFIX) \ +int __attribute__ ((noinline)) \ +test_vld##SUFFIX##_x2 () \ +{ \ + BASE##_t data[ELTS * 2]; \ + BASE##_t temp[ELTS * 2]; \ + BASE##x##ELTS##x##2##_t vectors; \ + int i,j; \ + for (i = 0; i < ELTS * 2; i++) \ + data [i] = (BASE##_t) 2*i + 1; \ + asm volatile ("" : : : "memory"); \ + vectors = vld1##SUFFIX##_x2 (data); \ + vst1##SUFFIX (temp, vectors.val[0]); \ + vst1##SUFFIX (&temp[ELTS], vectors.val[1]); \ + asm volatile ("" : : : "memory"); \ + for (j = 0; j < ELTS * 2; j++) \ + if (temp[j] != data[j]) \ + return 1; \ + return 0; \ +} + +#define VARIANTS(VARIANT) \ +VARIANT (uint8, 8, _u8) \ +VARIANT (uint16, 4, _u16) \ +VARIANT (uint32, 2, _u32) \ +VARIANT (uint64, 1, _u64) \ +VARIANT (int8, 8, _s8) \ +VARIANT (int16, 4, _s16) \ +VARIANT (int32, 2, _s32) \ +VARIANT (int64, 1, _s64) \ +VARIANT (poly8, 8, _p8) \ +VARIANT (poly16, 4, _p16) \ +VARIANT (float16, 4, _f16) \ +VARIANT (float32, 2, _f32) \ +VARIANT (float64, 1, _f64) \ +VARIANT (uint8, 16, q_u8) \ +VARIANT (uint16, 8, q_u16) \ +VARIANT (uint32, 4, q_u32) \ +VARIANT (uint64, 2, q_u64) \ +VARIANT (int8, 16, q_s8) \ +VARIANT (int16, 8, q_s16) \ +VARIANT (int32, 4, q_s32) \ +VARIANT (int64, 2, q_s64) \ +VARIANT (poly8, 16, q_p8) \ +VARIANT (poly16, 8, q_p16) \ +VARIANT (float16, 8, q_f16) \ +VARIANT (float32, 4, q_f32) \ +VARIANT (float64, 2, q_f64) + +/* Tests of vld1_x2 and vld1q_x2. */ +VARIANTS (TESTMETH) + +#define CHECK(BASE, ELTS, SUFFIX) \ + if (test_vld##SUFFIX##_x2 () != 0) \ + abort (); + +int +main (int argc, char **argv) +{ + VARIANTS (CHECK) + + return 0; +} + -- 2.7.4