diff mbox

[AArch64,ARM,PATCHv2,3/3] Add tests for missing Poly64_t intrinsics to GCC

Message ID VI1PR0801MB20314851CA0B552AE1AB5015FF8D0@VI1PR0801MB2031.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Tamar Christina Nov. 29, 2016, 9:50 a.m. UTC
Hi All,

The new patch contains the proper types for the intrinsics that should be returning uint64x1
and has the rest of the comments by Christophe in them.

Kind Regards,
Tamar

________________________________________
From: Tamar Christina

Sent: Friday, November 25, 2016 4:01:30 PM
To: Christophe Lyon
Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd
Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

 >

> > A few comments about this new version:

> > * arm-neon-ref.h: why do you create

> CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64?

> > Can't you just add calls to CHECK_CRYPTO in the existing

> > CHECK_RESULTS_NAMED_NO_FP16?


Yes, that should be fine, I didn't used to have CHECK_CRYPTO before and when I added it
I didn't remove the split. I'll do it now.

> >

> > * p64_p128:

> > From what I can see ARM and AArch64 differ on the vceq variants

> > available with poly64.

> > For ARM, arm_neon.h contains: uint64x1_t vceq_p64 (poly64x1_t __a,

> > poly64x1_t __b) For AArch64, I can't see vceq_p64 in arm_neon.h? ...

> > Actually I've just noticed the other you submitted while I was writing

> > this, where you add vceq_p64 for aarch64, but it still returns

> > uint64_t.

> > Why do you change the vceq_64 test to return poly64_t instead of

> uint64_t?


This patch is slightly outdated. The correct type is `uint64_t` but when it was noticed
This patch was already sent. New one coming soon.

> >

> > Why do you add #ifdef __aarch64 before vldX_p64 tests and until vsli_p64?

> >


This is wrong, remove them. It was supposed to be around the vldX_lane_p64 tests.

> > The comment /* vget_lane_p64 tests.  */ is wrong before VLDX_LANE

> > tests

> >

> > You need to protect the new vmov, vget_high and vget_lane tests with

> > #ifdef __aarch64__.

> >


vget_lane is already in an #ifdef, vmov you're right, but I also notice that the
test calls VDUP instead of VMOV, which explains why I didn't get a test failure.

Thanks for the feedback,
I'll get these updated.

>

> Actually, vget_high_p64 exists on arm, so no need for the #fidef for it.

>

>

> > Christophe

> >

> >> Kind regards,

> >> Tamar

> >> ________________________________________

> >> From: Tamar Christina

> >> Sent: Tuesday, November 8, 2016 11:58:46 AM

> >> To: Christophe Lyon

> >> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard

> >> Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

> >> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing

> >> Poly64_t intrinsics to GCC

> >>

> >> Hi Christophe,

> >>

> >> Thanks for the review!

> >>

> >>>

> >>> A while ago I added p64_p128.c, to contain all the poly64/128 tests

> >>> except for vreinterpret.

> >>> Why do you need to create p64.c ?

> >>

> >> I originally created it because I had a much smaller set of

> >> intrinsics that I wanted to add initially, this grew and It hadn't occurred to

> me that I can use the existing file now.

> >>

> >> Another reason was the effective-target arm_crypto_ok as you

> mentioned below.

> >>

> >>>

> >>> Similarly, adding tests for vcreate_p64 etc... in p64.c or

> >>> p64_p128.c might be easier to maintain than adding them to vcreate.c

> >>> etc with several #ifdef conditions.

> >>

> >> Fair enough, I'll move them to p64_p128.c.

> >>

> >>> For vdup-vmod.c, why do you add the "&& defined(__aarch64__)"

> >>> condition? These intrinsics are defined in arm/arm_neon.h, right?

> >>> They are tested in p64_p128.c

> >>

> >> I should have looked for them, they weren't being tested before so I

> >> had Mistakenly assumed that they weren't available. Now I realize I

> >> just need To add the proper test option to the file to enable crypto. I'll

> update this as well.

> >>

> >>> Looking at your patch, it seems some tests are currently missing for arm:

> >>> vget_high_p64. I'm not sure why I missed it when I removed neont-

> >>> testgen...

> >>

> >> I'll adjust the test conditions so they run for ARM as well.

> >>

> >>>

> >>> Regarding vreinterpret_p128.c, doesn't the existing effective-target

> >>> arm_crypto_ok prevent the tests from running on aarch64?

> >>

> >> Yes they do, I was comparing the output against a clean version and

> >> hasn't noticed That they weren't running. Thanks!

> >>

> >>>

> >>> Thanks,

> >>>

> >>> Christophe

Comments

Christophe Lyon Nov. 29, 2016, 10:12 a.m. UTC | #1
Hi Tamar,


On 29 November 2016 at 10:50, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi All,

>

> The new patch contains the proper types for the intrinsics that should be returning uint64x1

> and has the rest of the comments by Christophe in them.

>


LGTM.

One more question: maybe we want to add explicit tests for vdup*_v_p64
even though they are aliases for vmov?

Christophe

> Kind Regards,

> Tamar

>

> ________________________________________

> From: Tamar Christina

> Sent: Friday, November 25, 2016 4:01:30 PM

> To: Christophe Lyon

> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

>

>  >

>> > A few comments about this new version:

>> > * arm-neon-ref.h: why do you create

>> CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64?

>> > Can't you just add calls to CHECK_CRYPTO in the existing

>> > CHECK_RESULTS_NAMED_NO_FP16?

>

> Yes, that should be fine, I didn't used to have CHECK_CRYPTO before and when I added it

> I didn't remove the split. I'll do it now.

>

>> >

>> > * p64_p128:

>> > From what I can see ARM and AArch64 differ on the vceq variants

>> > available with poly64.

>> > For ARM, arm_neon.h contains: uint64x1_t vceq_p64 (poly64x1_t __a,

>> > poly64x1_t __b) For AArch64, I can't see vceq_p64 in arm_neon.h? ...

>> > Actually I've just noticed the other you submitted while I was writing

>> > this, where you add vceq_p64 for aarch64, but it still returns

>> > uint64_t.

>> > Why do you change the vceq_64 test to return poly64_t instead of

>> uint64_t?

>

> This patch is slightly outdated. The correct type is `uint64_t` but when it was noticed

> This patch was already sent. New one coming soon.

>

>> >

>> > Why do you add #ifdef __aarch64 before vldX_p64 tests and until vsli_p64?

>> >

>

> This is wrong, remove them. It was supposed to be around the vldX_lane_p64 tests.

>

>> > The comment /* vget_lane_p64 tests.  */ is wrong before VLDX_LANE

>> > tests

>> >

>> > You need to protect the new vmov, vget_high and vget_lane tests with

>> > #ifdef __aarch64__.

>> >

>

> vget_lane is already in an #ifdef, vmov you're right, but I also notice that the

> test calls VDUP instead of VMOV, which explains why I didn't get a test failure.

>

> Thanks for the feedback,

> I'll get these updated.

>

>>

>> Actually, vget_high_p64 exists on arm, so no need for the #fidef for it.

>>

>>

>> > Christophe

>> >

>> >> Kind regards,

>> >> Tamar

>> >> ________________________________________

>> >> From: Tamar Christina

>> >> Sent: Tuesday, November 8, 2016 11:58:46 AM

>> >> To: Christophe Lyon

>> >> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard

>> >> Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

>> >> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing

>> >> Poly64_t intrinsics to GCC

>> >>

>> >> Hi Christophe,

>> >>

>> >> Thanks for the review!

>> >>

>> >>>

>> >>> A while ago I added p64_p128.c, to contain all the poly64/128 tests

>> >>> except for vreinterpret.

>> >>> Why do you need to create p64.c ?

>> >>

>> >> I originally created it because I had a much smaller set of

>> >> intrinsics that I wanted to add initially, this grew and It hadn't occurred to

>> me that I can use the existing file now.

>> >>

>> >> Another reason was the effective-target arm_crypto_ok as you

>> mentioned below.

>> >>

>> >>>

>> >>> Similarly, adding tests for vcreate_p64 etc... in p64.c or

>> >>> p64_p128.c might be easier to maintain than adding them to vcreate.c

>> >>> etc with several #ifdef conditions.

>> >>

>> >> Fair enough, I'll move them to p64_p128.c.

>> >>

>> >>> For vdup-vmod.c, why do you add the "&& defined(__aarch64__)"

>> >>> condition? These intrinsics are defined in arm/arm_neon.h, right?

>> >>> They are tested in p64_p128.c

>> >>

>> >> I should have looked for them, they weren't being tested before so I

>> >> had Mistakenly assumed that they weren't available. Now I realize I

>> >> just need To add the proper test option to the file to enable crypto. I'll

>> update this as well.

>> >>

>> >>> Looking at your patch, it seems some tests are currently missing for arm:

>> >>> vget_high_p64. I'm not sure why I missed it when I removed neont-

>> >>> testgen...

>> >>

>> >> I'll adjust the test conditions so they run for ARM as well.

>> >>

>> >>>

>> >>> Regarding vreinterpret_p128.c, doesn't the existing effective-target

>> >>> arm_crypto_ok prevent the tests from running on aarch64?

>> >>

>> >> Yes they do, I was comparing the output against a clean version and

>> >> hasn't noticed That they weren't running. Thanks!

>> >>

>> >>>

>> >>> Thanks,

>> >>>

>> >>> Christophe
Christophe Lyon Nov. 29, 2016, 12:57 p.m. UTC | #2
On 29 November 2016 at 11:12, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> Hi Tamar,

>

>

> On 29 November 2016 at 10:50, Tamar Christina <Tamar.Christina@arm.com> wrote:

>> Hi All,

>>

>> The new patch contains the proper types for the intrinsics that should be returning uint64x1

>> and has the rest of the comments by Christophe in them.

>>

>

> LGTM.

>

> One more question: maybe we want to add explicit tests for vdup*_v_p64

> even though they are aliases for vmov?

>

Sorry, I meant vdup_n_p64, but the tests are already in place.

So, OK for me, but I can't approve.

Thanks,

Christophe

> Christophe

>

>> Kind Regards,

>> Tamar

>>

>> ________________________________________

>> From: Tamar Christina

>> Sent: Friday, November 25, 2016 4:01:30 PM

>> To: Christophe Lyon

>> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

>> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

>>

>>  >

>>> > A few comments about this new version:

>>> > * arm-neon-ref.h: why do you create

>>> CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64?

>>> > Can't you just add calls to CHECK_CRYPTO in the existing

>>> > CHECK_RESULTS_NAMED_NO_FP16?

>>

>> Yes, that should be fine, I didn't used to have CHECK_CRYPTO before and when I added it

>> I didn't remove the split. I'll do it now.

>>

>>> >

>>> > * p64_p128:

>>> > From what I can see ARM and AArch64 differ on the vceq variants

>>> > available with poly64.

>>> > For ARM, arm_neon.h contains: uint64x1_t vceq_p64 (poly64x1_t __a,

>>> > poly64x1_t __b) For AArch64, I can't see vceq_p64 in arm_neon.h? ...

>>> > Actually I've just noticed the other you submitted while I was writing

>>> > this, where you add vceq_p64 for aarch64, but it still returns

>>> > uint64_t.

>>> > Why do you change the vceq_64 test to return poly64_t instead of

>>> uint64_t?

>>

>> This patch is slightly outdated. The correct type is `uint64_t` but when it was noticed

>> This patch was already sent. New one coming soon.

>>

>>> >

>>> > Why do you add #ifdef __aarch64 before vldX_p64 tests and until vsli_p64?

>>> >

>>

>> This is wrong, remove them. It was supposed to be around the vldX_lane_p64 tests.

>>

>>> > The comment /* vget_lane_p64 tests.  */ is wrong before VLDX_LANE

>>> > tests

>>> >

>>> > You need to protect the new vmov, vget_high and vget_lane tests with

>>> > #ifdef __aarch64__.

>>> >

>>

>> vget_lane is already in an #ifdef, vmov you're right, but I also notice that the

>> test calls VDUP instead of VMOV, which explains why I didn't get a test failure.

>>

>> Thanks for the feedback,

>> I'll get these updated.

>>

>>>

>>> Actually, vget_high_p64 exists on arm, so no need for the #fidef for it.

>>>

>>>

>>> > Christophe

>>> >

>>> >> Kind regards,

>>> >> Tamar

>>> >> ________________________________________

>>> >> From: Tamar Christina

>>> >> Sent: Tuesday, November 8, 2016 11:58:46 AM

>>> >> To: Christophe Lyon

>>> >> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard

>>> >> Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

>>> >> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing

>>> >> Poly64_t intrinsics to GCC

>>> >>

>>> >> Hi Christophe,

>>> >>

>>> >> Thanks for the review!

>>> >>

>>> >>>

>>> >>> A while ago I added p64_p128.c, to contain all the poly64/128 tests

>>> >>> except for vreinterpret.

>>> >>> Why do you need to create p64.c ?

>>> >>

>>> >> I originally created it because I had a much smaller set of

>>> >> intrinsics that I wanted to add initially, this grew and It hadn't occurred to

>>> me that I can use the existing file now.

>>> >>

>>> >> Another reason was the effective-target arm_crypto_ok as you

>>> mentioned below.

>>> >>

>>> >>>

>>> >>> Similarly, adding tests for vcreate_p64 etc... in p64.c or

>>> >>> p64_p128.c might be easier to maintain than adding them to vcreate.c

>>> >>> etc with several #ifdef conditions.

>>> >>

>>> >> Fair enough, I'll move them to p64_p128.c.

>>> >>

>>> >>> For vdup-vmod.c, why do you add the "&& defined(__aarch64__)"

>>> >>> condition? These intrinsics are defined in arm/arm_neon.h, right?

>>> >>> They are tested in p64_p128.c

>>> >>

>>> >> I should have looked for them, they weren't being tested before so I

>>> >> had Mistakenly assumed that they weren't available. Now I realize I

>>> >> just need To add the proper test option to the file to enable crypto. I'll

>>> update this as well.

>>> >>

>>> >>> Looking at your patch, it seems some tests are currently missing for arm:

>>> >>> vget_high_p64. I'm not sure why I missed it when I removed neont-

>>> >>> testgen...

>>> >>

>>> >> I'll adjust the test conditions so they run for ARM as well.

>>> >>

>>> >>>

>>> >>> Regarding vreinterpret_p128.c, doesn't the existing effective-target

>>> >>> arm_crypto_ok prevent the tests from running on aarch64?

>>> >>

>>> >> Yes they do, I was comparing the output against a clean version and

>>> >> hasn't noticed That they weren't running. Thanks!

>>> >>

>>> >>>

>>> >>> Thanks,

>>> >>>

>>> >>> Christophe
Kyrill Tkachov Nov. 29, 2016, 1:48 p.m. UTC | #3
On 29/11/16 09:50, Tamar Christina wrote:
> Hi All,

>

> The new patch contains the proper types for the intrinsics that should be returning uint64x1

> and has the rest of the comments by Christophe in them.


Ok with an appropriate ChangeLog entry.
Thanks,
Kyrill

> Kind Regards,

> Tamar

>

> ________________________________________

> From: Tamar Christina

> Sent: Friday, November 25, 2016 4:01:30 PM

> To: Christophe Lyon

> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

>

>   >

>>> A few comments about this new version:

>>> * arm-neon-ref.h: why do you create

>> CHECK_RESULTS_NAMED_NO_FP16_NO_POLY64?

>>> Can't you just add calls to CHECK_CRYPTO in the existing

>>> CHECK_RESULTS_NAMED_NO_FP16?

> Yes, that should be fine, I didn't used to have CHECK_CRYPTO before and when I added it

> I didn't remove the split. I'll do it now.

>

>>> * p64_p128:

>>>  From what I can see ARM and AArch64 differ on the vceq variants

>>> available with poly64.

>>> For ARM, arm_neon.h contains: uint64x1_t vceq_p64 (poly64x1_t __a,

>>> poly64x1_t __b) For AArch64, I can't see vceq_p64 in arm_neon.h? ...

>>> Actually I've just noticed the other you submitted while I was writing

>>> this, where you add vceq_p64 for aarch64, but it still returns

>>> uint64_t.

>>> Why do you change the vceq_64 test to return poly64_t instead of

>> uint64_t?

> This patch is slightly outdated. The correct type is `uint64_t` but when it was noticed

> This patch was already sent. New one coming soon.

>

>>> Why do you add #ifdef __aarch64 before vldX_p64 tests and until vsli_p64?

>>>

> This is wrong, remove them. It was supposed to be around the vldX_lane_p64 tests.

>

>>> The comment /* vget_lane_p64 tests.  */ is wrong before VLDX_LANE

>>> tests

>>>

>>> You need to protect the new vmov, vget_high and vget_lane tests with

>>> #ifdef __aarch64__.

>>>

> vget_lane is already in an #ifdef, vmov you're right, but I also notice that the

> test calls VDUP instead of VMOV, which explains why I didn't get a test failure.

>

> Thanks for the feedback,

> I'll get these updated.

>

>> Actually, vget_high_p64 exists on arm, so no need for the #fidef for it.

>>

>>

>>> Christophe

>>>

>>>> Kind regards,

>>>> Tamar

>>>> ________________________________________

>>>> From: Tamar Christina

>>>> Sent: Tuesday, November 8, 2016 11:58:46 AM

>>>> To: Christophe Lyon

>>>> Cc: GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard

>>>> Earnshaw; James Greenhalgh; Kyrylo Tkachov; nd

>>>> Subject: RE: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing

>>>> Poly64_t intrinsics to GCC

>>>>

>>>> Hi Christophe,

>>>>

>>>> Thanks for the review!

>>>>

>>>>> A while ago I added p64_p128.c, to contain all the poly64/128 tests

>>>>> except for vreinterpret.

>>>>> Why do you need to create p64.c ?

>>>> I originally created it because I had a much smaller set of

>>>> intrinsics that I wanted to add initially, this grew and It hadn't occurred to

>> me that I can use the existing file now.

>>>> Another reason was the effective-target arm_crypto_ok as you

>> mentioned below.

>>>>> Similarly, adding tests for vcreate_p64 etc... in p64.c or

>>>>> p64_p128.c might be easier to maintain than adding them to vcreate.c

>>>>> etc with several #ifdef conditions.

>>>> Fair enough, I'll move them to p64_p128.c.

>>>>

>>>>> For vdup-vmod.c, why do you add the "&& defined(__aarch64__)"

>>>>> condition? These intrinsics are defined in arm/arm_neon.h, right?

>>>>> They are tested in p64_p128.c

>>>> I should have looked for them, they weren't being tested before so I

>>>> had Mistakenly assumed that they weren't available. Now I realize I

>>>> just need To add the proper test option to the file to enable crypto. I'll

>> update this as well.

>>>>> Looking at your patch, it seems some tests are currently missing for arm:

>>>>> vget_high_p64. I'm not sure why I missed it when I removed neont-

>>>>> testgen...

>>>> I'll adjust the test conditions so they run for ARM as well.

>>>>

>>>>> Regarding vreinterpret_p128.c, doesn't the existing effective-target

>>>>> arm_crypto_ok prevent the tests from running on aarch64?

>>>> Yes they do, I was comparing the output against a clean version and

>>>> hasn't noticed That they weren't running. Thanks!

>>>>

>>>>> Thanks,

>>>>>

>>>>> Christophe
James Greenhalgh Nov. 29, 2016, 1:54 p.m. UTC | #4
On Tue, Nov 29, 2016 at 01:48:22PM +0000, Kyrill Tkachov wrote:
> 

> On 29/11/16 09:50, Tamar Christina wrote:

> >Hi All,

> >

> >The new patch contains the proper types for the intrinsics that should be returning uint64x1

> >and has the rest of the comments by Christophe in them.

> 

> Ok with an appropriate ChangeLog entry.


Also OK from an AArch64 persepctive based on the detailed review from
Christophe.

Thanks,
James
Christophe Lyon Nov. 30, 2016, 9:04 a.m. UTC | #5
Hi Tamar,


On 29 November 2016 at 14:54, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> On Tue, Nov 29, 2016 at 01:48:22PM +0000, Kyrill Tkachov wrote:

>>

>> On 29/11/16 09:50, Tamar Christina wrote:

>> >Hi All,

>> >

>> >The new patch contains the proper types for the intrinsics that should be returning uint64x1

>> >and has the rest of the comments by Christophe in them.

>>

>> Ok with an appropriate ChangeLog entry.

>

> Also OK from an AArch64 persepctive based on the detailed review from

> Christophe.

>

> Thanks,

> James

>


After you committed this patch (r242962), I've noticed some
regressions as follows:
* on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile
with errors like
warning: implicit declaration of function 'vreinterpretq_p64_p128
warning: implicit declaration of function 'vreinterpretq_p128_s8
error: incompatible types when assigning to type 'poly64x2_t' from type 'int'
etc...

* on arm configured for armv8-a, several tests fail to link or compile:
vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'
vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared
vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

You can have more details at
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242962/report-build-info.html


Christophe
Tamar Christina Nov. 30, 2016, 9:40 a.m. UTC | #6
Hi Christophe,

> After you committed this patch (r242962), I've noticed some regressions as

> follows:

> * on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile with

> errors like

> warning: implicit declaration of function 'vreinterpretq_p64_p128

> warning: implicit declaration of function 'vreinterpretq_p128_s8

> error: incompatible types when assigning to type 'poly64x2_t' from type 'int'

> etc...


Sorry for the screw up. On the last patch I only tested the file p64_p128.c.
I'll fix these asap.

> 

> * on arm configured for armv8-a, several tests fail to link or compile:

> vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'

> vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared

> vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

> 

> You can have more details at

> http://people.linaro.org/~christophe.lyon/cross-

> validation/gcc/trunk/242962/report-build-info.html

> 

> 

> Christophe
Andrew Pinski Dec. 7, 2016, 4:33 a.m. UTC | #7
On Wed, Nov 30, 2016 at 1:04 AM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> Hi Tamar,

>

>

> On 29 November 2016 at 14:54, James Greenhalgh <james.greenhalgh@arm.com> wrote:

>> On Tue, Nov 29, 2016 at 01:48:22PM +0000, Kyrill Tkachov wrote:

>>>

>>> On 29/11/16 09:50, Tamar Christina wrote:

>>> >Hi All,

>>> >

>>> >The new patch contains the proper types for the intrinsics that should be returning uint64x1

>>> >and has the rest of the comments by Christophe in them.

>>>

>>> Ok with an appropriate ChangeLog entry.

>>

>> Also OK from an AArch64 persepctive based on the detailed review from

>> Christophe.

>>

>> Thanks,

>> James

>>

>

> After you committed this patch (r242962), I've noticed some

> regressions as follows:

> * on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile

> with errors like

> warning: implicit declaration of function 'vreinterpretq_p64_p128

> warning: implicit declaration of function 'vreinterpretq_p128_s8

> error: incompatible types when assigning to type 'poly64x2_t' from type 'int'

> etc...

>

> * on arm configured for armv8-a, several tests fail to link or compile:

> vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'

> vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared

> vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

>

> You can have more details at

> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242962/report-build-info.html


I see the expected_poly64x1 failures also for aarch64:
https://gcc.gnu.org/ml/gcc-testresults/2016-12/msg00738.html

FAIL: gcc.target/aarch64/advsimd-intrinsics/vbsl.c   -O0  (test for
excess errors)
Excess errors:
vbsl.c:(.text+0x1dec): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1df0): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1e20): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1e24): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x2a74): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2a78): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2aa8): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2aac): undefined reference to `expected_poly64x2'

....
FAIL: gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c   -O0  (test
for excess errors)
Excess errors:
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:
error: 'expected0_poly64x1' undeclared (first use in this function);
did you mean 'expected_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:
error: 'expected0_poly64x2' undeclared (first use in this function);
did you mean 'expected0_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:
error: 'expected1_poly64x1' undeclared (first use in this function);
did you mean 'expected0_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:
error: 'expected1_poly64x2' undeclared (first use in this function);
did you mean 'expected1_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:
error: 'expected2_poly64x1' undeclared (first use in this function);
did you mean 'expected1_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:
error: 'expected2_poly64x2' undeclared (first use in this function);
did you mean 'expected2_poly64x1'?


etc.

>

>

> Christophe
Tamar Christina Dec. 12, 2016, 11:29 a.m. UTC | #8
Hi Andrew,

These should be fixed now.

Thanks,
Tamar

________________________________________
From: Andrew Pinski <pinskia@gmail.com>

Sent: Wednesday, December 7, 2016 4:33:51 AM
To: Christophe Lyon
Cc: Tamar Christina; Kyrill Tkachov; James Greenhalgh; GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; nd
Subject: Re: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

On Wed, Nov 30, 2016 at 1:04 AM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> Hi Tamar,

>

>

> On 29 November 2016 at 14:54, James Greenhalgh <james.greenhalgh@arm.com> wrote:

>> On Tue, Nov 29, 2016 at 01:48:22PM +0000, Kyrill Tkachov wrote:

>>>

>>> On 29/11/16 09:50, Tamar Christina wrote:

>>> >Hi All,

>>> >

>>> >The new patch contains the proper types for the intrinsics that should be returning uint64x1

>>> >and has the rest of the comments by Christophe in them.

>>>

>>> Ok with an appropriate ChangeLog entry.

>>

>> Also OK from an AArch64 persepctive based on the detailed review from

>> Christophe.

>>

>> Thanks,

>> James

>>

>

> After you committed this patch (r242962), I've noticed some

> regressions as follows:

> * on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile

> with errors like

> warning: implicit declaration of function 'vreinterpretq_p64_p128

> warning: implicit declaration of function 'vreinterpretq_p128_s8

> error: incompatible types when assigning to type 'poly64x2_t' from type 'int'

> etc...

>

> * on arm configured for armv8-a, several tests fail to link or compile:

> vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'

> vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared

> vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

>

> You can have more details at

> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242962/report-build-info.html


I see the expected_poly64x1 failures also for aarch64:
https://gcc.gnu.org/ml/gcc-testresults/2016-12/msg00738.html

FAIL: gcc.target/aarch64/advsimd-intrinsics/vbsl.c   -O0  (test for
excess errors)
Excess errors:
vbsl.c:(.text+0x1dec): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1df0): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1e20): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x1e24): undefined reference to `expected_poly64x1'
vbsl.c:(.text+0x2a74): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2a78): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2aa8): undefined reference to `expected_poly64x2'
vbsl.c:(.text+0x2aac): undefined reference to `expected_poly64x2'

....
FAIL: gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c   -O0  (test
for excess errors)
Excess errors:
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:
error: 'expected0_poly64x1' undeclared (first use in this function);
did you mean 'expected_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:
error: 'expected0_poly64x2' undeclared (first use in this function);
did you mean 'expected0_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:
error: 'expected1_poly64x1' undeclared (first use in this function);
did you mean 'expected0_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:
error: 'expected1_poly64x2' undeclared (first use in this function);
did you mean 'expected1_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:
error: 'expected2_poly64x1' undeclared (first use in this function);
did you mean 'expected1_poly64x1'?
/home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:
error: 'expected2_poly64x2' undeclared (first use in this function);
did you mean 'expected2_poly64x1'?


etc.

>

>

> Christophe
Andrew Pinski Dec. 12, 2016, 11:53 p.m. UTC | #9
On Mon, Dec 12, 2016 at 3:29 AM, Tamar Christina
<Tamar.Christina@arm.com> wrote:
> Hi Andrew,

>

> These should be fixed now.


Yes they are fixed.

Thanks,
Andrew

>

> Thanks,

> Tamar

>

> ________________________________________

> From: Andrew Pinski <pinskia@gmail.com>

> Sent: Wednesday, December 7, 2016 4:33:51 AM

> To: Christophe Lyon

> Cc: Tamar Christina; Kyrill Tkachov; James Greenhalgh; GCC Patches; christophe.lyon@st.com; Marcus Shawcroft; Richard Earnshaw; nd

> Subject: Re: [AArch64][ARM][GCC][PATCHv2 3/3] Add tests for missing Poly64_t intrinsics to GCC

>

> On Wed, Nov 30, 2016 at 1:04 AM, Christophe Lyon

> <christophe.lyon@linaro.org> wrote:

>> Hi Tamar,

>>

>>

>> On 29 November 2016 at 14:54, James Greenhalgh <james.greenhalgh@arm.com> wrote:

>>> On Tue, Nov 29, 2016 at 01:48:22PM +0000, Kyrill Tkachov wrote:

>>>>

>>>> On 29/11/16 09:50, Tamar Christina wrote:

>>>> >Hi All,

>>>> >

>>>> >The new patch contains the proper types for the intrinsics that should be returning uint64x1

>>>> >and has the rest of the comments by Christophe in them.

>>>>

>>>> Ok with an appropriate ChangeLog entry.

>>>

>>> Also OK from an AArch64 persepctive based on the detailed review from

>>> Christophe.

>>>

>>> Thanks,

>>> James

>>>

>>

>> After you committed this patch (r242962), I've noticed some

>> regressions as follows:

>> * on aarch64, vreinterpret_p128 and vreinterpret_p64 fail to compile

>> with errors like

>> warning: implicit declaration of function 'vreinterpretq_p64_p128

>> warning: implicit declaration of function 'vreinterpretq_p128_s8

>> error: incompatible types when assigning to type 'poly64x2_t' from type 'int'

>> etc...

>>

>> * on arm configured for armv8-a, several tests fail to link or compile:

>> vbsl.c:(.text+0x24f0): undefined reference to `expected_poly64x1'

>> vdup-vmov.c:227:38: error: 'expected0_poly64x1' undeclared

>> vdup_lane.c:(.text+0x1584): undefined reference to `expected_poly64x1'

>>

>> You can have more details at

>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/242962/report-build-info.html

>

> I see the expected_poly64x1 failures also for aarch64:

> https://gcc.gnu.org/ml/gcc-testresults/2016-12/msg00738.html

>

> FAIL: gcc.target/aarch64/advsimd-intrinsics/vbsl.c   -O0  (test for

> excess errors)

> Excess errors:

> vbsl.c:(.text+0x1dec): undefined reference to `expected_poly64x1'

> vbsl.c:(.text+0x1df0): undefined reference to `expected_poly64x1'

> vbsl.c:(.text+0x1e20): undefined reference to `expected_poly64x1'

> vbsl.c:(.text+0x1e24): undefined reference to `expected_poly64x1'

> vbsl.c:(.text+0x2a74): undefined reference to `expected_poly64x2'

> vbsl.c:(.text+0x2a78): undefined reference to `expected_poly64x2'

> vbsl.c:(.text+0x2aa8): undefined reference to `expected_poly64x2'

> vbsl.c:(.text+0x2aac): undefined reference to `expected_poly64x2'

>

> ....

> FAIL: gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c   -O0  (test

> for excess errors)

> Excess errors:

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:

> error: 'expected0_poly64x1' undeclared (first use in this function);

> did you mean 'expected_poly64x1'?

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:175:38:

> error: 'expected0_poly64x2' undeclared (first use in this function);

> did you mean 'expected0_poly64x1'?

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:

> error: 'expected1_poly64x1' undeclared (first use in this function);

> did you mean 'expected0_poly64x1'?

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:178:38:

> error: 'expected1_poly64x2' undeclared (first use in this function);

> did you mean 'expected1_poly64x1'?

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:

> error: 'expected2_poly64x1' undeclared (first use in this function);

> did you mean 'expected1_poly64x1'?

> /home/jenkins/workspace/BuildThunderX_native_gcc_upstream/gcc/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1_dup.c:181:38:

> error: 'expected2_poly64x2' undeclared (first use in this function);

> did you mean 'expected2_poly64x1'?

>

>

> etc.

>

>>

>>

>> Christophe
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
index 462141586b3db7c5256c74b08fa0449210634226..beaf6ac31d5c5affe3702a505ad0df8679229e32 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h
@@ -32,6 +32,13 @@  extern size_t strlen(const char *);
    VECT_VAR(expected, int, 16, 4) -> expected_int16x4
    VECT_VAR_DECL(expected, int, 16, 4) -> int16x4_t expected_int16x4
 */
+/* Some instructions don't exist on ARM.
+   Use this macro to guard against them.  */
+#ifdef __aarch64__
+#define AARCH64_ONLY(X) X
+#else
+#define AARCH64_ONLY(X)
+#endif
 
 #define xSTR(X) #X
 #define STR(X) xSTR(X)
@@ -92,6 +99,13 @@  extern size_t strlen(const char *);
     fprintf(stderr, "CHECKED %s %s\n", STR(VECT_TYPE(T, W, N)), MSG);	\
   }
 
+#if defined (__ARM_FEATURE_CRYPTO)
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT) \
+	       CHECK(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#else
+#define CHECK_CRYPTO(MSG,T,W,N,FMT,EXPECTED,COMMENT)
+#endif
+
 /* Floating-point variant.  */
 #define CHECK_FP(MSG,T,W,N,FMT,EXPECTED,COMMENT)			\
   {									\
@@ -184,6 +198,9 @@  extern ARRAY(expected, uint, 32, 2);
 extern ARRAY(expected, uint, 64, 1);
 extern ARRAY(expected, poly, 8, 8);
 extern ARRAY(expected, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 1);
+#endif
 extern ARRAY(expected, hfloat, 16, 4);
 extern ARRAY(expected, hfloat, 32, 2);
 extern ARRAY(expected, hfloat, 64, 1);
@@ -197,6 +214,9 @@  extern ARRAY(expected, uint, 32, 4);
 extern ARRAY(expected, uint, 64, 2);
 extern ARRAY(expected, poly, 8, 16);
 extern ARRAY(expected, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+extern ARRAY(expected, poly, 64, 2);
+#endif
 extern ARRAY(expected, hfloat, 16, 8);
 extern ARRAY(expected, hfloat, 32, 4);
 extern ARRAY(expected, hfloat, 64, 2);
@@ -213,6 +233,7 @@  extern ARRAY(expected, hfloat, 64, 2);
     CHECK(test_name, uint, 64, 1, PRIx64, EXPECTED, comment);		\
     CHECK(test_name, poly, 8, 8, PRIx8, EXPECTED, comment);		\
     CHECK(test_name, poly, 16, 4, PRIx16, EXPECTED, comment);		\
+    CHECK_CRYPTO(test_name, poly, 64, 1, PRIx64, EXPECTED, comment);	\
     CHECK_FP(test_name, float, 32, 2, PRIx32, EXPECTED, comment);	\
 									\
     CHECK(test_name, int, 8, 16, PRIx8, EXPECTED, comment);		\
@@ -225,6 +246,7 @@  extern ARRAY(expected, hfloat, 64, 2);
     CHECK(test_name, uint, 64, 2, PRIx64, EXPECTED, comment);		\
     CHECK(test_name, poly, 8, 16, PRIx8, EXPECTED, comment);		\
     CHECK(test_name, poly, 16, 8, PRIx16, EXPECTED, comment);		\
+    CHECK_CRYPTO(test_name, poly, 64, 2, PRIx64, EXPECTED, comment);	\
     CHECK_FP(test_name, float, 32, 4, PRIx32, EXPECTED, comment);	\
   }									\
 
@@ -398,6 +420,9 @@  static void clean_results (void)
   CLEAN(result, uint, 64, 1);
   CLEAN(result, poly, 8, 8);
   CLEAN(result, poly, 16, 4);
+#if defined (__ARM_FEATURE_CRYPTO)
+  CLEAN(result, poly, 64, 1);
+#endif
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
   CLEAN(result, float, 16, 4);
 #endif
@@ -413,6 +438,9 @@  static void clean_results (void)
   CLEAN(result, uint, 64, 2);
   CLEAN(result, poly, 8, 16);
   CLEAN(result, poly, 16, 8);
+#if defined (__ARM_FEATURE_CRYPTO)
+  CLEAN(result, poly, 64, 2);
+#endif
 #if defined (__ARM_FP16_FORMAT_IEEE) || defined (__ARM_FP16_FORMAT_ALTERNATIVE)
   CLEAN(result, float, 16, 8);
 #endif
@@ -438,6 +466,13 @@  static void clean_results (void)
 #define DECL_VARIABLE(VAR, T1, W, N)		\
   VECT_TYPE(T1, W, N) VECT_VAR(VAR, T1, W, N)
 
+#if defined (__ARM_FEATURE_CRYPTO)
+#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N) \
+  DECL_VARIABLE(VAR, T1, W, N)
+#else
+#define DECL_VARIABLE_CRYPTO(VAR, T1, W, N)
+#endif
+
 /* Declare only 64 bits signed variants.  */
 #define DECL_VARIABLE_64BITS_SIGNED_VARIANTS(VAR)	\
   DECL_VARIABLE(VAR, int, 8, 8);			\
@@ -473,6 +508,7 @@  static void clean_results (void)
   DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR);	\
   DECL_VARIABLE(VAR, poly, 8, 8);		\
   DECL_VARIABLE(VAR, poly, 16, 4);		\
+  DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1);	\
   DECL_VARIABLE(VAR, float, 16, 4);		\
   DECL_VARIABLE(VAR, float, 32, 2)
 #else
@@ -481,6 +517,7 @@  static void clean_results (void)
   DECL_VARIABLE_64BITS_UNSIGNED_VARIANTS(VAR);	\
   DECL_VARIABLE(VAR, poly, 8, 8);		\
   DECL_VARIABLE(VAR, poly, 16, 4);		\
+  DECL_VARIABLE_CRYPTO(VAR, poly, 64, 1);	\
   DECL_VARIABLE(VAR, float, 32, 2)
 #endif
 
@@ -491,6 +528,7 @@  static void clean_results (void)
   DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR);	\
   DECL_VARIABLE(VAR, poly, 8, 16);		\
   DECL_VARIABLE(VAR, poly, 16, 8);		\
+  DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2);	\
   DECL_VARIABLE(VAR, float, 16, 8);		\
   DECL_VARIABLE(VAR, float, 32, 4)
 #else
@@ -499,6 +537,7 @@  static void clean_results (void)
   DECL_VARIABLE_128BITS_UNSIGNED_VARIANTS(VAR);	\
   DECL_VARIABLE(VAR, poly, 8, 16);		\
   DECL_VARIABLE(VAR, poly, 16, 8);		\
+  DECL_VARIABLE_CRYPTO(VAR, poly, 64, 2);	\
   DECL_VARIABLE(VAR, float, 32, 4)
 #endif
 /* Declare all variants.  */
@@ -531,6 +570,13 @@  static void clean_results (void)
 
 /* Helpers to call macros with 1 constant and 5 variable
    arguments.  */
+#if defined (__ARM_FEATURE_CRYPTO)
+#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N) \
+  MACRO(VAR1, VAR2, T1, T2, T3, W, N)
+#else
+#define MACRO_CRYPTO(MACRO, VAR1, VAR2, T1, T2, T3, W, N)
+#endif
+
 #define TEST_MACRO_64BITS_SIGNED_VARIANTS_1_5(MACRO, VAR)	\
   MACRO(VAR, , int, s, 8, 8);					\
   MACRO(VAR, , int, s, 16, 4);					\
@@ -601,13 +647,15 @@  static void clean_results (void)
   TEST_MACRO_64BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2);	\
   TEST_MACRO_64BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2);	\
   MACRO(VAR1, VAR2, , poly, p, 8, 8);				\
-  MACRO(VAR1, VAR2, , poly, p, 16, 4)
+  MACRO(VAR1, VAR2, , poly, p, 16, 4);				\
+  MACRO_CRYPTO(MACRO, VAR1, VAR2, , poly, p, 64, 1)
 
 #define TEST_MACRO_128BITS_VARIANTS_2_5(MACRO, VAR1, VAR2)	\
   TEST_MACRO_128BITS_SIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2);	\
   TEST_MACRO_128BITS_UNSIGNED_VARIANTS_2_5(MACRO, VAR1, VAR2);	\
   MACRO(VAR1, VAR2, q, poly, p, 8, 16);				\
-  MACRO(VAR1, VAR2, q, poly, p, 16, 8)
+  MACRO(VAR1, VAR2, q, poly, p, 16, 8);				\
+  MACRO_CRYPTO(MACRO, VAR1, VAR2, q, poly, p, 64, 2)
 
 #define TEST_MACRO_ALL_VARIANTS_2_5(MACRO, VAR1, VAR2)	\
   TEST_MACRO_64BITS_VARIANTS_2_5(MACRO, VAR1, VAR2);	\
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
index 519cffb0125079022e7ba876c1ca657d9e37cac2..8907b38cde90b44a8f1501f72b2c4e812cba5707 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
@@ -1,8 +1,9 @@ 
 /* This file contains tests for all the *p64 intrinsics, except for
    vreinterpret which have their own testcase.  */
 
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
 /* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
 
 #include <arm_neon.h>
 #include "arm-neon-ref.h"
@@ -38,6 +39,17 @@  VECT_VAR_DECL(vdup_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 };
 VECT_VAR_DECL(vdup_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2,
 						 0xfffffffffffffff2 };
 
+/* Expected results: vmov_n.  */
+VECT_VAR_DECL(vmov_n_expected0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(vmov_n_expected0,poly,64,2) [] = { 0xfffffffffffffff0,
+						 0xfffffffffffffff0 };
+VECT_VAR_DECL(vmov_n_expected1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(vmov_n_expected1,poly,64,2) [] = { 0xfffffffffffffff1,
+						 0xfffffffffffffff1 };
+VECT_VAR_DECL(vmov_n_expected2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(vmov_n_expected2,poly,64,2) [] = { 0xfffffffffffffff2,
+						 0xfffffffffffffff2 };
+
 /* Expected results: vext.  */
 VECT_VAR_DECL(vext_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
 VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 };
@@ -45,6 +57,9 @@  VECT_VAR_DECL(vext_expected,poly,64,2) [] = { 0xfffffffffffffff1, 0x88 };
 /* Expected results: vget_low.  */
 VECT_VAR_DECL(vget_low_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
 
+/* Expected results: vget_high.  */
+VECT_VAR_DECL(vget_high_expected,poly,64,1) [] = { 0xfffffffffffffff1 };
+
 /* Expected results: vld1.  */
 VECT_VAR_DECL(vld1_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
 VECT_VAR_DECL(vld1_expected,poly,64,2) [] = { 0xfffffffffffffff0,
@@ -109,6 +124,39 @@  VECT_VAR_DECL(vst1_lane_expected,poly,64,1) [] = { 0xfffffffffffffff0 };
 VECT_VAR_DECL(vst1_lane_expected,poly,64,2) [] = { 0xfffffffffffffff0,
 						   0x3333333333333333 };
 
+/* Expected results: vldX_lane.  */
+VECT_VAR_DECL(expected_vld_st2_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st2_0,poly,64,2) [] = { 0xfffffffffffffff0,
+						   0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st2_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st2_1,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+						   0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st3_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st3_0,poly,64,2) [] = { 0xfffffffffffffff0,
+						   0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st3_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st3_1,poly,64,2) [] = { 0xfffffffffffffff2,
+						   0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st3_2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(expected_vld_st3_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+						   0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st4_0,poly,64,1) [] = { 0xfffffffffffffff0 };
+VECT_VAR_DECL(expected_vld_st4_0,poly,64,2) [] = { 0xfffffffffffffff0,
+						   0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st4_1,poly,64,1) [] = { 0xfffffffffffffff1 };
+VECT_VAR_DECL(expected_vld_st4_1,poly,64,2) [] = { 0xfffffffffffffff2,
+						   0xfffffffffffffff3 };
+VECT_VAR_DECL(expected_vld_st4_2,poly,64,1) [] = { 0xfffffffffffffff2 };
+VECT_VAR_DECL(expected_vld_st4_2,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+						   0xaaaaaaaaaaaaaaaa };
+VECT_VAR_DECL(expected_vld_st4_3,poly,64,1) [] = { 0xfffffffffffffff3 };
+VECT_VAR_DECL(expected_vld_st4_3,poly,64,2) [] = { 0xaaaaaaaaaaaaaaaa,
+						   0xaaaaaaaaaaaaaaaa };
+
+/* Expected results: vget_lane.  */
+VECT_VAR_DECL(vget_lane_expected,poly,64,1) = 0xfffffffffffffff0;
+VECT_VAR_DECL(vget_lane_expected,poly,64,2) = 0xfffffffffffffff0;
+
 int main (void)
 {
   int i;
@@ -341,6 +389,26 @@  int main (void)
 
   CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_low_expected, "");
 
+  /* vget_high_p64 tests.  */
+#undef TEST_MSG
+#define TEST_MSG "VGET_HIGH"
+
+#define TEST_VGET_HIGH(T1, T2, W, N, N2)					\
+  VECT_VAR(vget_high_vector64, T1, W, N) =				\
+    vget_high_##T2##W(VECT_VAR(vget_high_vector128, T1, W, N2));		\
+  vst1_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vget_high_vector64, T1, W, N))
+
+  DECL_VARIABLE(vget_high_vector64, poly, 64, 1);
+  DECL_VARIABLE(vget_high_vector128, poly, 64, 2);
+
+  CLEAN(result, poly, 64, 1);
+
+  VLOAD(vget_high_vector128, buffer, q, poly, p, 64, 2);
+
+  TEST_VGET_HIGH(poly, p, 64, 1, 2);
+
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, vget_high_expected, "");
+
   /* vld1_p64 tests.  */
 #undef TEST_MSG
 #define TEST_MSG "VLD1/VLD1Q"
@@ -645,7 +713,7 @@  int main (void)
   VECT_VAR(vst1_lane_vector, T1, W, N) =				\
     vld1##Q##_##T2##W(VECT_VAR(buffer, T1, W, N));			\
   vst1##Q##_lane_##T2##W(VECT_VAR(result, T1, W, N),			\
-			 VECT_VAR(vst1_lane_vector, T1, W, N), L)
+			 VECT_VAR(vst1_lane_vector, T1, W, N), L);
 
   DECL_VARIABLE(vst1_lane_vector, poly, 64, 1);
   DECL_VARIABLE(vst1_lane_vector, poly, 64, 2);
@@ -659,5 +727,298 @@  int main (void)
   CHECK(TEST_MSG, poly, 64, 1, PRIx64, vst1_lane_expected, "");
   CHECK(TEST_MSG, poly, 64, 2, PRIx64, vst1_lane_expected, "");
 
+#ifdef __aarch64__
+
+  /* vmov_n_p64 tests.  */
+#undef TEST_MSG
+#define TEST_MSG "VMOV/VMOVQ"
+
+#define TEST_VMOV(Q, T1, T2, W, N)					\
+  VECT_VAR(vmov_n_vector, T1, W, N) =					\
+    vmov##Q##_n_##T2##W(VECT_VAR(buffer_dup, T1, W, N)[i]);		\
+  vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vmov_n_vector, T1, W, N))
+
+  DECL_VARIABLE(vmov_n_vector, poly, 64, 1);
+  DECL_VARIABLE(vmov_n_vector, poly, 64, 2);
+
+  /* Try to read different places from the input buffer.  */
+  for (i=0; i< 3; i++) {
+    CLEAN(result, poly, 64, 1);
+    CLEAN(result, poly, 64, 2);
+
+    TEST_VMOV(, poly, p, 64, 1);
+    TEST_VMOV(q, poly, p, 64, 2);
+
+    switch (i) {
+    case 0:
+      CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected0, "");
+      CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected0, "");
+      break;
+    case 1:
+      CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected1, "");
+      CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected1, "");
+      break;
+    case 2:
+      CHECK(TEST_MSG, poly, 64, 1, PRIx64, vmov_n_expected2, "");
+      CHECK(TEST_MSG, poly, 64, 2, PRIx64, vmov_n_expected2, "");
+      break;
+    default:
+      abort();
+    }
+  }
+
+  /* vget_lane_p64 tests.  */
+#undef TEST_MSG
+#define TEST_MSG "VGET_LANE/VGETQ_LANE"
+
+#define TEST_VGET_LANE(Q, T1, T2, W, N, L)				   \
+  VECT_VAR(vget_lane_vector, T1, W, N) = vget##Q##_lane_##T2##W(VECT_VAR(vector, T1, W, N), L); \
+  if (VECT_VAR(vget_lane_vector, T1, W, N) != VECT_VAR(vget_lane_expected, T1, W, N)) {		\
+    fprintf(stderr,							   \
+	    "ERROR in %s (%s line %d in result '%s') at type %s "	   \
+	    "got 0x%" PRIx##W " != 0x%" PRIx##W "\n",			   \
+	    TEST_MSG, __FILE__, __LINE__,				   \
+	    STR(VECT_VAR(vget_lane_expected, T1, W, N)),		   \
+	    STR(VECT_NAME(T1, W, N)),					   \
+	    VECT_VAR(vget_lane_vector, T1, W, N),			   \
+	    VECT_VAR(vget_lane_expected, T1, W, N));			   \
+    abort ();								   \
+  }
+
+  /* Initialize input values.  */
+  DECL_VARIABLE(vector, poly, 64, 1);
+  DECL_VARIABLE(vector, poly, 64, 2);
+
+  VLOAD(vector, buffer,  , poly, p, 64, 1);
+  VLOAD(vector, buffer, q, poly, p, 64, 2);
+
+  VECT_VAR_DECL(vget_lane_vector, poly, 64, 1);
+  VECT_VAR_DECL(vget_lane_vector, poly, 64, 2);
+
+  TEST_VGET_LANE( , poly, p, 64, 1, 0);
+  TEST_VGET_LANE(q, poly, p, 64, 2, 0);
+
+  /* vldx_lane_p64 tests.  */
+#undef TEST_MSG
+#define TEST_MSG "VLDX_LANE/VLDXQ_LANE"
+
+VECT_VAR_DECL_INIT(buffer_vld2_lane, poly, 64, 2);
+VECT_VAR_DECL_INIT(buffer_vld3_lane, poly, 64, 3);
+VECT_VAR_DECL_INIT(buffer_vld4_lane, poly, 64, 4);
+
+  /* In this case, input variables are arrays of vectors.  */
+#define DECL_VLD_STX_LANE(T1, W, N, X)					\
+  VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X);	\
+  VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X);	\
+  VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N]
+
+  /* We need to use a temporary result buffer (result_bis), because
+     the one used for other tests is not large enough. A subset of the
+     result data is moved from result_bis to result, and it is this
+     subset which is used to check the actual behavior. The next
+     macro enables to move another chunk of data from result_bis to
+     result.  */
+  /* We also use another extra input buffer (buffer_src), which we
+     fill with 0xAA, and which it used to load a vector from which we
+     read a given lane.  */
+
+#define TEST_VLDX_LANE(Q, T1, T2, W, N, X, L)				\
+  memset (VECT_VAR(buffer_src, T1, W, N), 0xAA,				\
+	  sizeof(VECT_VAR(buffer_src, T1, W, N)));			\
+									\
+  VECT_ARRAY_VAR(vector_src, T1, W, N, X) =				\
+    vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N));		\
+									\
+  VECT_ARRAY_VAR(vector, T1, W, N, X) =					\
+    /* Use dedicated init buffer, of size.  X */			\
+    vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X),	\
+			     VECT_ARRAY_VAR(vector_src, T1, W, N, X),	\
+			     L);					\
+  vst##X##Q##_##T2##W(VECT_VAR(result_bis_##X, T1, W, N),		\
+		      VECT_ARRAY_VAR(vector, T1, W, N, X));		\
+  memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \
+	 sizeof(VECT_VAR(result, T1, W, N)))
+
+  /* Overwrite "result" with the contents of "result_bis"[Y].  */
+#undef TEST_EXTRA_CHUNK
+#define TEST_EXTRA_CHUNK(T1, W, N, X, Y)		\
+  memcpy(VECT_VAR(result, T1, W, N),			\
+	 &(VECT_VAR(result_bis_##X, T1, W, N)[Y*N]),	\
+	 sizeof(VECT_VAR(result, T1, W, N)));
+
+  /* Add some padding to try to catch out of bound accesses.  */
+#define ARRAY1(V, T, W, N) VECT_VAR_DECL(V,T,W,N)[1]={42}
+#define DUMMY_ARRAY(V, T, W, N, L) \
+  VECT_VAR_DECL(V,T,W,N)[N*L]={0}; \
+  ARRAY1(V##_pad,T,W,N)
+
+#define DECL_ALL_VLD_STX_LANE(X)     \
+  DECL_VLD_STX_LANE(poly, 64, 1, X); \
+  DECL_VLD_STX_LANE(poly, 64, 2, X);
+
+#define TEST_ALL_VLDX_LANE(X)		  \
+  TEST_VLDX_LANE(, poly, p, 64, 1, X, 0); \
+  TEST_VLDX_LANE(q, poly, p, 64, 2, X, 0);
+
+#define TEST_ALL_EXTRA_CHUNKS(X,Y)	     \
+  TEST_EXTRA_CHUNK(poly, 64, 1, X, Y) \
+  TEST_EXTRA_CHUNK(poly, 64, 2, X, Y)
+
+#define CHECK_RESULTS_VLD_STX_LANE(test_name,EXPECTED,comment)	\
+  CHECK(test_name, poly, 64, 1, PRIx64, EXPECTED, comment);	\
+  CHECK(test_name, poly, 64, 2, PRIx64, EXPECTED, comment);
+
+  /* Declare the temporary buffers / variables.  */
+  DECL_ALL_VLD_STX_LANE(2);
+  DECL_ALL_VLD_STX_LANE(3);
+  DECL_ALL_VLD_STX_LANE(4);
+
+  DUMMY_ARRAY(buffer_src, poly, 64, 1, 4);
+  DUMMY_ARRAY(buffer_src, poly, 64, 2, 4);
+
+  /* Check vld2_lane/vld2q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD2_LANE/VLD2Q_LANE"
+  TEST_ALL_VLDX_LANE(2);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_0, " chunk 0");
+
+  TEST_ALL_EXTRA_CHUNKS(2, 1);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st2_1, " chunk 1");
+
+  /* Check vld3_lane/vld3q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD3_LANE/VLD3Q_LANE"
+  TEST_ALL_VLDX_LANE(3);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_0, " chunk 0");
+
+  TEST_ALL_EXTRA_CHUNKS(3, 1);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_1, " chunk 1");
+
+  TEST_ALL_EXTRA_CHUNKS(3, 2);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st3_2, " chunk 2");
+
+  /* Check vld4_lane/vld4q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VLD4_LANE/VLD4Q_LANE"
+  TEST_ALL_VLDX_LANE(4);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_0, " chunk 0");
+
+  TEST_ALL_EXTRA_CHUNKS(4, 1);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_1, " chunk 1");
+  TEST_ALL_EXTRA_CHUNKS(4, 2);
+
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_2, " chunk 2");
+
+  TEST_ALL_EXTRA_CHUNKS(4, 3);
+  CHECK_RESULTS_VLD_STX_LANE (TEST_MSG, expected_vld_st4_3, " chunk 3");
+
+  /* In this case, input variables are arrays of vectors.  */
+#define DECL_VSTX_LANE(T1, W, N, X)					\
+  VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector, T1, W, N, X);	\
+  VECT_ARRAY_TYPE(T1, W, N, X) VECT_ARRAY_VAR(vector_src, T1, W, N, X);	\
+  VECT_VAR_DECL(result_bis_##X, T1, W, N)[X * N]
+
+  /* We need to use a temporary result buffer (result_bis), because
+     the one used for other tests is not large enough. A subset of the
+     result data is moved from result_bis to result, and it is this
+     subset which is used to check the actual behavior. The next
+     macro enables to move another chunk of data from result_bis to
+     result.  */
+  /* We also use another extra input buffer (buffer_src), which we
+     fill with 0xAA, and which it used to load a vector from which we
+     read a given lane.  */
+#define TEST_VSTX_LANE(Q, T1, T2, W, N, X, L)				 \
+  memset (VECT_VAR(buffer_src, T1, W, N), 0xAA,				 \
+	  sizeof(VECT_VAR(buffer_src, T1, W, N)));			 \
+  memset (VECT_VAR(result_bis_##X, T1, W, N), 0,			 \
+	  sizeof(VECT_VAR(result_bis_##X, T1, W, N)));			 \
+									 \
+  VECT_ARRAY_VAR(vector_src, T1, W, N, X) =				 \
+    vld##X##Q##_##T2##W(VECT_VAR(buffer_src, T1, W, N));		 \
+									 \
+  VECT_ARRAY_VAR(vector, T1, W, N, X) =					 \
+    /* Use dedicated init buffer, of size X.  */			 \
+    vld##X##Q##_lane_##T2##W(VECT_VAR(buffer_vld##X##_lane, T1, W, X),	 \
+			     VECT_ARRAY_VAR(vector_src, T1, W, N, X),	 \
+			     L);					 \
+  vst##X##Q##_lane_##T2##W(VECT_VAR(result_bis_##X, T1, W, N),		 \
+			   VECT_ARRAY_VAR(vector, T1, W, N, X),		 \
+			   L);						 \
+  memcpy(VECT_VAR(result, T1, W, N), VECT_VAR(result_bis_##X, T1, W, N), \
+	 sizeof(VECT_VAR(result, T1, W, N)));
+
+#define TEST_ALL_VSTX_LANE(X)		  \
+  TEST_VSTX_LANE(, poly, p, 64, 1, X, 0); \
+  TEST_VSTX_LANE(q, poly, p, 64, 2, X, 0);
+
+  /* Check vst2_lane/vst2q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST2_LANE/VST2Q_LANE"
+  TEST_ALL_VSTX_LANE(2);
+
+#define CMT " (chunk 0)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_0, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(2, 1);
+#undef CMT
+#define CMT " chunk 1"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st2_1, CMT);
+
+  /* Check vst3_lane/vst3q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST3_LANE/VST3Q_LANE"
+  TEST_ALL_VSTX_LANE(3);
+
+#undef CMT
+#define CMT " (chunk 0)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_0, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(3, 1);
+
+#undef CMT
+#define CMT " (chunk 1)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_1, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(3, 2);
+
+#undef CMT
+#define CMT " (chunk 2)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st3_2, CMT);
+
+  /* Check vst4_lane/vst4q_lane.  */
+  clean_results ();
+#undef TEST_MSG
+#define TEST_MSG "VST4_LANE/VST4Q_LANE"
+  TEST_ALL_VSTX_LANE(4);
+
+#undef CMT
+#define CMT " (chunk 0)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_0, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(4, 1);
+
+#undef CMT
+#define CMT " (chunk 1)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_1, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(4, 2);
+
+#undef CMT
+#define CMT " (chunk 2)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_2, CMT);
+
+  TEST_ALL_EXTRA_CHUNKS(4, 3);
+
+#undef CMT
+#define CMT " (chunk 3)"
+  CHECK(TEST_MSG, poly, 64, 1, PRIx64, expected_vld_st4_3, CMT);
+
+#endif /* __aarch64__.  */
+
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
index 808641524c47b2c245ee2f10e74a784a7bccefc9..f192d4dda514287c8417e7fc922bc580b209b163 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p128.c
@@ -1,7 +1,8 @@ 
 /* This file contains tests for the vreinterpret *p128 intrinsics.  */
 
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
 /* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
 
 #include <arm_neon.h>
 #include "arm-neon-ref.h"
@@ -78,9 +79,7 @@  VECT_VAR_DECL(vreint_expected_q_f16_p128,hfloat,16,8) [] = { 0xfff0, 0xffff,
 int main (void)
 {
   DECL_VARIABLE_128BITS_VARIANTS(vreint_vector);
-  DECL_VARIABLE(vreint_vector, poly, 64, 2);
   DECL_VARIABLE_128BITS_VARIANTS(vreint_vector_res);
-  DECL_VARIABLE(vreint_vector_res, poly, 64, 2);
 
   clean_results ();
 
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
index 1d8cf9aa69f0b5b0717e98de613e3c350d6395d4..c915fd2fea6b4d8770c9a4aab88caad391105d89 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vreinterpret_p64.c
@@ -1,7 +1,8 @@ 
 /* This file contains tests for the vreinterpret *p64 intrinsics.  */
 
-/* { dg-require-effective-target arm_crypto_ok } */
+/* { dg-require-effective-target arm_crypto_ok { target { arm*-*-* } } } */
 /* { dg-add-options arm_crypto } */
+/* { dg-additional-options "-march=armv8-a+crypto" { target { aarch64*-*-* } } }*/
 
 #include <arm_neon.h>
 #include "arm-neon-ref.h"
@@ -121,11 +122,7 @@  int main (void)
   CHECK_FP(TEST_MSG, T1, W, N, PRIx##W, EXPECTED, "");
 
   DECL_VARIABLE_ALL_VARIANTS(vreint_vector);
-  DECL_VARIABLE(vreint_vector, poly, 64, 1);
-  DECL_VARIABLE(vreint_vector, poly, 64, 2);
   DECL_VARIABLE_ALL_VARIANTS(vreint_vector_res);
-  DECL_VARIABLE(vreint_vector_res, poly, 64, 1);
-  DECL_VARIABLE(vreint_vector_res, poly, 64, 2);
 
   clean_results ();