From patchwork Sun Jan 8 05:14:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jim Wilson X-Patchwork-Id: 90300 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp342439qgi; Sat, 7 Jan 2017 21:14:33 -0800 (PST) X-Received: by 10.84.193.36 with SMTP id e33mr14675520pld.147.1483852473531; Sat, 07 Jan 2017 21:14:33 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id w86si2241789pfa.192.2017.01.07.21.14.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 07 Jan 2017 21:14:33 -0800 (PST) Received-SPF: pass (google.com: domain of gdb-patches-return-136063-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org; spf=pass (google.com: domain of gdb-patches-return-136063-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gdb-patches-return-136063-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:from:date:message-id:subject:to :cc:content-type; q=dns; s=default; b=bzfghN+4Zvy1B6rwOCmZerh4rK 2m7LBCt5F8lxUGXC5xTAnUmRzxQL5dqk7CpgahYs/b8lKzxdUA78QFp7hU7i3qmQ Mu3AudxCE6AmeA/BF55i4VJeTP0XAjJWRaO251q7UwS0dKZYn83gnh+HtI0ZNGwz J2tYhIcB4hvcOuosA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:mime-version:from:date:message-id:subject:to :cc:content-type; s=default; bh=GFhEUeRb7Nu5iX8/GhGA6i1uSr8=; b= gi327Pu84LXj8ZNAtz6Y2IlFIkEgtu3J/P3aFhCfq+4xDy+22c9T5gZ1YoTUZWLB zpzCqRLys6LfjRDiyJo/GsdoXqiEst3U/PQLaqqvnO9RQmuSBfDVtaGSqdBzETh1 RsxJ38UWeSXBfi+7DhxsobwmW1Cr08fJG8vcrJmLmZs= Received: (qmail 78357 invoked by alias); 8 Jan 2017 05:14:21 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 78300 invoked by uid 89); 8 Jan 2017 05:14:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=no version=3.3.2 spammy=uzp, unzip, 2269, H*c:sk:f403045 X-HELO: mail-yw0-f172.google.com Received: from mail-yw0-f172.google.com (HELO mail-yw0-f172.google.com) (209.85.161.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 08 Jan 2017 05:14:03 +0000 Received: by mail-yw0-f172.google.com with SMTP id a10so401806214ywa.3 for ; Sat, 07 Jan 2017 21:14:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=/h7Xx4ITaxuwVkmaxjrAjBDbkNjWUs5ILMG+wef+LKU=; b=gGsh17HSc0nOMbb0MybgID0YMxQWuRQW9dNv7yinMQv86obSvFJ7Nqf4+75Vp7Om1+ Z3hutzsKDphhiEefAuClI9iw2qzU9xylYSrat64qshEuvsoVI7YI2hy/y7RNcaOlOnUE ONHpqw1E+84wKnHtcXmb6jFzj9PnWFrXKmg7+xul/hVd3RcEOeSWL4dgXqC9tJ7qytkG piGSkwQKyx7iYXOTVg/UJQ+dULeQ5ZEulE8MH/uwYQT68wQr1QAV60Xgget1p9BUw/eV eIMXaBaCUX/wn4dKbJHmlzOOjMCZspi/Pnm6vU2R3GLHwoNHz0lH/DXj3E9Udmzn0g7P 13hw== X-Gm-Message-State: AIkVDXK5i/fywk5jcMum93GSwwLw2ovTrv4a3gErrcv5y/DNp7bdOFG6dXVHBqfAeIeVPsn2XXS1BE9mqYnSNLRK X-Received: by 10.129.173.10 with SMTP id l10mr10623358ywh.175.1483852442170; Sat, 07 Jan 2017 21:14:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.129.92.4 with HTTP; Sat, 7 Jan 2017 21:14:01 -0800 (PST) From: Jim Wilson Date: Sat, 7 Jan 2017 21:14:01 -0800 Message-ID: Subject: [PATCH] aarch64 sim uzp1/uzp2 bug fix To: gdb-patches@sourceware.org Cc: Nick Clifton The uzp1/uzp2 implementation has a number of problems I had to rewrite it. It doesn't get the shifting/masking right. it gets input1 and input2 wrong. It checks one bit instead of two for the size field. It doesn't fail for the non-full size==3 (1d) case which should be an unallocated instruction. The new testcase passes with the patch, and fails without. The GCC C testsuite failures go from 2269 to 2227 (-42). Jim 2017-01-07 Jim Wilson sim/aarch64/ * simulator.c (do_vec_UZP): Rewrite. sim/testsuite/sim/aarch64/ * uzp.s: New. diff --git a/sim/aarch64/simulator.c b/sim/aarch64/simulator.c index 7b75c6e..36129e5 100644 --- a/sim/aarch64/simulator.c +++ b/sim/aarch64/simulator.c @@ -2958,12 +2958,10 @@ do_vec_UZP (sim_cpu *cpu) uint64_t val_n1 = aarch64_get_vec_u64 (cpu, vn, 0); uint64_t val_n2 = aarch64_get_vec_u64 (cpu, vn, 1); - uint64_t val1 = 0; - uint64_t val2 = 0; + uint64_t val1; + uint64_t val2; - uint64_t input1 = upper ? val_n1 : val_m1; - uint64_t input2 = upper ? val_n2 : val_m2; - unsigned i; + uint64_t input2 = full ? val_n2 : val_m1; NYI_assert (29, 24, 0x0E); NYI_assert (21, 21, 0); @@ -2971,32 +2969,68 @@ do_vec_UZP (sim_cpu *cpu) NYI_assert (13, 10, 6); TRACE_DECODE (cpu, "emulated at line %d", __LINE__); - switch (INSTR (23, 23)) + switch (INSTR (23, 22)) { case 0: - for (i = 0; i < 8; i++) + val1 = (val_n1 >> (upper * 8)) & 0xFFULL; + val1 |= (val_n1 >> ((upper * 8) + 8)) & 0xFF00ULL; + val1 |= (val_n1 >> ((upper * 8) + 16)) & 0xFF0000ULL; + val1 |= (val_n1 >> ((upper * 8) + 24)) & 0xFF000000ULL; + + val1 |= (input2 << (32 - (upper * 8))) & 0xFF00000000ULL; + val1 |= (input2 << (24 - (upper * 8))) & 0xFF0000000000ULL; + val1 |= (input2 << (16 - (upper * 8))) & 0xFF000000000000ULL; + val1 |= (input2 << (8 - (upper * 8))) & 0xFF00000000000000ULL; + + if (full) { - val1 |= (input1 >> (i * 8)) & (0xFFULL << (i * 8)); - val2 |= (input2 >> (i * 8)) & (0xFFULL << (i * 8)); + val2 = (val_m1 >> (upper * 8)) & 0xFFULL; + val2 |= (val_m1 >> ((upper * 8) + 8)) & 0xFF00ULL; + val2 |= (val_m1 >> ((upper * 8) + 16)) & 0xFF0000ULL; + val2 |= (val_m1 >> ((upper * 8) + 24)) & 0xFF000000ULL; + + val2 |= (val_m2 << (32 - (upper * 8))) & 0xFF00000000ULL; + val2 |= (val_m2 << (24 - (upper * 8))) & 0xFF0000000000ULL; + val2 |= (val_m2 << (16 - (upper * 8))) & 0xFF000000000000ULL; + val2 |= (val_m2 << (8 - (upper * 8))) & 0xFF00000000000000ULL; } break; case 1: - for (i = 0; i < 4; i++) + val1 = (val_n1 >> (upper * 16)) & 0xFFFFULL; + val1 |= (val_n1 >> ((upper * 16) + 16)) & 0xFFFF0000ULL; + + val1 |= (input2 << (32 - (upper * 16))) & 0xFFFF00000000ULL;; + val1 |= (input2 << (16 - (upper * 16))) & 0xFFFF000000000000ULL; + + if (full) { - val1 |= (input1 >> (i * 16)) & (0xFFFFULL << (i * 16)); - val2 |= (input2 >> (i * 16)) & (0xFFFFULL << (i * 16)); + val2 = (val_m1 >> (upper * 16)) & 0xFFFFULL; + val2 |= (val_m1 >> ((upper * 16) + 16)) & 0xFFFF0000ULL; + + val2 |= (val_m2 << (32 - (upper * 16))) & 0xFFFF00000000ULL; + val2 |= (val_m2 << (16 - (upper * 16))) & 0xFFFF000000000000ULL; } break; case 2: - val1 = ((input1 & 0xFFFFFFFF) | ((input1 >> 32) & 0xFFFFFFFF00000000ULL)); - val2 = ((input2 & 0xFFFFFFFF) | ((input2 >> 32) & 0xFFFFFFFF00000000ULL)); + val1 = (val_n1 >> (upper * 32)) & 0xFFFFFFFF; + val1 |= (input2 << (32 - (upper * 32))) & 0xFFFFFFFF00000000ULL; + + if (full) + { + val2 = (val_m1 >> (upper * 32)) & 0xFFFFFFFF; + val2 |= (val_m2 << (32 - (upper * 32))) & 0xFFFFFFFF00000000ULL; + } + break; case 3: - val1 = input1; - val2 = input2; - break; + if (! full) + HALT_UNALLOC; + + val1 = upper ? val_n2 : val_n1; + val2 = upper ? val_m2 : val_m1; + break; } aarch64_set_vec_u64 (cpu, vd, 0, val1); diff --git a/sim/testsuite/sim/aarch64/uzp.s b/sim/testsuite/sim/aarch64/uzp.s new file mode 100644 index 0000000..268d4fb --- /dev/null +++ b/sim/testsuite/sim/aarch64/uzp.s @@ -0,0 +1,214 @@ +# mach: aarch64 + +# Check the unzip instructions: uzp, uzp2. + +.include "testutils.inc" + +input1: + .word 0x04030201 + .word 0x08070605 + .word 0x0c0b0a09 + .word 0x100f0e0d +input2: + .word 0x14131211 + .word 0x18171615 + .word 0x1c1b1a19 + .word 0x201f1e1d +zl8b: + .word 0x07050301 + .word 0x17151311 +zu8b: + .word 0x08060402 + .word 0x18161412 +zl16b: + .word 0x07050301 + .word 0x0f0d0b09 + .word 0x17151311 + .word 0x1f1d1b19 +zu16b: + .word 0x08060402 + .word 0x100e0c0a + .word 0x18161412 + .word 0x201e1c1a +zl4h: + .word 0x06050201 + .word 0x16151211 +zu4h: + .word 0x08070403 + .word 0x18171413 +zl8h: + .word 0x06050201 + .word 0x0e0d0a09 + .word 0x16151211 + .word 0x1e1d1a19 +zu8h: + .word 0x08070403 + .word 0x100f0c0b + .word 0x18171413 + .word 0x201f1c1b +zl2s: + .word 0x04030201 + .word 0x14131211 +zu2s: + .word 0x08070605 + .word 0x18171615 +zl4s: + .word 0x04030201 + .word 0x0c0b0a09 + .word 0x14131211 + .word 0x1c1b1a19 +zu4s: + .word 0x08070605 + .word 0x100f0e0d + .word 0x18171615 + .word 0x201f1e1d +zl2d: + .word 0x04030201 + .word 0x08070605 + .word 0x14131211 + .word 0x18171615 +zu2d: + .word 0x0c0b0a09 + .word 0x100f0e0d + .word 0x1c1b1a19 + .word 0x201f1e1d + + start + adrp x0, input1 + ldr q0, [x0, #:lo12:input1] + adrp x0, input2 + ldr q1, [x0, #:lo12:input2] + + uzp1 v2.8b, v0.8b, v1.8b + mov x1, v2.d[0] + adrp x3, zl8b + ldr x4, [x3, #:lo12:zl8b] + cmp x1, x4 + bne .Lfailure + + uzp2 v2.8b, v0.8b, v1.8b + mov x1, v2.d[0] + adrp x3, zu8b + ldr x4, [x3, #:lo12:zu8b] + cmp x1, x4 + bne .Lfailure + + uzp1 v2.16b, v0.16b, v1.16b + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zl16b + ldr x4, [x3, #:lo12:zl16b] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zl16b+8] + cmp x2, x5 + bne .Lfailure + + uzp2 v2.16b, v0.16b, v1.16b + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zu16b + ldr x4, [x3, #:lo12:zu16b] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zu16b+8] + cmp x2, x5 + bne .Lfailure + + uzp1 v2.4h, v0.4h, v1.4h + mov x1, v2.d[0] + adrp x3, zl4h + ldr x4, [x3, #:lo12:zl4h] + cmp x1, x4 + bne .Lfailure + + uzp2 v2.4h, v0.4h, v1.4h + mov x1, v2.d[0] + adrp x3, zu4h + ldr x4, [x3, #:lo12:zu4h] + cmp x1, x4 + bne .Lfailure + + uzp1 v2.8h, v0.8h, v1.8h + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zl8h + ldr x4, [x3, #:lo12:zl8h] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zl8h+8] + cmp x2, x5 + bne .Lfailure + + uzp2 v2.8h, v0.8h, v1.8h + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zu8h + ldr x4, [x3, #:lo12:zu8h] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zu8h+8] + cmp x2, x5 + bne .Lfailure + + uzp1 v2.2s, v0.2s, v1.2s + mov x1, v2.d[0] + adrp x3, zl2s + ldr x4, [x3, #:lo12:zl2s] + cmp x1, x4 + bne .Lfailure + + uzp2 v2.2s, v0.2s, v1.2s + mov x1, v2.d[0] + adrp x3, zu2s + ldr x4, [x3, #:lo12:zu2s] + cmp x1, x4 + bne .Lfailure + + uzp1 v2.4s, v0.4s, v1.4s + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zl4s + ldr x4, [x3, #:lo12:zl4s] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zl4s+8] + cmp x2, x5 + bne .Lfailure + + uzp2 v2.4s, v0.4s, v1.4s + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zu4s + ldr x4, [x3, #:lo12:zu4s] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zu4s+8] + cmp x2, x5 + bne .Lfailure + + uzp1 v2.2d, v0.2d, v1.2d + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zl2d + ldr x4, [x3, #:lo12:zl2d] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zl2d+8] + cmp x2, x5 + bne .Lfailure + + uzp2 v2.2d, v0.2d, v1.2d + mov x1, v2.d[0] + mov x2, v2.d[1] + adrp x3, zu2d + ldr x4, [x3, #:lo12:zu2d] + cmp x1, x4 + bne .Lfailure + ldr x5, [x3, #:lo12:zu2d+8] + cmp x2, x5 + bne .Lfailure + + pass +.Lfailure: + fail