From patchwork Fri Jan 4 22:31:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154801 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1090651ljp; Fri, 4 Jan 2019 14:32:17 -0800 (PST) X-Google-Smtp-Source: ALg8bN7qle5CQJ7LKyaKwKiD7VyMp02c7fQJRcoAD1QsFEW2KKZAZ6xAuPk3iODNVXqToJnbSbuL X-Received: by 2002:adf:cd0e:: with SMTP id w14mr47180879wrm.218.1546641137244; Fri, 04 Jan 2019 14:32:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641137; cv=none; d=google.com; s=arc-20160816; b=BQsjClZeoWIRp/ZTs7HhrGET6sI3C+/BdDZrho10S2hYYnzu2lcKhcZjuejSZcvEdb a7HUZtzsQRAJqHcZCFd06dYLGUJFO8xXnEQfFHSwrcrK/tyhpjryOBCkOLwlP1aZgzNc L+RtwYjJxmLUBntdw1eY/17sjiZp/2m7OAgZUkxXD+eHqCHIedE3sTjdYGTJio63Ch2r 048Hn3p8iUqrmEbkUAiOwIT2R05Iu1M4hYjokgKoYeZ+FvEUcpCVdsp0LjZgxzBq6FBl i8BIlN0DHbj2NX9kf+vnYLTLQ99Md6g16HkWjs8UcKMopBpaFUi/rFo+vrpejzsAlMxW cYZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=po3fGKZcEsraXkqBMSX5FPnIKtuOJRDnAJ2/AeZX7VI=; b=fv2YYxyfgYFCP5miGQI8Hg/JSCyaQNTbBiodEsGoT/JUJtKs20l+HDmQQtUM/S8Bi0 ndfm97so5zg5pR/o/W8tL3DHfKAXA1L4SzFL8CByd9Fnc8qPc5OGKjJIfzwa43c+z+uY ZPeFgHWiKELRjSMuQ+92y6nfea6KJL75I8c3UjcPjQW84VPMGl8uaU2IfApFDA/Qzg9X Jvi5xV5URhj8bQbFDXqHHOEb4Jmfmd57U0/Cti7og8lXEYW3n892X+Vjis5dXjlAsK8q HRH1fZRjQYF1y0Ljn0S0tTzOzI3I9aGb8rPQercbjqA9LDMcnetQRKNBpOZDZEtBbmKR 4WMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=faTdPAb1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id z132si1307526wmb.166.2019.01.04.14.32.17 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:32:17 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=faTdPAb1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:56602 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0k-0002fd-Ie for patch@linaro.org; Fri, 04 Jan 2019 17:32:14 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54790 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY02-0002dW-ME for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY01-0001Dj-QI for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:30 -0500 Received: from mail-it1-x141.google.com ([2607:f8b0:4864:20::141]:33664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY01-0001DX-MK for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:29 -0500 Received: by mail-it1-x141.google.com with SMTP id m8so2327292itk.0 for ; Fri, 04 Jan 2019 14:31:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=po3fGKZcEsraXkqBMSX5FPnIKtuOJRDnAJ2/AeZX7VI=; b=faTdPAb1TBGQ2kdTyUf/ne3KM0vGFU78c0h9NFNH2AvMRGln2paiWhJ8uPDMSUjsYW dBBmHAliCHujluhQsepIPzePD3m0tHL6tO12EusGKAiwqtjpjWyAUugtx30s/mLJsgEE GRQAhsZl+Bkrkg2+ssLHIJZ+28n4zUuAJFvgg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=po3fGKZcEsraXkqBMSX5FPnIKtuOJRDnAJ2/AeZX7VI=; b=myv3NahVC6v1O/0JSKbrHFCtrzNuZFMjqG2TKfHdxa5TkMy3eW8PnU8salI/FXBxj4 Vxcqynvlv22U95wPnyYX3v2FIkoAAGxdftLhWC06S/G6+fTnHXxyEJMIKtBcE507Ttbt s4OIY7dtXAUGOFqgJLQUT6IfnY32poe66R4iJTfTUa6zEJeX/n+lRbfNAk1nhJZICQw8 GRVn1+xOLQqFmPv7XMsqLFAkJim3u7vPwjWM4j+xGc+r2QWdCvADZXWozfKJ9vTK0rQw j8/GOkcuv7fiXP1cBx/+zl7BsjSTQa4TIyatYkVuBfpN0MAES4jP7pCYrI67LEwS420u ysqg== X-Gm-Message-State: AJcUukcPkLtcFilHVVenY0Hlkm4rk5GgH4I2LtCVprY9zyAISzUH/yRr yYJ2wYSrgb3bECvnJ7GtUbKJCAYo5q0= X-Received: by 2002:a24:57c5:: with SMTP id u188mr2367371ita.54.1546641088647; Fri, 04 Jan 2019 14:31:28 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.26 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:07 +1000 Message-Id: <20190104223116.14037-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::141 Subject: [Qemu-devel] [PATCH v2 01/10] tcg: Add logical simplifications during gvec expand X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" We handle many of these during integer expansion, and the rest of them during integer optimization. Reviewed-by: David Gibson Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) -- 2.17.2 diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 61c25f5784..ec231b78fb 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1840,7 +1840,12 @@ void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, .opc = INDEX_op_and_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs == bofs) { + tcg_gen_gvec_mov(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1853,7 +1858,12 @@ void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, .opc = INDEX_op_or_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs == bofs) { + tcg_gen_gvec_mov(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } void tcg_gen_gvec_xor(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1866,7 +1876,12 @@ void tcg_gen_gvec_xor(unsigned vece, uint32_t dofs, uint32_t aofs, .opc = INDEX_op_xor_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs == bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, 0); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1879,7 +1894,12 @@ void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, uint32_t aofs, .opc = INDEX_op_andc_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs == bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, 0); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1892,7 +1912,12 @@ void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, .opc = INDEX_op_orc_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs == bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, -1); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } static const GVecGen2s gop_ands = { From patchwork Fri Jan 4 22:31:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154806 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1095505ljp; Fri, 4 Jan 2019 14:39:45 -0800 (PST) X-Google-Smtp-Source: ALg8bN4FizFALlOgfjeKXNHf1xe9T7h8qTUsV0r1FK9yNYyen/UtMs98SEv9+TRWuaBiXoGcMeG+ X-Received: by 2002:adf:f9cb:: with SMTP id w11mr44531739wrr.201.1546641585349; Fri, 04 Jan 2019 14:39:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641585; cv=none; d=google.com; s=arc-20160816; b=Z8Shxo/og2540/1JDfDUdsO/u3lWJBTBPvx59LUZIC7afZ5dMm+chp6thWGP8617M+ O6KcXUxleVt0wCZHuBQT8UYt7yQPcjA59ko91ee9/ifRKG2Jn7TE6DUmyf+e52KTz6tz /lpa4Hn9RJWtF+HpPSXvs8OgIYfBnf6qq2sslOZfVI6QyZZjBb+5lYzHwsgWq8oOOVkD 7HAWCwObk9MHXgDfrHiaOtT5Ayq6lGMNRtVLgBMZvjFxHuBY8BCSN2HQoBTUdKJQWCUP O0+pLSwjPvOYvbYdUU9kU1FeQsbxf5XvBMHvhpKB62MYohMBXIiQVZXWBwGsJo0b5wsr l9DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=DCNcv2k0t88K5ivbES4v5MrwSiEW1I9azwgO1d1vYhU=; b=xdp0EMHKBT4u9Hu0HnoXO1XvNcXaNwxRY20ySvydUqM/Fz08vUSSj9pNAM1oR48hpT tQqbma/3b8mVR7wT2fRJDZEyk22sKWPrrca+b48XD9SJkYC6j157D/0590YLSvMXULtS LO2iz3qJRarw432y22ga3d9XoRs+A0Kz9JWL6/DCfD8ZtRsfRW8SEaAmenEFxWvdKGwn dAWWo5pH0OpkznkQGnY4Pm1q8FWWrqGHOLzRpNNhpNwwPsY2dZfoV4v/znUwibAV6BTv 3h7zWDvxURSXwjQhyBXfh6aK3k2SF/I1MvrLlQGR5YXjHjgR9G01936orL8D/fsethAp Mhmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JNQ5IkZJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id d15si31787863wru.309.2019.01.04.14.39.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:39:45 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JNQ5IkZJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57913 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY80-0002yF-4P for patch@linaro.org; Fri, 04 Jan 2019 17:39:44 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54816 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY05-0002es-Mh for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY04-0001F7-F8 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:33 -0500 Received: from mail-it1-x141.google.com ([2607:f8b0:4864:20::141]:50458) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY04-0001Et-9t for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:32 -0500 Received: by mail-it1-x141.google.com with SMTP id z7so3664843iti.0 for ; Fri, 04 Jan 2019 14:31:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=DCNcv2k0t88K5ivbES4v5MrwSiEW1I9azwgO1d1vYhU=; b=JNQ5IkZJSyIVUWbg6RZuVaeG4z5YD3wLR+h/CpnI4J87zN4jPikCfwoiADgPl7i00V QeFfG1EVJrhDj2iNHxEHeLUKAnlvYXQlcM7QKLP2p03EcHaTCna9oWmrKE9A7gg3AzZs 3cuYw8ObC+1DOG9laAMA8PacV3HBKepwlPCjU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=DCNcv2k0t88K5ivbES4v5MrwSiEW1I9azwgO1d1vYhU=; b=K7Rump+oTPdxyV85pAq0VppmzeRQXnbLI02G1ryEp6aH2/FZkQcMkbZ1vTRaKEXPaY k9ZUJqg580HFq8c6CbdtjqgZd0u3FjU8srCNqufqXYe72JoqU5FKjGZwR40N7uDuVsDc VnEFUd67vfpwh6xL+D2kxWyikrQWQ4WgrR1N+XOEv+/9kaPqS6eUMBVLpcF9VaE67e6q 7Ux2QlJbJJwkZ57qCQOx0gf2gCtdxNhWIf3kr61aeOCsxxA6Zli61+O2C9NNyXb6LTwI F2gtXCzfDUSn4c15eKPo26kENnti5TtO8JJGoLTk/ixDP79dNG0MMPDHp4ziC2RVLkiP A8bQ== X-Gm-Message-State: AJcUukdWeqJRynPf0OOlL5s7JoXNdKVlTrJmzEsyWjJ1V5AaxvDzqItl HfZ8bbSEqFunQLBVFSde4jWyZpjR+To= X-Received: by 2002:a24:fa4b:: with SMTP id v72mr2014813ith.20.1546641091308; Fri, 04 Jan 2019 14:31:31 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.29 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:30 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:08 +1000 Message-Id: <20190104223116.14037-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::141 Subject: [Qemu-devel] [PATCH v2 02/10] tcg: Add gvec expanders for nand, nor, eqv X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: David Gibson Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 3 +++ tcg/tcg-op-gvec.h | 6 +++++ tcg/tcg-op.h | 3 +++ accel/tcg/tcg-runtime-gvec.c | 33 +++++++++++++++++++++++ tcg/tcg-op-gvec.c | 51 ++++++++++++++++++++++++++++++++++++ tcg/tcg-op-vec.c | 21 +++++++++++++++ 6 files changed, 117 insertions(+) -- 2.17.2 diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 1bd39d136d..835ddfebb2 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -211,6 +211,9 @@ DEF_HELPER_FLAGS_4(gvec_or, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_xor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_andc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_orc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_nand, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_nor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_eqv, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ands, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(gvec_xors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index ff43a29a0b..d65b9d9d4c 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -242,6 +242,12 @@ void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_nand(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_nor(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_eqv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_andi(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t c, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 7007ec0d4d..f6ef1cd690 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -962,6 +962,9 @@ void tcg_gen_or_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_xor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_andc_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_orc_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_nand_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a); diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index 90340e56e0..d1802467d5 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -512,6 +512,39 @@ void HELPER(gvec_orc)(void *d, void *a, void *b, uint32_t desc) clear_high(d, oprsz, desc); } +void HELPER(gvec_nand)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = ~(*(vec64 *)(a + i) & *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_nor)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = ~(*(vec64 *)(a + i) | *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_eqv)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = ~(*(vec64 *)(a + i) ^ *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + void HELPER(gvec_ands)(void *d, void *a, uint64_t b, uint32_t desc) { intptr_t oprsz = simd_oprsz(desc); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index ec231b78fb..81689d02f7 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1920,6 +1920,57 @@ void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, } } +void tcg_gen_gvec_nand(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_nand_i64, + .fniv = tcg_gen_nand_vec, + .fno = gen_helper_gvec_nand, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + + if (aofs == bofs) { + tcg_gen_gvec_not(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + +void tcg_gen_gvec_nor(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_nor_i64, + .fniv = tcg_gen_nor_vec, + .fno = gen_helper_gvec_nor, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + + if (aofs == bofs) { + tcg_gen_gvec_not(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + +void tcg_gen_gvec_eqv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_eqv_i64, + .fniv = tcg_gen_eqv_vec, + .fno = gen_helper_gvec_eqv, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + + if (aofs == bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, -1); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + static const GVecGen2s gop_ands = { .fni8 = tcg_gen_and_i64, .fniv = tcg_gen_and_vec, diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index cefba3d185..d77fdf7c1d 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -275,6 +275,27 @@ void tcg_gen_orc_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) } } +void tcg_gen_nand_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_nand_vec when adding a backend supports it. */ + tcg_gen_and_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + +void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_nor_vec when adding a backend supports it. */ + tcg_gen_or_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + +void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_eqv_vec when adding a backend supports it. */ + tcg_gen_xor_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a) { if (TCG_TARGET_HAS_not_vec) { From patchwork Fri Jan 4 22:31:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154803 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1093274ljp; Fri, 4 Jan 2019 14:36:10 -0800 (PST) X-Google-Smtp-Source: ALg8bN6eOfrPCIySE6sM+WHrCwXKZ7zgFT/43vUNSA1ZMlcgGVd4uQZJ6VeU6vBeN1oyvtlMRYXl X-Received: by 2002:a63:e915:: with SMTP id i21mr3051204pgh.409.1546641370079; Fri, 04 Jan 2019 14:36:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641370; cv=none; d=google.com; s=arc-20160816; b=WVEwOyMQSxt3poGa6J7zeDsWFBfPYGCajq2x8TPFd0Ql+DmCG3nCaQXTi3HSvwYajS Sf/cBwrET6LGT3CXk9V2TmKnKICLmb3OdZDE1sdVSoBhkCpp1dbBeNdg1WPLyR2VPSpF uRV1mpb43hw0WDsNdAtfKfzkR45nmBtgPnK078XZ3CJjXODO+8bfN19vVrmOj/i/+Prj xmgAlqVzS8QKlyPiaX2GOAedbh784pr64nA3WBKbYSx2QbzznIdQS96eDiIfI9Li4sIS DXEc9ms2hQQuuSElbzaDKe3Y6FklLLt9gFute3CjmmmD/PcNQR68USa6ivwW7QvsMGQW JCWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=13baR0fYl6duiBYqIpKDqOz4EjDaN2cM4w2ffhmehW8=; b=ozNwavMLuliKs+NovewyKjzawV0FotYOGdrTbfupWD6Iko5uWYxP0QhRMuGTVZG30z WlBb6vFa5oMYusSB2vAzzisSvpHKtKpZBT7w8FiLS/goQ+ftywFGOSq9BWgUnmhrvSeT GOyjB0M3Uw5hcIk5x3QXtGczGRQYjdOh2YLNYkOMLT4uXX9AJpHfZoyZkbCNqz+Ns9W6 EmeG5450Kc9lOT1P+p4QT5ZuyrOWukx3ZU2I09qGN1MLywLg2jHncNjbQPl91FQU9F5m o3CJO2opzOc5SHhk3Xkc0X+fMDwCs4JrjG7IptAai80SpbtMAx0OSgmiJzaeHfZP/3sh hvGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TqrIethi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id d9si52583947pgv.123.2019.01.04.14.36.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:36:09 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TqrIethi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57101 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY4U-0005uL-CZ for patch@linaro.org; Fri, 04 Jan 2019 17:36:06 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54833 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY07-0002hZ-Tc for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY06-0001Qa-VI for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:35 -0500 Received: from mail-it1-x143.google.com ([2607:f8b0:4864:20::143]:33666) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY06-0001NY-Qp for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:34 -0500 Received: by mail-it1-x143.google.com with SMTP id m8so2327407itk.0 for ; Fri, 04 Jan 2019 14:31:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=13baR0fYl6duiBYqIpKDqOz4EjDaN2cM4w2ffhmehW8=; b=TqrIethis2KNSQswGFi4CJFpenX8Tljjgqs4j9LEAclSCN+/s0U+HOrOwGCXjzqxid 2IwreALKGvdw3wL4SPn9qqyiZq91epnPJA11PgiKngwYLnAsRUEic0JpHqWVsbx8CXlX 4hMDspAsOeEjhRfo4WBSKl8pnd7w2R+neZBHE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=13baR0fYl6duiBYqIpKDqOz4EjDaN2cM4w2ffhmehW8=; b=HeBB6DVzdI/JEEubAkQeUbBNFjbFHDNnwLl4ffFAUYc3uvkqbaF469+5c3pSQfNa2V 85t20zZ7UYkFs5yIwGh83XYfRbDhI4IZBAQHzZKQ0YKt+w+Di6+nRRejZ1rZTak/q3/1 yviTqPJvFFLRqUY4sTpLKY4x9iz8OIA63zCAhURx433vN8wf0/c1dkJyAAf9MOCAv+BU Z0MNFejedL8hqKb2BTiRZ8SizmGgLFx5S5GooNA4YjOvIBCifSTaeY3Wjhq53Z0OG9WB H7KEtr+NLELux64o3Pm8HcSxKvCQWuzifRlZJM2RFPP+Avh7SZBy7wZgRI/rwHG8ncfu 3VPA== X-Gm-Message-State: AJcUukertxSVuNEZgAu8RQXLmvpUtBQ2DM+9xqBdoxLuv37WwsUsLweI vX6WjZHCLVWm94Lqwh8hVN5LL2CtoUg= X-Received: by 2002:a24:185:: with SMTP id 127mr2184486itk.55.1546641093768; Fri, 04 Jan 2019 14:31:33 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.31 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:09 +1000 Message-Id: <20190104223116.14037-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::143 Subject: [Qemu-devel] [PATCH v2 03/10] tcg: Add write_aofs to GVecGen4 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This allows writing 2 output, 3 input operations. Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 2 ++ tcg/tcg-op-gvec.c | 27 +++++++++++++++++++-------- 2 files changed, 21 insertions(+), 8 deletions(-) -- 2.17.2 diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index d65b9d9d4c..2cb447112e 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -181,6 +181,8 @@ typedef struct { uint8_t vece; /* Prefer i64 to v64. */ bool prefer_i64; + /* Write aofs as a 2nd dest operand. */ + bool write_aofs; } GVecGen4; void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 81689d02f7..c10d3d7b26 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -665,7 +665,7 @@ static void expand_3_i32(uint32_t dofs, uint32_t aofs, /* Expand OPSZ bytes worth of three-operand operations using i32 elements. */ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, - uint32_t cofs, uint32_t oprsz, + uint32_t cofs, uint32_t oprsz, bool write_aofs, void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32)) { TCGv_i32 t0 = tcg_temp_new_i32(); @@ -680,6 +680,9 @@ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, tcg_gen_ld_i32(t3, cpu_env, cofs + i); fni(t0, t1, t2, t3); tcg_gen_st_i32(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_i32(t1, cpu_env, aofs + i); + } } tcg_temp_free_i32(t3); tcg_temp_free_i32(t2); @@ -769,7 +772,7 @@ static void expand_3_i64(uint32_t dofs, uint32_t aofs, /* Expand OPSZ bytes worth of three-operand operations using i64 elements. */ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs, - uint32_t cofs, uint32_t oprsz, + uint32_t cofs, uint32_t oprsz, bool write_aofs, void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64)) { TCGv_i64 t0 = tcg_temp_new_i64(); @@ -784,6 +787,9 @@ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs, tcg_gen_ld_i64(t3, cpu_env, cofs + i); fni(t0, t1, t2, t3); tcg_gen_st_i64(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_i64(t1, cpu_env, aofs + i); + } } tcg_temp_free_i64(t3); tcg_temp_free_i64(t2); @@ -880,7 +886,7 @@ static void expand_3_vec(unsigned vece, uint32_t dofs, uint32_t aofs, /* Expand OPSZ bytes worth of four-operand operations using host vectors. */ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, - uint32_t tysz, TCGType type, + uint32_t tysz, TCGType type, bool write_aofs, void (*fni)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec)) { @@ -896,6 +902,9 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_gen_ld_vec(t3, cpu_env, cofs + i); fni(vece, t0, t1, t2, t3); tcg_gen_st_vec(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_vec(t1, cpu_env, aofs + i); + } } tcg_temp_free_vec(t3); tcg_temp_free_vec(t2); @@ -1187,7 +1196,7 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, */ some = QEMU_ALIGN_DOWN(oprsz, 32); expand_4_vec(g->vece, dofs, aofs, bofs, cofs, some, - 32, TCG_TYPE_V256, g->fniv); + 32, TCG_TYPE_V256, g->write_aofs, g->fniv); if (some == oprsz) { break; } @@ -1200,18 +1209,20 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, /* fallthru */ case TCG_TYPE_V128: expand_4_vec(g->vece, dofs, aofs, bofs, cofs, oprsz, - 16, TCG_TYPE_V128, g->fniv); + 16, TCG_TYPE_V128, g->write_aofs, g->fniv); break; case TCG_TYPE_V64: expand_4_vec(g->vece, dofs, aofs, bofs, cofs, oprsz, - 8, TCG_TYPE_V64, g->fniv); + 8, TCG_TYPE_V64, g->write_aofs, g->fniv); break; case 0: if (g->fni8 && check_size_impl(oprsz, 8)) { - expand_4_i64(dofs, aofs, bofs, cofs, oprsz, g->fni8); + expand_4_i64(dofs, aofs, bofs, cofs, oprsz, + g->write_aofs, g->fni8); } else if (g->fni4 && check_size_impl(oprsz, 4)) { - expand_4_i32(dofs, aofs, bofs, cofs, oprsz, g->fni4); + expand_4_i32(dofs, aofs, bofs, cofs, oprsz, + g->write_aofs, g->fni4); } else { assert(g->fno != NULL); tcg_gen_gvec_4_ool(dofs, aofs, bofs, cofs, From patchwork Fri Jan 4 22:31:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154802 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1090742ljp; Fri, 4 Jan 2019 14:32:23 -0800 (PST) X-Google-Smtp-Source: ALg8bN5ZSlSMh1uCY4eYXU/IWEhkW05VOglnmeADSx7H4VOl0v1cH8lOXFtCTew9Az5tq1OykOC1 X-Received: by 2002:adf:9d85:: with SMTP id p5mr42888119wre.41.1546641143250; Fri, 04 Jan 2019 14:32:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641143; cv=none; d=google.com; s=arc-20160816; b=mDU7ko10BhO0LAuqqNpi+bIVuVkcq+GKohyigfxaZWblgXcLn02fcB1rydIxN/0v4M kR6QtJY9HEWJ1oXOoYjqPqo5PdLPJblBl650T+Nyrq9FmW4HzUoaw/QSq7WNKwbwhHiE Ytpih36mHahjlm9f5i68N/cN2K6tr4nETKjmMM2lbqpxkPd6pvioBvjs3z3senMsyjXA B8Nu26D9hmAjLftQGA4uoLdC4ypqTngzC0oQJR66aPliTjSO8c3J73e+HhPEpMHdYI6L 0KJfE9znDa2zsL9k6h9DxkqP1GQTy3GvugjhOMbPWFoWqZlED64umkLJBd9zPkcugHJR cYPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=+6X0SjI2cU8UHG6qE+NK8sKOtTuvkXLBDx285LoFEAA=; b=AIA/wDyab3JhbFJzx7EFef85viB6Qna86IFZ3+kdAhoAS3x3V4YCyTQHj9+Epjj0VF IvpFvSZvC18xgi7FNiKFhPfJdvzbrgwM62fhwsvz/fzKM/fEEQf5apsTjQk5ba9T/fFT k4IzwQNWo6Civdz8FwSmnLYZWjsDdStk8K0eg1QBnNBlIHFu8T80TB3n1EN29HAvdqnd HQnX+YFE56WZR0zCzVp6rNQDpXRr/KVZv83uEJGNseBL9vBRGdbIBj5ISKBc4sk2AFWE T80usdNhGGmB6pzs3g+NXLRQAYBNHvmRq2R5zlYvSJZgGpgGTuCB4Sk751nlpMe6COfQ V23w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="jk/lpdHh"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id n3si1340427wme.9.2019.01.04.14.32.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:32:23 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="jk/lpdHh"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:56630 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0s-0002pe-4X for patch@linaro.org; Fri, 04 Jan 2019 17:32:22 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54861 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0D-0002le-9X for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY09-0001g5-SN for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:41 -0500 Received: from mail-it1-x144.google.com ([2607:f8b0:4864:20::144]:34867) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY09-0001dw-Mm for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:37 -0500 Received: by mail-it1-x144.google.com with SMTP id p197so3464526itp.0 for ; Fri, 04 Jan 2019 14:31:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=+6X0SjI2cU8UHG6qE+NK8sKOtTuvkXLBDx285LoFEAA=; b=jk/lpdHhZyPx3Qnt3VxQaa0GMwm5Pn//IdSFX7zmzhfUC2yOA67q/MNUV3orL5GHyK uCpVSU2Hi4fvQIOL7Ei97JH6ygfQMXxSrB4h/BduBD9bcHNWhXZ/RIvkL1bgDZ5hIVq6 fa+4zO4Sc270rs49qeRIqhAinQl0qxSt1spoI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=+6X0SjI2cU8UHG6qE+NK8sKOtTuvkXLBDx285LoFEAA=; b=ThYTDU31sZVqEZ5aI/CahncpwoCbbSPBfs9MuPJXu+ss9ql1yOTZSIPlRdilMPPfZe OwVbTZCSX/BXcnJRBHWWViYvR9u3QsZihLxpfoTTyN4fcGNQDROdQ+qCMJiS4LPYNbFe Ss9GvmsbRJH9IjGE1jcNTflYQXfbY9Jq4H76jdg8kBldzYnpKDQ+KELBI+4u/fQXabnc pqu175FuC4Q6v8hCW7jC2t+nbhJRi6GjUEujisBQmL+hiRWB175Im1BdQTKsG3FbRHGs 3RpFuRZcAFk8OTM/X/qRP3NUQOJk2AcQjwiXIVobnlF1d/oDqDWlpYevVB232fOWbz2T VVgQ== X-Gm-Message-State: AJcUuke7ehThcIROLg1Favwheb/KVcYZv367Yo8BwSA8nvaMQEcGc4xT 8/XbzsPbvS8nXLJj7h+1gYZQfvHmLOo= X-Received: by 2002:a24:25ce:: with SMTP id g197mr2209083itg.61.1546641096623; Fri, 04 Jan 2019 14:31:36 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.34 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:10 +1000 Message-Id: <20190104223116.14037-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::144 Subject: [Qemu-devel] [PATCH v2 04/10] tcg: Add opcodes for vector saturated arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op.h | 4 ++ tcg/tcg-opc.h | 4 ++ tcg/tcg.h | 1 + tcg/tcg-op-gvec.c | 84 ++++++++++++++++++++++++++++++---------- tcg/tcg-op-vec.c | 34 ++++++++++++++-- tcg/tcg.c | 5 +++ 8 files changed, 110 insertions(+), 24 deletions(-) -- 2.17.2 diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index f966a4fcb3..98556bcf22 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -135,6 +135,7 @@ typedef enum { #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 +#define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index f378d29568..44381062e6 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -185,6 +185,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 +#define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && (len) == 8) || ((ofs) == 8 && (len) == 8) || \ diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index f6ef1cd690..4a93d730e8 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -967,6 +967,10 @@ void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a); +void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 7a8a3edb5b..94b2ed80af 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -222,6 +222,10 @@ DEF(add_vec, 1, 2, 0, IMPLVEC) DEF(sub_vec, 1, 2, 0, IMPLVEC) DEF(mul_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_mul_vec)) DEF(neg_vec, 1, 1, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_neg_vec)) +DEF(ssadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(usadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(sssub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(ussub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.h b/tcg/tcg.h index 3a629991ca..df24afa425 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -183,6 +183,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_mul_vec 0 +#define TCG_TARGET_HAS_sat_vec 0 #else #define TCG_TARGET_MAYBE_vec 1 #endif diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index c10d3d7b26..0a33f51065 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1678,10 +1678,22 @@ void tcg_gen_gvec_ssadd(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] = { - { .fno = gen_helper_gvec_ssadd8, .vece = MO_8 }, - { .fno = gen_helper_gvec_ssadd16, .vece = MO_16 }, - { .fno = gen_helper_gvec_ssadd32, .vece = MO_32 }, - { .fno = gen_helper_gvec_ssadd64, .vece = MO_64 } + { .fniv = tcg_gen_ssadd_vec, + .fno = gen_helper_gvec_ssadd8, + .opc = INDEX_op_ssadd_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_ssadd_vec, + .fno = gen_helper_gvec_ssadd16, + .opc = INDEX_op_ssadd_vec, + .vece = MO_16 }, + { .fniv = tcg_gen_ssadd_vec, + .fno = gen_helper_gvec_ssadd32, + .opc = INDEX_op_ssadd_vec, + .vece = MO_32 }, + { .fniv = tcg_gen_ssadd_vec, + .fno = gen_helper_gvec_ssadd64, + .opc = INDEX_op_ssadd_vec, + .vece = MO_64 }, }; tcg_debug_assert(vece <= MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); @@ -1691,16 +1703,28 @@ void tcg_gen_gvec_sssub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] = { - { .fno = gen_helper_gvec_sssub8, .vece = MO_8 }, - { .fno = gen_helper_gvec_sssub16, .vece = MO_16 }, - { .fno = gen_helper_gvec_sssub32, .vece = MO_32 }, - { .fno = gen_helper_gvec_sssub64, .vece = MO_64 } + { .fniv = tcg_gen_sssub_vec, + .fno = gen_helper_gvec_sssub8, + .opc = INDEX_op_sssub_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_sssub_vec, + .fno = gen_helper_gvec_sssub16, + .opc = INDEX_op_sssub_vec, + .vece = MO_16 }, + { .fniv = tcg_gen_sssub_vec, + .fno = gen_helper_gvec_sssub32, + .opc = INDEX_op_sssub_vec, + .vece = MO_32 }, + { .fniv = tcg_gen_sssub_vec, + .fno = gen_helper_gvec_sssub64, + .opc = INDEX_op_sssub_vec, + .vece = MO_64 }, }; tcg_debug_assert(vece <= MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } -static void tcg_gen_vec_usadd32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +static void tcg_gen_usadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { TCGv_i32 max = tcg_const_i32(-1); tcg_gen_add_i32(d, a, b); @@ -1708,7 +1732,7 @@ static void tcg_gen_vec_usadd32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) tcg_temp_free_i32(max); } -static void tcg_gen_vec_usadd32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +static void tcg_gen_usadd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 max = tcg_const_i64(-1); tcg_gen_add_i64(d, a, b); @@ -1720,20 +1744,30 @@ void tcg_gen_gvec_usadd(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] = { - { .fno = gen_helper_gvec_usadd8, .vece = MO_8 }, - { .fno = gen_helper_gvec_usadd16, .vece = MO_16 }, - { .fni4 = tcg_gen_vec_usadd32_i32, + { .fniv = tcg_gen_usadd_vec, + .fno = gen_helper_gvec_usadd8, + .opc = INDEX_op_usadd_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_usadd_vec, + .fno = gen_helper_gvec_usadd16, + .opc = INDEX_op_usadd_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_usadd_i32, + .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd32, + .opc = INDEX_op_usadd_vec, .vece = MO_32 }, - { .fni8 = tcg_gen_vec_usadd32_i64, + { .fni8 = tcg_gen_usadd_i64, + .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd64, + .opc = INDEX_op_usadd_vec, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } -static void tcg_gen_vec_ussub32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +static void tcg_gen_ussub_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { TCGv_i32 min = tcg_const_i32(0); tcg_gen_sub_i32(d, a, b); @@ -1741,7 +1775,7 @@ static void tcg_gen_vec_ussub32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) tcg_temp_free_i32(min); } -static void tcg_gen_vec_ussub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +static void tcg_gen_ussub_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 min = tcg_const_i64(0); tcg_gen_sub_i64(d, a, b); @@ -1753,13 +1787,23 @@ void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] = { - { .fno = gen_helper_gvec_ussub8, .vece = MO_8 }, - { .fno = gen_helper_gvec_ussub16, .vece = MO_16 }, - { .fni4 = tcg_gen_vec_ussub32_i32, + { .fniv = tcg_gen_ussub_vec, + .fno = gen_helper_gvec_ussub8, + .opc = INDEX_op_ussub_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_ussub_vec, + .fno = gen_helper_gvec_ussub16, + .opc = INDEX_op_ussub_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_ussub_i32, + .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub32, + .opc = INDEX_op_ussub_vec, .vece = MO_32 }, - { .fni8 = tcg_gen_vec_ussub32_i64, + { .fni8 = tcg_gen_ussub_i64, + .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub64, + .opc = INDEX_op_ussub_vec, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index d77fdf7c1d..675aa09258 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -386,7 +386,8 @@ void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, } } -void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, + TCGv_vec b, TCGOpcode opc) { TCGTemp *rt = tcgv_vec_temp(r); TCGTemp *at = tcgv_vec_temp(a); @@ -399,11 +400,36 @@ void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) tcg_debug_assert(at->base_type >= type); tcg_debug_assert(bt->base_type >= type); - can = tcg_can_emit_vec_op(INDEX_op_mul_vec, type, vece); + can = tcg_can_emit_vec_op(opc, type, vece); if (can > 0) { - vec_gen_3(INDEX_op_mul_vec, type, vece, ri, ai, bi); + vec_gen_3(opc, type, vece, ri, ai, bi); } else { tcg_debug_assert(can < 0); - tcg_expand_vec_op(INDEX_op_mul_vec, type, vece, ri, ai, bi); + tcg_expand_vec_op(opc, type, vece, ri, ai, bi); } } + +void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_mul_vec); +} + +void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_ssadd_vec); +} + +void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_usadd_vec); +} + +void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_sssub_vec); +} + +void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_ussub_vec); +} diff --git a/tcg/tcg.c b/tcg/tcg.c index c54b119020..15ed5af007 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1607,6 +1607,11 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: return have_vec && TCG_TARGET_HAS_shv_vec; + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: + return have_vec && TCG_TARGET_HAS_sat_vec; default: tcg_debug_assert(op > INDEX_op_last_generic && op < NB_OPS); From patchwork Fri Jan 4 22:31:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154804 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1093681ljp; Fri, 4 Jan 2019 14:36:49 -0800 (PST) X-Google-Smtp-Source: ALg8bN4nigtIYhtMztf2mkBD/y+hvlCnkOxMV75XLi48tevdqHuY+TIgaguPjODXl3aAN5QwollH X-Received: by 2002:a1c:dd04:: with SMTP id u4mr2570146wmg.84.1546641409880; Fri, 04 Jan 2019 14:36:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641409; cv=none; d=google.com; s=arc-20160816; b=GZAE3P4pHZgQERR/86CsqhGv8Sd594qDffMN3z7ya9qhAuToVkXaAuUxfsDrfReEVZ bTE8ukSEtxUfkPwj5qV759kyO+IhcWY7qY0TaIk+kMTHOMPlu4aDKAxT2oW4Bj8JHsfY iCP9WziP7nu20JBWtcZskr3XZnegOGgaJ+ILzJslwKOG3dEuNjCzOt5WxM5QNZYG/Bj3 qkX7a2Day1qGnSW0EYifWiJjtTv8RBIl6LCBuB7jh7fZwwIgM4zzAhkf8fJ/JMiBFmud 564JWOqKppZI2zLWUrMDqMQfn7JYGZGwoPEoHurdmwUTgnQ+vtaYiF2sa8z+kS++XP2Y Cnaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=oc2/ydQBmZXr7b3KRnpHwkAcfEDqqil/zgo+zZnn/do=; b=XgrTpPOtbQ7Y5o68rD8KDPJA3bwWDdiy4ia0IV+fCpdqHzGOH4aEoU5f5iBXwHa0MF BvZMZyph96U/Yfi91GxplTORn+zgyy9c76AYLS0F+zqLSA7BaX+q7WK+Uq3+LxHw02j7 cDhIlm/gqBCDxCDGzprseINaIS9fwfYkKVQtIJkuRBEOrRUEGOMnS/dxsUEq493vgTyp COy3RP18yLI2tTroVcFjuRHaNQ8kE7tlWLtvBc4qpfEyOnuN279eOjtkMAv7npB98u6I 2at1uiVkzNx1ZoTeTN5o7lx4z/j46gqdYCwVeu1u88vMNVP2H6ludiCoPNr1GNi6JlIU IUnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="j/4dtI7H"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id w15si32536076wrn.103.2019.01.04.14.36.49 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:36:49 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="j/4dtI7H"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57545 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY5A-0000VE-KI for patch@linaro.org; Fri, 04 Jan 2019 17:36:48 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54877 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0F-0002oD-4a for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0D-0001to-4b for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:43 -0500 Received: from mail-io1-xd44.google.com ([2607:f8b0:4864:20::d44]:36634) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0C-0001r7-UR for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:41 -0500 Received: by mail-io1-xd44.google.com with SMTP id m19so30707930ioh.3 for ; Fri, 04 Jan 2019 14:31:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=oc2/ydQBmZXr7b3KRnpHwkAcfEDqqil/zgo+zZnn/do=; b=j/4dtI7Hgf6ABlYoR0XjnUC1FiGsRJsH2Fc3d/VgAGgDjtogw2QJKAR0tAbTdN+m+X EvZ9+GCUuoXvtckifrRiTHF5DRHSlYS+9+N8jkIwukeEBRhfuWi4DRvl0/8zCQ9gZ+aN YKwQ3vGKNaVP143HsCTgRwdXjXleJgMGzKr+g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=oc2/ydQBmZXr7b3KRnpHwkAcfEDqqil/zgo+zZnn/do=; b=DmhaekliemdY2R0XYYoRpnH2v06+wJP67pqcsryc4JiBhYlunDPYC5z5dmta5CakVp kqSo+A8wtkKnRsWZTTk+k3ZzDPkArKO9MD/G9wBCgj/6iGGOmvAwwghYHSXnegx1od5p RiSM2nuObif+XshWfGjaMWGtzzU7lP7p6U45LRBUCqabNbWrbqJfW9YJffkflCmAZhZD MWJzpdFzu2QHL5V4q+7lClf5aQgCVbqCvNG1ZSRT9DpPTTootcql821jP6BQXHp2JmI/ r/ClgzNNgxN1OVBuRwJODjQPVlCKs9pOuiVsyHMebWL61mkFErBTZyDgau/Q6bstwkxE Uobw== X-Gm-Message-State: AJcUukcRsBnax7rFJEMnK0qepBWjfMgUu9028u0fBxzxqz17HTgNpufj YlBkfnHAcmSypQaG+nbErBTy0Yun3Ro= X-Received: by 2002:a6b:d803:: with SMTP id y3mr39385073iob.247.1546641099522; Fri, 04 Jan 2019 14:31:39 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.36 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:38 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:11 +1000 Message-Id: <20190104223116.14037-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d44 Subject: [Qemu-devel] [PATCH v2 05/10] tcg: Add opcodes for vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 20 ++++ tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op-gvec.h | 10 ++ tcg/tcg-op.h | 4 + tcg/tcg-opc.h | 4 + tcg/tcg.h | 1 + accel/tcg/tcg-runtime-gvec.c | 224 +++++++++++++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 108 +++++++++++++++++ tcg/tcg-op-vec.c | 20 ++++ tcg/tcg.c | 5 + 11 files changed, 398 insertions(+) -- 2.17.2 diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 835ddfebb2..dfe325625c 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -200,6 +200,26 @@ DEF_HELPER_FLAGS_4(gvec_ussub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ussub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ussub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_smax8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umin8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umax8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(gvec_neg8, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg16, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg32, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 98556bcf22..545a6eec75 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -136,6 +136,7 @@ typedef enum { #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 44381062e6..7bd7eae672 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -186,6 +186,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && (len) == 8) || ((ofs) == 8 && (len) == 8) || \ diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 2cb447112e..4734eef7de 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -234,6 +234,16 @@ void tcg_gen_gvec_usadd(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +/* Min/max. */ +void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); + void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 4a93d730e8..2d98868d8f 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -971,6 +971,10 @@ void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 94b2ed80af..4e0238ad1a 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -226,6 +226,10 @@ DEF(ssadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(usadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(sssub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(ussub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(smin_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(umin_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(smax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(umax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.h b/tcg/tcg.h index df24afa425..1c3579077d 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -184,6 +184,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_mul_vec 0 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 #else #define TCG_TARGET_MAYBE_vec 1 #endif diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index d1802467d5..9358749741 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -1028,3 +1028,227 @@ void HELPER(gvec_ussub64)(void *d, void *a, void *b, uint32_t desc) } clear_high(d, oprsz, desc); } + +void HELPER(gvec_smin8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int8_t)) { + int8_t aa = *(int8_t *)(a + i); + int8_t bb = *(int8_t *)(b + i); + int8_t dd = aa < bb ? aa : bb; + *(int8_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int16_t)) { + int16_t aa = *(int16_t *)(a + i); + int16_t bb = *(int16_t *)(b + i); + int16_t dd = aa < bb ? aa : bb; + *(int16_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int32_t)) { + int32_t aa = *(int32_t *)(a + i); + int32_t bb = *(int32_t *)(b + i); + int32_t dd = aa < bb ? aa : bb; + *(int32_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int64_t)) { + int64_t aa = *(int64_t *)(a + i); + int64_t bb = *(int64_t *)(b + i); + int64_t dd = aa < bb ? aa : bb; + *(int64_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int8_t)) { + int8_t aa = *(int8_t *)(a + i); + int8_t bb = *(int8_t *)(b + i); + int8_t dd = aa > bb ? aa : bb; + *(int8_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int16_t)) { + int16_t aa = *(int16_t *)(a + i); + int16_t bb = *(int16_t *)(b + i); + int16_t dd = aa > bb ? aa : bb; + *(int16_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int32_t)) { + int32_t aa = *(int32_t *)(a + i); + int32_t bb = *(int32_t *)(b + i); + int32_t dd = aa > bb ? aa : bb; + *(int32_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int64_t)) { + int64_t aa = *(int64_t *)(a + i); + int64_t bb = *(int64_t *)(b + i); + int64_t dd = aa > bb ? aa : bb; + *(int64_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint8_t)) { + uint8_t aa = *(uint8_t *)(a + i); + uint8_t bb = *(uint8_t *)(b + i); + uint8_t dd = aa < bb ? aa : bb; + *(uint8_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint16_t)) { + uint16_t aa = *(uint16_t *)(a + i); + uint16_t bb = *(uint16_t *)(b + i); + uint16_t dd = aa < bb ? aa : bb; + *(uint16_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint32_t)) { + uint32_t aa = *(uint32_t *)(a + i); + uint32_t bb = *(uint32_t *)(b + i); + uint32_t dd = aa < bb ? aa : bb; + *(uint32_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + uint64_t aa = *(uint64_t *)(a + i); + uint64_t bb = *(uint64_t *)(b + i); + uint64_t dd = aa < bb ? aa : bb; + *(uint64_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint8_t)) { + uint8_t aa = *(uint8_t *)(a + i); + uint8_t bb = *(uint8_t *)(b + i); + uint8_t dd = aa > bb ? aa : bb; + *(uint8_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint16_t)) { + uint16_t aa = *(uint16_t *)(a + i); + uint16_t bb = *(uint16_t *)(b + i); + uint16_t dd = aa > bb ? aa : bb; + *(uint16_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint32_t)) { + uint32_t aa = *(uint32_t *)(a + i); + uint32_t bb = *(uint32_t *)(b + i); + uint32_t dd = aa > bb ? aa : bb; + *(uint32_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + uint64_t aa = *(uint64_t *)(a + i); + uint64_t bb = *(uint64_t *)(b + i); + uint64_t dd = aa > bb ? aa : bb; + *(uint64_t *)(d + i) = dd; + } + clear_high(d, oprsz, desc); +} diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 0a33f51065..3ee44fcb75 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1810,6 +1810,114 @@ void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } +void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_smin_vec, + .fno = gen_helper_gvec_smin8, + .opc = INDEX_op_smin_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_smin_vec, + .fno = gen_helper_gvec_smin16, + .opc = INDEX_op_smin_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_smin_i32, + .fniv = tcg_gen_smin_vec, + .fno = gen_helper_gvec_smin32, + .opc = INDEX_op_smin_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_smin_i64, + .fniv = tcg_gen_smin_vec, + .fno = gen_helper_gvec_smin64, + .opc = INDEX_op_smin_vec, + .vece = MO_64 } + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_umin_vec, + .fno = gen_helper_gvec_umin8, + .opc = INDEX_op_umin_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_umin_vec, + .fno = gen_helper_gvec_umin16, + .opc = INDEX_op_umin_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_umin_i32, + .fniv = tcg_gen_umin_vec, + .fno = gen_helper_gvec_umin32, + .opc = INDEX_op_umin_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_umin_i64, + .fniv = tcg_gen_umin_vec, + .fno = gen_helper_gvec_umin64, + .opc = INDEX_op_umin_vec, + .vece = MO_64 } + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_smax_vec, + .fno = gen_helper_gvec_smax8, + .opc = INDEX_op_smax_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_smax_vec, + .fno = gen_helper_gvec_smax16, + .opc = INDEX_op_smax_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_smax_i32, + .fniv = tcg_gen_smax_vec, + .fno = gen_helper_gvec_smax32, + .opc = INDEX_op_smax_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_smax_i64, + .fniv = tcg_gen_smax_vec, + .fno = gen_helper_gvec_smax64, + .opc = INDEX_op_smax_vec, + .vece = MO_64 } + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_umax_vec, + .fno = gen_helper_gvec_umax8, + .opc = INDEX_op_umax_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_umax_vec, + .fno = gen_helper_gvec_umax16, + .opc = INDEX_op_umax_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_umax_i32, + .fniv = tcg_gen_umax_vec, + .fno = gen_helper_gvec_umax32, + .opc = INDEX_op_umax_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_umax_i64, + .fniv = tcg_gen_umax_vec, + .fno = gen_helper_gvec_umax64, + .opc = INDEX_op_umax_vec, + .vece = MO_64 } + }; + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + /* Perform a vector negation using normal negation and a mask. Compare gen_subv_mask above. */ static void gen_negv_mask(TCGv_i64 d, TCGv_i64 b, TCGv_i64 m) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 675aa09258..36f35022ac 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -433,3 +433,23 @@ void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3(vece, r, a, b, INDEX_op_ussub_vec); } + +void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_smin_vec); +} + +void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_umin_vec); +} + +void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_smax_vec); +} + +void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_umax_vec); +} diff --git a/tcg/tcg.c b/tcg/tcg.c index 15ed5af007..1ae1e788f6 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1612,6 +1612,11 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: return have_vec && TCG_TARGET_HAS_sat_vec; + case INDEX_op_smin_vec: + case INDEX_op_umin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umax_vec: + return have_vec && TCG_TARGET_HAS_minmax_vec; default: tcg_debug_assert(op > INDEX_op_last_generic && op < NB_OPS); From patchwork Fri Jan 4 22:31:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154807 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1095661ljp; Fri, 4 Jan 2019 14:40:00 -0800 (PST) X-Google-Smtp-Source: ALg8bN5bocjjEhKOj904td3yHw1thzSuBJj6mkr5uoPHP/qA9sbpHzlkVDMJnzcdDqgnzgJSbzTw X-Received: by 2002:adf:dd06:: with SMTP id a6mr48078595wrm.2.1546641600211; Fri, 04 Jan 2019 14:40:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641600; cv=none; d=google.com; s=arc-20160816; b=EoRagy+C7GSaS7ZDID8kz/gf2pga9V5uelUJbYlbyoZxLGGX8w2pcTSWx5R/oKJzCD gSRVqD5skJ0eFFIYkqRq0hL9r9Kt+1EisJ4FbjGwvDV+U3rwFw1WYuZeNk6aJIXACfeb 8x6V3S4vk0TlkJ4LmZ+IVB4jFnYwC1qcoa3fVpMl45IitcXoGF+F+1FwdH0gLcRzVTPB LTpGv8f91yJZqgGxPY7qKoURV3nXsWaDTnT+CJEuu9z+V5sN1xaTlPzfclLXfbeFlNqH WkoYe8jDCf77Q02B4mktfi9+EQLr9HIRaI3YHOYPW7YCCPWiYzfU+28VuYJVwegPEyt/ 1NLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=Z1wlPA+4A6/CnEqRLT3YSogLwPxAp1hyT1AeUsX3gRw=; b=tIj4MYCTeITBcEjUCrtqUVAQe6cIdq1EQVzfWxfbqAjrIr+ReoGhMaLbTapAhCwRAL x4xkFY8Sh63O5ukVhR1TRhoeHtkbN7nsXb29ZK6ELVkj/miJ21jqSmLVMLdaDmvBA+ka 3ItPIyMo3IEPmB6CL39lZwtpjVGLfk2KKCANA4lXDtCTead9YmQ9HvVkyAdOshwcwIwO XQEYwI6Aek7HwOJSdlQnqzwNkkyF8me9ve0Bw3e96CVZj77KCCMvgb2GQDI5cCCLwTv3 GsNv7TlSrsZErhY4+5NTWmlwdriN3y9YCF0U9pa8NcmaOjbivlYrsG/xh4gRszCUsNqT 9uiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WGJa5YpP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 13si1346901wmv.96.2019.01.04.14.39.59 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:40:00 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WGJa5YpP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57955 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY8F-0003Bp-1n for patch@linaro.org; Fri, 04 Jan 2019 17:39:59 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54972 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0I-0002tu-P6 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0G-0001y0-9e for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:46 -0500 Received: from mail-io1-xd34.google.com ([2607:f8b0:4864:20::d34]:43030) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0G-0001xZ-1H for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:44 -0500 Received: by mail-io1-xd34.google.com with SMTP id b23so9777766ios.10 for ; Fri, 04 Jan 2019 14:31:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=Z1wlPA+4A6/CnEqRLT3YSogLwPxAp1hyT1AeUsX3gRw=; b=WGJa5YpPuRzdfAXf2/dNXGypAOMPXgBuOmJrq4uICsWAbsroW3q/LewVqIE2gtGFa0 czxKTwbBQ9J5FZ4CDY+0nRzLVuG/7tAy5UYfjZwzw6J7jgnEk6TDg6Gv/GxD5VAQtUhi zQBLP+73MxNdvRH5RGQeCDSlZFw3M8CwLFzsQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=Z1wlPA+4A6/CnEqRLT3YSogLwPxAp1hyT1AeUsX3gRw=; b=CoU8t5RbPn43TR29PS5JLxTnM9cG+VoiQ2hdN4SAsYgY8H2aNABZeYtliakUwR39ix fn/9/Ew2/8bVskN3YB1/RbODKoqGtU9qDx5m1jN6VO7GHRJ6IVZwIfEQqkh5Lpus0t1S LMMTPssU5uz0CX6hjR/xZvwOti5pRAjBzXQgz6g8rUr8b6TOwCBSZnsNwiUGAshnDzX0 HNe9jUu3aa6VOPDV1LLMzX1j4N5/4MFINCJkYYe/5mrgBXcabL4FYBYt5xp2/sx/o6l5 n7Qpg80VF4ODkXF0lE6js+y/fE4u+Ss4GpQhDe65wHQjAV8eGOxWHx90Zxh/LfpQj8zC tQQA== X-Gm-Message-State: AJcUukeCRPrcGxCiT1CbEIgdM5ITbpKAdiC0m9Lj+7NzdDCSVor+iPd7 MGaJEeJYAxKiFA8ipisYgO8nk9CQacE= X-Received: by 2002:a5d:8597:: with SMTP id f23mr39641872ioj.238.1546641102860; Fri, 04 Jan 2019 14:31:42 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.39 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:42 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:12 +1000 Message-Id: <20190104223116.14037-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d34 Subject: [Qemu-devel] [PATCH v2 06/10] tcg/i386: Split subroutines out of tcg_expand_vec_op X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This routine was becoming too large. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.inc.c | 459 +++++++++++++++++++------------------- 1 file changed, 232 insertions(+), 227 deletions(-) -- 2.17.2 diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index c21c3272f2..ad97386d06 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -3079,253 +3079,258 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } } +static void expand_vec_shi(TCGType type, unsigned vece, bool shr, + TCGv_vec v0, TCGv_vec v1, TCGArg imm) +{ + TCGv_vec t1, t2; + + tcg_debug_assert(vece == MO_8); + + t1 = tcg_temp_new_vec(type); + t2 = tcg_temp_new_vec(type); + + /* Unpack to W, shift, and repack. Tricky bits: + (1) Use punpck*bw x,x to produce DDCCBBAA, + i.e. duplicate in other half of the 16-bit lane. + (2) For right-shift, add 8 so that the high half of + the lane becomes zero. For left-shift, we must + shift up and down again. + (3) Step 2 leaves high half zero such that PACKUSWB + (pack with unsigned saturation) does not modify + the quantity. */ + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + + if (shr) { + tcg_gen_shri_vec(MO_16, t1, t1, imm + 8); + tcg_gen_shri_vec(MO_16, t2, t2, imm + 8); + } else { + tcg_gen_shli_vec(MO_16, t1, t1, imm + 8); + tcg_gen_shli_vec(MO_16, t2, t2, imm + 8); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + tcg_gen_shri_vec(MO_16, t2, t2, 8); + } + + vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); +} + +static void expand_vec_sari(TCGType type, unsigned vece, + TCGv_vec v0, TCGv_vec v1, TCGArg imm) +{ + TCGv_vec t1, t2; + + switch (vece) { + case MO_8: + /* Unpack to W, shift, and repack, as in expand_vec_shi. */ + t1 = tcg_temp_new_vec(type); + t2 = tcg_temp_new_vec(type); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + tcg_gen_sari_vec(MO_16, t1, t1, imm + 8); + tcg_gen_sari_vec(MO_16, t2, t2, imm + 8); + vec_gen_3(INDEX_op_x86_packss_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + break; + + case MO_64: + if (imm <= 32) { + /* We can emulate a small sign extend by performing an arithmetic + * 32-bit shift and overwriting the high half of a 64-bit logical + * shift (note that the ISA says shift of 32 is valid). + */ + t1 = tcg_temp_new_vec(type); + tcg_gen_sari_vec(MO_32, t1, v1, imm); + tcg_gen_shri_vec(MO_64, v0, v1, imm); + vec_gen_4(INDEX_op_x86_blend_vec, type, MO_32, + tcgv_vec_arg(v0), tcgv_vec_arg(v0), + tcgv_vec_arg(t1), 0xaa); + tcg_temp_free_vec(t1); + } else { + /* Otherwise we will need to use a compare vs 0 to produce + * the sign-extend, shift and merge. + */ + t1 = tcg_const_zeros_vec(type); + tcg_gen_cmp_vec(TCG_COND_GT, MO_64, t1, t1, v1); + tcg_gen_shri_vec(MO_64, v0, v1, imm); + tcg_gen_shli_vec(MO_64, t1, t1, 64 - imm); + tcg_gen_or_vec(MO_64, v0, v0, t1); + tcg_temp_free_vec(t1); + } + break; + + default: + g_assert_not_reached(); + } +} + +static void expand_vec_mul(TCGType type, unsigned vece, + TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) +{ + TCGv_vec t1, t2, t3, t4; + + tcg_debug_assert(vece == MO_8); + + /* + * Unpack v1 bytes to words, 0 | x. + * Unpack v2 bytes to words, y | 0. + * This leaves the 8-bit result, x * y, with 8 bits of right padding. + * Shift logical right by 8 bits to clear the high 8 bytes before + * using an unsigned saturated pack. + * + * The difference between the V64, V128 and V256 cases is merely how + * we distribute the expansion between temporaries. + */ + switch (type) { + case TCG_TYPE_V64: + t1 = tcg_temp_new_vec(TCG_TYPE_V128); + t2 = tcg_temp_new_vec(TCG_TYPE_V128); + tcg_gen_dup16i_vec(t2, 0); + vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(t2), tcgv_vec_arg(v2)); + tcg_gen_mul_vec(MO_16, t1, t1, t2); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + break; + + case TCG_TYPE_V128: + case TCG_TYPE_V256: + t1 = tcg_temp_new_vec(type); + t2 = tcg_temp_new_vec(type); + t3 = tcg_temp_new_vec(type); + t4 = tcg_temp_new_vec(type); + tcg_gen_dup16i_vec(t4, 0); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(t4), tcgv_vec_arg(v2)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t3), tcgv_vec_arg(v1), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t4), tcgv_vec_arg(t4), tcgv_vec_arg(v2)); + tcg_gen_mul_vec(MO_16, t1, t1, t2); + tcg_gen_mul_vec(MO_16, t3, t3, t4); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + tcg_gen_shri_vec(MO_16, t3, t3, 8); + vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t3)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + tcg_temp_free_vec(t3); + tcg_temp_free_vec(t4); + break; + + default: + g_assert_not_reached(); + } +} + +static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2, TCGCond cond) +{ + enum { + NEED_SWAP = 1, + NEED_INV = 2, + NEED_BIAS = 4 + }; + static const uint8_t fixups[16] = { + [0 ... 15] = -1, + [TCG_COND_EQ] = 0, + [TCG_COND_NE] = NEED_INV, + [TCG_COND_GT] = 0, + [TCG_COND_LT] = NEED_SWAP, + [TCG_COND_LE] = NEED_INV, + [TCG_COND_GE] = NEED_SWAP | NEED_INV, + [TCG_COND_GTU] = NEED_BIAS, + [TCG_COND_LTU] = NEED_BIAS | NEED_SWAP, + [TCG_COND_LEU] = NEED_BIAS | NEED_INV, + [TCG_COND_GEU] = NEED_BIAS | NEED_SWAP | NEED_INV, + }; + TCGv_vec t1, t2; + uint8_t fixup; + + fixup = fixups[cond & 15]; + tcg_debug_assert(fixup != 0xff); + + if (fixup & NEED_INV) { + cond = tcg_invert_cond(cond); + } + if (fixup & NEED_SWAP) { + t1 = v1, v1 = v2, v2 = t1; + cond = tcg_swap_cond(cond); + } + + t1 = t2 = NULL; + if (fixup & NEED_BIAS) { + t1 = tcg_temp_new_vec(type); + t2 = tcg_temp_new_vec(type); + tcg_gen_dupi_vec(vece, t2, 1ull << ((8 << vece) - 1)); + tcg_gen_sub_vec(vece, t1, v1, t2); + tcg_gen_sub_vec(vece, t2, v2, t2); + v1 = t1; + v2 = t2; + cond = tcg_signed_cond(cond); + } + + tcg_debug_assert(cond == TCG_COND_EQ || cond == TCG_COND_GT); + /* Expand directly; do not recurse. */ + vec_gen_4(INDEX_op_cmp_vec, type, vece, + tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); + + if (t1) { + tcg_temp_free_vec(t1); + if (t2) { + tcg_temp_free_vec(t2); + } + } + if (fixup & NEED_INV) { + tcg_gen_not_vec(vece, v0, v0); + } +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { va_list va; - TCGArg a1, a2; - TCGv_vec v0, t1, t2, t3, t4; + TCGArg a2; + TCGv_vec v0, v1, v2; va_start(va, a0); v0 = temp_tcgv_vec(arg_temp(a0)); + v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + a2 = va_arg(va, TCGArg); switch (opc) { case INDEX_op_shli_vec: case INDEX_op_shri_vec: - tcg_debug_assert(vece == MO_8); - a1 = va_arg(va, TCGArg); - a2 = va_arg(va, TCGArg); - /* Unpack to W, shift, and repack. Tricky bits: - (1) Use punpck*bw x,x to produce DDCCBBAA, - i.e. duplicate in other half of the 16-bit lane. - (2) For right-shift, add 8 so that the high half of - the lane becomes zero. For left-shift, we must - shift up and down again. - (3) Step 2 leaves high half zero such that PACKUSWB - (pack with unsigned saturation) does not modify - the quantity. */ - t1 = tcg_temp_new_vec(type); - t2 = tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, - tcgv_vec_arg(t1), a1, a1); - vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, - tcgv_vec_arg(t2), a1, a1); - if (opc == INDEX_op_shri_vec) { - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - } else { - vec_gen_3(INDEX_op_shli_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_shli_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), 8); - } - vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t2)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); + expand_vec_shi(type, vece, opc == INDEX_op_shri_vec, v0, v1, a2); break; case INDEX_op_sari_vec: - a1 = va_arg(va, TCGArg); - a2 = va_arg(va, TCGArg); - if (vece == MO_8) { - /* Unpack to W, shift, and repack, as above. */ - t1 = tcg_temp_new_vec(type); - t2 = tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, - tcgv_vec_arg(t1), a1, a1); - vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, - tcgv_vec_arg(t2), a1, a1); - vec_gen_3(INDEX_op_sari_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_sari_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - vec_gen_3(INDEX_op_x86_packss_vec, type, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t2)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - break; - } - tcg_debug_assert(vece == MO_64); - /* MO_64: If the shift is <= 32, we can emulate the sign extend by - performing an arithmetic 32-bit shift and overwriting the high - half of the result (note that the ISA says shift of 32 is valid). */ - if (a2 <= 32) { - t1 = tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_sari_vec, type, MO_32, tcgv_vec_arg(t1), a1, a2); - vec_gen_3(INDEX_op_shri_vec, type, MO_64, a0, a1, a2); - vec_gen_4(INDEX_op_x86_blend_vec, type, MO_32, - a0, a0, tcgv_vec_arg(t1), 0xaa); - tcg_temp_free_vec(t1); - break; - } - /* Otherwise we will need to use a compare vs 0 to produce the - sign-extend, shift and merge. */ - t1 = tcg_temp_new_vec(type); - t2 = tcg_const_zeros_vec(type); - vec_gen_4(INDEX_op_cmp_vec, type, MO_64, - tcgv_vec_arg(t1), tcgv_vec_arg(t2), a1, TCG_COND_GT); - tcg_temp_free_vec(t2); - vec_gen_3(INDEX_op_shri_vec, type, MO_64, a0, a1, a2); - vec_gen_3(INDEX_op_shli_vec, type, MO_64, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), 64 - a2); - vec_gen_3(INDEX_op_or_vec, type, MO_64, a0, a0, tcgv_vec_arg(t1)); - tcg_temp_free_vec(t1); + expand_vec_sari(type, vece, v0, v1, a2); break; case INDEX_op_mul_vec: - tcg_debug_assert(vece == MO_8); - a1 = va_arg(va, TCGArg); - a2 = va_arg(va, TCGArg); - switch (type) { - case TCG_TYPE_V64: - t1 = tcg_temp_new_vec(TCG_TYPE_V128); - t2 = tcg_temp_new_vec(TCG_TYPE_V128); - tcg_gen_dup16i_vec(t2, 0); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t2)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2); - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t1)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - break; - - case TCG_TYPE_V128: - t1 = tcg_temp_new_vec(TCG_TYPE_V128); - t2 = tcg_temp_new_vec(TCG_TYPE_V128); - t3 = tcg_temp_new_vec(TCG_TYPE_V128); - t4 = tcg_temp_new_vec(TCG_TYPE_V128); - tcg_gen_dup16i_vec(t4, 0); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t4), a2); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t3), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t4), tcgv_vec_arg(t4), a2); - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_mul_vec(MO_16, t3, t3, t4); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - tcg_gen_shri_vec(MO_16, t3, t3, 8); - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t3)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - tcg_temp_free_vec(t3); - tcg_temp_free_vec(t4); - break; - - case TCG_TYPE_V256: - t1 = tcg_temp_new_vec(TCG_TYPE_V256); - t2 = tcg_temp_new_vec(TCG_TYPE_V256); - t3 = tcg_temp_new_vec(TCG_TYPE_V256); - t4 = tcg_temp_new_vec(TCG_TYPE_V256); - tcg_gen_dup16i_vec(t4, 0); - /* a1: A[0-7] ... D[0-7]; a2: W[0-7] ... Z[0-7] - t1: extends of B[0-7], D[0-7] - t2: extends of X[0-7], Z[0-7] - t3: extends of A[0-7], C[0-7] - t4: extends of W[0-7], Y[0-7]. */ - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t4), a2); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t3), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t4), tcgv_vec_arg(t4), a2); - /* t1: BX DZ; t2: AW CY. */ - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_mul_vec(MO_16, t3, t3, t4); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - tcg_gen_shri_vec(MO_16, t3, t3, 8); - /* a0: AW BX CY DZ. */ - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V256, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t3)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - tcg_temp_free_vec(t3); - tcg_temp_free_vec(t4); - break; - - default: - g_assert_not_reached(); - } + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_mul(type, vece, v0, v1, v2); break; case INDEX_op_cmp_vec: - { - enum { - NEED_SWAP = 1, - NEED_INV = 2, - NEED_BIAS = 4 - }; - static const uint8_t fixups[16] = { - [0 ... 15] = -1, - [TCG_COND_EQ] = 0, - [TCG_COND_NE] = NEED_INV, - [TCG_COND_GT] = 0, - [TCG_COND_LT] = NEED_SWAP, - [TCG_COND_LE] = NEED_INV, - [TCG_COND_GE] = NEED_SWAP | NEED_INV, - [TCG_COND_GTU] = NEED_BIAS, - [TCG_COND_LTU] = NEED_BIAS | NEED_SWAP, - [TCG_COND_LEU] = NEED_BIAS | NEED_INV, - [TCG_COND_GEU] = NEED_BIAS | NEED_SWAP | NEED_INV, - }; - - TCGCond cond; - uint8_t fixup; - - a1 = va_arg(va, TCGArg); - a2 = va_arg(va, TCGArg); - cond = va_arg(va, TCGArg); - fixup = fixups[cond & 15]; - tcg_debug_assert(fixup != 0xff); - - if (fixup & NEED_INV) { - cond = tcg_invert_cond(cond); - } - if (fixup & NEED_SWAP) { - TCGArg t; - t = a1, a1 = a2, a2 = t; - cond = tcg_swap_cond(cond); - } - - t1 = t2 = NULL; - if (fixup & NEED_BIAS) { - t1 = tcg_temp_new_vec(type); - t2 = tcg_temp_new_vec(type); - tcg_gen_dupi_vec(vece, t2, 1ull << ((8 << vece) - 1)); - tcg_gen_sub_vec(vece, t1, temp_tcgv_vec(arg_temp(a1)), t2); - tcg_gen_sub_vec(vece, t2, temp_tcgv_vec(arg_temp(a2)), t2); - a1 = tcgv_vec_arg(t1); - a2 = tcgv_vec_arg(t2); - cond = tcg_signed_cond(cond); - } - - tcg_debug_assert(cond == TCG_COND_EQ || cond == TCG_COND_GT); - vec_gen_4(INDEX_op_cmp_vec, type, vece, a0, a1, a2, cond); - - if (fixup & NEED_BIAS) { - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - } - if (fixup & NEED_INV) { - tcg_gen_not_vec(vece, v0, v0); - } - } + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; default: From patchwork Fri Jan 4 22:31:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154805 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1094558ljp; Fri, 4 Jan 2019 14:38:15 -0800 (PST) X-Google-Smtp-Source: ALg8bN5rHSdJ9Nq1GxWVb8buQAR0kt2uBsaBZ5C2E4o0VAiFUiSzc5Xr7GYi0tkFPu49LOukK9ub X-Received: by 2002:adf:fc09:: with SMTP id i9mr43618543wrr.299.1546641495695; Fri, 04 Jan 2019 14:38:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641495; cv=none; d=google.com; s=arc-20160816; b=gDiG3Cbg1/Z5y+S4XglyRQGhggOvsht7T+UVfDBBzkBA+BR8VGLyimyq2zS995teaj 45cip3LiVIHhGqPWp5E2xliwp74s0gyzTqU51F8HAqtMeL++frgnovU/GVOAfCm5IZep DyIaOehRX9BWVrCqCkKZvJ4L1nsjV7RaJzFKzikZlikNFp810bCmRtzIEMv7KQLdlpZR DL9l0Y4+yJlUuWCbYRoPOKLF1Eivb80rBmDwsw+ymRkEdy/2ZSq5lopH7u/V0RjiLQVd pOlY+a7MSrQ6a3iR8sLL6ut7WBRgqfUsJtq5jmmg/Na89+SGobhLRrs71GKNYxcVrIyv BMBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=SN+MeOnTNXBfnXP6XYPFDEbbdv3QTmQaTEHi4q44Zmk=; b=jZZcSuvY4kThXsO42OLCYc/CJLwUgdmxWzha9iWvz610+8lzUnh2SZRybJRnpgdwKJ AchS/2udVWtotI8aDo1bQyBRBfLe8NkNb0TC7E6yD/0ZC9i6C24xDKDaGs5Sj88ieDXi ezeql2KTl2n0Gknj1FCCDLi3aahp80LTW1kbZ0BzpgWk7tws+v6Sdv1fAC6hupjLPle9 E2XBUAH13OjipTYWKWLnc9mn4428Ulb3sph2snUVzhflSXcwUx37Y6Z+eafgJayEN2Ha H8uYw5H4ouUJVphrlBp6eRSgu8CSd38QnYfrgMQYiJzyay7OpyITHEdpSogSFOP6PJXy plmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=cIYtx4KH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id f3si1298364wmg.129.2019.01.04.14.38.15 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:38:15 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=cIYtx4KH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:57856 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY6Y-0002bi-EY for patch@linaro.org; Fri, 04 Jan 2019 17:38:14 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55059 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0L-0002yZ-MH for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0I-00020G-HS for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:49 -0500 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:36633) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0I-0001zd-D2 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:46 -0500 Received: by mail-io1-xd42.google.com with SMTP id m19so30708059ioh.3 for ; Fri, 04 Jan 2019 14:31:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=SN+MeOnTNXBfnXP6XYPFDEbbdv3QTmQaTEHi4q44Zmk=; b=cIYtx4KHUoKmQJ7otCLJr5/C916lAi6j2e/W/uaXgQwde9d0gr9Gluhrw8uhRjSMvE ro+w/8vxDD96TOvmLWJhcvGZmpUJwEqwoEO54P8yHjGy4OPkoa3f89PR7A0rzChA+qv9 TDYWqEEjfTtHro97430vc10zwf0qMEdh7TTL4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=SN+MeOnTNXBfnXP6XYPFDEbbdv3QTmQaTEHi4q44Zmk=; b=eHQF2+iRDvfx/YauKKMY/XUCp/ZZ5qVv6dXUxN3dsaI1DjkDXYdJg/S6ppdmhtJoNd ufn7DLEZC5mB9gPhmnoXkMvTpghieDfyS5vI0zznT9jeAG3kV64i9VBu0WSnZFn5Mx+M YHtvRLOseL7eDQDDBnkY+d/2tPaUSuC018k/LbFZcSoknLkRd2QoOKfnm47Eaqqr83tW dR2cRku5gc9GJo6fIScSrHOF//fTH69xNTKvXPXwJw+Jy0icfyRgdaZd2Eis3ANZ4txn CIDKE14Qc3fFRrHK63313D9vpyRl4uUrCNEkCRPAaPR6QC50CVdtTCCq0xCugYNXjd6l OjkQ== X-Gm-Message-State: AJcUukfZg+EmB2jjN26ne0VRUrGj3DbCEzACNiiTFIwnDipeY2Vehob0 ryOF/nrhLsc3f6qwp0bhh5wQlFK2FMw= X-Received: by 2002:a6b:700a:: with SMTP id l10mr10152331ioc.138.1546641105407; Fri, 04 Jan 2019 14:31:45 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.43 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:13 +1000 Message-Id: <20190104223116.14037-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d42 Subject: [Qemu-devel] [PATCH v2 07/10] tcg/i386: Implement vector saturating arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 42 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+), 1 deletion(-) -- 2.17.2 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 7bd7eae672..efbd5a6fc9 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -185,7 +185,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 -#define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index ad97386d06..feec40a412 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -377,6 +377,10 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_PADDW (0xfd | P_EXT | P_DATA16) #define OPC_PADDD (0xfe | P_EXT | P_DATA16) #define OPC_PADDQ (0xd4 | P_EXT | P_DATA16) +#define OPC_PADDSB (0xec | P_EXT | P_DATA16) +#define OPC_PADDSW (0xed | P_EXT | P_DATA16) +#define OPC_PADDUB (0xdc | P_EXT | P_DATA16) +#define OPC_PADDUW (0xdd | P_EXT | P_DATA16) #define OPC_PAND (0xdb | P_EXT | P_DATA16) #define OPC_PANDN (0xdf | P_EXT | P_DATA16) #define OPC_PBLENDW (0x0e | P_EXT3A | P_DATA16) @@ -408,6 +412,10 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_PSUBW (0xf9 | P_EXT | P_DATA16) #define OPC_PSUBD (0xfa | P_EXT | P_DATA16) #define OPC_PSUBQ (0xfb | P_EXT | P_DATA16) +#define OPC_PSUBSB (0xe8 | P_EXT | P_DATA16) +#define OPC_PSUBSW (0xe9 | P_EXT | P_DATA16) +#define OPC_PSUBUB (0xd8 | P_EXT | P_DATA16) +#define OPC_PSUBUW (0xd9 | P_EXT | P_DATA16) #define OPC_PUNPCKLBW (0x60 | P_EXT | P_DATA16) #define OPC_PUNPCKLWD (0x61 | P_EXT | P_DATA16) #define OPC_PUNPCKLDQ (0x62 | P_EXT | P_DATA16) @@ -2591,9 +2599,21 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const add_insn[4] = { OPC_PADDB, OPC_PADDW, OPC_PADDD, OPC_PADDQ }; + static int const ssadd_insn[4] = { + OPC_PADDSB, OPC_PADDSW, OPC_UD2, OPC_UD2 + }; + static int const usadd_insn[4] = { + OPC_PADDSB, OPC_PADDSW, OPC_UD2, OPC_UD2 + }; static int const sub_insn[4] = { OPC_PSUBB, OPC_PSUBW, OPC_PSUBD, OPC_PSUBQ }; + static int const sssub_insn[4] = { + OPC_PSUBSB, OPC_PSUBSW, OPC_UD2, OPC_UD2 + }; + static int const ussub_insn[4] = { + OPC_PSUBSB, OPC_PSUBSW, OPC_UD2, OPC_UD2 + }; static int const mul_insn[4] = { OPC_UD2, OPC_PMULLW, OPC_PMULLD, OPC_UD2 }; @@ -2631,9 +2651,21 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_add_vec: insn = add_insn[vece]; goto gen_simd; + case INDEX_op_ssadd_vec: + insn = ssadd_insn[vece]; + goto gen_simd; + case INDEX_op_usadd_vec: + insn = usadd_insn[vece]; + goto gen_simd; case INDEX_op_sub_vec: insn = sub_insn[vece]; goto gen_simd; + case INDEX_op_sssub_vec: + insn = sssub_insn[vece]; + goto gen_simd; + case INDEX_op_ussub_vec: + insn = ussub_insn[vece]; + goto gen_simd; case INDEX_op_mul_vec: insn = mul_insn[vece]; goto gen_simd; @@ -3007,6 +3039,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_or_vec: case INDEX_op_xor_vec: case INDEX_op_andc_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3074,6 +3110,12 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } return 1; + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: + return vece <= MO_16; + default: return 0; } From patchwork Fri Jan 4 22:31:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154808 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1095858ljp; Fri, 4 Jan 2019 14:40:20 -0800 (PST) X-Google-Smtp-Source: ALg8bN6uSlG6b3heFKQ/SV0VsYzD6SA0kSXcnm5YsQuFgIa0v8c87rgNFSWIg0GYQEYDKQpZPRCV X-Received: by 2002:adf:f785:: with SMTP id q5mr48121885wrp.9.1546641620846; Fri, 04 Jan 2019 14:40:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641620; cv=none; d=google.com; s=arc-20160816; b=RJ+wzY607v7ssSC+Eyjxp3oEeyZizYABuZZaOXcCAeR9Oo/sEBJKEGshcyoBPv597Q 2JGi2lnjDyzCGtlwX8qKlATgMFb3Od0CdEe227obQ4fGzFINJLkrgLvmJqbmVkyWJ7Q/ zt+G3Wdu0XuH7Q16Qp6CMqK0Z9xGLfKg8RVK5JGI5MhwldjCaTyITacR9cP1BpYyAkV/ iCTGNiVlFFGlB4KhQzcjIsPk/WksbOt9jEdrebmcO0iesnN3K5oxNc5kljuV9/881X/B GcKlioVHOjA00veFwNq0/juTsxwdQ1Jw1o+7uQuZS6djpN2bIIPGAs1eQNkI5DK1oKw7 PvoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=hAwJsayMvDe4O8fXPQhl9uUYFAm/hD4Vqu6cubinGYk=; b=KcXiXIpKO8SD6shh9lfx13JKTkicbvS3di6NjP59cZ8M2rx6NkydxGUs9SqHA2PPy8 TxNHtvRZD7REFH4aLwNiW2lPQDVGPCf7GyOW1vWlny6NBUMQaNqmQUslnHYDS5HWU5j1 VIx2iZvg7lXnxpRGVRUjX6tuzZbt11MsCqVwqk4zASQO8dDsd7i77rmKgY/GlVJTwekr QLG9POhDf4LikJ9bBR+cWpdHSN1oatSRYcacsybDM++vAvtkhRU+DuzpDhXB0pZ+D9T2 LXgK79dHoGYtsXHXSm0t8lL1D9mY4rCY8kb/NpLVBzPirHf3acY2CzL4CdVvoQHsIZZ6 5gQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=dzVRsv61; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id w129si1483423wmb.25.2019.01.04.14.40.20 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:40:20 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=dzVRsv61; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:58342 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY8Z-0005LD-PF for patch@linaro.org; Fri, 04 Jan 2019 17:40:19 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55066 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0M-0002yf-7m for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0L-00026o-2y for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from mail-io1-xd41.google.com ([2607:f8b0:4864:20::d41]:41916) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0K-00024L-Tw for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:49 -0500 Received: by mail-io1-xd41.google.com with SMTP id s22so30699537ioc.8 for ; Fri, 04 Jan 2019 14:31:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=hAwJsayMvDe4O8fXPQhl9uUYFAm/hD4Vqu6cubinGYk=; b=dzVRsv61gb/RrEUgFVJ2EdjqU6w8PMeXiQafEDjbqNB8BxjoFbIIscKPovPAKX9HOE gyF25eG1GSoHYqME0sbImL/zd2hB8fkt5aC9JJRAtWHuX82fzLsHHG4RJVS/9JWxhvya xXaGY3pi14BUilxRklo+HakxjOS/jX2dtFEJg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=hAwJsayMvDe4O8fXPQhl9uUYFAm/hD4Vqu6cubinGYk=; b=cuoUWbDpuHA43PfHTXZGr7guBmnvW5JbIxJ3XqLtA654b/PQWkUp0HyYAgC/AGPUWs hwXnGGGf47QNyJGLDR6er1xGYzggqjdAKsNNG72RyDRu3Ixc51eTEXYqeO2yHpC+JIAG H29GeX0NF71rctoPc9NMpRXfmuPo1ikKI87C1gWHba1nkhBV5pa6fsHPB3bN87LIxhAT Np3yde1lF/TNQ6dsxUMZ3zky1GCbn0PgeQKGP3qzK5vzOqHPvmM5H4RnVTqPeA1CJwil FRVpqE2OYeeaVHgqcR/NVQJUUUIpDjkKpVuMz/NEheX5J40H3VWXl/M7AV1jcC4ExZbg +6sw== X-Gm-Message-State: AJcUukchpLjWSFV9VwC3Z25UEYC8FYXA8H+cQ5zUPKiyDr8X/9QATvCF I1U32FqyZCMsMWBrIER5WiQ30KmzvwU= X-Received: by 2002:a6b:5902:: with SMTP id n2mr17552308iob.16.1546641107899; Fri, 04 Jan 2019 14:31:47 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.45 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:14 +1000 Message-Id: <20190104223116.14037-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d41 Subject: [Qemu-devel] [PATCH v2 08/10] tcg/i386: Implement vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 81 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+), 1 deletion(-) -- 2.17.2 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index efbd5a6fc9..7995fe3eab 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -186,7 +186,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 -#define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && (len) == 8) || ((ofs) == 8 && (len) == 8) || \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index feec40a412..94007c7aa5 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -392,6 +392,18 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16) #define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16) #define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16) +#define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16) +#define OPC_PMAXSW (0xee | P_EXT | P_DATA16) +#define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16) +#define OPC_PMAXUB (0xde | P_EXT | P_DATA16) +#define OPC_PMAXUW (0x3e | P_EXT38 | P_DATA16) +#define OPC_PMAXUD (0x3f | P_EXT38 | P_DATA16) +#define OPC_PMINSB (0x38 | P_EXT38 | P_DATA16) +#define OPC_PMINSW (0xea | P_EXT | P_DATA16) +#define OPC_PMINSD (0x39 | P_EXT38 | P_DATA16) +#define OPC_PMINUB (0xda | P_EXT | P_DATA16) +#define OPC_PMINUW (0x3a | P_EXT38 | P_DATA16) +#define OPC_PMINUD (0x3b | P_EXT38 | P_DATA16) #define OPC_PMOVSXBW (0x20 | P_EXT38 | P_DATA16) #define OPC_PMOVSXWD (0x23 | P_EXT38 | P_DATA16) #define OPC_PMOVSXDQ (0x25 | P_EXT38 | P_DATA16) @@ -2638,6 +2650,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const packus_insn[4] = { OPC_PACKUSWB, OPC_PACKUSDW, OPC_UD2, OPC_UD2 }; + static int const smin_insn[4] = { + OPC_PMINSB, OPC_PMINSW, OPC_PMINSD, OPC_UD2 + }; + static int const smax_insn[4] = { + OPC_PMAXSB, OPC_PMAXSW, OPC_PMAXSD, OPC_UD2 + }; + static int const umin_insn[4] = { + OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_UD2 + }; + static int const umax_insn[4] = { + OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_UD2 + }; TCGType type = vecl + TCG_TYPE_V64; int insn, sub; @@ -2678,6 +2702,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_xor_vec: insn = OPC_PXOR; goto gen_simd; + case INDEX_op_smin_vec: + insn = smin_insn[vece]; + goto gen_simd; + case INDEX_op_umin_vec: + insn = umin_insn[vece]; + goto gen_simd; + case INDEX_op_smax_vec: + insn = smax_insn[vece]; + goto gen_simd; + case INDEX_op_umax_vec: + insn = umax_insn[vece]; + goto gen_simd; case INDEX_op_x86_punpckl_vec: insn = punpckl_insn[vece]; goto gen_simd; @@ -3043,6 +3079,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_usadd_vec: case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: + case INDEX_op_smin_vec: + case INDEX_op_umin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umax_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3115,6 +3155,11 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: return vece <= MO_16; + case INDEX_op_smin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umin_vec: + case INDEX_op_umax_vec: + return vece <= MO_32 ? 1 : -1; default: return 0; @@ -3343,6 +3388,25 @@ static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, } } +static void expand_vec_minmax(TCGType type, unsigned vece, + TCGCond cond, bool min, + TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) +{ + TCGv_vec t1 = tcg_temp_new_vec(type); + + tcg_debug_assert(vece == MO_64); + + tcg_gen_cmp_vec(cond, vece, t1, v1, v2); + if (min) { + TCGv_vec t2; + t2 = v1, v1 = v2, v2 = t2; + } + vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, + tcgv_vec_arg(v0), tcgv_vec_arg(v1), + tcgv_vec_arg(v2), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { @@ -3375,6 +3439,23 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; + case INDEX_op_smin_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GT, true, v0, v1, v2); + break; + case INDEX_op_smax_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GT, false, v0, v1, v2); + break; + case INDEX_op_umin_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GTU, true, v0, v1, v2); + break; + case INDEX_op_umax_vec: + v2 = temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GTU, false, v0, v1, v2); + break; + default: break; } From patchwork Fri Jan 4 22:31:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154809 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1095981ljp; Fri, 4 Jan 2019 14:40:32 -0800 (PST) X-Google-Smtp-Source: ALg8bN6x+2Zzsk6IUFTB/7e9U7Gblm9C0US91nhoOvsDBPaty1IJcF58+fDU3UENTt6tmZEvG5l5 X-Received: by 2002:a5d:558a:: with SMTP id i10mr43918060wrv.287.1546641632349; Fri, 04 Jan 2019 14:40:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641632; cv=none; d=google.com; s=arc-20160816; b=UckWhc1XsSGzALSGOluf++CzoPZtLTfrBzXppe23jVr4H26tfhwITIuASluLg3cgpG D3Orzlk/R0ud/oUoJVFHAhEhHCYaxMRumU1NWp3LMgKOpsphoY1BubQo5Fb2o2XqkGFe NaI6L+eD2bm/lum6L9UWlrr1wNysbUsctvEKCEaxLMCgLvrtueO3w0DYedrKAcogEDQR Cn1NqxBIVVgPq0kaBO1Dx+zR8yI2IZ8lG2QEv4AL0Hc1l/HTZNA8e1fJ0MtOd4AUY3jw R+cc0A5Px+K7HKSfF08InmHEN1Tv+CgHC/Wcl8Y7wAmMZnqKikZhO35yLvo7/q4Czrs0 Xh6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=bjBWCHW6CsJ6peflKASsBY6j3Bp2s4dyGDxZEiyy4vc=; b=BYgGVlwsqgS0j0oyNS2ZwpiUneYBbOVVNR9rKiTmyUzNu0uaXYUGWczmwGDhMFCPif iy0OgRah7b3WmCpr0uln+Lq/Yjjc9orDsgD09T2lrUQikt066WRtkki7EbJmtG3GvP8u UlO5h85yPid/x5Ls9Uqvnr+Oqb4l2ST33L9yd+3G2/QgZ+gCzU0eYXvNv9Gor6aTmo5d vwrTsZl+/DCcNuKmL/H34qTm0zOpaNLVtxe6fzu2LzyzcnyYBhJnvQLhiK5kKWI9KM7p Q04m/PCDPIKBXIwrmOJXjMsghPdqvxOz8PU62uzySo7u84NglS/DCdKQuZnxF2nzHX0G U4Tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=i6zdQcgN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id g197si1340007wmd.114.2019.01.04.14.40.32 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:40:32 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=i6zdQcgN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:58387 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY8l-0005Ys-8Z for patch@linaro.org; Fri, 04 Jan 2019 17:40:31 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55082 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0O-00031k-Ay for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0N-00029c-Fz for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:52 -0500 Received: from mail-io1-xd43.google.com ([2607:f8b0:4864:20::d43]:36634) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0N-00029C-B6 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:51 -0500 Received: by mail-io1-xd43.google.com with SMTP id m19so30708174ioh.3 for ; Fri, 04 Jan 2019 14:31:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=bjBWCHW6CsJ6peflKASsBY6j3Bp2s4dyGDxZEiyy4vc=; b=i6zdQcgN8zn4dBlv6w7PMsz9tOUMQ0HkshxeCND8wJVAYreOEWGQWURdNcyBznaAGL 2PuUrFuNnNuZeQLAx4ws4UWkUoFHA2foC6ke5+dT0Fav4fuTfWTHCj28RA/RNijoSn+Z K2A36UIAOpEOHI2b09CJH6ZEYUoXWroDwK+9o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=bjBWCHW6CsJ6peflKASsBY6j3Bp2s4dyGDxZEiyy4vc=; b=Dv6LNzHX5Y82OIqfxgjO13QfxMzo+OfD+XpaUdJBSRxPreN9uor7hY3pmdDldJIYMf WP92vXxS6y79EzBlJm4qMBF1opNP2HmgNzowLnbQRmupxq/CWK62XUUGfUxfyOe/Lilo 0WA0+dwNpXaxkbaDNYWN5MJ2aDdZ/+FNlMHcNy23yT0q8+A/Hv2e0eM9ze7tdjWqUkCU ggaa6s90j0tLemxzMPPnoVUbWWn5/fKzRmGql3QO3+7pcqrzD4lqjklgKpi964ikCThv Ah1JYFFXuZYuZVsAiSbBmR/dFh3zaF/Mw2Kv4OZricyRXH/qHlfYln7a0nWF22Gv9wUd 9Pcg== X-Gm-Message-State: AJcUukfSXlIISt+WswYyqeD534391Jkknk2L67tOmA/EheC1NDiZxpc1 GKanBH0yqev0zXy9wkELntNILU8sZtc= X-Received: by 2002:a6b:600b:: with SMTP id r11mr40122957iog.259.1546641110355; Fri, 04 Jan 2019 14:31:50 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.48 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:49 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:15 +1000 Message-Id: <20190104223116.14037-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d43 Subject: [Qemu-devel] [PATCH v2 09/10] tcg/aarch64: Implement vector saturating arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) -- 2.17.2 diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 545a6eec75..a1884543d0 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -135,7 +135,7 @@ typedef enum { #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 -#define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 0 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 0562e0aa40..b2b011f130 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -528,6 +528,10 @@ typedef enum { I3616_CMHI = 0x2e203400, I3616_CMHS = 0x2e203c00, I3616_CMEQ = 0x2e208c00, + I3616_SQADD = 0x0e200c00, + I3616_SQSUB = 0x0e202c00, + I3616_UQADD = 0x2e200c00, + I3616_UQSUB = 0x2e202c00, /* AdvSIMD two-reg misc. */ I3617_CMGT0 = 0x0e208800, @@ -2137,6 +2141,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_orc_vec: tcg_out_insn(s, 3616, ORN, is_q, 0, a0, a1, a2); break; + case INDEX_op_ssadd_vec: + tcg_out_insn(s, 3616, SQADD, is_q, vece, a0, a1, a2); + break; + case INDEX_op_sssub_vec: + tcg_out_insn(s, 3616, SQSUB, is_q, vece, a0, a1, a2); + break; + case INDEX_op_usadd_vec: + tcg_out_insn(s, 3616, UQADD, is_q, vece, a0, a1, a2); + break; + case INDEX_op_ussub_vec: + tcg_out_insn(s, 3616, UQSUB, is_q, vece, a0, a1, a2); + break; case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; @@ -2207,6 +2223,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: return 1; case INDEX_op_mul_vec: return vece < MO_64; @@ -2386,6 +2406,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_xor_vec: case INDEX_op_andc_vec: case INDEX_op_orc_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: From patchwork Fri Jan 4 22:31:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 154810 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp1097939ljp; Fri, 4 Jan 2019 14:43:42 -0800 (PST) X-Google-Smtp-Source: ALg8bN6TWIvkKWUwJAkTzj9WCylSWBv6UWlgfxqg+TnJrWUipwjHy6GyX85VfjI9TPz+5A/R28Wf X-Received: by 2002:a19:db54:: with SMTP id s81mr15969480lfg.102.1546641822840; Fri, 04 Jan 2019 14:43:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546641822; cv=none; d=google.com; s=arc-20160816; b=nU2mw2JGdVvnRauaeuPV+E25ehJD3xaUNRtEcgA6nKT2LZM2YyuWCeaXzhfVlT49Q3 4qcnnSIta5bCefBHm8HSvewJxj1UghKZct/Ka1y+g0JEYRf/IHyYOSHPOzcPGQ5SxR6v KYAdjiKAen8FbglXr3EZZL7ZwogMWvfWe70FiMx2ICVAM6mXwPHNjV47NtY4uoTjZ6nY CdYcNJWkONdg8KPHtSOlw2NK8guoKLAgCmKXArGz5oO0/SWcNcRTwdlLYdbIyoYtkeWI BA0uQ7QoBb5VfIqHI6pbD96PGD6DoxZ5xdK/LstlHEhZfxZwjVIRQe3eFsMb6+BVgbEm A9ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=1RVpvwteSuePPKo6wNsZRQ/0lJBH3U2L6QXC6WM1wUc=; b=SPNJXeZmIGYtRbnBFDuGe3BBG1l6dPj+PkL+3yR4GB/uDV36nZrUnaapKHncKunWL/ hf9v4UiRIHm5nqc5Phkwoq883pFSZ5RUY5P73YkwPK/NDQH91P56/Vexat4FaDrA4b8i o2akX/rxGOgN4/OVWgj2pKET6nE7udiseIsrxC/ju2wrrL1c2vQg8SSSALCdbhkc0rmX lLEVpqz+VOJgQJo7tg3L18iE+EU5r7LkbJQpiI8p6ZR9gr/B1Abdx29Xq42e7MP++MGH /NwlAzr4oFkPBYWzmOTnWvbP4m2lSuHzFR2LX6CtIg0APlpjT3wBmiNbmw4tOHbrIXcy nnUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Rn81FQ8f; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org. [208.118.235.17]) by mx.google.com with ESMTPS id g83si45187367lfl.30.2019.01.04.14.43.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 04 Jan 2019 14:43:42 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Rn81FQ8f; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:59152 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfYBp-00018h-Ie for patch@linaro.org; Fri, 04 Jan 2019 17:43:41 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55099 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0Q-00036k-O7 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0P-0002GI-RK for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:54 -0500 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:42690) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0P-0002F0-Mz for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:53 -0500 Received: by mail-io1-xd42.google.com with SMTP id x6so30675046ioa.9 for ; Fri, 04 Jan 2019 14:31:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=1RVpvwteSuePPKo6wNsZRQ/0lJBH3U2L6QXC6WM1wUc=; b=Rn81FQ8ffSMh/Ve/JS9/8VI/EGe6C8+v8O7nzsWMcWxT+plSqu4+C2M4aTFcuOXH5E bcwFQhCeVrz/Js8losiHtU+gfXxF1l+oV9Y8qRO4d/wHPI2nhj4tTyfMpdUfjdEz6VoZ W1tK6yD+04f31SK3iZxamC5sgV6fG1sFsuhwE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=1RVpvwteSuePPKo6wNsZRQ/0lJBH3U2L6QXC6WM1wUc=; b=RUF4oxrCWUCgOOxiXOvjmw3+0x7EQfQTA87onxK7SAtpvL7XXrYM6AGMSL6BbEQPsq tNCnDQa2zGd+s8RWFQqw+OruPHEmwMJAiv9OQYVTRtd9yCb8qIppYWhTZI7tBjUmpj/4 ex1mAItWxkHaFOa+99ULO9WI/IFYkxhSzZo1/7NYREd1SawrThwgQtcWZ/Fty/OwUZeg TWfU/6bUW4FlxRaUgxjxlbvXa8A4zA189j5icXXBIHlDdhvvtq/ZhfOoXhgXbYQ5C+Yo xmNcEQdsW5hF78G/CnhncmPrej6QlsL+NALSM8X4PMIwlHMIqD6tp5j33Ku2fTrvdyMJ SqlQ== X-Gm-Message-State: AJcUukeVMcY/Q3R8+d1DKirL7W9nS3J/oDotI5uZa3vFKFDwyQNy1BCu HzEKN0HtQvhgb8gHnt7GLZPJJLBOm/c= X-Received: by 2002:a6b:7402:: with SMTP id s2mr35341564iog.219.1546641112844; Fri, 04 Jan 2019 14:31:52 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.50 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:16 +1000 Message-Id: <20190104223116.14037-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d42 Subject: [Qemu-devel] [PATCH v2 10/10] tcg/aarch64: Implement vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) -- 2.17.2 diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index a1884543d0..2d93cf404e 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -136,7 +136,7 @@ typedef enum { #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 -#define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_minmax_vec 1 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index b2b011f130..ee0d5819af 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -528,8 +528,12 @@ typedef enum { I3616_CMHI = 0x2e203400, I3616_CMHS = 0x2e203c00, I3616_CMEQ = 0x2e208c00, + I3616_SMAX = 0x0e206400, + I3616_SMIN = 0x0e206c00, I3616_SQADD = 0x0e200c00, I3616_SQSUB = 0x0e202c00, + I3616_UMAX = 0x2e206400, + I3616_UMIN = 0x2e206c00, I3616_UQADD = 0x2e200c00, I3616_UQSUB = 0x2e202c00, @@ -2153,6 +2157,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_ussub_vec: tcg_out_insn(s, 3616, UQSUB, is_q, vece, a0, a1, a2); break; + case INDEX_op_smax_vec: + tcg_out_insn(s, 3616, SMAX, is_q, vece, a0, a1, a2); + break; + case INDEX_op_smin_vec: + tcg_out_insn(s, 3616, SMIN, is_q, vece, a0, a1, a2); + break; + case INDEX_op_umax_vec: + tcg_out_insn(s, 3616, UMAX, is_q, vece, a0, a1, a2); + break; + case INDEX_op_umin_vec: + tcg_out_insn(s, 3616, UMIN, is_q, vece, a0, a1, a2); + break; case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; @@ -2227,6 +2243,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: return 1; case INDEX_op_mul_vec: return vece < MO_64; @@ -2410,6 +2430,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: