From patchwork Thu Sep 24 01:24:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 272916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB417C4363D for ; Thu, 24 Sep 2020 01:26:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2F49D2145D for ; Thu, 24 Sep 2020 01:26:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="bJA8LS+l" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F49D2145D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38268 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG1S-0006PD-7Z for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:26:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43698) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0J-0004l8-9x for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:24:59 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:55870) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0H-0005gu-Np for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:24:59 -0400 Received: by mail-pj1-x1043.google.com with SMTP id q4so731623pjh.5 for ; Wed, 23 Sep 2020 18:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+zHwfpc8hHHaXFYBDv0TpetGN2biZLrhLJF6I/paWB4=; b=bJA8LS+ld2p05oAtq2Mp9VDFkcfxVU0KQmf7MF1ntoAEM7eu6FJcoHRMC2H74l6BkI e+3c6YyCRqreiExSVdew8b/sNa4Az/tCEqrAscKT7bHy6RTTj6YNzn/dYDXqXxGwGgPj HnIM2zxhlYi5L7eXRBO6qtZR00otICkuqdT5FzRE4yW3l7AJ9+YLKY4qzsz0aY02wZiz DGer8u/DIq5gjF6aMRXET6Vkj1jrdownvJYOQQp1CKzV7uxZ9FHzw+Sr3Z5zmqTYaKVg DcTXA0lXM8zIaSw7nJ5G3lywq5yjU/pfv/r48IuWuxqRRSjczu7TW7mZsKUoFakoHwzi n7Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+zHwfpc8hHHaXFYBDv0TpetGN2biZLrhLJF6I/paWB4=; b=j0f++hHGXq/VyLX7/IONLg0BaFLFY5eTTLV1iynzxSmdCXrlC/FPZATXRWXgzWJsxj nR1v6g27lyDlPMtkyJOqr3AWarmWBOUDaCmeWdmQitwXIoFc9CWslFgbs4SNzECuYR8k pDbxKHGVdlB3Yr529X4oGU+Y/nQ2I7daJ92HcdoCxizfIoIcgHcvTfBBPoi2OYGEMt8k JIVk15J5v12HcqU6ONFhgV+C51em96RCohRYXZ1T2LeeQ3Hs9Hvm/pLby+n3An30L8y5 ripDb05omoZXpCyTtvn2RsA/ubXtOw1jxCzxYALhc3PwH/cwveAenWlvEYFn2IZQTYvJ mLUg== X-Gm-Message-State: AOAM531o54wohX5ox0dlAb2sOQ2SIVrhNKIS9vP8qvq5cbs81xUyZ1ak 8qjio7se0Gyl5zgjQG1YgiNL21pHxdtrCA== X-Google-Smtp-Source: ABdhPJwoawGiVmgUn/vEAopVNqaqZV/oKGxGovI0T3PU4vyuPnS7QZgNUwigDw2ZlRGDWytys4IJUA== X-Received: by 2002:a17:902:d913:b029:d0:cbe1:e712 with SMTP id c19-20020a170902d913b02900d0cbe1e712mr2399368plz.32.1600910696119; Wed, 23 Sep 2020 18:24:56 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.24.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:24:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 1/8] softfloat: Use mulu64 for mul64To128 Date: Wed, 23 Sep 2020 18:24:46 -0700 Message-Id: <20200924012453.659757-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Via host-utils.h, we use a host widening multiply for 64-bit hosts, and a common subroutine for 32-bit hosts. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- include/fpu/softfloat-macros.h | 24 ++++-------------------- 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/include/fpu/softfloat-macros.h b/include/fpu/softfloat-macros.h index a35ec2893a..57845f8af0 100644 --- a/include/fpu/softfloat-macros.h +++ b/include/fpu/softfloat-macros.h @@ -83,6 +83,7 @@ this code that are retained. #define FPU_SOFTFLOAT_MACROS_H #include "fpu/softfloat-types.h" +#include "qemu/host-utils.h" /*---------------------------------------------------------------------------- | Shifts `a' right by the number of bits given in `count'. If any nonzero @@ -515,27 +516,10 @@ static inline void | `z0Ptr' and `z1Ptr'. *----------------------------------------------------------------------------*/ -static inline void mul64To128( uint64_t a, uint64_t b, uint64_t *z0Ptr, uint64_t *z1Ptr ) +static inline void +mul64To128(uint64_t a, uint64_t b, uint64_t *z0Ptr, uint64_t *z1Ptr) { - uint32_t aHigh, aLow, bHigh, bLow; - uint64_t z0, zMiddleA, zMiddleB, z1; - - aLow = a; - aHigh = a>>32; - bLow = b; - bHigh = b>>32; - z1 = ( (uint64_t) aLow ) * bLow; - zMiddleA = ( (uint64_t) aLow ) * bHigh; - zMiddleB = ( (uint64_t) aHigh ) * bLow; - z0 = ( (uint64_t) aHigh ) * bHigh; - zMiddleA += zMiddleB; - z0 += ( ( (uint64_t) ( zMiddleA < zMiddleB ) )<<32 ) + ( zMiddleA>>32 ); - zMiddleA <<= 32; - z1 += zMiddleA; - z0 += ( z1 < zMiddleA ); - *z1Ptr = z1; - *z0Ptr = z0; - + mulu64(z1Ptr, z0Ptr, a, b); } /*---------------------------------------------------------------------------- From patchwork Thu Sep 24 01:24:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 272914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66DD0C4363D for ; Thu, 24 Sep 2020 01:28:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BC3F821534 for ; Thu, 24 Sep 2020 01:28:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="lYyMbrpd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BC3F821534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45678 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG3P-000168-Sv for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:28:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43712) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0K-0004lw-LG for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:00 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]:36704) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0I-0005hC-UQ for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:00 -0400 Received: by mail-pl1-x62b.google.com with SMTP id s19so747350plp.3 for ; Wed, 23 Sep 2020 18:24:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PUkREiDac/4ZH7kWY/xK+XNOCt1EYwSejvDGxYtWXc4=; b=lYyMbrpdc++53vlc3q+IWdBjzU9XkySvfkNS25W7dv13bKR38I535+EFHodl0X2XKc lbSxON3zFzUjwB3x/exjzHAyOKfinuh3LTZNeuVAxOly1yEVplmqcWEes7dIBIYdy+mw FvdOn1tIDasH6lGX9xnL1N4fHeU1wpmwu942Rc3gegE1AmxSBUVzPkTx+n7Tb+0Kp+rV d5DPBpAcMjVbchVwccgSOZltEBlKHrFIL6lWa9+kYamANQnWiMyDKmeK3BX/GkXdcI9D nwhy1lS4+ryyGDN+VBdpqbWVp6ns+sqNWA1uOo9k271fwGDNVilV4vmMZmO8DnP1vfk2 pgww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PUkREiDac/4ZH7kWY/xK+XNOCt1EYwSejvDGxYtWXc4=; b=s4PPcncvi4fK4Ur6ILgU6eLs70iYhS0dxTBDdL+Rg2ooH9HN4L/y3lYK2jYpiXHP8l raluBbYcuGs99zrHsyOT6SdPmaRZa+qBTrdton4h71irMIoV+8EcWPb/pMbdp1UB2qJ9 EYI6/xuDwDqxLwYtyeREBhKjvKwCFxEB8hWX8KOwSP5zbfCa33AsO6gCANHkOXjwTguF +KzPfaLJfhkGot/xq4XhjCkqI3q7lUkzWYRm6qtOv1IAp6CfFQiVv4/Kq887d8HJ5V5P wnqw1HLBUsu4+vjyTqwF2o/JgV2CtAJK4XrTKSYp0VAd7AauebA6WX277fAx9GDUYxr/ tcZA== X-Gm-Message-State: AOAM531DJZY9GsDZpMXRUw7ybZ1MVugvIsYzITlD9ZKGoao7LXYW0mKI YOz82rH/rcXMD15kDfFQK/qzZYJLYQSzeA== X-Google-Smtp-Source: ABdhPJzMOS2FmZyIbfQ/ARz6qZ2hW4CRiVsy9YvfmzBRFRZT4WZh1M9cpHzQIbdeziR+SB/WeyuOcA== X-Received: by 2002:a17:902:9697:b029:d1:e598:4001 with SMTP id n23-20020a1709029697b02900d1e5984001mr2288681plp.59.1600910697202; Wed, 23 Sep 2020 18:24:57 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.24.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:24:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 2/8] softfloat: Use int128.h for some operations Date: Wed, 23 Sep 2020 18:24:47 -0700 Message-Id: <20200924012453.659757-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Use our Int128, which wraps the compiler's __int128_t, instead of open-coding left shifts and arithmetic. We'd need to extend Int128 to have unsigned operations to replace more than these three. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- include/fpu/softfloat-macros.h | 39 +++++++++++++++++----------------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/include/fpu/softfloat-macros.h b/include/fpu/softfloat-macros.h index 57845f8af0..95d88d05b8 100644 --- a/include/fpu/softfloat-macros.h +++ b/include/fpu/softfloat-macros.h @@ -84,6 +84,7 @@ this code that are retained. #include "fpu/softfloat-types.h" #include "qemu/host-utils.h" +#include "qemu/int128.h" /*---------------------------------------------------------------------------- | Shifts `a' right by the number of bits given in `count'. If any nonzero @@ -352,13 +353,11 @@ static inline void shortShift128Left(uint64_t a0, uint64_t a1, int count, static inline void shift128Left(uint64_t a0, uint64_t a1, int count, uint64_t *z0Ptr, uint64_t *z1Ptr) { - if (count < 64) { - *z1Ptr = a1 << count; - *z0Ptr = count == 0 ? a0 : (a0 << count) | (a1 >> (-count & 63)); - } else { - *z1Ptr = 0; - *z0Ptr = a1 << (count - 64); - } + Int128 a = int128_make128(a1, a0); + Int128 z = int128_lshift(a, count); + + *z0Ptr = int128_gethi(z); + *z1Ptr = int128_getlo(z); } /*---------------------------------------------------------------------------- @@ -405,15 +404,15 @@ static inline void *----------------------------------------------------------------------------*/ static inline void - add128( - uint64_t a0, uint64_t a1, uint64_t b0, uint64_t b1, uint64_t *z0Ptr, uint64_t *z1Ptr ) +add128(uint64_t a0, uint64_t a1, uint64_t b0, uint64_t b1, + uint64_t *z0Ptr, uint64_t *z1Ptr) { - uint64_t z1; - - z1 = a1 + b1; - *z1Ptr = z1; - *z0Ptr = a0 + b0 + ( z1 < a1 ); + Int128 a = int128_make128(a1, a0); + Int128 b = int128_make128(b1, b0); + Int128 z = int128_add(a, b); + *z0Ptr = int128_gethi(z); + *z1Ptr = int128_getlo(z); } /*---------------------------------------------------------------------------- @@ -463,13 +462,15 @@ static inline void *----------------------------------------------------------------------------*/ static inline void - sub128( - uint64_t a0, uint64_t a1, uint64_t b0, uint64_t b1, uint64_t *z0Ptr, uint64_t *z1Ptr ) +sub128(uint64_t a0, uint64_t a1, uint64_t b0, uint64_t b1, + uint64_t *z0Ptr, uint64_t *z1Ptr) { + Int128 a = int128_make128(a1, a0); + Int128 b = int128_make128(b1, b0); + Int128 z = int128_sub(a, b); - *z1Ptr = a1 - b1; - *z0Ptr = a0 - b0 - ( a1 < b1 ); - + *z0Ptr = int128_gethi(z); + *z1Ptr = int128_getlo(z); } /*---------------------------------------------------------------------------- From patchwork Thu Sep 24 01:24:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 272915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5E8CC4363D for ; Thu, 24 Sep 2020 01:26:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 20EC32145D for ; Thu, 24 Sep 2020 01:26:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="kPMwDU46" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20EC32145D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38668 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG1X-0006Yq-3G for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:26:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43726) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0M-0004oO-0E for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:02 -0400 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]:39834) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0K-0005hR-CO for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:01 -0400 Received: by mail-pl1-x632.google.com with SMTP id y17so736384plb.6 for ; Wed, 23 Sep 2020 18:25:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5CbuzJn61hBLc+FY8dWMDM+0yvJ0dAh227GS03ZS2Y4=; b=kPMwDU463lCbNamT3GFvdKX+kFYWHxeNz2SrIxb6WcjOhvtWJ9ooZUMhSHlg650FWq KWiL+eAyFNrDz8VIO1Ce2obCjDywAGT8q2+FRzBgHHdqiB+XJyiWQTTxJE2CNsn9PQGs Xnj0deRNlUYyiAeb5WgI18E+z1j+D7toSToF/g8QXkPDtqh3qfSnlbem+WgF00jUSfLd L5yn58Qqmd4u2JUd8IWtUaY1dgEhO7ekAs+aEpOVXMjh9XOon0HQrnXRFaQVODW1mgfO QAJ2Zb5zzgBp50hDsnOyQ2mRGbnQNLtjeK/Qr6OT1U9a56gMl2JiCGGABNRXAzec27ux 9E8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5CbuzJn61hBLc+FY8dWMDM+0yvJ0dAh227GS03ZS2Y4=; b=SG08MD5ZS56VLRMwAvi25dL1V2q9LBeekhz3MKjLLkYQF0CjifJ89LKi+aEE8aBJfa 54Ac9eu5z1/rc0UX5hXJnUoBKmOAwCsgKS6WSUwzM07Vuc9hnQUMuKUBOJcnT3sCofY2 fnKRmS1Q4/RX0S9bAqKj6gBXjdPDdLJOsPYYsEaWFkYZ6t/fHWf1sVaCdPH4xw6UcF/W 7yFgmimCHBp9aJ93Olr7GPsfspTGB1PCDCP3SUDNY65Eb/19+9o7CocK3K9zqaiyG4Qz h2f5tPfC2v+RaxszuLAOC9zBmnrmRaRrBKBG29dvcHy9gXeN0ZyYVMmnGPMsjOeNUNFP pd/w== X-Gm-Message-State: AOAM532kahNiSjBkLKyyjwMTDaolzrJX9cSHXlP6E61F87G3BdkAZ/Om /jDddKlqjT7jz0x9OisvHeO08JST2n0I5w== X-Google-Smtp-Source: ABdhPJwBpYf3THNTwwgwGtww+qAoQhIgAxk45E1l8OMgF4w2YyxZOZHoE7gbh+cv63nLeyDkxAu68w== X-Received: by 2002:a17:902:b10b:b029:d1:e5e7:bdf8 with SMTP id q11-20020a170902b10bb02900d1e5e7bdf8mr2254153plr.43.1600910698661; Wed, 23 Sep 2020 18:24:58 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.24.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:24:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 3/8] softfloat: Tidy a * b + inf return Date: Wed, 23 Sep 2020 18:24:48 -0700 Message-Id: <20200924012453.659757-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" No reason to set values in 'a', when we already have float_class_inf in 'c', and can flip that sign. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- fpu/softfloat.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 67cfa0fd82..9db55d2b11 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1380,9 +1380,8 @@ static FloatParts muladd_floats(FloatParts a, FloatParts b, FloatParts c, s->float_exception_flags |= float_flag_invalid; return parts_default_nan(s); } else { - a.cls = float_class_inf; - a.sign = c.sign ^ sign_flip; - return a; + c.sign ^= sign_flip; + return c; } } From patchwork Thu Sep 24 01:24:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E4C9C4363D for ; Thu, 24 Sep 2020 01:26:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 99EF22145D for ; Thu, 24 Sep 2020 01:26:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="xQI2BUod" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 99EF22145D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38830 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG1a-0006cm-GW for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:26:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43742) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0N-0004r8-Dv for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:03 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:34931) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0L-0005hd-LS for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:03 -0400 Received: by mail-pj1-x1044.google.com with SMTP id jw11so697396pjb.0 for ; Wed, 23 Sep 2020 18:25:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DTuyn50RwqtJUsMxHJmqO9znr03h7lV5LqctzDmaByI=; b=xQI2BUod19ya/B++dGru/GRtvSs88IP5q1gYYE9G+2T931ZT4dA6MhmISVErRfCw1p lZr0beeILt/Yt+GclMxigiMrTpt0gagWL4JsDwLj6d6jzY/86qyDib5Raj6733stDh6v apLQnb9QKl1G5EV2QZz78cxOajZhFtlbhD+E+BKnhO8avHwOGryWSqKtvLajE0hkMmgn RL2OgMAZQy0ksicd2fBwCwIu+n41zRzuWSxd/2KLrC/D2LvR92leV9P0lNYbrts5St/H d54tOTsiXnD0cUhKUBep5Ylof4zzH4FSh+TuuD6Kzv2JxtK/NCyAoO1WJraIF0X3RiIb WtdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DTuyn50RwqtJUsMxHJmqO9znr03h7lV5LqctzDmaByI=; b=ope19jHOtvbuibLFwdWQozi6x/4T/QuejOidl7XZxf6qnZcSqT1LXxvXgIGq3K/tRi GtwfyaFG54fvtq9Pr9C6k3s5yBKlX2qPc7f3AvCbmgR6BYr1OTSFfI/EXoY7BqjFCQiw Jpdl8qLX4SBfB3BJZxQtGy2V0LiokoOCd22wafBrk1YH0tYqJzb3Njtzc/YuXQnNzFm6 gp2wK1ikQQKRkcXYhmRoJ6SxcVpXpBi4xkX6Uwba4UFess/vUns+8/rdGMEBc225oB0d elsZcCppArIbBnT9ei9fG+qCqoC7EyMfE7/ckjyrckMHRq5Nvpg4spMHGqKFZnk3EQnE bWwQ== X-Gm-Message-State: AOAM5336GtdSgib2HUYd7prJjh0YLdo5/fAVCklWY8u0160wsUUv6/pK MBN+igFGV0pGikbq/bxR5FlrB9PaDKG98A== X-Google-Smtp-Source: ABdhPJwI8F5VmWnywQHLADZakL0nvz2OxDUf1+KM81fSU9G27FqNdOvYWVhSIqM0/Ted7gmxzbVupA== X-Received: by 2002:a17:902:16b:b029:d2:565e:a70b with SMTP id 98-20020a170902016bb02900d2565ea70bmr2407520plb.42.1600910699993; Wed, 23 Sep 2020 18:24:59 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.24.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:24:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 4/8] softfloat: Add float_cmask and constants Date: Wed, 23 Sep 2020 18:24:49 -0700 Message-Id: <20200924012453.659757-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Testing more than one class at a time is better done with masks. This reduces the static branch count. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- fpu/softfloat.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 9db55d2b11..3e625c47cd 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -469,6 +469,20 @@ typedef enum __attribute__ ((__packed__)) { float_class_snan, } FloatClass; +#define float_cmask(bit) (1u << (bit)) + +enum { + float_cmask_zero = float_cmask(float_class_zero), + float_cmask_normal = float_cmask(float_class_normal), + float_cmask_inf = float_cmask(float_class_inf), + float_cmask_qnan = float_cmask(float_class_qnan), + float_cmask_snan = float_cmask(float_class_snan), + + float_cmask_infzero = float_cmask_zero | float_cmask_inf, + float_cmask_anynan = float_cmask_qnan | float_cmask_snan, +}; + + /* Simple helpers for checking if, or what kind of, NaN we have */ static inline __attribute__((unused)) bool is_nan(FloatClass c) { @@ -1335,24 +1349,27 @@ bfloat16 QEMU_FLATTEN bfloat16_mul(bfloat16 a, bfloat16 b, float_status *status) static FloatParts muladd_floats(FloatParts a, FloatParts b, FloatParts c, int flags, float_status *s) { - bool inf_zero = ((1 << a.cls) | (1 << b.cls)) == - ((1 << float_class_inf) | (1 << float_class_zero)); - bool p_sign; + bool inf_zero, p_sign; bool sign_flip = flags & float_muladd_negate_result; FloatClass p_class; uint64_t hi, lo; int p_exp; + int ab_mask, abc_mask; + + ab_mask = float_cmask(a.cls) | float_cmask(b.cls); + abc_mask = float_cmask(c.cls) | ab_mask; + inf_zero = ab_mask == float_cmask_infzero; /* It is implementation-defined whether the cases of (0,inf,qnan) * and (inf,0,qnan) raise InvalidOperation or not (and what QNaN * they return if they do), so we have to hand this information * off to the target-specific pick-a-NaN routine. */ - if (is_nan(a.cls) || is_nan(b.cls) || is_nan(c.cls)) { + if (unlikely(abc_mask & float_cmask_anynan)) { return pick_nan_muladd(a, b, c, inf_zero, s); } - if (inf_zero) { + if (unlikely(inf_zero)) { s->float_exception_flags |= float_flag_invalid; return parts_default_nan(s); } @@ -1367,9 +1384,9 @@ static FloatParts muladd_floats(FloatParts a, FloatParts b, FloatParts c, p_sign ^= 1; } - if (a.cls == float_class_inf || b.cls == float_class_inf) { + if (ab_mask & float_cmask_inf) { p_class = float_class_inf; - } else if (a.cls == float_class_zero || b.cls == float_class_zero) { + } else if (ab_mask & float_cmask_zero) { p_class = float_class_zero; } else { p_class = float_class_normal; From patchwork Thu Sep 24 01:24:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304590 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D76C2C4363D for ; Thu, 24 Sep 2020 01:28:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6E78A21534 for ; Thu, 24 Sep 2020 01:28:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="UNmGmyxe" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E78A21534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45978 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG3T-0001E0-Jw for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:28:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43750) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0O-0004tY-JT for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:04 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:36357) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0M-0005hq-Q8 for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:04 -0400 Received: by mail-pj1-x1042.google.com with SMTP id b17so695682pji.1 for ; Wed, 23 Sep 2020 18:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+yuwP80qij/nuD82ane1l2a4D7zi8BJkB3dvAAaGtzw=; b=UNmGmyxeiXfK+0oAEpdRfj5RvWB+jjKuC0YK++ypXm+2tOLj+hNuu7I+8CpIXsCDTS oDHOrG3TU6xOlc2TuN/eKG2ztUVd12CN2FiduN54hAge5ZoTTNGVAyHd6zMQlvb+LHvE oWx6AhCx7EACdxVF/dvqKsr1yJ14DeQkBmCnOCA5w40SuJ9BBr24LGGvtdbGaoS/Tk/O sgu6O2oz21xWUvKFSOb0u94BO2TxaFpRmT0AdMOpf14ACyOTVVdJNfCparFSXDNqIwG5 +9Sp05QD35xi0SVnKljkbZXGDE022dyHYIIzlPa3iz0sR50tef8VNball/Vc/hLrYnL8 1pIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+yuwP80qij/nuD82ane1l2a4D7zi8BJkB3dvAAaGtzw=; b=Gsibhts93Qd0uA1JpsSuNd+sxaUNiuWoTJ1xmuf70q5DemxCTekkdtjvSnH+iyvmRC Dm6h+upHoS7qCNqAlVN3BiYxWjbjLgWJalJjspAeCXpOHrKddUBdsZtD3wevWp/SEFyb q84uT2RJ5hMrgeife2TgAgaHq3Z/lSGejOKGaLUuQLXKvJkw7CdKYUZi7AsOFTcB2q4P 1cxBORpXtvC+Ict/F3MkLlzv125KFl6N/QtyUy1SH4eWW2J77XmZpUQFQjglW0MQmIZW ZTOT+WsgAm4q41TVU/WcL5HCsQTUOE4dvxu6zlMyycfHlPACz664MoQ7V8c1o8hTLhx4 F0rQ== X-Gm-Message-State: AOAM532KNcnDr7L+nU798mCFkqipBV5b4c1StIeund3PPWM0EnhSJLZ8 gY9ApZ7IfR/r9yAAs8VdNZmzEJCGWJAkpw== X-Google-Smtp-Source: ABdhPJyb+MmvOlKUd6ouRwQjKyW8bAqvU8adzSllui5uj4vj9dqe+ehivfzJQ5PcKtVX7xgEnIZwtw== X-Received: by 2002:a17:902:e9d2:b029:d1:e5e7:be63 with SMTP id 18-20020a170902e9d2b02900d1e5e7be63mr2253409plk.61.1600910701132; Wed, 23 Sep 2020 18:25:01 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.25.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:25:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 5/8] softfloat: Inline pick_nan_muladd into its caller Date: Wed, 23 Sep 2020 18:24:50 -0700 Message-Id: <20200924012453.659757-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Because of FloatParts, there will only ever be one caller. Inlining allows us to re-use abc_mask for the snan test. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- fpu/softfloat.c | 75 +++++++++++++++++++++++-------------------------- 1 file changed, 35 insertions(+), 40 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 3e625c47cd..e038434a07 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -929,45 +929,6 @@ static FloatParts pick_nan(FloatParts a, FloatParts b, float_status *s) return a; } -static FloatParts pick_nan_muladd(FloatParts a, FloatParts b, FloatParts c, - bool inf_zero, float_status *s) -{ - int which; - - if (is_snan(a.cls) || is_snan(b.cls) || is_snan(c.cls)) { - s->float_exception_flags |= float_flag_invalid; - } - - which = pickNaNMulAdd(a.cls, b.cls, c.cls, inf_zero, s); - - if (s->default_nan_mode) { - /* Note that this check is after pickNaNMulAdd so that function - * has an opportunity to set the Invalid flag. - */ - which = 3; - } - - switch (which) { - case 0: - break; - case 1: - a = b; - break; - case 2: - a = c; - break; - case 3: - return parts_default_nan(s); - default: - g_assert_not_reached(); - } - - if (is_snan(a.cls)) { - return parts_silence_nan(a, s); - } - return a; -} - /* * Returns the result of adding or subtracting the values of the * floating-point values `a' and `b'. The operation is performed @@ -1366,7 +1327,41 @@ static FloatParts muladd_floats(FloatParts a, FloatParts b, FloatParts c, * off to the target-specific pick-a-NaN routine. */ if (unlikely(abc_mask & float_cmask_anynan)) { - return pick_nan_muladd(a, b, c, inf_zero, s); + int which; + + if (unlikely(abc_mask & float_cmask_snan)) { + float_raise(float_flag_invalid, s); + } + + which = pickNaNMulAdd(a.cls, b.cls, c.cls, inf_zero, s); + + if (s->default_nan_mode) { + /* + * Note that this check is after pickNaNMulAdd so that function + * has an opportunity to set the Invalid flag for inf_zero. + */ + which = 3; + } + + switch (which) { + case 0: + break; + case 1: + a = b; + break; + case 2: + a = c; + break; + case 3: + return parts_default_nan(s); + default: + g_assert_not_reached(); + } + + if (is_snan(a.cls)) { + return parts_silence_nan(a, s); + } + return a; } if (unlikely(inf_zero)) { From patchwork Thu Sep 24 01:24:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8220BC4363D for ; Thu, 24 Sep 2020 01:29:23 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C87AB21D20 for ; Thu, 24 Sep 2020 01:29:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="WAB3sCo+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C87AB21D20 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50828 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG4Y-0003Fk-00 for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:29:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43780) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0R-0004wS-AI for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:07 -0400 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]:42676) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0O-0005hz-Sg for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:06 -0400 Received: by mail-pf1-x42c.google.com with SMTP id d6so822775pfn.9 for ; Wed, 23 Sep 2020 18:25:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=98iXaPYRLY+m1y4PoyF25H+hGcwsxS4GdvBUZnInWc0=; b=WAB3sCo+8HKLC2N/WUmOQHgeS1BJxs6WZhSgYBvu3wUqZGm0uh9Mj3lji+gS1+6yX2 FgwTpYSqUia2OVUG23FByuOqNJJwBTVehj7FuYQc/ABsmiW+2M7f44FXzHnHIrCdtJPa bkJrDWUO73ea6BuHyo8BQKD+czDWD9st5121KqXOveQF1cNzUNlYhik8Zy4HhQk+Q6SY 5qaZQN2EVmT5k0psrW9AEwibYr1yP3AaZw+uZZhxMd0MBcdnpKJQGNK2Ft8aV7RCdXCq SzJg1LMi8ep7TRIAXfYfKj5zoTBSKy7jX9AYebqE9lY4LC8gXIjxAuOJLubQAreILSO8 TwPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=98iXaPYRLY+m1y4PoyF25H+hGcwsxS4GdvBUZnInWc0=; b=VKdfrQKfX53XANnhsc+tPMTDh9MJLghTOvR8Lep+Rpo5W75eiYv1H5jHvLg9f6MH9e neA86/0gkfsx1N/uWpVJJLVqjbCN3vbATaJadPXdss8rKZzS91RH7qAQE0QCYtpa6ZAi TnzrsBntvknoCN4pQvjn67mHItNzcKClId1dDposdeSyhWAJvLmptJcqwzDxLWvKLG1n ib7iJTZF4z/DCgh4CEmtwy8Pkg5rAmc9bZ8BYbZvOBGDtwYlHmnkAbMNe/WI5CJANdhW WjadvO+3cb2XP7y8/kksl/Udxoze4Fjy4qFMrnl2eNMfZrfaAxOJ4CkRKpTQLdVykJOb lNmw== X-Gm-Message-State: AOAM532HUO2Tb8X3hlWEfmQUmlqOUj6C3evVwTZXJZVa+xESgdliKEox K8Mo6O284gVUO98895MnCW07Oh0XekPsqA== X-Google-Smtp-Source: ABdhPJzMybFSHt83rnDOLZL3l79NtHFGW9MsWyOOgpfn1knbRKQBjFNgnSbySV1mwZWG22NIgg6LQA== X-Received: by 2002:a63:fd08:: with SMTP id d8mr2015949pgh.223.1600910702658; Wed, 23 Sep 2020 18:25:02 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.25.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:25:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 6/8] softfloat: Implement float128_muladd Date: Wed, 23 Sep 2020 18:24:51 -0700 Message-Id: <20200924012453.659757-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- include/fpu/softfloat.h | 2 + fpu/softfloat.c | 356 +++++++++++++++++++++++++++++++++++++++- tests/fp/fp-test.c | 2 +- tests/fp/wrap.c.inc | 12 ++ 4 files changed, 370 insertions(+), 2 deletions(-) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 78ad5ca738..a38433deb4 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -1196,6 +1196,8 @@ float128 float128_sub(float128, float128, float_status *status); float128 float128_mul(float128, float128, float_status *status); float128 float128_div(float128, float128, float_status *status); float128 float128_rem(float128, float128, float_status *status); +float128 float128_muladd(float128, float128, float128, int, + float_status *status); float128 float128_sqrt(float128, float_status *status); FloatRelation float128_compare(float128, float128, float_status *status); FloatRelation float128_compare_quiet(float128, float128, float_status *status); diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e038434a07..5b714fbd82 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -512,11 +512,19 @@ static inline __attribute__((unused)) bool is_qnan(FloatClass c) typedef struct { uint64_t frac; - int32_t exp; + int32_t exp; FloatClass cls; bool sign; } FloatParts; +/* Similar for float128. */ +typedef struct { + uint64_t frac0, frac1; + int32_t exp; + FloatClass cls; + bool sign; +} FloatParts128; + #define DECOMPOSED_BINARY_POINT (64 - 2) #define DECOMPOSED_IMPLICIT_BIT (1ull << DECOMPOSED_BINARY_POINT) #define DECOMPOSED_OVERFLOW_BIT (DECOMPOSED_IMPLICIT_BIT << 1) @@ -4574,6 +4582,46 @@ static void } +/*---------------------------------------------------------------------------- +| Returns the parts of floating-point value `a'. +*----------------------------------------------------------------------------*/ + +static void float128_unpack(FloatParts128 *p, float128 a, float_status *status) +{ + p->sign = extractFloat128Sign(a); + p->exp = extractFloat128Exp(a); + p->frac0 = extractFloat128Frac0(a); + p->frac1 = extractFloat128Frac1(a); + + if (p->exp == 0) { + if ((p->frac0 | p->frac1) == 0) { + p->cls = float_class_zero; + } else if (status->flush_inputs_to_zero) { + float_raise(float_flag_input_denormal, status); + p->cls = float_class_zero; + p->frac0 = p->frac1 = 0; + } else { + normalizeFloat128Subnormal(p->frac0, p->frac1, &p->exp, + &p->frac0, &p->frac1); + p->exp -= 0x3fff; + p->cls = float_class_normal; + } + } else if (p->exp == 0x7fff) { + if ((p->frac0 | p->frac1) == 0) { + p->cls = float_class_inf; + } else if (float128_is_signaling_nan(a, status)) { + p->cls = float_class_snan; + } else { + p->cls = float_class_qnan; + } + } else { + /* Add the implicit bit. */ + p->frac0 |= UINT64_C(0x0001000000000000); + p->exp -= 0x3fff; + p->cls = float_class_normal; + } +} + /*---------------------------------------------------------------------------- | Packs the sign `zSign', the exponent `zExp', and the significand formed | by the concatenation of `zSig0' and `zSig1' into a quadruple-precision @@ -7205,6 +7253,312 @@ float128 float128_mul(float128 a, float128 b, float_status *status) } +static void shortShift256Left(uint64_t p[4], unsigned count) +{ + int negcount = -count & 63; + + if (count == 0) { + return; + } + g_assert(count < 64); + p[0] = (p[0] << count) | (p[1] >> negcount); + p[1] = (p[1] << count) | (p[2] >> negcount); + p[2] = (p[2] << count) | (p[3] >> negcount); + p[3] = (p[3] << count); +} + +static void shift256RightJamming(uint64_t p[4], int count) +{ + uint64_t in = 0; + + g_assert(count >= 0); + + count = MIN(count, 256); + for (; count >= 64; count -= 64) { + in |= p[3]; + p[3] = p[2]; + p[2] = p[1]; + p[1] = p[0]; + p[0] = 0; + } + + if (count) { + int negcount = -count & 63; + + in |= p[3] << negcount; + p[3] = (p[2] << negcount) | (p[3] >> count); + p[2] = (p[1] << negcount) | (p[2] >> count); + p[1] = (p[0] << negcount) | (p[1] >> count); + p[0] = p[0] >> count; + } + p[3] |= (in != 0); +} + +/* R = A - B */ +static void sub256(uint64_t r[4], uint64_t a[4], uint64_t b[4]) +{ + bool borrow = false; + + for (int i = 3; i >= 0; --i) { + if (borrow) { + borrow = a[i] <= b[i]; + r[i] = a[i] - b[i] - 1; + } else { + borrow = a[i] < b[i]; + r[i] = a[i] - b[i]; + } + } +} + +/* A = -A */ +static void neg256(uint64_t a[4]) +{ + a[3] = -a[3]; + if (likely(a[3])) { + goto not2; + } + a[2] = -a[2]; + if (likely(a[2])) { + goto not1; + } + a[1] = -a[1]; + if (likely(a[1])) { + goto not0; + } + a[0] = -a[0]; + return; + not2: + a[2] = ~a[2]; + not1: + a[1] = ~a[1]; + not0: + a[0] = ~a[0]; +} + +/* A += B */ +static void add256(uint64_t a[4], uint64_t b[4]) +{ + bool carry = false; + + for (int i = 3; i >= 0; --i) { + uint64_t t = a[i] + b[i]; + if (carry) { + t += 1; + carry = t <= a[i]; + } else { + carry = t < a[i]; + } + a[i] = t; + } +} + +float128 float128_muladd(float128 a_f, float128 b_f, float128 c_f, + int flags, float_status *status) +{ + bool inf_zero, p_sign, sign_flip; + uint64_t p_frac[4]; + FloatParts128 a, b, c; + int p_exp, exp_diff, shift, ab_mask, abc_mask; + FloatClass p_cls; + + float128_unpack(&a, a_f, status); + float128_unpack(&b, b_f, status); + float128_unpack(&c, c_f, status); + + ab_mask = float_cmask(a.cls) | float_cmask(b.cls); + abc_mask = float_cmask(c.cls) | ab_mask; + inf_zero = ab_mask == float_cmask_infzero; + + /* If any input is a NaN, select the required result. */ + if (unlikely(abc_mask & float_cmask_anynan)) { + if (unlikely(abc_mask & float_cmask_snan)) { + float_raise(float_flag_invalid, status); + } + + int which = pickNaNMulAdd(a.cls, b.cls, c.cls, inf_zero, status); + if (status->default_nan_mode) { + which = 3; + } + switch (which) { + case 0: + break; + case 1: + a_f = b_f; + a.cls = b.cls; + break; + case 2: + a_f = c_f; + a.cls = c.cls; + break; + case 3: + return float128_default_nan(status); + } + if (is_snan(a.cls)) { + return float128_silence_nan(a_f, status); + } + return a_f; + } + + /* After dealing with input NaNs, look for Inf * Zero. */ + if (unlikely(inf_zero)) { + float_raise(float_flag_invalid, status); + return float128_default_nan(status); + } + + p_sign = a.sign ^ b.sign; + + if (flags & float_muladd_negate_c) { + c.sign ^= 1; + } + if (flags & float_muladd_negate_product) { + p_sign ^= 1; + } + sign_flip = (flags & float_muladd_negate_result); + + if (ab_mask & float_cmask_inf) { + p_cls = float_class_inf; + } else if (ab_mask & float_cmask_zero) { + p_cls = float_class_zero; + } else { + p_cls = float_class_normal; + } + + if (c.cls == float_class_inf) { + if (p_cls == float_class_inf && p_sign != c.sign) { + /* +Inf + -Inf = NaN */ + float_raise(float_flag_invalid, status); + return float128_default_nan(status); + } + /* Inf + Inf = Inf of the proper sign; reuse the return below. */ + p_cls = float_class_inf; + p_sign = c.sign; + } + + if (p_cls == float_class_inf) { + return packFloat128(p_sign ^ sign_flip, 0x7fff, 0, 0); + } + + if (p_cls == float_class_zero) { + if (c.cls == float_class_zero) { + if (p_sign != c.sign) { + p_sign = status->float_rounding_mode == float_round_down; + } + return packFloat128(p_sign ^ sign_flip, 0, 0, 0); + } + + if (flags & float_muladd_halve_result) { + c.exp -= 1; + } + return roundAndPackFloat128(c.sign ^ sign_flip, + c.exp + 0x3fff - 1, + c.frac0, c.frac1, 0, status); + } + + /* a & b should be normals now... */ + assert(a.cls == float_class_normal && b.cls == float_class_normal); + + /* Multiply of 2 113-bit numbers produces a 226-bit result. */ + mul128To256(a.frac0, a.frac1, b.frac0, b.frac1, + &p_frac[0], &p_frac[1], &p_frac[2], &p_frac[3]); + + /* Realign the binary point at bit 48 of p_frac[0]. */ + shift = clz64(p_frac[0]) - 15; + g_assert(shift == 15 || shift == 16); + shortShift256Left(p_frac, shift); + p_exp = a.exp + b.exp - (shift - 16); + exp_diff = p_exp - c.exp; + + uint64_t c_frac[4] = { c.frac0, c.frac1, 0, 0 }; + + /* Add or subtract C from the intermediate product. */ + if (c.cls == float_class_zero) { + /* Fall through to rounding after addition (with zero). */ + } else if (p_sign != c.sign) { + /* Subtraction */ + if (exp_diff < 0) { + shift256RightJamming(p_frac, -exp_diff); + sub256(p_frac, c_frac, p_frac); + p_exp = c.exp; + p_sign ^= 1; + } else if (exp_diff > 0) { + shift256RightJamming(c_frac, exp_diff); + sub256(p_frac, p_frac, c_frac); + } else { + /* Low 128 bits of C are known to be zero. */ + sub128(p_frac[0], p_frac[1], c_frac[0], c_frac[1], + &p_frac[0], &p_frac[1]); + /* + * Since we have normalized to bit 48 of p_frac[0], + * a negative result means C > P and we need to invert. + */ + if ((int64_t)p_frac[0] < 0) { + neg256(p_frac); + p_sign ^= 1; + } + } + + /* + * Gross normalization of the 256-bit subtraction result. + * Fine tuning below shared with addition. + */ + if (p_frac[0] != 0) { + /* nothing to do */ + } else if (p_frac[1] != 0) { + p_exp -= 64; + p_frac[0] = p_frac[1]; + p_frac[1] = p_frac[2]; + p_frac[2] = p_frac[3]; + p_frac[3] = 0; + } else if (p_frac[2] != 0) { + p_exp -= 128; + p_frac[0] = p_frac[2]; + p_frac[1] = p_frac[3]; + p_frac[2] = 0; + p_frac[3] = 0; + } else if (p_frac[3] != 0) { + p_exp -= 192; + p_frac[0] = p_frac[3]; + p_frac[1] = 0; + p_frac[2] = 0; + p_frac[3] = 0; + } else { + /* Subtraction was exact: result is zero. */ + p_sign = status->float_rounding_mode == float_round_down; + return packFloat128(p_sign ^ sign_flip, 0, 0, 0); + } + } else { + /* Addition */ + if (exp_diff <= 0) { + shift256RightJamming(p_frac, -exp_diff); + /* Low 128 bits of C are known to be zero. */ + add128(p_frac[0], p_frac[1], c_frac[0], c_frac[1], + &p_frac[0], &p_frac[1]); + p_exp = c.exp; + } else { + shift256RightJamming(c_frac, exp_diff); + add256(p_frac, c_frac); + } + } + + /* Fine normalization of the 256-bit result: p_frac[0] != 0. */ + shift = clz64(p_frac[0]) - 15; + if (shift < 0) { + shift256RightJamming(p_frac, -shift); + } else if (shift > 0) { + shortShift256Left(p_frac, shift); + } + p_exp -= shift; + + if (flags & float_muladd_halve_result) { + p_exp -= 1; + } + return roundAndPackFloat128(p_sign ^ sign_flip, + p_exp + 0x3fff - 1, + p_frac[0], p_frac[1], + p_frac[2] | (p_frac[3] != 0), + status); +} + /*---------------------------------------------------------------------------- | Returns the result of dividing the quadruple-precision floating-point value | `a' by the corresponding value `b'. The operation is performed according to diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c index 06ffebd6db..9bbb0dba67 100644 --- a/tests/fp/fp-test.c +++ b/tests/fp/fp-test.c @@ -717,7 +717,7 @@ static void do_testfloat(int op, int rmode, bool exact) test_abz_f128(true_abz_f128M, subj_abz_f128M); break; case F128_MULADD: - not_implemented(); + test_abcz_f128(slow_f128M_mulAdd, qemu_f128_mulAdd); break; case F128_SQRT: test_az_f128(slow_f128M_sqrt, qemu_f128M_sqrt); diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc index 0cbd20013e..65a713deae 100644 --- a/tests/fp/wrap.c.inc +++ b/tests/fp/wrap.c.inc @@ -574,6 +574,18 @@ WRAP_MULADD(qemu_f32_mulAdd, float32_muladd, float32) WRAP_MULADD(qemu_f64_mulAdd, float64_muladd, float64) #undef WRAP_MULADD +static void qemu_f128_mulAdd(const float128_t *ap, const float128_t *bp, + const float128_t *cp, float128_t *res) +{ + float128 a, b, c, ret; + + a = soft_to_qemu128(*ap); + b = soft_to_qemu128(*bp); + c = soft_to_qemu128(*cp); + ret = float128_muladd(a, b, c, 0, &qsf); + *res = qemu_to_soft128(ret); +} + #define WRAP_CMP16(name, func, retcond) \ static bool name(float16_t a, float16_t b) \ { \ From patchwork Thu Sep 24 01:24:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 272912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7AF3DC4363D for ; Thu, 24 Sep 2020 01:31:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D63AF208E4 for ; Thu, 24 Sep 2020 01:31:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="UihMbbKm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D63AF208E4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG6M-0004KF-VU for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:31:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43798) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0S-0004y9-Kw for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:09 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:39721) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0P-0005i5-FD for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:08 -0400 Received: by mail-pl1-x644.google.com with SMTP id y17so736495plb.6 for ; Wed, 23 Sep 2020 18:25:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vy0IYhtMJlbLw9gdq1ziCf1I+pv44OHn8TCjYGrXMYs=; b=UihMbbKmm/b21VcXJg+9EvqbtQRAsYWblMSbvgFCoNSGweKfBAPSqrMjTm5oUY7piF SvOvafKWV87UtJ4BOvjwcXsvBYbxfILQHIjBLe9dDBUX+Dbu3nVHjKo1A7u6LnE96TBv hIWa9aHdM1iI/lbDu0JPk20KrqK1LfnCbSqIkheaULMMfRoqIZXs3jHQtvhWM314zkL0 9Y57N81tqNeHc3SAqmyycZS/JecXttJ7xhZCEPfUXSLhc5SaBg5Sc1yYQ/Y6i6o9UNFg AHwGkIyZuS7O2qLGFSmD3IMfFEn3lHaBIbexxCgb7C5hQYlo0hMWWYUBDIIpMwvSoczI 4pcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vy0IYhtMJlbLw9gdq1ziCf1I+pv44OHn8TCjYGrXMYs=; b=A6tfEAGkNJaTlSjN1NB/MAeejXu6Ge3x6EWSd2j+sPobl0WHgeYGW2M/L/TxRw/YIX 25+RAe9DP/yw1lnpnqRhUyjtMZM3OLIZ+1zwsirRnd2Cj+K4US1amQfpEyr9kmHYqsMD lGKARazVsN1WK2DF9G0Bos1Zz+WM2hH1SGGm9ioqbgF9jnv/OI/Xt/8R4uV18mK1kcjz 0H69kQOX4xOxGfR+I8CWdmIy2RC1McIu4GiB5GY/khMB9EfCB0z3T4nVjJ0q9sA5f25N 7ngXe+DZ1aIc7hJ1EvozVxPKEZ+wj+QxAlfrzLwwJk8d4a5wlvPW+u+Ur/WXpg+2pnqG us9g== X-Gm-Message-State: AOAM532UIXMuWX+NVOgZlRClpLfIs2fu/6MsvNlsauUF6y2dTWf0BQsi 9al9KxGGlRMsVrIGK5DSLM62oFhs4QRViQ== X-Google-Smtp-Source: ABdhPJwRJzZPjBJvPeyHZfwKgSq3UYDtXNDHXZWJT1PnmU9YsF8X0D0dilH4uEgGXjM2PYxTn9R0Fg== X-Received: by 2002:a17:902:ff07:b029:d1:e5fa:aa1d with SMTP id f7-20020a170902ff07b02900d1e5faaa1dmr2281500plj.84.1600910703742; Wed, 23 Sep 2020 18:25:03 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.25.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:25:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 7/8] softfloat: Use x86_64 assembly for {add,sub}{192,256} Date: Wed, 23 Sep 2020 18:24:52 -0700 Message-Id: <20200924012453.659757-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The compiler cannot chain more than two additions together. Use inline assembly for 3 or 4 additions. Signed-off-by: Richard Henderson --- include/fpu/softfloat-macros.h | 18 ++++++++++++++++-- fpu/softfloat.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 2 deletions(-) diff --git a/include/fpu/softfloat-macros.h b/include/fpu/softfloat-macros.h index 95d88d05b8..99fa124e56 100644 --- a/include/fpu/softfloat-macros.h +++ b/include/fpu/softfloat-macros.h @@ -436,6 +436,13 @@ static inline void uint64_t *z2Ptr ) { +#ifdef __x86_64__ + asm("add %5, %2\n\t" + "adc %4, %1\n\t" + "adc %3, %0" + : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) + : "rm"(b0), "rm"(b1), "rm"(b2), "0"(a0), "1"(a1), "2"(a2)); +#else uint64_t z0, z1, z2; int8_t carry0, carry1; @@ -450,7 +457,7 @@ static inline void *z2Ptr = z2; *z1Ptr = z1; *z0Ptr = z0; - +#endif } /*---------------------------------------------------------------------------- @@ -494,6 +501,13 @@ static inline void uint64_t *z2Ptr ) { +#ifdef __x86_64__ + asm("sub %5, %2\n\t" + "sbb %4, %1\n\t" + "sbb %3, %0" + : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) + : "rm"(b0), "rm"(b1), "rm"(b2), "0"(a0), "1"(a1), "2"(a2)); +#else uint64_t z0, z1, z2; int8_t borrow0, borrow1; @@ -508,7 +522,7 @@ static inline void *z2Ptr = z2; *z1Ptr = z1; *z0Ptr = z0; - +#endif } /*---------------------------------------------------------------------------- diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 5b714fbd82..d8e5d90fd7 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -7297,6 +7297,15 @@ static void shift256RightJamming(uint64_t p[4], int count) /* R = A - B */ static void sub256(uint64_t r[4], uint64_t a[4], uint64_t b[4]) { +#if defined(__x86_64__) + asm("sub %7, %3\n\t" + "sbb %6, %2\n\t" + "sbb %5, %1\n\t" + "sbb %4, %0" + : "=&r"(r[0]), "=&r"(r[1]), "=&r"(r[2]), "=&r"(r[3]) + : "rme"(b[0]), "rme"(b[1]), "rme"(b[2]), "rme"(b[3]), + "0"(a[0]), "1"(a[1]), "2"(a[2]), "3"(a[3])); +#else bool borrow = false; for (int i = 3; i >= 0; --i) { @@ -7308,11 +7317,20 @@ static void sub256(uint64_t r[4], uint64_t a[4], uint64_t b[4]) r[i] = a[i] - b[i]; } } +#endif } /* A = -A */ static void neg256(uint64_t a[4]) { +#if defined(__x86_64__) + asm("negq %3\n\t" + "sbb %6, %2\n\t" + "sbb %5, %1\n\t" + "sbb %4, %0" + : "=&r"(a[0]), "=&r"(a[1]), "=&r"(a[2]), "+rm"(a[3]) + : "rme"(a[0]), "rme"(a[1]), "rme"(a[2]), "0"(0), "1"(0), "2"(0)); +#else a[3] = -a[3]; if (likely(a[3])) { goto not2; @@ -7333,11 +7351,20 @@ static void neg256(uint64_t a[4]) a[1] = ~a[1]; not0: a[0] = ~a[0]; +#endif } /* A += B */ static void add256(uint64_t a[4], uint64_t b[4]) { +#if defined(__x86_64__) + asm("add %7, %3\n\t" + "adc %6, %2\n\t" + "adc %5, %1\n\t" + "adc %4, %0" + : "+r"(a[0]), "+r"(a[1]), "+r"(a[2]), "+r"(a[3]) + : "rme"(b[0]), "rme"(b[1]), "rme"(b[2]), "rme"(b[3])); +#else bool carry = false; for (int i = 3; i >= 0; --i) { @@ -7350,6 +7377,7 @@ static void add256(uint64_t a[4], uint64_t b[4]) } a[i] = t; } +#endif } float128 float128_muladd(float128 a_f, float128 b_f, float128 c_f, From patchwork Thu Sep 24 01:24:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 272913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3733DC4727E for ; Thu, 24 Sep 2020 01:28:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C56E021D20 for ; Thu, 24 Sep 2020 01:28:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="PjaM2p2I" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C56E021D20 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46084 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLG3T-0001GZ-Mx for qemu-devel@archiver.kernel.org; Wed, 23 Sep 2020 21:28:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43800) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLG0T-0004yX-PB for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:10 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:42792) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kLG0Q-0005iB-UE for qemu-devel@nongnu.org; Wed, 23 Sep 2020 21:25:09 -0400 Received: by mail-pf1-x442.google.com with SMTP id d6so822808pfn.9 for ; Wed, 23 Sep 2020 18:25:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Zem0wYaQsDtFgn41N8malHR+nZOq/vMa+XbyiM3d1r8=; b=PjaM2p2I4+bC1TbXtRb/XoTetyId2RQ2/oMO+rBe1TfIGvnJQkBPM4PKpES13uxfER PPzbr8tpePMYVYz67onJHEUxH9fE0i6hS4X95U7BpZCFwEyiQu1liA+vvZ0cbqjFdpDA otRVX+pLl3t/TauomaOzxpYsjrLLFt/GAx94QUnyqXCPJd8EfJPkOnEGbxdnRlWJexwm jEeiAklfL/21Xi5559ONx1J772y92kJOrjBoQp+y12wLfhYZ8dRNcaLjWrqNqQxY2/6n F034KBvNc9AGWUuggNS+1s+D2cpbZWSRBabECiZGSw9t/XzqDiz2SGrJREzXL8RXBQC1 kvyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Zem0wYaQsDtFgn41N8malHR+nZOq/vMa+XbyiM3d1r8=; b=UwohB9DTolBgli7LEAoZGeU/+jthVabZ3hBAn/EuQPXBDwDbwILLT4u0/vqhfE6K+E VuJPWQ5sVLxli9BsjlhxkZDrYa+OHejSlirXGMB30aA9UyNZ6nCrrvrntL5HtDLdMNZI aKacnJgaBc3SMh89adUXkvOYken1hi75NGuao1g7a0Pkzp/15UokPUGOhRBDEf8H/rju NoynFPC98KRY3FUb2At2eKF92/ZEFF8uceOMfGcRN1UFfy1dH8LwcfEm3WaZTZglJrUB Z3AbgnYygPBiov9Ua7gMwq5KVgyyAO4lMuZppeIQyb6ZY8jyclC2GF41x6BN3YxUlxeV 3S/A== X-Gm-Message-State: AOAM533+tZfcKzAnKepto4mkyy3JPee8sUNTMesFLLeLY2HMewbh2/Pb C/9J7uzarm24EvS46NPP/OxnY4uTnjVboA== X-Google-Smtp-Source: ABdhPJyKF/A2/Y9FTAVuEIVGg2nHn3VaP7wnFq8+u2dJRbsZUZvXYZXN0GbYrItN9co87f4zcbQbxA== X-Received: by 2002:a65:5c8b:: with SMTP id a11mr1951021pgt.272.1600910704930; Wed, 23 Sep 2020 18:25:04 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id k27sm938432pgm.29.2020.09.23.18.25.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 18:25:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 8/8] softfloat: Use aarch64 assembly for {add,sub}{192,256} Date: Wed, 23 Sep 2020 18:24:53 -0700 Message-Id: <20200924012453.659757-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200924012453.659757-1-richard.henderson@linaro.org> References: <20200924012453.659757-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: bharata@linux.ibm.com, alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The compiler cannot chain more than two additions together. Use inline assembly for 3 or 4 additions. Signed-off-by: Richard Henderson --- include/fpu/softfloat-macros.h | 14 ++++++++++++++ fpu/softfloat.c | 25 +++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/include/fpu/softfloat-macros.h b/include/fpu/softfloat-macros.h index 99fa124e56..969a486fd2 100644 --- a/include/fpu/softfloat-macros.h +++ b/include/fpu/softfloat-macros.h @@ -442,6 +442,13 @@ static inline void "adc %3, %0" : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) : "rm"(b0), "rm"(b1), "rm"(b2), "0"(a0), "1"(a1), "2"(a2)); +#elif defined(__aarch64__) + asm("adds %2, %x5, %x8\n\t" + "adcs %1, %x4, %x7\n\t" + "adc %0, %x3, %x6" + : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) + : "rZ"(a0), "rZ"(a1), "rZ"(a2), "rZ"(b0), "rZ"(b1), "rZ"(b2) + : "cc"); #else uint64_t z0, z1, z2; int8_t carry0, carry1; @@ -507,6 +514,13 @@ static inline void "sbb %3, %0" : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) : "rm"(b0), "rm"(b1), "rm"(b2), "0"(a0), "1"(a1), "2"(a2)); +#elif defined(__aarch64__) + asm("subs %2, %x5, %x8\n\t" + "sbcs %1, %x4, %x7\n\t" + "sbc %0, %x3, %x6" + : "=&r"(*z0Ptr), "=&r"(*z1Ptr), "=&r"(*z2Ptr) + : "rZ"(a0), "rZ"(a1), "rZ"(a2), "rZ"(b0), "rZ"(b1), "rZ"(b2) + : "cc"); #else uint64_t z0, z1, z2; int8_t borrow0, borrow1; diff --git a/fpu/softfloat.c b/fpu/softfloat.c index d8e5d90fd7..1601095d60 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -7305,6 +7305,16 @@ static void sub256(uint64_t r[4], uint64_t a[4], uint64_t b[4]) : "=&r"(r[0]), "=&r"(r[1]), "=&r"(r[2]), "=&r"(r[3]) : "rme"(b[0]), "rme"(b[1]), "rme"(b[2]), "rme"(b[3]), "0"(a[0]), "1"(a[1]), "2"(a[2]), "3"(a[3])); +#elif defined(__aarch64__) + asm("subs %[r3], %x[a3], %x[b3]\n\t" + "sbcs %[r2], %x[a2], %x[b2]\n\t" + "sbcs %[r1], %x[a1], %x[b1]\n\t" + "sbc %[r0], %x[a0], %x[b0]" + : [r0] "=&r"(r[0]), [r1] "=&r"(r[1]), + [r2] "=&r"(r[2]), [r3] "=&r"(r[3]) + : [a0] "rZ"(a[0]), [a1] "rZ"(a[1]), [a2] "rZ"(a[2]), [a3] "rZ"(a[3]), + [b0] "rZ"(b[0]), [b1] "rZ"(b[1]), [b2] "rZ"(b[2]), [b3] "rZ"(b[3]) + : "cc"); #else bool borrow = false; @@ -7330,6 +7340,13 @@ static void neg256(uint64_t a[4]) "sbb %4, %0" : "=&r"(a[0]), "=&r"(a[1]), "=&r"(a[2]), "+rm"(a[3]) : "rme"(a[0]), "rme"(a[1]), "rme"(a[2]), "0"(0), "1"(0), "2"(0)); +#elif defined(__aarch64__) + asm("negs %3, %3\n\t" + "ngcs %2, %2\n\t" + "ngcs %1, %1\n\t" + "ngc %0, %0" + : "+r"(a[0]), "+r"(a[1]), "+r"(a[2]), "+r"(a[3]) + : : "cc"); #else a[3] = -a[3]; if (likely(a[3])) { @@ -7364,6 +7381,14 @@ static void add256(uint64_t a[4], uint64_t b[4]) "adc %4, %0" : "+r"(a[0]), "+r"(a[1]), "+r"(a[2]), "+r"(a[3]) : "rme"(b[0]), "rme"(b[1]), "rme"(b[2]), "rme"(b[3])); +#elif defined(__aarch64__) + asm("adds %3, %3, %x7\n\t" + "adcs %2, %2, %x6\n\t" + "adcs %1, %1, %x5\n\t" + "adc %0, %0, %x4" + : "+r"(a[0]), "+r"(a[1]), "+r"(a[2]), "+r"(a[3]) + : "rZ"(b[0]), "rZ"(b[1]), "rZ"(b[2]), "rZ"(b[3]) + : "cc"); #else bool carry = false;