From patchwork Tue Jan 8 16:21:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 154986 Delivered-To: patch@linaro.org Received: by 2002:a02:48:0:0:0:0:0 with SMTP id 69csp5065317jaa; Tue, 8 Jan 2019 08:25:10 -0800 (PST) X-Google-Smtp-Source: ALg8bN4EMs2sGlPn1Pw8mZ6RFQMqMCdRFB6J4cPY0LgjjD48PE6Yi+N+XgE17YF9tW6FACXQQAZu X-Received: by 2002:a1c:8f95:: with SMTP id r143mr2174290wmd.65.1546964710311; Tue, 08 Jan 2019 08:25:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546964710; cv=none; d=google.com; s=arc-20160816; b=YRQoCT571R8wP56mBxODnDdLUCWQK0tOmWKpMglWmzWJ+BYk7bHn/GUmEo7Ymrlu4V G/PmUpp6EkRcur3FO/Y9qKANmRsHh5MDmPXXJW0pckfq07RDWTPXw+sVUF2AuZOhyhtW UtNzdYOFx5HuEEJE8vA0IO30EA/I2wtXm9WI2X7CrLdd85syz1G1DOlP3HscHSx7oOhA 0l0zuzXhEYQv9bvE62Fu8gnYIenusPh4p4SM6dWlHixpSiPAIwPVU8N4KrWHHs4Tx77Q THFHzxRTORGBxF9aQEJo46SuBwfHZm4+zGGv0ne74425IAwPaxvTcjUe2rya8ImPytFd IzdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=Cu0zWCd/QWoV++VI/VVTfNtwQXOM6UIZaGTYIh0OvRk=; b=Kw9ddC7QZQkvlblhngiq5gJulz5L8d3Za3BaVsSNGwc7i91s5aubuHji8eo9Pa4rQI cFC9Z0rTpdfevEeGisbNPhOuz2d42nviS3s0M/ulebQlqv/cwIkoS38Z+PlwaqR6RsDG 5TWsoU5imreVrPnqGBf8ajinjRSpdJyp7o6R2KcwVaFndrvcI9I8sFHdUABdKTTwgVTt 12O9jUehUXRQweu5WHNjKtyUko/ix7UMy6TrqYJFgBzEjVk+8zp3aB/bqavf8Wzh5mBF SZmTnaFSuMWqZvpCf+6RQzC359lmDluT9QfusGx4jen6KUJlVn5zVFIkRWL3e753snNC eVdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RU7NZAsV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id g7si39076550wri.268.2019.01.08.08.25.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 08 Jan 2019 08:25:10 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RU7NZAsV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1]:54000 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gguBe-0006Oy-7l for patch@linaro.org; Tue, 08 Jan 2019 11:25:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48000) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ggu8m-0004HR-6W for qemu-devel@nongnu.org; Tue, 08 Jan 2019 11:22:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ggu8g-0000jY-Sm for qemu-devel@nongnu.org; Tue, 08 Jan 2019 11:22:05 -0500 Received: from mail-wr1-x442.google.com ([2a00:1450:4864:20::442]:35990) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ggu8f-0000Uz-Dt for qemu-devel@nongnu.org; Tue, 08 Jan 2019 11:22:02 -0500 Received: by mail-wr1-x442.google.com with SMTP id u4so4673521wrp.3 for ; Tue, 08 Jan 2019 08:21:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Cu0zWCd/QWoV++VI/VVTfNtwQXOM6UIZaGTYIh0OvRk=; b=RU7NZAsVWKRCWRoOFovvTpFekdnjE94bjqnfMRak74doeNqaakbCecBvEVF7E/kzrN pN/n3pGCdviZJor25a4rVWjsvffzLYin2GjnFQEsZLa/imu0VsqXxLsEebz9NxlkCqc8 jYj/hwUJQJFoTGaeBzn8cJhDtho02oUNuiZnA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Cu0zWCd/QWoV++VI/VVTfNtwQXOM6UIZaGTYIh0OvRk=; b=o6ryX5krW08x4cDbl5koL99xZhDI84smCW/AFhsMYNLvnLjXToD63VqOtwpR90iiWL Qf3npjHf3ky89B1z+hDeMi9WX9w/1RdKgkQ0sn865gVvR2Wv/1K/Kd74jf+ZH856FscX ctXh9pNqU5PQvJqSYQSUEToGkv/wW/bt2OMZFyIAyBhLF6LcSRAT3+N8AXfExBLufGHg fs+G4exWk3DghtqOsAwU99tpY9qcylxVv3G9jnYceRYTzAmCk8Sq/mh6y6581egIK1SV HzGcnCUNWtqvEktVqSS5MtX5UNeX1PnruyyJTLh6yPIxharx6vZRg1wMuPW6AdNb85s8 oRxw== X-Gm-Message-State: AJcUukfl1FaNyrmnffB0tMHCHtohGhpie5kAm3EreerboIydmp9i2aKB jUV/sS7Y1w6KUI+1iCVv8vkjvw== X-Received: by 2002:adf:d089:: with SMTP id y9mr1989239wrh.22.1546964518269; Tue, 08 Jan 2019 08:21:58 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id w125sm10303466wmb.45.2019.01.08.08.21.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 08 Jan 2019 08:21:56 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id ADB363E113B; Tue, 8 Jan 2019 16:21:54 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: qemu-devel@nongnu.org Date: Tue, 8 Jan 2019 16:21:51 +0000 Message-Id: <20190108162154.22259-4-alex.bennee@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190108162154.22259-1-alex.bennee@linaro.org> References: <20190108162154.22259-1-alex.bennee@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::442 Subject: [Qemu-devel] [PATCH v1 3/6] softfloat: enforce softfloat if the host's FMA is broken X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , cota@braap.org, =?utf-8?q?Al?= =?utf-8?q?ex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: "Emilio G. Cota" The added branch to the FMA ops is marked as unlikely and therefore its impact on performance (measured with fp-bench) is within noise range when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. In addition, when the host doesn't have a hardware FMA instruction we force the use of softfloat, since whatever the libc does (e.g. checking the host's FP flags) is unlikely to be faster than our softfloat implementation. For instance, on an i386 machine with no hardware support for FMA, we get: $ for precision in single double; do ./fp-bench -o mulAdd -p $precision done - before: 5.07 MFlops 1.85 MFlops - after: 12.65 MFlops 10.05 MFlops Reported-by: Laurent Desnogues Suggested-by: Richard Henderson Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 85 ++++++++++++++++++++++++++++++++++++++++++++ include/qemu/cpuid.h | 6 ++++ 2 files changed, 91 insertions(+) -- 2.17.1 diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 59eac97d10..ccaed85b0f 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1542,6 +1542,8 @@ soft_f64_muladd(float64 a, float64 b, float64 c, int flags, return float64_round_pack_canonical(pr, status); } +static bool force_soft_fma; + float32 QEMU_FLATTEN float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) { @@ -1562,6 +1564,11 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) if (unlikely(!f32_is_zon3(ua, ub, uc))) { goto soft; } + + if (unlikely(force_soft_fma)) { + goto soft; + } + /* * When (a || b) == 0, there's no need to check for under/over flow, * since we know the addend is (normal || 0) and the product is 0. @@ -1623,6 +1630,11 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) if (unlikely(!f64_is_zon3(ua, ub, uc))) { goto soft; } + + if (unlikely(force_soft_fma)) { + goto soft; + } + /* * When (a || b) == 0, there's no need to check for under/over flow, * since we know the addend is (normal || 0) and the product is 0. @@ -7974,3 +7986,76 @@ float128 float128_scalbn(float128 a, int n, float_status *status) , status); } + +#ifdef CONFIG_CPUID_H +#include "qemu/cpuid.h" +#endif + +static void check_host_hw_fma(void) +{ +#ifdef CONFIG_CPUID_H + int max = __get_cpuid_max(0, NULL); + int a, b, c, d; + bool has_fma3 = false; + bool has_fma4 = false; + bool has_avx = false; + + if (max >= 1) { + __cpuid(1, a, b, c, d); + + /* check whether avx is usable */ + if (c & bit_OSXSAVE) { + int bv; + + __asm("xgetbv" : "=a"(bv), "=d"(d) : "c"(0)); + if ((bv & 6) == 6) { + has_avx = c & bit_AVX; + } + } + + if (has_avx) { + /* fma3 */ + has_fma3 = c & bit_FMA3; + + /* fma4 */ + __cpuid(0x80000000, a, b, c, d); + if (a >= 0x80000001) { + __cpuid(0x80000001, a, b, c, d); + + has_fma4 = c & bit_FMA4; + } + } + } + /* + * Without HW FMA, whatever the libc does is probably slower than our + * softfloat implementation. + */ + if (!has_fma3 && !has_fma4) { + force_soft_fma = true; + } +#endif +} + +static void __attribute__((constructor)) softfloat_init(void) +{ + union_float64 ua, ub, uc, ur; + + if (QEMU_NO_HARDFLOAT) { + return; + } + + /* + * Test that the host's FMA is not obviously broken. For example, + * glibc < 2.23 can perform an incorrect FMA on certain hosts; see + * https://sourceware.org/bugzilla/show_bug.cgi?id=13304 + */ + ua.s = 0x0020000000000001ULL; + ub.s = 0x3ca0000000000000ULL; + uc.s = 0x0020000000000000ULL; + ur.h = fma(ua.h, ub.h, uc.h); + if (ur.s != 0x0020000000000001ULL) { + force_soft_fma = true; + } + + check_host_hw_fma(); +} diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 69301700bd..320926ffe0 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -25,6 +25,9 @@ #endif /* Leaf 1, %ecx */ +#ifndef bit_FMA3 +#define bit_FMA3 (1 << 12) +#endif #ifndef bit_SSE4_1 #define bit_SSE4_1 (1 << 19) #endif @@ -53,5 +56,8 @@ #ifndef bit_LZCNT #define bit_LZCNT (1 << 5) #endif +#ifndef bit_FMA4 +#define bit_FMA4 (1 << 16) +#endif #endif /* QEMU_CPUID_H */