From patchwork Mon Dec 17 10:56:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 153995 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2354270ljp; Mon, 17 Dec 2018 03:29:13 -0800 (PST) X-Google-Smtp-Source: AFSGD/UAqlxlBTGV+oVYIafn7tEJM/iyBs0d/bRl7/jlTtd7EM33mLgp587gR7bFEczjKPs3+Z+T X-Received: by 2002:a37:80c2:: with SMTP id b185mr11743993qkd.8.1545046153681; Mon, 17 Dec 2018 03:29:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545046153; cv=none; d=google.com; s=arc-20160816; b=RCsEYVnMORk9+q3lKZN5kmHkCh/WuoVMWNlu4Rd+60TIH8kBBxt/9rUeohxKo24vfy ic/KsxPuG9sSegW5Y7oCIsvbC2chfkMluNgfO/4sQMfW2M54lXrp8cYwrALqqycnlKMn zD9xBwFDfGQQ+V8NSi1rYAAhY8h0FVvQwE5xKtnH2IaCLa6w1RwLuFUuy+UcadWs2PSa UL8Q6wR7pihF9ghnpeLp6//bNzcxhtU0apmlNNGT2GKbPni+gSpTaMjSd0n2G4RDvQAP DXVmQ7C8VMVpmjJtcRR/HFy7JqeZ1Gux5DpAM/o4OX+t9SpJ8FJuxptxqTxWA8lkTsdM DmZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=avb+rCi+XzCXzUbUL9NU5f6aJ2cYi1T60sE1/2MaAhU=; b=fOPM+BN1daLQJx+f2BkJrmppJ7IF/yfYWTayUQT0Lp5PsFl+k8SmLmS8Ts9dBrKibB aBiv02OP3QLbwINHX4Lc2Uxg9hZNAlocvKoALd2Mmq+jIa5nmRmOzKyF39tkRqY0TtKA ZEfJYHSUg+jBYGDQGj2M1hnsO0mYdn5R4Ejts86yn098xNdzaFlKVOR7COQ2oEdKojbZ yeHPHzXHMLeMjk85ZvWwbdYWrjikjP2WWA1ODci56BpFgLfWSbJZoTdLi142vKLOzCB+ LODz7jPQy85Ur8jHICzD3KTtWcB2JSSR84RAaIRJfnaKZE0Z38CURH8nGUEg51cQmobq 9Z2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="I/s4Omtv"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id x45si4909250qtk.221.2018.12.17.03.29.13 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 17 Dec 2018 03:29:13 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="I/s4Omtv"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:46091 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYr5F-0000uE-6A for patch@linaro.org; Mon, 17 Dec 2018 06:29:13 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39164) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYqf9-0005Rj-MT for qemu-devel@nongnu.org; Mon, 17 Dec 2018 06:02:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gYqf4-00047E-HO for qemu-devel@nongnu.org; Mon, 17 Dec 2018 06:02:14 -0500 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]:40695) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gYqf4-00046h-AT for qemu-devel@nongnu.org; Mon, 17 Dec 2018 06:02:10 -0500 Received: by mail-wm1-x336.google.com with SMTP id q26so12097184wmf.5 for ; Mon, 17 Dec 2018 03:02:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=avb+rCi+XzCXzUbUL9NU5f6aJ2cYi1T60sE1/2MaAhU=; b=I/s4OmtvW+V6IHgoB9V1VD2OemtfS2tHmeITBWOV1Fr7kwqgwYIcyxyxb8bRsquyY7 ID7Ci4OcZBMbv88VrWIdROW0FxLQ71KOtaA8u37O7GlcJbdoExeIOQgE8G6Q4N7H4nV4 Bjiv3m0Xelqj6XB0qNvU7QpkfT5ea95Vu8SJg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=avb+rCi+XzCXzUbUL9NU5f6aJ2cYi1T60sE1/2MaAhU=; b=FQ4FBxZm0s4McqX2Yr4s3MU6uDjEIe5a7Lxf5tICwn6KpQ3gutFKyF7eYDSDhDYJ24 /IoI60kE9LFWviqF1dRaOxXlAI5Pri93Y0iQ+wN+M8AguXfsAsKU2jbY9mckCLdedLXn NHg/BV8LZSYH9CI1xDYXnNFQ41MSz5wVhb9pmVaDQyE/ZPvMAJeG36kaqelWuUnt/ag5 QPUabQh7J5SBgW5awRKqsUPBJZNdoDYNg+SH3AoG+6/Bc3hPdb5tZEYD8mcPWEhf8MFr Uw+yjXJSXpBjrcuguySGHlSz/4IPBRZH5E+z1ODMWy2v6a8OiVYsut9OhA5+639pH56L mJnw== X-Gm-Message-State: AA+aEWYVewffpAueJggodeFTtEqWpEHfcCbLyHAAz0wGvXtPeBsZHt0l jw2L9iHZjXyW6W7P686ZYXieFQ== X-Received: by 2002:a1c:4855:: with SMTP id v82mr11165568wma.15.1545044529161; Mon, 17 Dec 2018 03:02:09 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id w80sm14904752wme.38.2018.12.17.03.02.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 17 Dec 2018 03:02:08 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id A92EB3E0619; Mon, 17 Dec 2018 10:56:51 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: peter.maydell@linaro.org Date: Mon, 17 Dec 2018 10:56:48 +0000 Message-Id: <20181217105650.27361-14-alex.bennee@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181217105650.27361-1-alex.bennee@linaro.org> References: <20181217105650.27361-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::336 Subject: [Qemu-devel] [PULL v3 13/15] hardfloat: implement float32/64 fused multiply-add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Emilio G. Cota" , =?utf-8?q?Alex_Benn=C3=A9e?= , qemu-devel@nongnu.org, Aurelien Jarno Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: "Emilio G. Cota" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota Signed-off-by: Alex Bennée -- 2.17.1 diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 82294458fe..7554d63495 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1518,8 +1518,9 @@ float16 QEMU_FLATTEN float16_muladd(float16 a, float16 b, float16 c, return float16_round_pack_canonical(pr, status); } -float32 QEMU_FLATTEN float32_muladd(float32 a, float32 b, float32 c, - int flags, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_f32_muladd(float32 a, float32 b, float32 c, int flags, + float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1529,8 +1530,9 @@ float32 QEMU_FLATTEN float32_muladd(float32 a, float32 b, float32 c, return float32_round_pack_canonical(pr, status); } -float64 QEMU_FLATTEN float64_muladd(float64 a, float64 b, float64 c, - int flags, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_f64_muladd(float64 a, float64 b, float64 c, int flags, + float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1540,6 +1542,128 @@ float64 QEMU_FLATTEN float64_muladd(float64 a, float64 b, float64 c, return float64_round_pack_canonical(pr, status); } +float32 QEMU_FLATTEN +float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) +{ + union_float32 ua, ub, uc, ur; + + ua.s = xa; + ub.s = xb; + uc.s = xc; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + if (unlikely(flags & float_muladd_halve_result)) { + goto soft; + } + + float32_input_flush3(&ua.s, &ub.s, &uc.s, s); + if (unlikely(!f32_is_zon3(ua, ub, uc))) { + goto soft; + } + /* + * When (a || b) == 0, there's no need to check for under/over flow, + * since we know the addend is (normal || 0) and the product is 0. + */ + if (float32_is_zero(ua.s) || float32_is_zero(ub.s)) { + union_float32 up; + bool prod_sign; + + prod_sign = float32_is_neg(ua.s) ^ float32_is_neg(ub.s); + prod_sign ^= !!(flags & float_muladd_negate_product); + up.s = float32_set_sign(float32_zero, prod_sign); + + if (flags & float_muladd_negate_c) { + uc.h = -uc.h; + } + ur.h = up.h + uc.h; + } else { + if (flags & float_muladd_negate_product) { + ua.h = -ua.h; + } + if (flags & float_muladd_negate_c) { + uc.h = -uc.h; + } + + ur.h = fmaf(ua.h, ub.h, uc.h); + + if (unlikely(f32_is_inf(ur))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabsf(ur.h) <= FLT_MIN)) { + goto soft; + } + } + if (flags & float_muladd_negate_result) { + return float32_chs(ur.s); + } + return ur.s; + + soft: + return soft_f32_muladd(ua.s, ub.s, uc.s, flags, s); +} + +float64 QEMU_FLATTEN +float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) +{ + union_float64 ua, ub, uc, ur; + + ua.s = xa; + ub.s = xb; + uc.s = xc; + + if (unlikely(!can_use_fpu(s))) { + goto soft; + } + if (unlikely(flags & float_muladd_halve_result)) { + goto soft; + } + + float64_input_flush3(&ua.s, &ub.s, &uc.s, s); + if (unlikely(!f64_is_zon3(ua, ub, uc))) { + goto soft; + } + /* + * When (a || b) == 0, there's no need to check for under/over flow, + * since we know the addend is (normal || 0) and the product is 0. + */ + if (float64_is_zero(ua.s) || float64_is_zero(ub.s)) { + union_float64 up; + bool prod_sign; + + prod_sign = float64_is_neg(ua.s) ^ float64_is_neg(ub.s); + prod_sign ^= !!(flags & float_muladd_negate_product); + up.s = float64_set_sign(float64_zero, prod_sign); + + if (flags & float_muladd_negate_c) { + uc.h = -uc.h; + } + ur.h = up.h + uc.h; + } else { + if (flags & float_muladd_negate_product) { + ua.h = -ua.h; + } + if (flags & float_muladd_negate_c) { + uc.h = -uc.h; + } + + ur.h = fma(ua.h, ub.h, uc.h); + + if (unlikely(f64_is_inf(ur))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabs(ur.h) <= FLT_MIN)) { + goto soft; + } + } + if (flags & float_muladd_negate_result) { + return float64_chs(ur.s); + } + return ur.s; + + soft: + return soft_f64_muladd(ua.s, ub.s, uc.s, flags, s); +} + /* * Returns the result of dividing the floating-point value `a' by the * corresponding value `b'. The operation is performed according to