From patchwork Mon Dec 17 10:56:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 153982 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp2327991ljp; Mon, 17 Dec 2018 03:02:04 -0800 (PST) X-Google-Smtp-Source: AFSGD/X3hzXFdWhfW4O56cntNnLa4/mLGhPWaM4zyrXCK6RE46wij4objbGYx/loRCa02+gPOyhv X-Received: by 2002:aed:2122:: with SMTP id 31mr12926094qtc.270.1545044524839; Mon, 17 Dec 2018 03:02:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545044524; cv=none; d=google.com; s=arc-20160816; b=PoEmCZpepBC4PtxUwM6iKyVPBKRcewBk1SANq3fnO7koWmfIsyFCc5DPIt7kVFxyBu GE6PUbYZvT+dDIc27N2a9e/U3ih4GNozdeQk76l2n5bebGI8hUbwf3wEP2Wkgbuz3y4Q fc/BQs9LX7e4/y+XVRHROSJx/v3WBlrHa6lrk3Y+hxacKTjPLLl9feFb1ZRQxHyV0s0C h9XssIrvfpsK+PmKtoRtGQHi7VoxBcZWuxzvluZPHb7KO4/S/KX88aENdgytMbXjjlhV Fl6LNG0dnXiKxf30O+/7Mr9zqVbkJr/GbiTa/384lD85M3e4UqhxR9TgoqN8ErTqRFe+ sPyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=gOXFTj5uqzApKHFUrx/GP3fxC64evZtaKBwy5sxWSzs=; b=sr3GCYelYxBlYwbZ7bBSuckTvcoleAj21enNnHecv7A8fKQE0fKkbPI2WthbW/LceR +Xe96VE4wILLJ4NDmLkt4b310WH3+r+iA3jMw509HTZ+cifUZSGKVjXVjwY/qdEy2bbz G0mp6N6CPoIR0P+PmN5wdNUuk1bLi/1EfPMEeTt67RiaiNyB98HjbdTP9/KmpZknnYXQ PgJD77+HBEgr4jcV2C3OmzBjQpcuEn5HjLrYg7R5xwmI/RP+Is0skCDQ5ZzLvHKrxmvA l+PDX3WLOmzSzVL9BuqGNOoPZ4HQBKt9y7wXLx3vtTJJRkqAud2k24jkz+BafpuggwOO 0cJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Bb9KVEqS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b62si843900qkf.59.2018.12.17.03.02.04 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 17 Dec 2018 03:02:04 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Bb9KVEqS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:45913 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYqey-0004we-6z for patch@linaro.org; Mon, 17 Dec 2018 06:02:04 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37274) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gYqbL-0002Y7-92 for qemu-devel@nongnu.org; Mon, 17 Dec 2018 05:58:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gYqaB-0000DI-1b for qemu-devel@nongnu.org; Mon, 17 Dec 2018 05:57:09 -0500 Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]:34148) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gYqaA-0000CE-3Q for qemu-devel@nongnu.org; Mon, 17 Dec 2018 05:57:06 -0500 Received: by mail-wm1-x341.google.com with SMTP id y185so4430730wmd.1 for ; Mon, 17 Dec 2018 02:57:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gOXFTj5uqzApKHFUrx/GP3fxC64evZtaKBwy5sxWSzs=; b=Bb9KVEqSTcHyk7bl7obciPKjCq7p1ar0Da6t8fY/wyP3B3I9vWuFY6PLxJDxTcr4mp c1l+V99UNqCTmRGO2btd+QqMBaNhHKhAF1klHbOwOKl7eZcgjAuER0jIM+fSpRQt4IMo juG4k5/r82AqXlt0zRI2gT638qTYLpHmpgZlw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gOXFTj5uqzApKHFUrx/GP3fxC64evZtaKBwy5sxWSzs=; b=BJ8MtYk60HMVSxls9mCkTQrDgGOELDyg0qzSFcRcbHTgr/xYqnwR7rQK4rJaBNJLY/ TbKy7fZby1oB88AElkpT5wUUNQGNso4q2ZE4kn/jGuePYkTgCpSn6XDirZsGFqRGjQZl SDlQ62iaDMj7B20+aXnlbt0xY64svfN/K6Ywy4YgkuqsSmDVAwfwpSUTZyY3Nxk7vNlI 8Bka60XbIGuhwxfByL5ncczjyKXs8Iw0h2Z3tA5r6T8ndUXNA9lLzNiwIQ0ErkwTGREp t8Cuw7UgHMJ2MgzZ9F3SxD023FNsE9gpij7C+oxLoiKP3L2NS7TQnQm2tqjE5w2CXeVc lL3g== X-Gm-Message-State: AA+aEWYADojibmdiPlN2DnrdSuIUOQ61EJSh/k3tJEuBG9DabyVwLCj5 Ca5OAZeVYwMn1vPS2gjehpWGQQ== X-Received: by 2002:a1c:ef11:: with SMTP id n17mr10773271wmh.112.1545044224023; Mon, 17 Dec 2018 02:57:04 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id n6sm9250255wmk.9.2018.12.17.02.56.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 17 Dec 2018 02:57:03 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id CE7EE3E063A; Mon, 17 Dec 2018 10:56:51 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: peter.maydell@linaro.org Date: Mon, 17 Dec 2018 10:56:50 +0000 Message-Id: <20181217105650.27361-16-alex.bennee@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181217105650.27361-1-alex.bennee@linaro.org> References: <20181217105650.27361-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::341 Subject: [Qemu-devel] [PULL v3 15/15] hardfloat: implement float32/64 comparison X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Emilio G. Cota" , =?utf-8?q?Alex_Benn=C3=A9e?= , qemu-devel@nongnu.org, Aurelien Jarno Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: "Emilio G. Cota" Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& = +-+ 2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& = +-+ 0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+ 0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+ 0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& = +-+ 1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& = +-+ 0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& = +-+ 0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota Signed-off-by: Alex Bennée -- 2.17.1 diff --git a/fpu/softfloat.c b/fpu/softfloat.c index fbd66fd8dc..59eac97d10 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2903,28 +2903,109 @@ static int compare_floats(FloatParts a, FloatParts b, bool is_quiet, } } -#define COMPARE(sz) \ -int float ## sz ## _compare(float ## sz a, float ## sz b, \ - float_status *s) \ +#define COMPARE(name, attr, sz) \ +static int attr \ +name(float ## sz a, float ## sz b, bool is_quiet, float_status *s) \ { \ FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, false, s); \ -} \ -int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \ - float_status *s) \ -{ \ - FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ - FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, true, s); \ + return compare_floats(pa, pb, is_quiet, s); \ } -COMPARE(16) -COMPARE(32) -COMPARE(64) +COMPARE(soft_f16_compare, QEMU_FLATTEN, 16) +COMPARE(soft_f32_compare, QEMU_SOFTFLOAT_ATTR, 32) +COMPARE(soft_f64_compare, QEMU_SOFTFLOAT_ATTR, 64) #undef COMPARE +int float16_compare(float16 a, float16 b, float_status *s) +{ + return soft_f16_compare(a, b, false, s); +} + +int float16_compare_quiet(float16 a, float16 b, float_status *s) +{ + return soft_f16_compare(a, b, true, s); +} + +static int QEMU_FLATTEN +f32_compare(float32 xa, float32 xb, bool is_quiet, float_status *s) +{ + union_float32 ua, ub; + + ua.s = xa; + ub.s = xb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + + float32_input_flush2(&ua.s, &ub.s, s); + if (isgreaterequal(ua.h, ub.h)) { + if (isgreater(ua.h, ub.h)) { + return float_relation_greater; + } + return float_relation_equal; + } + if (likely(isless(ua.h, ub.h))) { + return float_relation_less; + } + /* The only condition remaining is unordered. + * Fall through to set flags. + */ + soft: + return soft_f32_compare(ua.s, ub.s, is_quiet, s); +} + +int float32_compare(float32 a, float32 b, float_status *s) +{ + return f32_compare(a, b, false, s); +} + +int float32_compare_quiet(float32 a, float32 b, float_status *s) +{ + return f32_compare(a, b, true, s); +} + +static int QEMU_FLATTEN +f64_compare(float64 xa, float64 xb, bool is_quiet, float_status *s) +{ + union_float64 ua, ub; + + ua.s = xa; + ub.s = xb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + + float64_input_flush2(&ua.s, &ub.s, s); + if (isgreaterequal(ua.h, ub.h)) { + if (isgreater(ua.h, ub.h)) { + return float_relation_greater; + } + return float_relation_equal; + } + if (likely(isless(ua.h, ub.h))) { + return float_relation_less; + } + /* The only condition remaining is unordered. + * Fall through to set flags. + */ + soft: + return soft_f64_compare(ua.s, ub.s, is_quiet, s); +} + +int float64_compare(float64 a, float64 b, float_status *s) +{ + return f64_compare(a, b, false, s); +} + +int float64_compare_quiet(float64 a, float64 b, float_status *s) +{ + return f64_compare(a, b, true, s); +} + /* Multiply A by 2 raised to the power N. */ static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s) {