From patchwork Tue Jul 30 16:02:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815310 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365400wrs; Tue, 30 Jul 2024 09:04:09 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVUQ9Q6w7Yq3r4ny/zMrvMycMqRy3EWexYf1oMoP1zTtQF3CiXvWe4x/K13kzcEou5912Ycpj1pEqEfPCSRCv+D X-Google-Smtp-Source: AGHT+IEZavMWBNBNUjyCOAEyl+n7djHK/aDuzTudMas9nyYiv4Ug1WXOqE7zOA12VQ/aDRZ9Dz9y X-Received: by 2002:a05:622a:1826:b0:447:f62f:d146 with SMTP id d75a77b69052e-45004dfd63cmr165029091cf.20.1722355449751; Tue, 30 Jul 2024 09:04:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355449; cv=none; d=google.com; s=arc-20160816; b=nnYg24BZwYVCoANZDMBrVEMz2Zj75+0Rr6l4GAtQm5aJCR+a9MKenIAchqZPeUfbgR 0cyn2f4f45+OxlU6QFAuW2fG79C15CAQG2OvnkvhTyLi/Z8Z3A+UKYhW4dqRWTJZUlrV dRvmtCrZ4t/jeMiq7uIJ5gNEeaUs9l9tvI5HefoHsDQzsWekMuWWCbkOFzAAgtds3zuD 6P14bzRZq7RZHThSv9mG3pVCB06VZ3dJFmCYGzeUn2aFzZALUk0HSlkVdUIUOUr7nRxU NCaFHO0HtCIHgUuGWmemdh3R1bhtTU57EX3l+vVIAsiblCpclV4AsCowJAFvoRQ3LbEM 8xMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=MzA6yndkssVsonTKDMnjVcpdRFu5VtP6syrp6zmpq4E=; fh=48CiO3y0N1kuwJA849nqTfnDSs3y5grCLnStenbsR/w=; b=bJvnE3wssYEiMnGCUtAWKeXm/bVU55wQHG/7rfcJIkGx8gEAqEyjvemKop3aMsaWUC 4p++vx2u51TtgImOLUWERX2wB6Idzxf70FNUibBQ7yMTIWKLwTHZplHLWh2LfSH4KU1v hMw4tA9kNfZkclZt74/UJP416yWdI8zBOroNvZXyTWhpQzJktzMv23isFVgK5VC1dF7f RvrwhvuwzIvKL+OOTeIMVk9VoQRuTJqIgaSaZMvqUJBsXbR5pZQG4dJf5WRv+lE//L5t gBO/4w/sr5xBtk/iezynk0t6gpKkqMuqv7kUpjYksOn4r9KkKDFHqUIVdmnKslM0ykcW woJw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ir9dZ3YJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id d75a77b69052e-44fe852c37asi135186771cf.710.2024.07.30.09.04.09 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:09 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ir9dZ3YJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJi-00039e-Mz; Tue, 30 Jul 2024 12:03:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJh-000346-EF for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJe-0000Fc-MD for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-4266f3e0df8so27917225e9.2 for ; Tue, 30 Jul 2024 09:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355389; x=1722960189; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=MzA6yndkssVsonTKDMnjVcpdRFu5VtP6syrp6zmpq4E=; b=ir9dZ3YJPUo/CIDm54tqFp1AasR6mmctWCOwGZ4khO5oLWMSW77KRSNOp0vQX10kdX fKJ0AyQqP1trHICdwJ/xa3hobpseOgGYlRDVR2gXxUtQYXcCNfbS3SCreusKmZtpBEr/ R4h7fg3fLLEdh0mB+DQhuUg4QvI4w8b7MlJh4x4JXnCTULieClDeHFk0RWD1cBTtKVYs PSgzmSUCUdCuZnZdIqwc1JvYVNsI8+9HW18+HGMq+82W65AUOLUyNZUJqKJTQ965mU9m 8KZWpyer3DUiA259xhAdu4N+GCan/rrCA7jPZfOMko5Ejsr7fxUuH01BbuMFwTs7UQZL marQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355389; x=1722960189; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MzA6yndkssVsonTKDMnjVcpdRFu5VtP6syrp6zmpq4E=; b=gd7QLuPRy99YWVHOMOqWKv1PolF9Fm4/6j4iG/jMP3wyHCLpoqaDt7T+4fyHMz46yg U0kiRef5cGYqpfLBeQT5C9UGqKlnrbJtmULZBafe6js+kymEZHm0nFckUzXGvj32qkId OBWUnfquaQf5dxxLXodHqzXPlWi1dJkxICpfEcwMWNOIRSN61AC4LrUQI7Yu+MZhq85J MC5cFF6jQZ8mHOrsnAMNdjY3GXYauXsSxjNvTIum6zR0kziOxZGFzrwpS13BERAkT30I 1dIUuj3pxRDz1Om3F74b9EgOOS3lyg6BN3T1QO2xNUyJHrPZ+KEbWZlrOekuO6/vHWom t4sQ== X-Forwarded-Encrypted: i=1; AJvYcCVjJxoRy+4BDej80BmLLu8mDypu5ehD8p8H4bVtNee27su4W2exPutpJvlK7ASSNn44rn5u+zFnSzO0xeoJ3HcJWhvvv6M= X-Gm-Message-State: AOJu0YwI5QOnIF9jL6n7sSdjHs8UfoD9tkvfF4PWq1n3jw7e+jlXZ8zT AHDwfsq7c+5++YFiytZon82sdoYDtY9sbPTKYEN++PQYW4okvdsf6GmlHtesrVkdqoLoPccg5o+ n X-Received: by 2002:adf:fc0c:0:b0:366:e7aa:7fa5 with SMTP id ffacd0b85a97d-36b5d0b094cmr7074359f8f.1.1722355388644; Tue, 30 Jul 2024 09:03:08 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:08 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 1/8] target/arm: Allow setting the FPCR.EBF bit for FEAT_EBF16 Date: Tue, 30 Jul 2024 17:02:59 +0100 Message-Id: <20240730160306.2959745-2-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::336; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x336.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org FEAT_EBF16 adds one new bit to the FPCR floating point control register. Allow this bit to be read and written when the ID registers indicate the presence of the feature. Note that because this new bit is not in FPSCR_FPCR_MASK the bit is not visible in the AArch32 FPSCR, and FPSCR writes do not affect it. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/cpu-features.h | 5 +++++ target/arm/cpu.h | 1 + target/arm/vfp_helper.c | 8 ++++++-- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/target/arm/cpu-features.h b/target/arm/cpu-features.h index c59ca104fe1..cfb82c23cad 100644 --- a/target/arm/cpu-features.h +++ b/target/arm/cpu-features.h @@ -556,6 +556,11 @@ static inline bool isar_feature_aa64_bf16(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, BF16) != 0; } +static inline bool isar_feature_aa64_ebf16(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, BF16) > 1; +} + static inline bool isar_feature_aa64_rcpc_8_3(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, LRCPC) != 0; diff --git a/target/arm/cpu.h b/target/arm/cpu.h index a12859fc533..34df9d7e39b 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1707,6 +1707,7 @@ void vfp_set_fpscr(CPUARMState *env, uint32_t val); #define FPCR_OFE (1 << 10) /* Overflow exception trap enable */ #define FPCR_UFE (1 << 11) /* Underflow exception trap enable */ #define FPCR_IXE (1 << 12) /* Inexact exception trap enable */ +#define FPCR_EBF (1 << 13) /* Extended BFloat16 behaviors */ #define FPCR_IDE (1 << 15) /* Input Denormal exception trap enable */ #define FPCR_LEN_MASK (7 << 16) /* LEN, A-profile only */ #define FPCR_FZ16 (1 << 19) /* ARMv8.2+, FP16 flush-to-zero */ diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index b3698da8ca7..203d37303bd 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -254,6 +254,10 @@ static void vfp_set_fpcr_masked(CPUARMState *env, uint32_t val, uint32_t mask) val &= ~FPCR_FZ16; } + if (!cpu_isar_feature(aa64_ebf16, cpu)) { + val &= ~FPCR_EBF; + } + vfp_set_fpcr_to_host(env, val, mask); if (mask & (FPCR_LEN_MASK | FPCR_STRIDE_MASK)) { @@ -278,12 +282,12 @@ static void vfp_set_fpcr_masked(CPUARMState *env, uint32_t val, uint32_t mask) * We don't implement trapped exception handling, so the * trap enable bits, IDE|IXE|UFE|OFE|DZE|IOE are all RAZ/WI (not RES0!) * - * The FPCR bits we keep in vfp.fpcr are AHP, DN, FZ, RMode + * The FPCR bits we keep in vfp.fpcr are AHP, DN, FZ, RMode, EBF * and FZ16. Len, Stride and LTPSIZE we just handled. Store those bits * there, and zero any of the other FPCR bits and the RES0 and RAZ/WI * bits. */ - val &= FPCR_AHP | FPCR_DN | FPCR_FZ | FPCR_RMODE_MASK | FPCR_FZ16; + val &= FPCR_AHP | FPCR_DN | FPCR_FZ | FPCR_RMODE_MASK | FPCR_FZ16 | FPCR_EBF; env->vfp.fpcr &= ~mask; env->vfp.fpcr |= val; } From patchwork Tue Jul 30 16:03:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815307 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp364941wrs; Tue, 30 Jul 2024 09:03:25 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXF+UDKG8VzDf8QTCSIzReBbrkCH2sCFdRjjaxbIfrO7parPXOph36Zom57osc+PwF4U2wGH46P2+72ym2FoDQE X-Google-Smtp-Source: AGHT+IHG0Ng9DB3kMmlKhTK7r5Gb2nXapBW6TUq8Sdj2LbsXNnMUZcMFQUFb2or3nQlg4KmZnovI X-Received: by 2002:a25:d658:0:b0:e0b:b162:ef81 with SMTP id 3f1490d57ef6-e0bb162f76emr493214276.48.1722355404829; Tue, 30 Jul 2024 09:03:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355404; cv=none; d=google.com; s=arc-20160816; b=lcrDCKTJTnbKekVSE10KuTwOLja4ceiLO0E3V3AwA5F6CSXpRn6LDFsHsi0WWdlseB 1eIblR244wdmmluJk53UMMDjrzO5Ul2v+0fdUZNyGjXc9BPGOulGXV9yLPVBBL0EbkjF zfNnwPpaq2gHj0oV960OGyZTo63tfIl/D4ipkK5+uQiiC6dUksDm/xFT7z8A0Dm/s4dv UKXyfbeVTT8mLJnYdhnReqq536ENpshkiEsNDI+I9OlnYkKjwNXi9FJR9jwKjIBoSZ4f JDIYb/FxaLzw9r7R3funT6Z3mC1B1LLBIRcNlEY6A3p4st5GkoYTA0vUC1biyKj4uTS1 9qYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=J8QodGCyXxyaENLTKMO9Txf9gYtGANATnCYmDKKTwbU=; fh=A+ID7CWWeEffNPScQlo0bxNBXdtWXvVjA777iK1SIGM=; b=M9NzgJVpVOdcN1S5kkfmxRrY4l2DzWm+s75WhaedNY7b82K+0bYvlm/IVWwREJvZ47 CIv7z7fWI19bHaKNXXrzL0rZj5gjaxzocD8NRS70QbVNenUvl8H0iCQ2BqAqlAU+AZPk hehIA5P1OqtfF0dG+G3sZaxHaiBX9bBk0236QiqJPFrxHaZt+qAkAD5QgSDDxQO2l6k7 Rd8wJBORVFVjAEVf7UNS7ugvYC4y34e/M/pIvn+LhTG/WsD6RhBFwmizPFwL5Kjr/KkL wpAfYuIpUAJa2rNMY81O5lvAgR+rbB2rLYFvJ9C0plqOMCjuza0oMY3E0L2dzmxi8Ijf EvGw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=pMb9oii5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 3f1490d57ef6-e0b2aade25fsi7426757276.577.2024.07.30.09.03.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:03:24 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=pMb9oii5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJj-0003Ee-Sw; Tue, 30 Jul 2024 12:03:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJh-00034b-I5 for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJe-0000Fn-Mf for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-3683178b226so2137778f8f.1 for ; Tue, 30 Jul 2024 09:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355389; x=1722960189; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=J8QodGCyXxyaENLTKMO9Txf9gYtGANATnCYmDKKTwbU=; b=pMb9oii565HwEJFr2CgmyD59rMk/n9JX0/04HGWcG9Vad35iKrkX4tPZldxCiRzpi8 2k+otYTYcaIwTfpCv8rnkmE5t3Z6yVlYLf6Ryju5Ezeg7L2aUGx4KGEdYb9X5XnjoTOL qG9OsSZpj1gJTaTLlcqYLSvyTiwb89YEMMXTGf1UoB2RHTqqUmwiqLeeRBD6H8L2ws0h AnpQpze+djgnT69r41E3AqOR1WrkCGNTsGOi9fgVzXyFQJFOGO7as9iMF4nUY1BCfFdZ txm2iOwpAMpuffnYQePMOjXCkuZYi1ISkJpmbbWomjig2k758fziJ3aAv0zKmWOas9gF a4OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355389; x=1722960189; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=J8QodGCyXxyaENLTKMO9Txf9gYtGANATnCYmDKKTwbU=; b=K0sVzZ21kB8TPABWAKEttsFqOe0DZagbW/8yOq1V/7jUEE5kPsYII1d5mZEE/r1YYF ot8v1vOgKpYXyczWp61AjKNZKLc54u+IqAtSSQDW2ZH/fpqYV78GDjzP0tuz2YJY4C9X o8K95d+FT+vRQ154nR3t1Fq30njFdqkX6dOuLsPgXv/dDGMCnq4Lm53Z6Ut9NzsmZ9UB 0AdPPeN2ad/OwJDDOYJAH0eWpbjmCoGKQJHd+4cAf9n8O4eYg2CtJMuzY9cnHrYkwoRv eyQaEvUQJFeHQjjHymDX5BXY9fzmakC2D80IerT2CXiAOUcgInbxLE16IRrCxqit3Y/d Qfig== X-Forwarded-Encrypted: i=1; AJvYcCXeG2Lltiy0+eIt7PputrSC4QClnrwTM7umWZZKpa/apSOLIr/lH7zstZhzUpR7DcToH0yt7+qJbk+wnwGlKt3G6l7QdNQ= X-Gm-Message-State: AOJu0YyxyA1ayP0VNcYsY3knFPd2vyuRTOkiTlV+Mpa6WpeHLLImfehu fZ56S/qtO0dFuegDr/KMoKG+i1OSyiUVYMC+EPKh3YJxHaloK1uECw3kRXZSybUHCK9hBpBlTLW f X-Received: by 2002:a5d:698b:0:b0:367:9088:fecd with SMTP id ffacd0b85a97d-36b5cee2e4emr7735549f8f.7.1722355389158; Tue, 30 Jul 2024 09:03:09 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:08 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 2/8] target/arm: Pass env pointer through to sme_bfmopa helper Date: Tue, 30 Jul 2024 17:03:00 +0100 Message-Id: <20240730160306.2959745-3-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org To implement the FEAT_EBF16 semantics, we are going to need the CPUARMState env pointer in every helper function which calls bfdotadd(). Pass the env pointer through from generated code to the sme_bfmopa helper. (We'll add the code that uses it when we've adjusted all the helpers to have access to the env pointer.) Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/tcg/helper-sme.h | 4 ++-- target/arm/tcg/sme_helper.c | 4 ++-- target/arm/tcg/translate-sme.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/target/arm/tcg/helper-sme.h b/target/arm/tcg/helper-sme.h index 659867a1faf..f12d903aa44 100644 --- a/target/arm/tcg/helper-sme.h +++ b/target/arm/tcg/helper-sme.h @@ -126,8 +126,8 @@ DEF_HELPER_FLAGS_7(sme_fmopa_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_7(sme_fmopa_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_6(sme_bfmopa, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sme_bfmopa, TCG_CALL_NO_RWG, + void, env, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sme_smopa_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sme_umopa_s, TCG_CALL_NO_RWG, diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c index 2af2b957cb6..f172225b2f2 100644 --- a/target/arm/tcg/sme_helper.c +++ b/target/arm/tcg/sme_helper.c @@ -1080,8 +1080,8 @@ void HELPER(sme_fmopa_h)(CPUARMState *env, } } -void HELPER(sme_bfmopa)(void *vza, void *vzn, void *vzm, void *vpn, - void *vpm, uint32_t desc) +void HELPER(sme_bfmopa)(CPUARMState *env, void *vza, void *vzn, void *vzm, + void *vpn, void *vpm, uint32_t desc) { intptr_t row, col, oprsz = simd_maxsz(desc); uint32_t neg = simd_data(desc) * 0x80008000u; diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c index 8e9332f1898..bcb502feb05 100644 --- a/target/arm/tcg/translate-sme.c +++ b/target/arm/tcg/translate-sme.c @@ -355,7 +355,7 @@ TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, FPST_FPCR, gen_helper_sme_fmopa_d) /* TODO: FEAT_EBF16 */ -TRANS_FEAT(BFMOPA, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_bfmopa) +TRANS_FEAT(BFMOPA, aa64_sme, do_outprod_env, a, MO_32, gen_helper_sme_bfmopa) TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s) TRANS_FEAT(UMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_umopa_s) From patchwork Tue Jul 30 16:03:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815311 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365548wrs; Tue, 30 Jul 2024 09:04:25 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXojkjWpcvQVcDjB+Txw0/QSQr4V6UMhe2jvgf2rfrwSoQtwZ1oABtHJLxRvmXJOL0UtJMp5eyEiQ4mpi9mgqmD X-Google-Smtp-Source: AGHT+IFSThn0cbody5jOw4LgzUatTdbg7/mEkQdqkPGtNdgpzskHajle4VUvrYYh0FmzBpTrqJSs X-Received: by 2002:a05:6214:248e:b0:6b7:a4c0:9694 with SMTP id 6a1803df08f44-6bb5599f04fmr165364786d6.4.1722355465040; Tue, 30 Jul 2024 09:04:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355465; cv=none; d=google.com; s=arc-20160816; b=05KqCLz7Yq1jiiRa0sWUKxfzOmr23nhBi0s+H6GTYudF1eFhvNEfWr4Ws1hRwuNrUg yhEG+iTbfEuwvAtKg+LNHBy5ZuzbF011UsDrgKWHvyVhWKDEhIms0FzqIanGDyu4q9wO V1L8MpvGbj0U1RnySZhPjdEuRX2N0sa4jUkklAmDAwKNA/SdDQkdIu6ek+QMaClBtBtV 3l2DWi6Lfn1BsfvrbUmH7vAVQ3roe0+vilxsqmMtuTeWF3wYKetNOvgm3gOPZa4pxnoy s2jli4hd60pj6KBuBTBmz1DcsBbXjDFFAvNkaCP24l5TvFv9iyZ9FCgiq6cVD4GAX+fF ygIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Br6CkeiTIKRILq+hj11nYdiPNIHIQx79eIryvbnx584=; fh=OdZwe+Q2JRzXicc5HuaiMgPZxMruhYzx4KYy4waRBek=; b=yXtZmBYcOvjZRFLPdDW4hjX1H1oxOMVOTdd/W64ZKEFIlxRK8xEh9Lbk7BPww5UKLE 2snv5Lpipi1w5F//00u/VWT1/zOB/3khZJjuoa9ZrwLl1mtZ0IqIY5elPLfNEG7UZ8Wv 3trV7cm/l61iJAVgpJT9F9VbQJhszjZZfwyD/7+K4z85cph9c5diN22mZG447+Vv2i0b pxe35z0PxAgHRTClaOEUOWnoHZ2HGK/3c8QB96ZmXCq4vZDjjIcJpz4HvbRtKsMnfkAV 513ANljnNSfHuKJz6dhpV9REJ9RTy8IbLO30mSDR00ppObe3JKvP2sU1XcDZkjVdjS4Y s6wA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Zfkf2PWx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 6a1803df08f44-6bb3faabba8si122742326d6.384.2024.07.30.09.04.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:25 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Zfkf2PWx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJj-0003Bh-5D; Tue, 30 Jul 2024 12:03:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJh-00036M-Rw for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJf-0000Fz-5p for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:13 -0400 Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-42816ca782dso27078765e9.2 for ; Tue, 30 Jul 2024 09:03:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355390; x=1722960190; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Br6CkeiTIKRILq+hj11nYdiPNIHIQx79eIryvbnx584=; b=Zfkf2PWxBj5Zc5l8WSoGGBnm+gujF0xs7dADb/SiBN866sMfknKR3VjxSQSeoYqACm 6Apqndw1ZWsO/Y05UO0XmWw+vbpShC6Ru0g0NlHCnvjXW1KR0YKohLrUr/xz/7AtB3V8 hoIJaIPS4uPXW5IRbJ7ENe184BnDj8p7Q5JrJ+QXI0HqkNEAVGkf9E7lH5/cj5sbksKq gDz+lKNoj8a2zHv1eRO4kQosgbKtn2IhQQ75Wlt5AV6Ioyq7jhKN1ayJib5Z+2nM2r7K CWGN795pszwNbVhgNyIJKSuUASNgBl9oFP9fFDIoMeo3XlaYiBMfzNNNM2zJP08yUaup V7uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355390; x=1722960190; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Br6CkeiTIKRILq+hj11nYdiPNIHIQx79eIryvbnx584=; b=A9wP/s+Bvien1+IrqRF71nZ9yizHZOBUWQVfOhyGZkE3kV5l7yk4+oAaAESrw970OZ mb92UNH6FlxuSYyPyP/GTC2MRtitHkCd2ELWqvv7usTMHBkQpRtthzfdQBJ1v0yeUdjK ovAlUjO9+4SIYQcIQ62Xm22qM74I+V11WneoQlzSySxMU8OAebf7FmvDffmoqcMc4drf FYcCnONCxXu0XpV4l2KvgS21Pkv42ehrbPJM15pOY3zDSMZbtxr4QKxKuANEr8v1K2/Y EnK9vUx120xCQ3TGrF3+ebMYyffpSDMTG/0NzuEK9t6XrcuE2vN9J30vCdzEAGPJJxvK viIQ== X-Forwarded-Encrypted: i=1; AJvYcCVV9Py6D1YYGHkAI3jq7YLTgkX4nrL1cDQWbXxXtmjy/Zgsi8y082XAoaRXJkiOzxvBdqzX0Xe+SPziM/5EkYlZwZtMFCM= X-Gm-Message-State: AOJu0Yw/YT7M0bbQr+dDBMjEeFwVXPiXZG4WXcMwBon+O2T+EF+f/E6n 2xEpEE5k1/lFDa2QOes+Gy5zRGc0jZhoQxpfh9Q0039AFPH+F1J5ORGt3UHV6eg= X-Received: by 2002:a05:600c:5492:b0:427:b995:5bd0 with SMTP id 5b1f17b1804b1-42811dd08d3mr71927105e9.23.1722355389721; Tue, 30 Jul 2024 09:03:09 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:09 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 3/8] target/arm: Pass env pointer through to gvec_bfdot helper Date: Tue, 30 Jul 2024 17:03:01 +0100 Message-Id: <20240730160306.2959745-4-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32f; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Pass the env pointer through to the gvec_bfdot helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/tcg/translate-a64.c | 27 ++++++++++++++++++++++++- target/arm/tcg/translate-neon.c | 35 +++++++++++++++++++++++++++++++-- target/arm/tcg/translate-sve.c | 15 +++++++++++++- target/arm/tcg/vec_helper.c | 3 ++- 5 files changed, 77 insertions(+), 7 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 970d059dec5..aece9fd4aa7 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1027,8 +1027,8 @@ DEF_HELPER_FLAGS_5(gvec_ummla_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_bfdot, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 148be2826ec..4aef8b9211a 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -735,6 +735,22 @@ static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } +/* + * Expand a 4-operand operation using an out-of-line helper that takes + * a pointer to the CPU env. + */ +static void gen_gvec_op4_env(DisasContext *s, bool is_q, int rd, int rn, + int rm, int ra, int data, + gen_helper_gvec_4_ptr *fn) +{ + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + tcg_env, + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* * Expand a 4-operand + fpstatus pointer + simd data value operation using * an out-of-line helper. @@ -5601,10 +5617,19 @@ static bool do_dot_vector(DisasContext *s, arg_qrrr_e *a, return true; } +static bool do_dot_vector_env(DisasContext *s, arg_qrrr_e *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fp_access_check(s)) { + gen_gvec_op4_env(s, a->q, a->rd, a->rn, a->rm, a->rd, 0, fn); + } + return true; +} + TRANS_FEAT(SDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_sdot_b) TRANS_FEAT(UDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_udot_b) TRANS_FEAT(USDOT_v, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usdot_b) -TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfdot) +TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfdot) TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfmmla) TRANS_FEAT(SMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_smmla_b) TRANS_FEAT(UMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_ummla_b) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 915c9e56db5..454380f01d7 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -148,6 +148,37 @@ static bool do_neon_ddda(DisasContext *s, int q, int vd, int vn, int vm, return true; } +static bool do_neon_ddda_env(DisasContext *s, int q, int vd, int vn, int vm, + int data, gen_helper_gvec_4_ptr *fn_gvec) +{ + /* UNDEF accesses to D16-D31 if they don't exist. */ + if (((vd | vn | vm) & 0x10) && !dc_isar_feature(aa32_simd_r32, s)) { + return false; + } + + /* + * UNDEF accesses to odd registers for each bit of Q. + * Q will be 0b111 for all Q-reg instructions, otherwise + * when we have mixed Q- and D-reg inputs. + */ + if (((vd & 1) * 4 | (vn & 1) * 2 | (vm & 1)) & q) { + return false; + } + + if (!vfp_access_check(s)) { + return true; + } + + int opr_sz = q ? 16 : 8; + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, vd), + vfp_reg_offset(1, vn), + vfp_reg_offset(1, vm), + vfp_reg_offset(1, vd), + tcg_env, + opr_sz, opr_sz, data, fn_gvec); + return true; +} + static bool do_neon_ddda_fpst(DisasContext *s, int q, int vd, int vn, int vm, int data, ARMFPStatusFlavour fp_flavour, gen_helper_gvec_4_ptr *fn_gvec_ptr) @@ -266,8 +297,8 @@ static bool trans_VDOT_b16(DisasContext *s, arg_VDOT_b16 *a) if (!dc_isar_feature(aa32_bf16, s)) { return false; } - return do_neon_ddda(s, a->q * 7, a->vd, a->vn, a->vm, 0, - gen_helper_gvec_bfdot); + return do_neon_ddda_env(s, a->q * 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_bfdot); } static bool trans_VFML(DisasContext *s, arg_VFML *a) diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 798ab2bfb13..4fb0bd077b4 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -238,6 +238,19 @@ static bool gen_gvec_fpst_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn, return ret; } +static bool gen_gvec_env_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn, + int rd, int rn, int rm, int ra, + int data) +{ + return gen_gvec_ptr_zzzz(s, fn, rd, rn, rm, ra, data, tcg_env); +} + +static bool gen_gvec_env_arg_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn, + arg_rrrr_esz *a, int data) +{ + return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); +} + /* Invoke an out-of-line helper on 4 Zregs, 1 Preg, plus fpst. */ static bool gen_gvec_fpst_zzzzp(DisasContext *s, gen_helper_gvec_5_ptr *fn, int rd, int rn, int rm, int ra, int pg, @@ -7099,7 +7112,7 @@ TRANS_FEAT_NONSTREAMING(USMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz, TRANS_FEAT_NONSTREAMING(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz, gen_helper_gvec_ummla_b, a, 0) -TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_ool_arg_zzzz, +TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz, gen_helper_gvec_bfdot, a, 0) TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_ool_arg_zzxz, gen_helper_gvec_bfdot_idx, a) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 98604d170fd..37aad4be4b0 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2814,7 +2814,8 @@ float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2) return t1; } -void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, + void *envp, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); float32 *d = vd, *a = va; From patchwork Tue Jul 30 16:03:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815309 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365337wrs; Tue, 30 Jul 2024 09:04:05 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXpZYeTivHohAzUj2mdzOhdCiRWfg0S/djFwAl6R97bilgHwQCyUWbwc+WobLUK7rhJ/v7zhD4aP8OSLfmqkvkP X-Google-Smtp-Source: AGHT+IF2FEj/nMlsYrbYuqri4kNZAkDp4Zwy3zG9Uuk4ciKA7FahKFIQ7UDl8ebBIMrFlSUbeD7h X-Received: by 2002:a0d:f8c2:0:b0:65f:9451:13dd with SMTP id 00721157ae682-67a0a32317emr131525147b3.42.1722355444980; Tue, 30 Jul 2024 09:04:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355444; cv=none; d=google.com; s=arc-20160816; b=hMtO1uotzDAK45Q5te6lR8KRZEZQtSzNmUlW2Nty/8z+CKH4SxpT+IWjEipyX5yeFL NQP773DVA+yTQKVDeqigxd9rQ/MVWSougFsfuSbsySs+kNEZ9G1oxzFtJ9ir7vjywmWP r6uefeMdfNymxhU/ciSyo1cDQGHCH9hHDo+miOvFu6XClMhCPhMyfaZ/GgHv+EQnwrWo GR8wsFU9xTtLnbU5tVYe/mlcxai92vymeyXQuJKNbMHnLA6TlmG9VunJsgyeR4XHpaK5 OtzTxmrv5u5UxcgYV9Pt8bdww4NQMrHKyZOJL2OqUGufMhDiQ9RGN/L1sMi038OcN4ZG AIiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=HZaZuKw6MIHzrJiZQtNTqpnwcvq+BC4Z/9tdatLyulU=; fh=uiihvelnwZqFv962ow2MlmqwyTIxtDUMnoFy1WlrePg=; b=ewrv4NfFVOsCXSRNtDb52LV7MzQfE6/as83CKmvCr3oudJ4fKxWtOpnfk1zEj0xZi6 zuKGHRXRBrf9FBxz+g5lOjETLbFDtJNiAERhgcpjCJ53miG/BoOJOaEioRw8HiOSK43g PhTBvURUMvkTZRK3ah1amlucbxpbHrlRrMR7i9Qo/A1b/oGMNxsHN6YsgqPalcqJnGl8 BRlXGwgO1F80Ndub0n94qYpEs0EoYpqrd7E6mdSUziSpPs69YOh42l8HE1Uc0NEvsHX9 J/yw/ahBXHnA+9yCg5FLHHOU1P6SSTZLqb7FDhkmq91X/jl49W14tJ4p3kiJCidwQfiU 8irA==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ejB9alhi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 00721157ae682-675688bb5b9si70610567b3.6.2024.07.30.09.04.04 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:04 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ejB9alhi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJr-0003jT-Gb; Tue, 30 Jul 2024 12:03:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJp-0003bk-Ji for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:21 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJn-0000HW-Oh for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:21 -0400 Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-428141be2ddso28658955e9.2 for ; Tue, 30 Jul 2024 09:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355398; x=1722960198; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=HZaZuKw6MIHzrJiZQtNTqpnwcvq+BC4Z/9tdatLyulU=; b=ejB9alhiGOaIMtWcnK3ZApZyZoIJH7BBbymuu7t7lPxTy/B8mw4OSdoNRa8SX5sXLi r2mOz2di7l1Gqwk69zYwJJ52RT+ckbd+la8h3FDXxK4s3I+/3rNN680pJbUzskxdNbj0 akzncUY8g186td7hGNEQoYkMy28pn+Cyvx86zFcLRA4cXZ1KeyKTng9PzJNzaGaACDgw RH4ZdEUHR8nWtOrzYgDZwVUCFg1eeVF/pE1b2sAlkqYz6Xvp6OXYGJHm7mM+dEXgq6oo +d2omKqJg9o1FI/PhCkG4AKtgMPu0ATuIRaaV3I72/jyQGoWV0HWmgCyY5SI9dxJCgJ5 lzhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355398; x=1722960198; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HZaZuKw6MIHzrJiZQtNTqpnwcvq+BC4Z/9tdatLyulU=; b=cbWobvdA3og4JYUq41wpqU7Y6pVLWl1/Pn0vptO61Xnun4JhJzVdSRYtGAF9efugHe TmN4fg+u00kXpWGvJAvpcNDzKOzzLeopPtdVa6FNcc6zEdwLOqr8DXrreVeJdkVsCoZ7 UADcnSqOwlBxsg9sCnbwzSWMT8+r6fBu3vQ1KZP/3d2H/7d9wXZP5mExibJAO5do8+Gw Cw2Rw4cBF+/VimJ2ZUYkF4KSvoaf6MWzL0B1iBjv+uoW+8sEN6kqneOFG5loZgzvmPFQ OuB7Kje08AuuOCyyW6DFxFPqFE1Yc6KocGQxX3fxkusLMj6IMYZlU5P0GWGiWIX0xHRT mfog== X-Forwarded-Encrypted: i=1; AJvYcCULadMRyktjy4Ov54CS+8BDIPGdziulpTrgb2j+Hn4cthvRUzhMktU7EfYU2HHdn6T30g9ElxZosS4XztA2FY7tjTRu1/k= X-Gm-Message-State: AOJu0YyhkOyas2pHB7PlMXtNx21sS0ee21SfcA2y95sK5CUmvHLP2Ni8 weQ7qgtheX6ZDsDww+8wFARKuBdp6BE7dC2WZr//JlaMOu27k3M9f76TaxMCnDNLBvG0ai+26vm K X-Received: by 2002:a5d:698b:0:b0:367:9088:fecd with SMTP id ffacd0b85a97d-36b5cee2e4emr7735591f8f.7.1722355390195; Tue, 30 Jul 2024 09:03:10 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:09 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 4/8] target/arm: Pass env pointer through to gvec_bfdot_idx helper Date: Tue, 30 Jul 2024 17:03:02 +0100 Message-Id: <20240730160306.2959745-5-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Pass the env pointer through to the gvec_bfdot_idx helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/tcg/translate-a64.c | 11 ++++++++++- target/arm/tcg/translate-neon.c | 4 ++-- target/arm/tcg/translate-sve.c | 8 +++++++- target/arm/tcg/vec_helper.c | 2 +- 5 files changed, 22 insertions(+), 7 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index aece9fd4aa7..386cf8686ea 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1029,8 +1029,8 @@ DEF_HELPER_FLAGS_5(gvec_usmmla_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_bfdot_idx, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_bfdot_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index 4aef8b9211a..a4e9740c921 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -6403,13 +6403,22 @@ static bool do_dot_vector_idx(DisasContext *s, arg_qrrx_e *a, return true; } +static bool do_dot_vector_idx_env(DisasContext *s, arg_qrrx_e *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fp_access_check(s)) { + gen_gvec_op4_env(s, a->q, a->rd, a->rn, a->rm, a->rd, a->idx, fn); + } + return true; +} + TRANS_FEAT(SDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_sdot_idx_b) TRANS_FEAT(UDOT_vi, aa64_dp, do_dot_vector_idx, a, gen_helper_gvec_udot_idx_b) TRANS_FEAT(SUDOT_vi, aa64_i8mm, do_dot_vector_idx, a, gen_helper_gvec_sudot_idx_b) TRANS_FEAT(USDOT_vi, aa64_i8mm, do_dot_vector_idx, a, gen_helper_gvec_usdot_idx_b) -TRANS_FEAT(BFDOT_vi, aa64_bf16, do_dot_vector_idx, a, +TRANS_FEAT(BFDOT_vi, aa64_bf16, do_dot_vector_idx_env, a, gen_helper_gvec_bfdot_idx) static bool trans_BFMLAL_vi(DisasContext *s, arg_qrrx_e *a) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 454380f01d7..7de157c539c 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -391,8 +391,8 @@ static bool trans_VDOT_b16_scal(DisasContext *s, arg_VDOT_b16_scal *a) if (!dc_isar_feature(aa32_bf16, s)) { return false; } - return do_neon_ddda(s, a->q * 6, a->vd, a->vn, a->vm, a->index, - gen_helper_gvec_bfdot_idx); + return do_neon_ddda_env(s, a->q * 6, a->vd, a->vn, a->vm, a->index, + gen_helper_gvec_bfdot_idx); } static bool trans_VFML_scalar(DisasContext *s, arg_VFML_scalar *a) diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 4fb0bd077b4..8876d1f91a9 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -251,6 +251,12 @@ static bool gen_gvec_env_arg_zzzz(DisasContext *s, gen_helper_gvec_4_ptr *fn, return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); } +static bool gen_gvec_env_arg_zzxz(DisasContext *s, gen_helper_gvec_4_ptr *fn, + arg_rrxr_esz *a) +{ + return gen_gvec_env_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->index); +} + /* Invoke an out-of-line helper on 4 Zregs, 1 Preg, plus fpst. */ static bool gen_gvec_fpst_zzzzp(DisasContext *s, gen_helper_gvec_5_ptr *fn, int rd, int rn, int rm, int ra, int pg, @@ -7114,7 +7120,7 @@ TRANS_FEAT_NONSTREAMING(UMMLA, aa64_sve_i8mm, gen_gvec_ool_arg_zzzz, TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz, gen_helper_gvec_bfdot, a, 0) -TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_ool_arg_zzxz, +TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_env_arg_zzxz, gen_helper_gvec_bfdot_idx, a) TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz, diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 37aad4be4b0..1edde9792f0 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2828,7 +2828,7 @@ void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, } void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, - void *va, uint32_t desc) + void *va, void *envp, uint32_t desc) { intptr_t i, j, opr_sz = simd_oprsz(desc); intptr_t index = simd_data(desc); From patchwork Tue Jul 30 16:03:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815312 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365783wrs; Tue, 30 Jul 2024 09:04:47 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCXJuJ4a3Aeocs3vZNkY+tJFArkSsZt+0wRNLL+3ED5wDHZI15RRvtR66gL5w4jC7oO4Ln7putWH3c7DyGR4plRS X-Google-Smtp-Source: AGHT+IH9FDJgFGTiN9f8UK97Jw4zpKau0OfBd9APG7+5wKh2pj7m4Jl4mPv0/i99iHpunjCCirjm X-Received: by 2002:ac8:5a08:0:b0:446:5568:a6dd with SMTP id d75a77b69052e-45004d6fea4mr130592451cf.7.1722355487528; Tue, 30 Jul 2024 09:04:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355487; cv=none; d=google.com; s=arc-20160816; b=F8wbIO4rjur6+1nn2rlsJWa8G39X4bHqGe7TEov1yDccZC7UjP9uHf6zkfvKWw3wIM 6Qt2+p4THo+ykU8HImTg4ny3D7kloBwNeth120+uqzj4XxLSPUWWdIQjM9tXTz1D627D qVNn0TiOQODHBqZ4J3QPnv7MPLHzDy/z0CsANa9t840la0dYLyCIfoOgfxf7q9JAfaDF Dx7ZJ1SlbA2Pawjxnk/b0c/YLF2rSffG8FSJTgwqkJWB/Wy1fK+QzfFVrMAkH7fZfy1M VJ7c3/Yi9Rnn5ShIt1JrfpxHgBK7A9rwH7pSI5u1ksYhtKgWbDdafBTqC530+L6xPkdu hNCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=u/1r/oeOEIAX2gVU8SoWnDp6iCu5q8o/SWMbeXFkLdU=; fh=Td2xPD7sbWgMdup4FfMAlMqPSqHPTdERALasBXL/m50=; b=Nk7LgbI8C8/jSyrcQ07aINDi/BcGUv8pHvtRwnUk6QamVEov/Nz5WjKAKhyxNk4MQM H9xf1KGn/q3F+GZZT1jy6espGUAAyYkBBgleJeNNheOJnLAgR3O5fFibR6J5hjj+zvDx ndPCKyyj8NcTwJKFk6UbnGr8IOxQ0c9ZPKC9yqETA9lH7twdliTQHO1CcqQdoTpUhicw KNlc1xthauV9ER6IiCEvfitbU7Y44p0QW78Ll1O9i6WaZivJPeNrhQ0ozXJse/tvAvCb z7IeSlkyZknf3BoF++8KP0XN/b4RKTbhc0JJNCzpmz3henRqG2OQyjev534BqBy8UVqK TL2A==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XhRvqy8J; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id d75a77b69052e-44fe840ec4dsi126985621cf.422.2024.07.30.09.04.47 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:47 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=XhRvqy8J; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJs-0003mM-8Z; Tue, 30 Jul 2024 12:03:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJr-0003i2-6D for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:23 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJo-0000Hd-3y for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:22 -0400 Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-428035c0bb2so17930545e9.1 for ; Tue, 30 Jul 2024 09:03:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355398; x=1722960198; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=u/1r/oeOEIAX2gVU8SoWnDp6iCu5q8o/SWMbeXFkLdU=; b=XhRvqy8JslX+a0T971HQhfSvFmuFR9lctr81xCuSIMvOyvMduFb29uvBRWhMRxQdL1 AZtRSmum/sBOzOO/3LNNbOZ9ZpgDoNTEc7yLZviih7a5BJ01p+SfrufyxcrNDojsnDcJ T+WfWrzVeTYWuGbPJkgkC97b+bsZYgu5HYey7RkDxRcmc+kTlMp6PIwcb5sK72W11t5r fzSEIM+z1kWczybpV6MT423jjRCuR+cBExj2GaPa9rY0t9EhWDh5Uy8I1ZImzyfcX0Om gwxrX0mpuGrS8nXLzToFRJ7Y8nqSc3ScJtmYszWcMxPiWv8o13LTHW/CBzi0t3MfLraY g0+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355398; x=1722960198; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u/1r/oeOEIAX2gVU8SoWnDp6iCu5q8o/SWMbeXFkLdU=; b=AyDmFEdYlDukcGKw0n4Is7+9JmzXcdy3HHRE7A+YZPHHCENWhCJiWwNSK5kDziGVP1 X60uptgGOuSYgWSzCpPpRIVeEBTmB8l0ZCfLhK0sOkRIIvq+IwMfhYNwJdHaD5vYJBAq N+Vfkkt1oVHMsegDvXiXm7768BM8sZsMIQT4Tyd9TXnEaMnrexc8d1Gitzmy86b6bGqj zX4Sp2mBjBwLlyd9TL2qCTW0nRCd8Gzx77gnp99Kwg5OKMfMgBXjXF3kjqrx8vtAL9Zd LmDxedbrtTP0RnWepKasJZkZbDLSO9UeTrYQqJi/1g1LfAn2wcFGJjB5IXDz4VIao8rN zDHA== X-Forwarded-Encrypted: i=1; AJvYcCUnjA+ULUV0Wni8OmcDasS33PzwZGL6sFBEZoeQ2A4pAuwEtC1PQcy5rIRDJN4TSiM3k56wPSjsiO9oUV6g1/ZIghic4eQ= X-Gm-Message-State: AOJu0Yyf6CFeiDfxE9PElE90AjnCzTD0iLibud7Q6ks8AP3ipfzIuWqe rD+om8EePdCF+7mgBhgChSx10+9/L2AP3oCF8A8R/pouAkG2k3kv0RDsCTpdnPM= X-Received: by 2002:a05:6000:1a51:b0:362:2af4:43cc with SMTP id ffacd0b85a97d-36b8c8e6637mr1913162f8f.19.1722355398465; Tue, 30 Jul 2024 09:03:18 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:18 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 5/8] target/arm: Pass env pointer through to gvec_bfmmla helper Date: Tue, 30 Jul 2024 17:03:03 +0100 Message-Id: <20240730160306.2959745-6-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x330.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Pass the env pointer through to the gvec_bfmmla helper, so we can use it to add support for FEAT_EBF16. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/tcg/translate-a64.c | 2 +- target/arm/tcg/translate-neon.c | 4 ++-- target/arm/tcg/translate-sve.c | 2 +- target/arm/tcg/vec_helper.c | 3 ++- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 386cf8686ea..93b830d2cce 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1032,8 +1032,8 @@ DEF_HELPER_FLAGS_6(gvec_bfdot, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(gvec_bfdot_idx, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_bfmmla, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_bfmmla, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(gvec_bfmlal, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index a4e9740c921..33d49f524f4 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -5630,7 +5630,7 @@ TRANS_FEAT(SDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_sdot_b) TRANS_FEAT(UDOT_v, aa64_dp, do_dot_vector, a, gen_helper_gvec_udot_b) TRANS_FEAT(USDOT_v, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usdot_b) TRANS_FEAT(BFDOT_v, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfdot) -TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector, a, gen_helper_gvec_bfmmla) +TRANS_FEAT(BFMMLA, aa64_bf16, do_dot_vector_env, a, gen_helper_gvec_bfmmla) TRANS_FEAT(SMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_smmla_b) TRANS_FEAT(UMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_ummla_b) TRANS_FEAT(USMMLA, aa64_i8mm, do_dot_vector, a, gen_helper_gvec_usmmla_b) diff --git a/target/arm/tcg/translate-neon.c b/target/arm/tcg/translate-neon.c index 7de157c539c..13cd31aad42 100644 --- a/target/arm/tcg/translate-neon.c +++ b/target/arm/tcg/translate-neon.c @@ -3730,8 +3730,8 @@ static bool trans_VMMLA_b16(DisasContext *s, arg_VMMLA_b16 *a) if (!dc_isar_feature(aa32_bf16, s)) { return false; } - return do_neon_ddda(s, 7, a->vd, a->vn, a->vm, 0, - gen_helper_gvec_bfmmla); + return do_neon_ddda_env(s, 7, a->vd, a->vn, a->vm, 0, + gen_helper_gvec_bfmmla); } static bool trans_VFMA_b16(DisasContext *s, arg_VFMA_b16 *a) diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 8876d1f91a9..95e938662ed 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -7123,7 +7123,7 @@ TRANS_FEAT(BFDOT_zzzz, aa64_sve_bf16, gen_gvec_env_arg_zzzz, TRANS_FEAT(BFDOT_zzxz, aa64_sve_bf16, gen_gvec_env_arg_zzxz, gen_helper_gvec_bfdot_idx, a) -TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_ool_arg_zzzz, +TRANS_FEAT_NONSTREAMING(BFMMLA, aa64_sve_bf16, gen_gvec_env_arg_zzzz, gen_helper_gvec_bfmmla, a, 0) static bool do_BFMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 1edde9792f0..77efb5f47d8 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2847,7 +2847,8 @@ void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, + void *envp, uint32_t desc) { intptr_t s, opr_sz = simd_oprsz(desc); float32 *d = vd, *a = va; From patchwork Tue Jul 30 16:03:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815313 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365805wrs; Tue, 30 Jul 2024 09:04:49 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVg/i/Li5CPVubC6dKG387OdPcen2+Z6NoV/a8WuxhgGAyKFx1j4/vvXOAmA0Snw0OdzKP+kbOyv2a882JJgZGP X-Google-Smtp-Source: AGHT+IFcWGcl2hYtbH6ldr2hoSxPMM5jGUXHA+BYv5HS5Hh5Q0iVX9PtxCWbcIKW3j47hZ8af6cq X-Received: by 2002:a05:6358:524a:b0:1aa:d6fe:f422 with SMTP id e5c5f4694b2df-1adc0695342mr953188355d.18.1722355489657; Tue, 30 Jul 2024 09:04:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355489; cv=none; d=google.com; s=arc-20160816; b=KH8y2z8uRcRJEyqC5l6eN+BQXzGTDr9iqXmPypjwRDR03pzbSwcPsTAOvZK1sBZhgA 8W4JEcl2Bpcv0vdgyfyaiw4tgPKIh6cktYOE6vxQS1GbfaL1ywuwdoWq8Gnd4a/kwom+ y/h/J5GMU3Za7kCoxUcaagGGLLP8QIoIrHmj+gXmyHlhbC9ROe7mqT+0TV9gZQMzDBXh bz1qR063RRtJVtu7CbKkwX3GwP4QMscoZJmmGQSBRLMbs+89VWDRNE3tAdoAyeMmci6a 0aPh/IbO2iPJ2v760BxZwFnOU6R7nmUeWZigY444WJKGdv2xWgyJKCF3IcTcb1VJOCgo Pi8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=QvrVCRfMbrZEGuOW0qBR5GVSvdV1DpAOaaHu4y3CbnQ=; fh=athBsKf4pZ3nykGxlAy0wY2/Vn3IFxYP9ugzqxfZVlM=; b=sb3lNOiwUNXHzEuJvcEO9qg5/bQZNBss7nKQ2dCsik90RodfRxFVCPqE2TE5mZpy3O xcEj9NTFSgM4B2TRzi8sa7tkX2C3VSyrZbZuEGT/3meWS5pIgeJ3Bnsg9sEoQVvGjl6B Sr8rPpA4eZAY4UHuZAV6/LF3QhoiJVyYzdhx51JapB+13d7xtOgh6+HoJnuQrg3LadP8 viCxFllRVZlMOSH0YZw9yL6W5HXSq5RxbRtYLRXo0lyWlmwBypg4uu57GW0XeF9A7mkL NDRutBnzXF4ncVQa186aJZjUef6ZL3SB7cDHF6xXjFX9MhK8A/nXst85RfP9AL4MiEgB AFSg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=P8mj6Asf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 00721157ae682-6756962a6besi71139827b3.178.2024.07.30.09.04.49 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:49 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=P8mj6Asf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJu-00042E-Rx; Tue, 30 Jul 2024 12:03:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJt-0003ta-5s for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:25 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJp-0000Hq-1G for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:24 -0400 Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-369cb9f086aso2500687f8f.0 for ; Tue, 30 Jul 2024 09:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355399; x=1722960199; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QvrVCRfMbrZEGuOW0qBR5GVSvdV1DpAOaaHu4y3CbnQ=; b=P8mj6AsfbcfFPnCOdFCEkJSptuWiJODlsjFF4wmL4k1XzeEecJ9+z0lZTSc/n/Ne8a pPlkBD5LGNSoCIY5dXcr8Su6xQMABiKFMMKB3V2EOhcLfn6TjUdCmSAwukGh/nrnqXca 3x3aPFZ/AT9kC/rE59OqMXPluPcYzwBO6xM5SoA4eug2TV+s9lGXLROHAD48yb4L+OCX UfeXWSMW5ryzifFDM29ELrJKQyuvqojmyZvlRnkJnXePB6MDgevNGdBXkqyb/Y7JgLG1 SI+FZJEcVBk08WbjUxUXNNWGJiM0qqDNqma37RZErAiC5CbTYKK0+E1oGYCdvLKsauI2 j8Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355399; x=1722960199; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QvrVCRfMbrZEGuOW0qBR5GVSvdV1DpAOaaHu4y3CbnQ=; b=BvvIzNqyW8AzKZjyAbrvzfXfb/aDmzgPPWG4NoNqy3UxjZ8+84pXOb1k+mFa21XXc8 CeiKBHqP1nx7U1xsUTV8vBMkDYJ56mWjbtTR8137dMp31gfjdcX1pXztbPFFVGcZTtZE h7CiOpqhCj45FSHRfc/3wx8G2frFMhl+ndJoXS41fwaCSclX9k2aJr+y5sa/q/vzRHov 0rXz9wy+/aITH+PZJNQIBnjE/tQTJzYO7yVKC/Lvby+1M04ioCU0gdDCLBSi/Qm2ttEh 292Y0q0LGYunjz6hPAco9hH7k8MUpo2hOm+obKbTdGUaEKCTyId+iEJBu6sDFkrKDPSX RLgw== X-Forwarded-Encrypted: i=1; AJvYcCVollNT/f45JIhyKo4pOJWszkqbTy4EQXKuUWi6NORvuzi05pphR0Ruu8ESO/3Wd9Acq4ip9lKcS3LXgbDt4NyDq3Hz3+c= X-Gm-Message-State: AOJu0YzHYLPlw181cyeQsVFs/XQb1CyxvNv25y5UX5uS4VbDbIUBoShu eGB8AptEBDjJNvR6COO972Y2w4d1vBIFxey21HXTXkjVdEAKtYt/wevusaojPFs= X-Received: by 2002:a05:6000:110a:b0:367:98e6:362b with SMTP id ffacd0b85a97d-36b5d353ad0mr6460218f8f.42.1722355398982; Tue, 30 Jul 2024 09:03:18 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:18 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 6/8] target/arm: Prepare bfdotadd() callers for FEAT_EBF support Date: Tue, 30 Jul 2024 17:03:04 +0100 Message-Id: <20240730160306.2959745-7-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org We use bfdotadd() in four callsites for various helper functions. Currently this all assumes that we have the FPCR.EBF=0 semantics. For FPCR.EBF=1 we will need to: * call a different routine to bfdotadd() because we need to do a fused multiply-add rather than separate multiply and add steps * use a different float_status that honours the FPCR rounding mode and denormal-flushing fields * pass in an extra float_status that has been set up to perform round-to-odd rounding To prepare for this, refactor all the callsites so that instead of for (...) { x = bfdotadd(...); } they are: float_status fpst, fpst_odd; if (is_ebf(env, &fpst, &fpst_odd)) { for (...) { x = bfdotadd_ebf(..., &fpst, &fpst_odd); } } else { for (...) { x = bfdotadd(..., &fpst); } } For the moment the is_ebf() function always returns false, sets up fpst for EBF=0 semantics and never sets up fpst_odd; bfdotadd_ebf() will assert if called. We'll fill in the handling for EBF=1 in the next commit. This change should be a zero-behaviour-change refactor. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/tcg/vec_internal.h | 37 ++++++++- target/arm/tcg/sme_helper.c | 74 ++++++++++++------ target/arm/tcg/vec_helper.c | 141 +++++++++++++++++++++++++--------- 3 files changed, 192 insertions(+), 60 deletions(-) diff --git a/target/arm/tcg/vec_internal.h b/target/arm/tcg/vec_internal.h index 3ca1b94ccf9..094f5c169ca 100644 --- a/target/arm/tcg/vec_internal.h +++ b/target/arm/tcg/vec_internal.h @@ -223,13 +223,46 @@ int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); * bfdotadd: * @sum: addend * @e1, @e2: multiplicand vectors + * @fpst: floating-point status to use * * BFloat16 2-way dot product of @e1 & @e2, accumulating with @sum. * The @e1 and @e2 operands correspond to the 32-bit source vector * slots and contain two Bfloat16 values each. * - * Corresponds to the ARM pseudocode function BFDotAdd. + * Corresponds to the ARM pseudocode function BFDotAdd, specialized + * for the FPCR.EBF == 0 case. */ -float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2); +float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst); +/** + * bfdotadd_ebf: + * @sum: addend + * @e1, @e2: multiplicand vectors + * @fpst: floating-point status to use + * @fpst_odd: floating-point status to use for round-to-odd operations + * + * BFloat16 2-way dot product of @e1 & @e2, accumulating with @sum. + * The @e1 and @e2 operands correspond to the 32-bit source vector + * slots and contain two Bfloat16 values each. + * + * Corresponds to the ARM pseudocode function BFDotAdd, specialized + * for the FPCR.EBF == 1 case. + */ +float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2, + float_status *fpst, float_status *fpst_odd); + +/** + * is_ebf: + * @env: CPU state + * @statusp: pointer to floating point status to fill in + * @oddstatusp: pointer to floating point status to fill in for round-to-odd + * + * Determine whether a BFDotAdd operation should use FPCR.EBF = 0 + * or FPCR.EBF = 1 semantics. On return, has initialized *statusp + * and *oddstatusp to suitable float_status arguments to use with either + * bfdotadd() or bfdotadd_ebf(). + * Returns true for EBF = 1, false for EBF = 0. (The caller should use this + * to decide whether to call bfdotadd() or bfdotadd_ebf().) + */ +bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp); #endif /* TARGET_ARM_VEC_INTERNAL_H */ diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c index f172225b2f2..e3fbfa98fa5 100644 --- a/target/arm/tcg/sme_helper.c +++ b/target/arm/tcg/sme_helper.c @@ -1086,32 +1086,62 @@ void HELPER(sme_bfmopa)(CPUARMState *env, void *vza, void *vzn, void *vzm, intptr_t row, col, oprsz = simd_maxsz(desc); uint32_t neg = simd_data(desc) * 0x80008000u; uint16_t *pn = vpn, *pm = vpm; + float_status fpst, fpst_odd; - for (row = 0; row < oprsz; ) { - uint16_t prow = pn[H2(row >> 4)]; - do { - void *vza_row = vza + tile_vslice_offset(row); - uint32_t n = *(uint32_t *)(vzn + H1_4(row)); + if (is_ebf(env, &fpst, &fpst_odd)) { + for (row = 0; row < oprsz; ) { + uint16_t prow = pn[H2(row >> 4)]; + do { + void *vza_row = vza + tile_vslice_offset(row); + uint32_t n = *(uint32_t *)(vzn + H1_4(row)); - n = f16mop_adj_pair(n, prow, neg); + n = f16mop_adj_pair(n, prow, neg); - for (col = 0; col < oprsz; ) { - uint16_t pcol = pm[H2(col >> 4)]; - do { - if (prow & pcol & 0b0101) { - uint32_t *a = vza_row + H1_4(col); - uint32_t m = *(uint32_t *)(vzm + H1_4(col)); + for (col = 0; col < oprsz; ) { + uint16_t pcol = pm[H2(col >> 4)]; + do { + if (prow & pcol & 0b0101) { + uint32_t *a = vza_row + H1_4(col); + uint32_t m = *(uint32_t *)(vzm + H1_4(col)); - m = f16mop_adj_pair(m, pcol, 0); - *a = bfdotadd(*a, n, m); - } - col += 4; - pcol >>= 4; - } while (col & 15); - } - row += 4; - prow >>= 4; - } while (row & 15); + m = f16mop_adj_pair(m, pcol, 0); + *a = bfdotadd_ebf(*a, n, m, &fpst, &fpst_odd); + } + col += 4; + pcol >>= 4; + } while (col & 15); + } + row += 4; + prow >>= 4; + } while (row & 15); + } + } else { + for (row = 0; row < oprsz; ) { + uint16_t prow = pn[H2(row >> 4)]; + do { + void *vza_row = vza + tile_vslice_offset(row); + uint32_t n = *(uint32_t *)(vzn + H1_4(row)); + + n = f16mop_adj_pair(n, prow, neg); + + for (col = 0; col < oprsz; ) { + uint16_t pcol = pm[H2(col >> 4)]; + do { + if (prow & pcol & 0b0101) { + uint32_t *a = vza_row + H1_4(col); + uint32_t m = *(uint32_t *)(vzm + H1_4(col)); + + m = f16mop_adj_pair(m, pcol, 0); + *a = bfdotadd(*a, n, m, &fpst); + } + col += 4; + pcol >>= 4; + } while (col & 15); + } + row += 4; + prow >>= 4; + } while (row & 15); + } } } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 77efb5f47d8..baf04a0561b 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2790,7 +2790,7 @@ DO_MMLA_B(gvec_usmmla_b, do_usmmla_b) * BFloat16 Dot Product */ -float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2) +bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp) { /* FPCR is ignored for BFDOT and BFMMLA. */ float_status bf_status = { @@ -2800,29 +2800,50 @@ float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2) .flush_inputs_to_zero = true, .default_nan_mode = true, }; + + *statusp = bf_status; + return false; +} + +float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst) +{ float32 t1, t2; /* * Extract each BFloat16 from the element pair, and shift * them such that they become float32. */ - t1 = float32_mul(e1 << 16, e2 << 16, &bf_status); - t2 = float32_mul(e1 & 0xffff0000u, e2 & 0xffff0000u, &bf_status); - t1 = float32_add(t1, t2, &bf_status); - t1 = float32_add(sum, t1, &bf_status); + t1 = float32_mul(e1 << 16, e2 << 16, fpst); + t2 = float32_mul(e1 & 0xffff0000u, e2 & 0xffff0000u, fpst); + t1 = float32_add(t1, t2, fpst); + t1 = float32_add(sum, t1, fpst); return t1; } +float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2, + float_status *fpst, float_status *fpst_odd) +{ + g_assert_not_reached(); +} + void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, void *envp, uint32_t desc) { + CPUARMState *env = envp; intptr_t i, opr_sz = simd_oprsz(desc); float32 *d = vd, *a = va; uint32_t *n = vn, *m = vm; + float_status fpst, fpst_odd; - for (i = 0; i < opr_sz / 4; ++i) { - d[i] = bfdotadd(a[i], n[i], m[i]); + if (is_ebf(env, &fpst, &fpst_odd)) { + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = bfdotadd_ebf(a[i], n[i], m[i], &fpst, &fpst_odd); + } + } else { + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = bfdotadd(a[i], n[i], m[i], &fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -2830,18 +2851,30 @@ void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, void *va, void *envp, uint32_t desc) { + CPUARMState *env = envp; intptr_t i, j, opr_sz = simd_oprsz(desc); intptr_t index = simd_data(desc); intptr_t elements = opr_sz / 4; intptr_t eltspersegment = MIN(16 / 4, elements); float32 *d = vd, *a = va; uint32_t *n = vn, *m = vm; + float_status fpst, fpst_odd; - for (i = 0; i < elements; i += eltspersegment) { - uint32_t m_idx = m[i + H4(index)]; + if (is_ebf(env, &fpst, &fpst_odd)) { + for (i = 0; i < elements; i += eltspersegment) { + uint32_t m_idx = m[i + H4(index)]; - for (j = i; j < i + eltspersegment; j++) { - d[j] = bfdotadd(a[j], n[j], m_idx); + for (j = i; j < i + eltspersegment; j++) { + d[j] = bfdotadd_ebf(a[j], n[j], m_idx, &fpst, &fpst_odd); + } + } + } else { + for (i = 0; i < elements; i += eltspersegment) { + uint32_t m_idx = m[i + H4(index)]; + + for (j = i; j < i + eltspersegment; j++) { + d[j] = bfdotadd(a[j], n[j], m_idx, &fpst); + } } } clear_tail(d, opr_sz, simd_maxsz(desc)); @@ -2850,40 +2883,76 @@ void HELPER(gvec_bfdot_idx)(void *vd, void *vn, void *vm, void HELPER(gvec_bfmmla)(void *vd, void *vn, void *vm, void *va, void *envp, uint32_t desc) { + CPUARMState *env = envp; intptr_t s, opr_sz = simd_oprsz(desc); float32 *d = vd, *a = va; uint32_t *n = vn, *m = vm; + float_status fpst, fpst_odd; - for (s = 0; s < opr_sz / 4; s += 4) { - float32 sum00, sum01, sum10, sum11; + if (is_ebf(env, &fpst, &fpst_odd)) { + for (s = 0; s < opr_sz / 4; s += 4) { + float32 sum00, sum01, sum10, sum11; - /* - * Process the entire segment at once, writing back the - * results only after we've consumed all of the inputs. - * - * Key to indices by column: - * i j i k j k - */ - sum00 = a[s + H4(0 + 0)]; - sum00 = bfdotadd(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)]); - sum00 = bfdotadd(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)]); + /* + * Process the entire segment at once, writing back the + * results only after we've consumed all of the inputs. + * + * Key to indices by column: + * i j i k j k + */ + sum00 = a[s + H4(0 + 0)]; + sum00 = bfdotadd_ebf(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)], &fpst, &fpst_odd); + sum00 = bfdotadd_ebf(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)], &fpst, &fpst_odd); - sum01 = a[s + H4(0 + 1)]; - sum01 = bfdotadd(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)]); - sum01 = bfdotadd(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)]); + sum01 = a[s + H4(0 + 1)]; + sum01 = bfdotadd_ebf(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)], &fpst, &fpst_odd); + sum01 = bfdotadd_ebf(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)], &fpst, &fpst_odd); - sum10 = a[s + H4(2 + 0)]; - sum10 = bfdotadd(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)]); - sum10 = bfdotadd(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)]); + sum10 = a[s + H4(2 + 0)]; + sum10 = bfdotadd_ebf(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)], &fpst, &fpst_odd); + sum10 = bfdotadd_ebf(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)], &fpst, &fpst_odd); - sum11 = a[s + H4(2 + 1)]; - sum11 = bfdotadd(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)]); - sum11 = bfdotadd(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)]); + sum11 = a[s + H4(2 + 1)]; + sum11 = bfdotadd_ebf(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)], &fpst, &fpst_odd); + sum11 = bfdotadd_ebf(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)], &fpst, &fpst_odd); - d[s + H4(0 + 0)] = sum00; - d[s + H4(0 + 1)] = sum01; - d[s + H4(2 + 0)] = sum10; - d[s + H4(2 + 1)] = sum11; + d[s + H4(0 + 0)] = sum00; + d[s + H4(0 + 1)] = sum01; + d[s + H4(2 + 0)] = sum10; + d[s + H4(2 + 1)] = sum11; + } + } else { + for (s = 0; s < opr_sz / 4; s += 4) { + float32 sum00, sum01, sum10, sum11; + + /* + * Process the entire segment at once, writing back the + * results only after we've consumed all of the inputs. + * + * Key to indices by column: + * i j i k j k + */ + sum00 = a[s + H4(0 + 0)]; + sum00 = bfdotadd(sum00, n[s + H4(0 + 0)], m[s + H4(0 + 0)], &fpst); + sum00 = bfdotadd(sum00, n[s + H4(0 + 1)], m[s + H4(0 + 1)], &fpst); + + sum01 = a[s + H4(0 + 1)]; + sum01 = bfdotadd(sum01, n[s + H4(0 + 0)], m[s + H4(2 + 0)], &fpst); + sum01 = bfdotadd(sum01, n[s + H4(0 + 1)], m[s + H4(2 + 1)], &fpst); + + sum10 = a[s + H4(2 + 0)]; + sum10 = bfdotadd(sum10, n[s + H4(2 + 0)], m[s + H4(0 + 0)], &fpst); + sum10 = bfdotadd(sum10, n[s + H4(2 + 1)], m[s + H4(0 + 1)], &fpst); + + sum11 = a[s + H4(2 + 1)]; + sum11 = bfdotadd(sum11, n[s + H4(2 + 0)], m[s + H4(2 + 0)], &fpst); + sum11 = bfdotadd(sum11, n[s + H4(2 + 1)], m[s + H4(2 + 1)], &fpst); + + d[s + H4(0 + 0)] = sum00; + d[s + H4(0 + 1)] = sum01; + d[s + H4(2 + 0)] = sum10; + d[s + H4(2 + 1)] = sum11; + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Tue Jul 30 16:03:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815314 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365827wrs; Tue, 30 Jul 2024 09:04:52 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCU7Jal800dLqJXUQEHUidzO0C1kyJ/vUFCCvQLW3LAQ84AXPbCNHLIp/BdA6yTb48YbcBBDBC+AdoAQZbgADWht X-Google-Smtp-Source: AGHT+IFZYJe9WLgTLEdHAXkJ3bA46PVtwlJ4y4rcHFbv5V1EW/CdGQVD/OuK+ow57y2pM2vIZPRq X-Received: by 2002:a05:6214:1311:b0:6b0:86ab:fe89 with SMTP id 6a1803df08f44-6bb55a8df3dmr164785056d6.33.1722355492226; Tue, 30 Jul 2024 09:04:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355492; cv=none; d=google.com; s=arc-20160816; b=aRPb4p07PkaV4G4aCif382So1DK5IBSZ6qF4NMPEoRKOeuzzwp+FiuyiR6jUdpy3zB NJnNGQ5DNpnEPG7zw2pAnpAjM1iv1VbhsDXbgmyue19qsDrB0Pd8/qdyQ+gBnQI3ELZg EWUqsiJKUvnoKScOAip9t+qZtjNj7OU3hZdkghTn5wMh69I0Y8Shp9Q1pYuptOAHTrqY +saIoULZvoCv1//VGD7W62lJdbGb97qJ690XVsM1YmxpEhKDwGN3rE4Mw/49REsgmkFx DbF0Xu3EGPWHjlcg6wXy0LNw4sf9sZCJfB3DoMzftwwpXnoY3xSOkliPpcrYClGtIvd4 fcgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=iDXGs+7guPqwtDoRL/y+LlQlY9IQZ4XWQ35haRtARdg=; fh=7myv0VtI++P9bBhxG9jh0i3zVi83RaUmxq8CNWe7lPw=; b=pL3297awdUQhN4gya/vD/tdoLKxqqy39rXm+pyme6d9oPlt7KsLPvciiYKM35fn7HU IT6aLnbCoppD8nze19hECFlFC+25llG7JpTfv18dZNlVroSUKVyT9Swh/arRLQ5SKnxk 1g3j9BCD/72pMkZ6yy+F4Ta81uM1aQwD69c99p3TDqvd3+NsSJYkR4t9/8w6fPuN/Ff9 7Yc2oNXwjyKW8BJZBMK5/txGMVZYlsQcu6YOajTdx2ZCC22cjGEM15l3/lSWspNDc4iM 6wvF+hYw1Gz7bWC2OI4gHVZCxYcdfEnu3tOlDE/b+UzXrLFgDkUhdfCfu78lylrC+4EH hRiw==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eITtdWou; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 6a1803df08f44-6bb3fa881acsi131088036d6.190.2024.07.30.09.04.52 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:52 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eITtdWou; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJu-00040M-Bv; Tue, 30 Jul 2024 12:03:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJs-0003pv-OS for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:24 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJp-0000I1-1O for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:24 -0400 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-428163f7635so27352085e9.2 for ; Tue, 30 Jul 2024 09:03:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355399; x=1722960199; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=iDXGs+7guPqwtDoRL/y+LlQlY9IQZ4XWQ35haRtARdg=; b=eITtdWourb0qB8xke3jBw6nplm/fK6GREgpsvwHe6njMIgwxpTUgSDsRJ8EoPUcth4 u2vL/Kxm5e57ShgOxBqlVZ9LS8VG1Pf5Pq7fQolmxTAZqQpTj0+UksUWuA2ibHnwB/Kr iXRH/NojnLFaketdiclGYtKFFa9q8y9Yvo6KIVDqV3ygI7WKu2J5N41Yhroq9COpyKHV GM/lHSuhYJNEFhftYQXNXJVhZJTFKda/+gSyLwK30NpeSrTJH59IZYKy3Cbw6xL5iRby vPc+oWPSMp1LmQ2uGqsMoq1rDqrkSXjCKSTXoXGKSqSzy8nalCheyWpRSkrYUjXecZnP Dx5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355399; x=1722960199; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iDXGs+7guPqwtDoRL/y+LlQlY9IQZ4XWQ35haRtARdg=; b=wQfSv49EVE5RCQE7AzJIl/rrbLZU+0hen6GcN5e533RWyOv15Ge8SQtkedT+9JdYMp 0xB+oFnfZo/mMKKowfA8ftgp0Jp6UGWXo9jJ24+liNbhE4L0I+dcYMyxX1hDotmpcrkR dDSYzSCMdBnePvamhSHgCrcfoJJHR7CxrGtZE5Ifsww2sR2Dw8SN8NRxXxYThxTuF0TM npO+1Nrxr/KiHdD2ekK6HoARmZK4JrjDZo7tgakpE07Ka52ONkQDbC/TAX94mV+3zhpR hZGzMM7TzPrQlxJE8fOnOhkKnhqi3pPi+N1xKu52kz5pSYd8MzhmbwnuydOwCcBFmMQI hDbQ== X-Forwarded-Encrypted: i=1; AJvYcCWOIsp40flcWDeNmpQSZZH1kAHN8uz5HxogUecsZysFWA+8FoffEayXVnDIV1Hg2ugbIFIlEwWu3Q4GXNg10gqxUVgDe/M= X-Gm-Message-State: AOJu0YzbuXiqIfYrbRf7Qfk08VY2ZomPSuWEmioL73ZtdYBds/LrxGLV jmuEZNQe0UefnoaF9mgd4kVh/bt71mW9zY1SbNZclpDjm4Tv6jQZob1KzONY6mckdcGFXOTiohb + X-Received: by 2002:a05:600c:468e:b0:426:66a2:b200 with SMTP id 5b1f17b1804b1-42811a8f351mr84164095e9.0.1722355399506; Tue, 30 Jul 2024 09:03:19 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:19 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 7/8] target/arm: Implement FPCR.EBF=1 semantics for bfdotadd() Date: Tue, 30 Jul 2024 17:03:05 +0100 Message-Id: <20240730160306.2959745-8-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Implement the FPCR.EBF=1 semantics for bfdotadd() operations: * is_ebf() sets up fpst and fpst_odd * bfdotadd_ebf() implements the fused paired-multiply-and-add operation that we need The paired-multiply-and-add is similar to f16_dotadd() and we use the same trick here as in that function, but the inputs here are bfloat16 rather than float16. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/tcg/vec_helper.c | 57 +++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 3 deletions(-) diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index baf04a0561b..64076c1c595 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -2792,7 +2792,20 @@ DO_MMLA_B(gvec_usmmla_b, do_usmmla_b) bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp) { - /* FPCR is ignored for BFDOT and BFMMLA. */ + /* + * For BFDOT, BFMMLA, etc, the behaviour depends on FPCR.EBF. + * For EBF = 0, we ignore the FPCR bits which determine rounding + * mode and denormal-flushing, and we do unfused multiplies and + * additions with intermediate rounding of all products and sums. + * For EBF = 1, we honour FPCR rounding mode and denormal-flushing bits, + * and we perform a fused two-way sum-of-products without intermediate + * rounding of the products. + * In either case, we don't set fp exception flags. + * + * EBF is AArch64 only, so even if it's set in the FPCR it has + * no effect on AArch32 instructions. + */ + bool ebf = is_a64(env) && env->vfp.fpcr & FPCR_EBF; float_status bf_status = { .tininess_before_rounding = float_tininess_before_rounding, .float_rounding_mode = float_round_to_odd_inf, @@ -2801,8 +2814,19 @@ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp) .default_nan_mode = true, }; + if (ebf) { + float_status *fpst = &env->vfp.fp_status; + set_flush_to_zero(get_flush_to_zero(fpst), &bf_status); + set_flush_inputs_to_zero(get_flush_inputs_to_zero(fpst), &bf_status); + set_float_rounding_mode(get_float_rounding_mode(fpst), &bf_status); + + /* EBF=1 needs to do a step with round-to-odd semantics */ + *oddstatusp = bf_status; + set_float_rounding_mode(float_round_to_odd, oddstatusp); + } + *statusp = bf_status; - return false; + return ebf; } float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst) @@ -2824,7 +2848,34 @@ float32 bfdotadd(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst) float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2, float_status *fpst, float_status *fpst_odd) { - g_assert_not_reached(); + /* + * Compare f16_dotadd() in sme_helper.c, but here we have + * bfloat16 inputs. In particular that means that we do not + * want the FPCR.FZ16 flush semantics, so we use the normal + * float_status for the input handling here. + */ + float64 e1r = float32_to_float64(e1 << 16, fpst); + float64 e1c = float32_to_float64(e1 & 0xffff0000u, fpst); + float64 e2r = float32_to_float64(e2 << 16, fpst); + float64 e2c = float32_to_float64(e2 & 0xffff0000u, fpst); + float64 t64; + float32 t32; + + /* + * The ARM pseudocode function FPDot performs both multiplies + * and the add with a single rounding operation. Emulate this + * by performing the first multiply in round-to-odd, then doing + * the second multiply as fused multiply-add, and rounding to + * float32 all in one step. + */ + t64 = float64_mul(e1r, e2r, fpst_odd); + t64 = float64r32_muladd(e1c, e2c, t64, 0, fpst); + + /* This conversion is exact, because we've already rounded. */ + t32 = float64_to_float32(t64, fpst); + + /* The final accumulation step is not fused. */ + return float32_add(sum, t32, fpst); } void HELPER(gvec_bfdot)(void *vd, void *vn, void *vm, void *va, From patchwork Tue Jul 30 16:03:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 815315 Delivered-To: patch@linaro.org Received: by 2002:a5d:4acf:0:b0:367:895a:4699 with SMTP id y15csp365858wrs; Tue, 30 Jul 2024 09:04:55 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCVWQJLL4JQdfMD2G36f1XxdxuqA0Tc7Avp8zXZVbJ8hKZ2wSKGJozt3UItq+30ne+e5LvRD54rKh8HBhgWTGMdN X-Google-Smtp-Source: AGHT+IEIrJz3up3fhzTYvmwLlO2gnnJQBIDNKjcgwV6fGk0JqbwWzUmrSEZZePx8BkhzJM17pm89 X-Received: by 2002:a05:620a:280d:b0:7a1:e9a3:7945 with SMTP id af79cd13be357-7a1e9a37e3bmr1002502285a.21.1722355495007; Tue, 30 Jul 2024 09:04:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1722355494; cv=none; d=google.com; s=arc-20160816; b=MRyC2TPsnqH9FmVyEs78X/F5rlv0hL8ixf5A61iwhDXx64bAf94HqOhzS4RqmnTHUJ FVAAZNH9idavsUTfpJ9SQr7iEcjosVJxOhKhsqg4figi3mN5jTSSlBBoKWfoExigJJRV ZJPLDxdup2YJ0SEshWOtY0lvP3Bxy2YW5nl8XCluNUjVRVqeTqk+JksyI26MLcy7QjDe PQ8elM/zEIi/n4PoucUb5HoAryw+KIJGvUTPOB7iAvYI4xk7amDWGsmjNwHs9/KeA0kH r5ucpk8lZIAfRYyHCdFPZ7U6czNtoxKu/Izk+OZvvHDh1naC3yD2WpcV4ggpUCJf12gQ 32Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=YvH6WF9jz8JWZCJ6bAYBpoQtH4MHX55fT9yLfRYRxQ0=; fh=2BqrFyoxqFM1OiJ1b5EeJqZgNJOaurd2+KM4hz9bm0U=; b=ycJRgHsBB2TfE5Ew+oyu4dFNIO9/GG08VRPAX5k0z2n++7wUoCsOWKPRWL9NkfUj7+ XxeHOH7jw9FGJOBADEMEdreT/MiRVuSsJUa1+KnE4+q+cldqvsln03i8BlrQwoYX/+YU Q10VeoKS2hWihksHgr715f7BEfFgwKBwg1m1s6quZ3bgylkDmJl98pNRMl5281VLQOx3 aJ3Q5RFVtqDI9wnYLjxRD2MpbqrOEoP7uVibIrZuma1FXamrnQSR0wXIR9ppoXstOFsk EK+DeQ27zvO+L+vNWsEpeDni76VHaRVNtvwnIpd88VhGv/505H59vnILMdkhY1wpoRd+ +BTQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tBoJxOSj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id af79cd13be357-7a1d7475c5csi1322415885a.609.2024.07.30.09.04.54 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Jul 2024 09:04:54 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=tBoJxOSj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sYpJu-00040S-C1; Tue, 30 Jul 2024 12:03:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sYpJt-0003un-Ar for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:25 -0400 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sYpJp-0000IB-Ni for qemu-devel@nongnu.org; Tue, 30 Jul 2024 12:03:24 -0400 Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-428141be2ddso28659385e9.2 for ; Tue, 30 Jul 2024 09:03:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1722355400; x=1722960200; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=YvH6WF9jz8JWZCJ6bAYBpoQtH4MHX55fT9yLfRYRxQ0=; b=tBoJxOSjlchtr6RiHaQrtYT5kCfBX190aoHaeTmTRf0jMySmOIBHiqZHXzaV93CYvc O0UpFSuVCaEQ4ZV3fLFDNx2Hmx1a+byyrUSSdqzuCUS8uIoxIDU2utEAAq7934qWDY24 Uyi93+m4m4qhffoAvmjZWw96EFest6GuYFlSYVzfWxRfMHZXQ/D+zkCYSoFx6tSmEVyx VNuhnglhdlp6cBKjE8iwW36EL7K1WPs+/6XowHXr92MaxKtg8YEpCsQbQhnzVvS2P93e 99W4FQIKU8Yxf9ozBoy1ml7eKOIF37jImK0jWhD2eE+mVioKsRZH7xa4l1eYvBbmrxtA gjBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722355400; x=1722960200; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YvH6WF9jz8JWZCJ6bAYBpoQtH4MHX55fT9yLfRYRxQ0=; b=tqpfSIc8bjREn+i/hA82PHNcrlfAwr4G1spfz3wxMcCeR3XUS5WoAjdeFqccN4+WkJ yfgPNzwlvcMyXURl6HPQj2btKTphSayTziq9LkA75nwBX4V3witBipH6PokvlRMA50Tx FnlOiF5j31N7h4OnJKO+c1azmzBLBCWYU4G8phe1CG//a+aGdoShs999JV/a53iKG/6d ycrq7fYA6NaWBhxNLgQWToZrxB0fkLpayW00D9KgSjLddrptyVRNa1kAJHfgNSgS4KBZ ogBsZgZSKluUDPKGJX+6VJEt78rkxd3vIKNiQ+1958NfZeIOwUwOmXG2eTNPE22DE8/K Ho2w== X-Forwarded-Encrypted: i=1; AJvYcCVWpCN8wBkAahk7a0hdHSpYIcCIQeX6znOAgL6bbFJyWohYox9ic/nEDPRyQ7lKfx/Yt8s2MiXwXl7ULkgso0e7/0qOQUU= X-Gm-Message-State: AOJu0Yyh+DoSI9lNR14ZCmJq3W0Ew1KFvfORg41FP39voJgeT0McoXkd cUmR852bvxwKkGMMrtJNaHJfIAENXkUEZ6tObY6PXVAJ1LYUzjL7Z+prPe32oyM= X-Received: by 2002:a5d:45cb:0:b0:368:4634:c419 with SMTP id ffacd0b85a97d-36b5d09e40emr6824079f8f.58.1722355399934; Tue, 30 Jul 2024 09:03:19 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-36b3685810csm15001676f8f.71.2024.07.30.09.03.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 09:03:19 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 8/8] target/arm: Enable FEAT_EBF16 in the "max" CPU Date: Tue, 30 Jul 2024 17:03:06 +0100 Message-Id: <20240730160306.2959745-9-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240730160306.2959745-1-peter.maydell@linaro.org> References: <20240730160306.2959745-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32a; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Now that we've implemented the required behaviour for FEAT_EBF16, we can enable it for the "max" CPU type, list it in our documentation, and delete a TODO comment about it being missing. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- docs/system/arm/emulation.rst | 1 + target/arm/tcg/cpu64.c | 4 ++-- target/arm/tcg/translate-sme.c | 1 - 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst index 3ab6e726679..35f52a54b1c 100644 --- a/docs/system/arm/emulation.rst +++ b/docs/system/arm/emulation.rst @@ -45,6 +45,7 @@ the following architecture extensions: - FEAT_DotProd (Advanced SIMD dot product instructions) - FEAT_DoubleFault (Double Fault Extension) - FEAT_E0PD (Preventing EL0 access to halves of address maps) +- FEAT_EBF16 (AArch64 Extended BFloat16 instructions) - FEAT_ECV (Enhanced Counter Virtualization) - FEAT_EL0 (Support for execution at EL0) - FEAT_EL1 (Support for execution at EL1) diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c index fe232eb3069..79258a7c928 100644 --- a/target/arm/tcg/cpu64.c +++ b/target/arm/tcg/cpu64.c @@ -1160,7 +1160,7 @@ void aarch64_max_tcg_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64ISAR1, FRINTTS, 1); /* FEAT_FRINTTS */ t = FIELD_DP64(t, ID_AA64ISAR1, SB, 1); /* FEAT_SB */ t = FIELD_DP64(t, ID_AA64ISAR1, SPECRES, 1); /* FEAT_SPECRES */ - t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 1); /* FEAT_BF16 */ + t = FIELD_DP64(t, ID_AA64ISAR1, BF16, 2); /* FEAT_BF16, FEAT_EBF16 */ t = FIELD_DP64(t, ID_AA64ISAR1, DGH, 1); /* FEAT_DGH */ t = FIELD_DP64(t, ID_AA64ISAR1, I8MM, 1); /* FEAT_I8MM */ cpu->isar.id_aa64isar1 = t; @@ -1244,7 +1244,7 @@ void aarch64_max_tcg_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1); t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2); /* FEAT_SVE_PMULL128 */ t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1); /* FEAT_SVE_BitPerm */ - t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 1); /* FEAT_BF16 */ + t = FIELD_DP64(t, ID_AA64ZFR0, BFLOAT16, 2); /* FEAT_BF16, FEAT_EBF16 */ t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1); /* FEAT_SVE_SHA3 */ t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1); /* FEAT_SVE_SM4 */ t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1); /* FEAT_I8MM */ diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c index bcb502feb05..760c200e622 100644 --- a/target/arm/tcg/translate-sme.c +++ b/target/arm/tcg/translate-sme.c @@ -354,7 +354,6 @@ TRANS_FEAT(FMOPA_s, aa64_sme, do_outprod_fpst, a, TRANS_FEAT(FMOPA_d, aa64_sme_f64f64, do_outprod_fpst, a, MO_64, FPST_FPCR, gen_helper_sme_fmopa_d) -/* TODO: FEAT_EBF16 */ TRANS_FEAT(BFMOPA, aa64_sme, do_outprod_env, a, MO_32, gen_helper_sme_bfmopa) TRANS_FEAT(SMOPA_s, aa64_sme, do_outprod, a, MO_32, gen_helper_sme_smopa_s)