From patchwork Mon May 5 16:14:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 887653 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E668E266F19 for ; Mon, 5 May 2025 16:14:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461659; cv=none; b=HX6RWrlh3Rc7NDdMErcxVTCggfH+cqQgbDEkMl19UsIV8Il8Bg0FOUqQArf9EFw9AF22qrdaMkdwsoFi5I/fm5IOMuzycZXsb2416C7TiE5yucNHI09caVXZBs7NCCkX/S9hCKmPZsJLrBf8jCzubJJKsba4sD57NFr5gOVam9o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461659; c=relaxed/simple; bh=v60h/PYlX3rvoNvgm4fRVRD4fD6gJf2WBSqw2Zx3QDU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jzaDFKFQWRF059oyoyZVO5WfSZiUqtkIOx+Gnk5x7MFj+Bfo/x8NOxe3HgLDiE515YY+zU1Y7nwCLh1fZlBW51fi7fCH4LXfTC0syLyJO3Wutf42bvaPROJ58g/KndFQ54+z8eSg1Dmk7HbkBfAIwYLTnEXB2QJj3VVMxuyQQB0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ZCeuiyDe; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZCeuiyDe" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-739525d4d7bso3095359b3a.2 for ; Mon, 05 May 2025 09:14:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461657; x=1747066457; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YHUmRqC710I5Zczb2Gv8+GrvF7PyHiekDQRANAyhxgo=; b=ZCeuiyDekSOaSZal9s7tvknAo51HddNL7ytjBh2nxpdRoxYnWwVvmeOjJWGzklf4ri S3BRTI1olqDIJ+CQyYpXQXOys+diroamVUZhodez39FfLantBqnuLtYy6GL1dWAZR2Vs xNjDIzoibD7gD74Kc1ohhxRgvB9xkG82GHOtCxQ8HA8SIThepT1Pfj/GuzT8KI20YkX8 ktv+kkXMZJKoQyErGqHUdZRSNWrE07cCXNm4TxOJ9UXN6YTVmt5TQ98jBWYMHC8Aoc+G aSvJ7mZTyZn4gDSC0xi1ml8gNKXErLc/y99ctW3AloqYx6wLpM1Pu4oOi8QC++04PN2u CDNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461657; x=1747066457; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YHUmRqC710I5Zczb2Gv8+GrvF7PyHiekDQRANAyhxgo=; b=Uld15ieFqWH9FyrZde+hs74xjmARzXEVxxpk3j2P25+8SDogvTTcAGq8je7LYDNqlY NruMO/lvA40/TOxvI3AAmsEVpxZ2ZlrKpzHDrH3ycxXycc9Pa5boDNV2UxcLjC90DAsM Yg/amqskOyxMZrf1MzZRr7LPr0VK2Mb8hpaaJ8FlufTUEHnbQF0vGG2Hody/xd1Iq2kC yyw7eXk5lmwmN0XpEL0nK3Q7PammQ6XyX3S0+Fxgt71F6H6zJ+GA2ZIFeRUhbHJm1j4/ SE6rKrPKPqpRRNgMBsxMHzLHJDiQpL/iK8JmMpIMV9qi8VpHWrDuCzcFPo2Q2IBSM6m7 E9kw== X-Forwarded-Encrypted: i=1; AJvYcCUjvGnTHyfoEdAARx/6hrdzF/pJLmMzWbGGSMiHhfSBNhe8/VOKFEHojjiILLnH/ttHdKAaoDZH6bpF3g5+xqE=@vger.kernel.org X-Gm-Message-State: AOJu0Yy06EUjdgEXM1R0NqUqmogK34VoJu8U4cE0mEkaG645yrRHXdGo ueJTCPLSCkPGoeBke7iE+1rBgIevUz+hLEx5x480h8fd5Fgnx8613BiArzvDgLAp0IDhj4Xk6hC pqErktxoifw== X-Google-Smtp-Source: AGHT+IGpv50Wz1FmfbCrQKvOYpNfJ4AHoIP5XJSc/cts11lJ70nVyGRkWYp/qqNzmDSatbGKe3GytAHP4hN+Uw== X-Received: from pfhp37.prod.google.com ([2002:a05:6a00:a25:b0:740:813:f7bb]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:1bca:b0:734:b136:9c39 with SMTP id d2e1a72fcca58-7406f1769bemr10644622b3a.19.1746461657217; Mon, 05 May 2025 09:14:17 -0700 (PDT) Date: Mon, 5 May 2025 16:14:07 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-2-jiaqiyan@google.com> Subject: [PATCH v1 1/6] KVM: arm64: VM exit to userspace to handle SEA From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan When APEI fails to handle a stage2 abort that is synchrnous external abort (SEA), today KVM directly injects an async SError to the VCPU then resumes it, which usually results in unpleasant guest kernel panic. One major situation of guest SEA is when vCPU consumes recoverable uncorrected memory error (UER). Although SError and guest kernel panic effectively stops the propagation of corrupted memory, there is still room to recover from memory UER in a more graceful manner. Alternatively KVM can redirect the synchronous SEA event to VMM to - Reduce blast radius if possible. VMM can inject a SEA to VCPU via KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or fault is not from guest kernel, blast radius can be limited to the triggering thread in guest userspace, so VM can keep running. - VMM can protect from future memory poison consumption by unmapping the page from stage-2 with KVM userfault [1]. VMM can also track SEA events that VM customer cares about, restart VM when certain number of distinct poison events happened, provide observability to customers [2]. Introduce following userspace-visible features to make VMM handle SEA: - KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior when host APEI fails to claim a SEA, userspace can opt in this new capability to let KVM exit to userspace during synchronous abort. - KVM_EXIT_ARM_SEA. A new exit reason is introduced for this, and KVM fills kvm_run.arm_sea with as much as possible information about the SEA, including - ESR_EL2. - If faulting guest virtual and physical addresses are available. - Faulting guest virtual address if available. - Faulting guest physical address if available. [1] https://lpc.events/event/18/contributions/1757/attachments/1442/3073/LPC_%20KVM%20Userfault.pdf [2] https://cloud.google.com/solutions/sap/docs/manage-host-errors Signed-off-by: Jiaqi Yan --- arch/arm64/include/asm/kvm_emulate.h | 12 +++++++ arch/arm64/include/asm/kvm_host.h | 8 +++++ arch/arm64/include/asm/kvm_ras.h | 21 ++++------- arch/arm64/kvm/Makefile | 3 +- arch/arm64/kvm/arm.c | 5 +++ arch/arm64/kvm/kvm_ras.c | 54 ++++++++++++++++++++++++++++ arch/arm64/kvm/mmu.c | 12 ++----- include/uapi/linux/kvm.h | 11 ++++++ 8 files changed, 101 insertions(+), 25 deletions(-) create mode 100644 arch/arm64/kvm/kvm_ras.c diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index bd020fc28aa9c..a9de30478a088 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -429,6 +429,18 @@ static __always_inline bool kvm_vcpu_abt_issea(const struct kvm_vcpu *vcpu) } } +/* Return true if FAR holds valid faulting guest virtual address. */ +static inline bool kvm_vcpu_sea_far_valid(const struct kvm_vcpu *vcpu) +{ + return !(kvm_vcpu_get_esr(vcpu) & ESR_ELx_FnV); +} + +/* Return true if HPFAR_EL2 holds valid faulting guest physical address. */ +static inline bool kvm_vcpu_sea_ipa_valid(const struct kvm_vcpu *vcpu) +{ + return vcpu->arch.fault.hpfar_el2 & HPFAR_EL2_NS; +} + static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) { u64 esr = kvm_vcpu_get_esr(vcpu); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 73b7762b0e7d1..e0129f9799f80 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -342,6 +342,14 @@ struct kvm_arch { #define KVM_ARCH_FLAG_GUEST_HAS_SVE 9 /* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */ #define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10 + /* + * When APEI failed to claim stage-2 synchronous external abort + * (SEA) return to userspace with fault information. Userspace + * can opt in this feature if KVM_CAP_ARM_SEA_TO_USER is + * supported. Userspace is encouraged to handle this VM exit + * by injecting a SEA to VCPU before resume the VCPU. + */ +#define KVM_ARCH_FLAG_RETURN_SEA_TO_USER 11 unsigned long flags; /* VM-wide vCPU feature set */ diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_ras.h index 9398ade632aaf..a2fd91af8f97e 100644 --- a/arch/arm64/include/asm/kvm_ras.h +++ b/arch/arm64/include/asm/kvm_ras.h @@ -4,22 +4,15 @@ #ifndef __ARM64_KVM_RAS_H__ #define __ARM64_KVM_RAS_H__ -#include -#include -#include - -#include +#include /* - * Was this synchronous external abort a RAS notification? - * Returns '0' for errors handled by some RAS subsystem, or -ENOENT. + * Handle stage2 synchronous external abort (SEA) in the following order: + * 1. Delegate to APEI/GHES and if they can claim SEA, resume guest. + * 2. If userspace opt-ed in KVM_CAP_ARM_SEA_TO_USER, exit to userspace + * with details about the SEA. + * 3. Otherwise, inject async SError into the VCPU and resume guest. */ -static inline int kvm_handle_guest_sea(void) -{ - /* apei_claim_sea(NULL) expects to mask interrupts itself */ - lockdep_assert_irqs_enabled(); - - return apei_claim_sea(NULL); -} +int kvm_handle_guest_sea(struct kvm_vcpu *vcpu); #endif /* __ARM64_KVM_RAS_H__ */ diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 209bc76263f10..785d568411e88 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -23,7 +23,8 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ vgic/vgic-v3.o vgic/vgic-v4.o \ vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \ vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \ - vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o + vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o \ + kvm_ras.o kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) += pauth.o diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 19ca57def6292..47544945fba45 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -133,6 +133,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, } mutex_unlock(&kvm->lock); break; + case KVM_CAP_ARM_SEA_TO_USER: + r = 0; + set_bit(KVM_ARCH_FLAG_RETURN_SEA_TO_USER, &kvm->arch.flags); + break; default: break; } @@ -322,6 +326,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_IRQFD_RESAMPLE: case KVM_CAP_COUNTER_OFFSET: case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS: + case KVM_CAP_ARM_SEA_TO_USER: r = 1; break; case KVM_CAP_SET_GUEST_DEBUG2: diff --git a/arch/arm64/kvm/kvm_ras.c b/arch/arm64/kvm/kvm_ras.c new file mode 100644 index 0000000000000..83f2731c95d77 --- /dev/null +++ b/arch/arm64/kvm/kvm_ras.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include + +/* + * Was this synchronous external abort a RAS notification? + * Returns 0 for errors handled by some RAS subsystem, or -ENOENT. + */ +static int kvm_delegate_guest_sea(void) +{ + /* apei_claim_sea(NULL) expects to mask interrupts itself. */ + lockdep_assert_irqs_enabled(); + return apei_claim_sea(NULL); +} + +int kvm_handle_guest_sea(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run = vcpu->run; + bool exit = test_bit(KVM_ARCH_FLAG_RETURN_SEA_TO_USER, + &vcpu->kvm->arch.flags); + + /* For RAS the host kernel may handle this abort. */ + if (kvm_delegate_guest_sea() == 0) + return 1; + + if (!exit) { + /* Fallback behavior prior to KVM_EXIT_ARM_SEA. */ + kvm_inject_vabt(vcpu); + return 1; + } + + run->exit_reason = KVM_EXIT_ARM_SEA; + run->arm_sea.esr = kvm_vcpu_get_esr(vcpu); + run->arm_sea.flags = 0ULL; + run->arm_sea.gva = 0ULL; + run->arm_sea.gpa = 0ULL; + + if (kvm_vcpu_sea_far_valid(vcpu)) { + run->arm_sea.flags |= KVM_EXIT_ARM_SEA_FLAG_GVA_VALID; + run->arm_sea.gva = kvm_vcpu_get_hfar(vcpu); + } + + if (kvm_vcpu_sea_ipa_valid(vcpu)) { + run->arm_sea.flags |= KVM_EXIT_ARM_SEA_FLAG_GPA_VALID; + run->arm_sea.gpa = kvm_vcpu_get_fault_ipa(vcpu); + } + + return 0; +} diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 754f2fe0cc673..a605ee56fa150 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1795,16 +1795,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) int ret, idx; /* Synchronous External Abort? */ - if (kvm_vcpu_abt_issea(vcpu)) { - /* - * For RAS the host kernel may handle this abort. - * There is no need to pass the error into the guest. - */ - if (kvm_handle_guest_sea()) - kvm_inject_vabt(vcpu); - - return 1; - } + if (kvm_vcpu_abt_issea(vcpu)) + return kvm_handle_guest_sea(vcpu); esr = kvm_vcpu_get_esr(vcpu); diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index b6ae8ad8934b5..79dc4676ff74b 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -178,6 +178,7 @@ struct kvm_xen_exit { #define KVM_EXIT_NOTIFY 37 #define KVM_EXIT_LOONGARCH_IOCSR 38 #define KVM_EXIT_MEMORY_FAULT 39 +#define KVM_EXIT_ARM_SEA 40 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -446,6 +447,15 @@ struct kvm_run { __u64 gpa; __u64 size; } memory_fault; + /* KVM_EXIT_ARM_SEA */ + struct { + __u64 esr; +#define KVM_EXIT_ARM_SEA_FLAG_GVA_VALID (1ULL << 0) +#define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 1) + __u64 flags; + __u64 gva; + __u64 gpa; + } arm_sea; /* Fix the size of the union. */ char padding[256]; }; @@ -930,6 +940,7 @@ struct kvm_enable_cap { #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 #define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239 +#define KVM_CAP_ARM_SEA_TO_USER 240 struct kvm_irq_routing_irqchip { __u32 irqchip; From patchwork Mon May 5 16:14:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 888355 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B92926A1BB for ; Mon, 5 May 2025 16:14:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461662; cv=none; b=IEHNASd/zIXNZWTdaThS/JxBu8nWT3mKO5eIPYnqzv5XiyNI7EYnLLFn5muHK+wISHKJI24c0LeISAoE+7yC4YkiMo4RWP6UCKN6twv9CFEq2LR+iEYWNzzwJ63aata7dWDUVkvUcWOTbILX3ivbKhAj+OK7YfBbNuYz8Ee+180= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461662; c=relaxed/simple; bh=tP/Dy38GceNJDf3Zr30G5oxFd5DNpiB2roiPdOIJcEM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mfWGUq5BZuQLVirjslp0Worciyg+f3ywmjIG/oKvmFc7AVP2elZJMirbxVj32LNtttCJELvt9h2gTB37HEZxUv1DiXfryrndASwnLS3VhXA0os7E92sICm1F9hiUNsH0S+2CFyhS1vt4q+kucUoPTLjFGH90f9ILr7oBJx4CUu0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Dq1de9w4; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Dq1de9w4" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b1b2fa98c39so4731344a12.2 for ; Mon, 05 May 2025 09:14:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461658; x=1747066458; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OJagTHHNdVxyDzkTZrB+wizm+z9g6erBUIGhulC+0c0=; b=Dq1de9w4Pg5IEbVQ7AXzJAaX/hESIaPjg90O5SF3l8pDliiJs8rWe/u6tsuAracit+ enRAQfE9EORFZ42ebCyZa/bC3BkD3qJd1OLRJbfQsGNX1Lf2jsi37yU9tsRg0uYDDV9g x11w5UrCq29k3EZSL14EV5gjIXw6SNj5/ORvLunUPHvhVR/ATPCRVnKOPUJmbOioPwSi t594DaLU7cHR6EtgdIP8I3gfqiNQB61vXIR6jmEMrJT45xYeTEZEH+CaaQ7TPClYXqgp lCkuzopDk+YmuIpvS5M/GSSIGAvjIvQbOAkWflbB7G9WX7fbXnytQmFBnALrXvkj11Md WxmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461658; x=1747066458; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OJagTHHNdVxyDzkTZrB+wizm+z9g6erBUIGhulC+0c0=; b=ubA+Nxs//RDYjcEOVjDZ46RkJPayugBk4G1/yHtehsRqjPeq+doheTdV7CNWVtLTVA 4HA9Ao5PovSspGTn4K6jBjP1DvRrdqiD1X1vpFEwQm86/CHmlSWhn45MfsT2/RK3oqzX HchabA2tIvLwCShJbycR6VZzq8yXpTPEDJVAxFOa4kCfhpXH2KaYZISkHz/e8GQp/Buw +YOlnYvEsLrBIvgp1eiaeH4+dSgi1m0IEUwlBIqLu5CEPrA9yVxO70H/7p0qqkKbzWMo Wbi3USBmjdz22senjbsf/hU6Z2tiMJgrr2fT2cCeGAJYIK9DKuyQj4cxLjMFhlI4c9oQ f4iQ== X-Forwarded-Encrypted: i=1; AJvYcCW9p+jKaYddlpfEl5yvQn1RDhQBE0MPyl+zSg+Rf8l5IVga7oLA2lA0ucsqWK9qegnkcU7zHtmLwLyl42khn58=@vger.kernel.org X-Gm-Message-State: AOJu0YylnYqXM+m6qS/El6v8oJLmqjgnfpc8ZinkDHgJh2BVvI3Ng5Rv g5pKeA5q0HSqRKB6SEIPONPJ0686cU6uNR6ydVl6V2PFLEm4k6eZLgJvfMBbtlQp/050t1Bxk8C T3hfWVEdigg== X-Google-Smtp-Source: AGHT+IHmeuVYVihs7sQ36TEo2hNXWZbWh14KISXEiA150JU3zbOiiYwpdjNYy9bWt46liO2U4UYaoHnwxcbrZQ== X-Received: from pjbpw4.prod.google.com ([2002:a17:90b:2784:b0:2f8:49ad:406c]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d883:b0:309:eb54:9ea2 with SMTP id 98e67ed59e1d1-30a5ae3f34cmr13505195a91.20.1746461658457; Mon, 05 May 2025 09:14:18 -0700 (PDT) Date: Mon, 5 May 2025 16:14:08 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-3-jiaqiyan@google.com> Subject: [PATCH v1 2/6] KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Certain microarchitectures (e.g. Neoverse V2) do not keep track of the faulting address for a memory load that consumes poisoned data and results in a synchronous external abort (SEA). This means the faulting guest physical address is unavailable when KVM handles such SEA in EL2, and FAR_EL2 just holds a garbage value. In case VMM later asks KVM to synchronously inject a SEA into the guest, KVM should set FnV bit - in VCPU's ESR_EL1 to let guest kernel know that FAR_EL1 is invalid and holds garbage value - in VCPU's ESR_EL2 to let nested virtualization know that FAR_EL2 is invalid and holds garbage value Signed-off-by: Jiaqi Yan --- arch/arm64/kvm/inject_fault.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index a640e839848e6..b4f9a09952ead 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -81,6 +81,9 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned long addr if (!is_iabt) esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT; + if (!kvm_vcpu_sea_far_valid(vcpu)) + esr |= ESR_ELx_FnV; + esr |= ESR_ELx_FSC_EXTABT; if (match_target_el(vcpu, unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC))) { From patchwork Mon May 5 16:14:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 887652 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AB7326B0A9 for ; Mon, 5 May 2025 16:14:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461663; cv=none; b=ljQMX+Hf/+8gbytV3d06kW+5kMaDu3fVt+t+Joy3h4p3ewjQ900AZn7H8LMkPQ+Jb5xK+7VpAB253Wmzb17wQOlWeA8b0WP/k4JkAjJ0sNyXsd+sBfTF4BC7NE3D8ZImcE8uDSYdwEHl0kiwJCVd/Jm5eTuykghWUr6gzrFD7w8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461663; c=relaxed/simple; bh=WNmcRnW25SuTNW/fidnyuzaC0ufF7CU/QhzVlk6Eyk4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QWYqJeQgFTITREaXQnZEaghpzsWlruRc4zcXsfmoWzrIXcrzl8Qn6xGsyx/8zzVBXDEhcGQYD8fz4f1c1pye0o8PKlElidF2ftsym9TW45rgEWX+YZFFGwM6juONj+cmz9bOSrI+l59TN97YDJnL2sM7W4e1XeGoz1bhctHuRRU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=y3GC/6Ew; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="y3GC/6Ew" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2242ade807fso71333905ad.2 for ; Mon, 05 May 2025 09:14:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461660; x=1747066460; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ih0/JpStngeu2lOMQmMMElA7ojSkXNH4xy6u5tdpksc=; b=y3GC/6EwVjLQsC9dtPwGfK31ngSOH8Y2h2UeLmZBn9r2UE375VLvVmiG0EBCmseboa sCCpM/i9ylGQL6AHx+v1eevWwgUOzrafX+C5dprIS/3CO89MkccX79RBAX1/X0diBLef luxSwCZKwYwJh2iHmIN2bk4s2A02SxhWk6o4cOjLy+vsJvR731lEkkhS9C8ePyC8W8Hg +737TdUPDpMFcEITdumJ3UAs2roxubWZtsDJUK/FWFq1CdEKx+3YANN6A25F+e+eR5H2 G+Li3rXUxZYP0p2azMmWBnQbbeKWIcS15Pd7zhgK3CTPmbIl/ZBk5qQpltd6q/WCPBVF nzwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461660; x=1747066460; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ih0/JpStngeu2lOMQmMMElA7ojSkXNH4xy6u5tdpksc=; b=OdTdVhSR6djf2diUQ2bF6oSqQf/DqTeidgqSoiJfrycrLS0hyt5nx20+c4HTw/wYyJ RtW+8SZspAa+fPeCyYDyOCUePtdRHBz1rsB8R7k3tqQjlK55wHnkwRDN67+pBHR8jkDY Pw2M5OS5RkHcmIZxJoKl/yotjVb6lUczl3QY47Ma2MH4qn16OlteOBn1tVOh9W3csbjX +epW6Vj3td4vjh1wQ2pp1Gp4JyILfk2ErMDGLzGaRx0W4mzmAj4xPEzb7VXTZORSFEl0 P7BN/21vicckvtRSQ4gSVr+NIlgLRcaEZte6ILlNX1X/v51sWbdxSiMAzPgJP5NUxaMp h5dQ== X-Forwarded-Encrypted: i=1; AJvYcCWcL2TZLLN+IRjGs52O8TjeZdFmQZNNVi1kuk6V3P57q1690s69Ko00W9em+6460jvfC7tAE2Ap+0goVyBiN+Y=@vger.kernel.org X-Gm-Message-State: AOJu0YzrC0NlEDuXpjyT2g3S2GqDX00J1d67OywApizG3M/8ovPujE1S l83H2nh2NfLNlHjzfWjdZVzdEMzZ4jksz7yQIiZY7fSGDFxmHxCRQm59Q1GCLYjKNGvdqBg4eQB SMa2yF8QrAg== X-Google-Smtp-Source: AGHT+IEALx06q4ifLBOjCQ9ZZOZ/gQJWNi+Q92PHdvJDeENKCM2QDj0Em0rsjG/Q7RXdhODS4FzwytaIsf6wMQ== X-Received: from pfvo15.prod.google.com ([2002:a05:6a00:1b4f:b0:73d:b1c6:c137]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f683:b0:224:f12:3734 with SMTP id d9443c01a7336-22e1ea87368mr99744415ad.30.1746461659950; Mon, 05 May 2025 09:14:19 -0700 (PDT) Date: Mon, 5 May 2025 16:14:09 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-4-jiaqiyan@google.com> Subject: [PATCH v1 3/6] KVM: arm64: Allow userspace to inject external instruction aborts From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan From: Raghavendra Rao Ananta When KVM returns to userspace for KVM_EXIT_ARM_SEA, the userspace is encouraged to inject the abort into the guest via KVM_SET_VCPU_EVENTS. KVM_SET_VCPU_EVENTS currently only allows injecting external data aborts. However, the synchronous external abort that caused KVM_EXIT_ARM_SEA is possible to be an instruction abort. Userspace is already able to tell if an abort is due to data or instruction via kvm_run.arm_sea.esr, by checking its Exception Class value. Extend the KVM_SET_VCPU_EVENTS ioctl to allow injecting instruction abort into the guest. Signed-off-by: Jiaqi Yan Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/include/uapi/asm/kvm.h | 3 ++- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/guest.c | 13 ++++++++++--- include/uapi/linux/kvm.h | 1 + 4 files changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index ed5f3892674c7..643e8c4825451 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -184,8 +184,9 @@ struct kvm_vcpu_events { __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 47544945fba45..dc2efb627f450 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -319,6 +319,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: case KVM_CAP_ARM_NISV_TO_USER: case KVM_CAP_ARM_INJECT_EXT_DABT: + case KVM_CAP_ARM_INJECT_EXT_IABT: case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_VCPU_ATTRIBUTES: case KVM_CAP_PTP_KVM: diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 2196979a24a32..4917361ecf5cb 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -825,9 +825,9 @@ int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu, events->exception.serror_esr = vcpu_get_vsesr(vcpu); /* - * We never return a pending ext_dabt here because we deliver it to - * the virtual CPU directly when setting the event and it's no longer - * 'pending' at this point. + * We never return a pending ext_dabt or ext_iabt here because we + * deliver it to the virtual CPU directly when setting the event + * and it's no longer 'pending' at this point. */ return 0; @@ -839,6 +839,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, bool serror_pending = events->exception.serror_pending; bool has_esr = events->exception.serror_has_esr; bool ext_dabt_pending = events->exception.ext_dabt_pending; + bool ext_iabt_pending = events->exception.ext_iabt_pending; if (serror_pending && has_esr) { if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN)) @@ -852,8 +853,14 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, kvm_inject_vabt(vcpu); } + /* DABT and IABT cannot happen at the same time. */ + if (ext_dabt_pending && ext_iabt_pending) + return -EINVAL; + if (ext_dabt_pending) kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); + else if (ext_iabt_pending) + kvm_inject_pabt(vcpu, kvm_vcpu_get_hfar(vcpu)); return 0; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 79dc4676ff74b..bcf2b95b79123 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -941,6 +941,7 @@ struct kvm_enable_cap { #define KVM_CAP_X86_GUEST_MODE 238 #define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239 #define KVM_CAP_ARM_SEA_TO_USER 240 +#define KVM_CAP_ARM_INJECT_EXT_IABT 241 struct kvm_irq_routing_irqchip { __u32 irqchip; From patchwork Mon May 5 16:14:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 887651 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D86C626B955 for ; Mon, 5 May 2025 16:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461665; cv=none; b=jsmrJhQm0XQ4KeebZV3idOXVSjB8xjug271BpUOWmzNClyRaSzQHB8IkFVlnH0jQ+xgLfjPPXRPoc0h6khABSk2m7rhRevxCYEkcrs/gBL/IwXUcywbdkERi3W0hpxuBiwKF6FBEdXPKgpy5RXnbpCoZOwrSV4uUTkprIDTJLlc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461665; c=relaxed/simple; bh=TA/VLKIsq4UUQwvxrC4ZVZvjB+NeMfRDMShqDmxRKko=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=A25C90vwWLRLkxta73+ndpGMNlA/0WzsOvvlsM9SGCmG5o8AKTUnK9KIyWOqSZIQn5jzNXd+LVjpPWnraHyyPy82efAE6GbcJ4YSubpIIZODuyMWPv/PpJC5QKQhHEQGDF6wUlebHTcfQPLnY+nFJWUi9QD1adNILapfQgI90Jo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iyCaXB2j; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iyCaXB2j" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-736b2a25d9fso3052257b3a.0 for ; Mon, 05 May 2025 09:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461661; x=1747066461; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yvZJykt7Uf4dVmRhJKmJi6MP1scq5omIxYcDpxOQw64=; b=iyCaXB2jUbX1a1VRSdqXQTNBuqFOQMsX0/aofnZG+zQlzOQAkkKwxfY97cO+S0azz8 11st0K4Fud+EvGocPWNmiLAIEOF71zSEMsnUefTHp2vys0bAQYfdc9bqGHThwZyJwQvd 2UhAnO+QZpAKK/qvt8CmXSaql3KtdR+eS/PvmV8TzV2NX6X0qY5skNd3jx0f5ORm/5iN 5e1XxqEohUi1HRnTyT5xxIFhGLRGeqiE+nbqnweo4w6bTxmcl6ZXeTgr6h6v4jVA6nkk 4tsw1Q+GHrky7X8aLjf1al+wCrpzh4TCyHsQj022zsne7rebjCzObMRXa6tzcXfEKhfV TEaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461661; x=1747066461; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yvZJykt7Uf4dVmRhJKmJi6MP1scq5omIxYcDpxOQw64=; b=GbYRQyo6X7w8dQrcGd5eavsTRxBrpWQQ7pmZRz9pTJjLydEy1zBg84uKnJ9EsUdmUr 5E6a9pQJkKKScJ7KYeJySdQp2c5K0Cjb2Kw8t+cWEdkrYgQEkyanzrIrRa2yPF0HfgJS zTfbdkyuGUx7xtoOjzboVpFWzyu4yzCK3i9RTadBkfO3n+8n1Q3tmfNbERtajPh98zC6 IZNk+JQTVlWY2+DwRwzcCzChu5SVgbVD2JMEWNPSn2SxpRrHvvQ7+qWU2bCT/tKFNeGa ErhrTJJm3X5OZVxl9cZ62Lzio1RUAdiTBsbOih6miDiT/N1kkqKCoIYrt51hPofkiOd1 zzdg== X-Forwarded-Encrypted: i=1; AJvYcCVKTUL+Zn1H7GliM/wDj13r/zuMpBievMqNR1vamaJz96tseNqgQNkOo5jIOLJbX53tqp0nIxPdkdB8WBcYnv4=@vger.kernel.org X-Gm-Message-State: AOJu0Ywe49DgHt/ljDVnUmti0LiR/1INoJem6Y59VbLy36lzpr7PYq+e ej1YtFqOTcm2t73EopoFrS0ciiE2AeZBg6OuSIBRlS2UFXaKlH9/TRAcwSPhWt3SUVvks9DZoSh mxoVmXtobHg== X-Google-Smtp-Source: AGHT+IHn8tBuCeZUAxg8gs/dBzpH1HCCE85DCDtaZxk6M+ZfRc5ag2yPjrWYgrHQohzrfHegE2cATmHScRjqUA== X-Received: from pghg13.prod.google.com ([2002:a63:e60d:0:b0:b1f:bc65:a8df]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:32a2:b0:1f5:8da5:ffe9 with SMTP id adf61e73a8af0-20e966057d7mr9762805637.12.1746461661200; Mon, 05 May 2025 09:14:21 -0700 (PDT) Date: Mon, 5 May 2025 16:14:10 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-5-jiaqiyan@google.com> Subject: [PATCH v1 4/6] KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Test how KVM handles guest stage-2 SEA when APEI is unable to claim it. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided correct fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/arm64/sea_to_user.c | 324 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 1 + 3 files changed, 326 insertions(+) create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index f62b0a5aba35a..16d2e9f32619f 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -151,6 +151,7 @@ TEST_GEN_PROGS_arm64 += arm64/hypercalls TEST_GEN_PROGS_arm64 += arm64/mmio_abort TEST_GEN_PROGS_arm64 += arm64/page_fault_test TEST_GEN_PROGS_arm64 += arm64/psci_test +TEST_GEN_PROGS_arm64 += arm64/sea_to_user TEST_GEN_PROGS_arm64 += arm64/set_id_regs TEST_GEN_PROGS_arm64 += arm64/smccc_filter TEST_GEN_PROGS_arm64 += arm64/vcpu_width_config diff --git a/tools/testing/selftests/kvm/arm64/sea_to_user.c b/tools/testing/selftests/kvm/arm64/sea_to_user.c new file mode 100644 index 0000000000000..9490cdbad3466 --- /dev/null +++ b/tools/testing/selftests/kvm/arm64/sea_to_user.c @@ -0,0 +1,324 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test KVM returns to userspace with KVM_EXIT_ARM_SEA if host APEI fails + * to handle SEA and userspace has opt-ed in KVM_CAP_ARM_SEA_TO_USER. + * + * After reaching userspace with expected arm_sea info, also test userspace + * injecting a synchronous external data abort into the guest. + * + * This test utilizes EINJ to generate a REAL synchronous external data + * abort by consuming a recoverable uncorrectable memory error. Therefore + * the device under test must support EINJ in both firmware and host kernel, + * including the notrigger feature. Otherwise the test will be skipped. + * The under-test platform's APEI should be unable to claim SEA. Otherwise + * the test will also be skipped. + */ + +#include +#include +#include +#include + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "guest_modes.h" + +#define PAGE_PRESENT (1ULL << 63) +#define PAGE_PHYSICAL 0x007fffffffffffffULL +#define PAGE_ADDR_MASK (~(0xfffULL)) + +/* Value for "Recoverable state (UER)". */ +#define ESR_ELx_SET_UER 0U + +#define EINJ_ETYPE "/sys/kernel/debug/apei/einj/error_type" +#define EINJ_ADDR "/sys/kernel/debug/apei/einj/param1" +#define EINJ_MASK "/sys/kernel/debug/apei/einj/param2" +#define EINJ_FLAGS "/sys/kernel/debug/apei/einj/flags" +#define EINJ_NOTRIGGER "/sys/kernel/debug/apei/einj/notrigger" +#define EINJ_DOIT "/sys/kernel/debug/apei/einj/error_inject" +/* Memory Uncorrectable non-fatal. */ +#define ERROR_TYPE_MEMORY_UER 0x10 +/* Memory address and mask valid (param1 and param2). */ +#define MASK_MEMORY_UER 0b10 + +/* Guest virtual address region = [2G, 3G). */ +#define START_GVA 0x80000000UL +#define VM_MEM_SIZE 0x40000000UL +/* Note: EINJ_OFFSET must < VM_MEM_SIZE. */ +#define EINJ_OFFSET 0x05234badUL +#define EINJ_GVA ((START_GVA) + (EINJ_OFFSET)) + +static vm_paddr_t einj_gpa; +static void *einj_hva; +static uint64_t einj_hpa; +static bool far_invalid; + +static uint64_t translate_to_host_paddr(unsigned long vaddr) +{ + uint64_t pinfo; + int64_t offset = vaddr / getpagesize() * sizeof(pinfo); + int fd; + uint64_t page_addr; + uint64_t paddr; + + fd = open("/proc/self/pagemap", O_RDONLY); + if (fd < 0) + ksft_exit_fail_perror("Failed to open /proc/self/pagemap"); + if (pread(fd, &pinfo, sizeof(pinfo), offset) != sizeof(pinfo)) { + close(fd); + ksft_exit_fail_perror("Failed to read /proc/self/pagemap"); + } + + close(fd); + + if ((pinfo & PAGE_PRESENT) == 0) + ksft_exit_fail_perror("Page not present"); + + page_addr = (pinfo & PAGE_PHYSICAL) << MIN_PAGE_SHIFT; + paddr = page_addr + (vaddr & (getpagesize() - 1)); + return paddr; +} + +static void write_einj_entry(const char *einj_path, uint64_t val) +{ + char cmd[256] = {0}; + FILE *cmdfile = NULL; + + sprintf(cmd, "echo %#lx > %s", val, einj_path); + cmdfile = popen(cmd, "r"); + + if (pclose(cmdfile) == 0) + ksft_print_msg("echo %#lx > %s - done\n", val, einj_path); + else + ksft_exit_fail_perror("Failed to write EINJ entry"); +} + +static void inject_uer(uint64_t paddr) +{ + if (access("/sys/firmware/acpi/tables/EINJ", R_OK) == -1) + ksft_test_result_skip("EINJ table no available in firmware"); + + if (access(EINJ_ETYPE, R_OK | W_OK) == -1) + ksft_test_result_skip("EINJ module probably not loaded?"); + + write_einj_entry(EINJ_ETYPE, ERROR_TYPE_MEMORY_UER); + write_einj_entry(EINJ_FLAGS, MASK_MEMORY_UER); + write_einj_entry(EINJ_ADDR, paddr); + write_einj_entry(EINJ_MASK, ~0x0UL); + write_einj_entry(EINJ_NOTRIGGER, 1); + write_einj_entry(EINJ_DOIT, 1); +} + +/* + * When host APEI successfully claims the SEA caused by guest_code, kernel + * will send SIGBUS signal with BUS_MCEERR_AR to test thread. + * + * We set up this SIGBUS handler to skip the test for that case. + */ +static void sigbus_signal_handler(int sig, siginfo_t *si, void *v) +{ + ksft_print_msg("SIGBUS (%d) received, dumping siginfo...\n", sig); + ksft_print_msg("si_signo=%d, si_errno=%d, si_code=%d, si_addr=%p\n", + si->si_signo, si->si_errno, si->si_code, si->si_addr); + if (si->si_code == BUS_MCEERR_AR) + ksft_test_result_skip("SEA is claimed by host APEI\n"); + else + ksft_test_result_fail("Exit with signal unhandled\n"); + + exit(0); +} + +static void setup_sigbus_handler(void) +{ + struct sigaction act; + + memset(&act, 0, sizeof(act)); + sigemptyset(&act.sa_mask); + act.sa_sigaction = sigbus_signal_handler; + act.sa_flags = SA_SIGINFO; + TEST_ASSERT(sigaction(SIGBUS, &act, NULL) == 0, + "Failed to setup SIGBUS handler"); +} + +static void guest_code(void) +{ + uint64_t guest_data; + + /* Consumes error will cause a SEA. */ + guest_data = *(uint64_t *)EINJ_GVA; + + GUEST_FAIL("Data corruption not prevented by SEA: gva=%#lx, data=%#lx", + EINJ_GVA, guest_data); +} + +static void expect_sea_handler(struct ex_regs *regs) +{ + u64 esr = read_sysreg(esr_el1); + u64 far = read_sysreg(far_el1); + bool expect_far_invalid = far_invalid; + + GUEST_PRINTF("Guest SEA esr_el1=%#lx, far_el1=%#lx\n", esr, far); + + GUEST_ASSERT_EQ(ESR_ELx_EC(esr), ESR_ELx_EC_DABT_CUR); + GUEST_ASSERT_EQ(esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + + if (expect_far_invalid) { + GUEST_ASSERT(esr & ESR_ELx_FnV); + GUEST_PRINTF("Guest observed garbage value in FAR\n"); + } else { + GUEST_ASSERT(!(esr & ESR_ELx_FnV)); + GUEST_ASSERT_EQ(far, EINJ_GVA); + } + + GUEST_DONE(); +} + +static void vcpu_inject_sea(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events = {}; + + events.exception.ext_dabt_pending = true; + vcpu_events_set(vcpu, &events); +} + +static void run_vm(struct kvm_vm *vm, struct kvm_vcpu *vcpu) +{ + struct ucall uc; + bool guest_done = false; + struct kvm_run *run = vcpu->run; + + /* Resume the vCPU after error injection to consume the error. */ + vcpu_run(vcpu); + + ksft_print_msg("Dump kvm_run info about KVM_EXIT_%s\n", + exit_reason_str(run->exit_reason)); + ksft_print_msg("kvm_run.arm_sea: esr=%#llx, flags=%#llx\n", + run->arm_sea.esr, run->arm_sea.flags); + ksft_print_msg("kvm_run.arm_sea: gva=%#llx, gpa=%#llx\n", + run->arm_sea.gva, run->arm_sea.gpa); + + /* Validate the KVM_EXIT. */ + TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_ARM_SEA); + TEST_ASSERT_EQ(ESR_ELx_EC(run->arm_sea.esr), ESR_ELx_EC_DABT_LOW); + TEST_ASSERT_EQ(run->arm_sea.esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + TEST_ASSERT_EQ(run->arm_sea.esr & ESR_ELx_SET_MASK, ESR_ELx_SET_UER); + + if (run->arm_sea.flags & KVM_EXIT_ARM_SEA_FLAG_GVA_VALID) + TEST_ASSERT_EQ(run->arm_sea.gva, EINJ_GVA); + + if (run->arm_sea.flags & KVM_EXIT_ARM_SEA_FLAG_GPA_VALID) + TEST_ASSERT_EQ(run->arm_sea.gpa, einj_gpa & PAGE_ADDR_MASK); + + far_invalid = run->arm_sea.esr & ESR_ELx_FnV; + + /* Inject a SEA into guest and expect handled in SEA handler. */ + vcpu_inject_sea(vcpu); + + /* Expect the guest to reach GUEST_DONE gracefully. */ + do { + vcpu_run(vcpu); + switch (get_ucall(vcpu, &uc)) { + case UCALL_PRINTF: + ksft_print_msg("From guest: %s", uc.buffer); + break; + case UCALL_DONE: + ksft_print_msg("Guest done gracefully!\n"); + guest_done = 1; + break; + case UCALL_ABORT: + ksft_print_msg("Guest aborted!\n"); + guest_done = 1; + REPORT_GUEST_ASSERT(uc); + break; + default: + TEST_FAIL("Unexpected ucall: %lu\n", uc.cmd); + } + } while (!guest_done); +} + +static struct kvm_vm *vm_create_with_sea_handler(struct kvm_vcpu **vcpu) +{ + size_t backing_page_size; + size_t guest_page_size; + size_t alignment; + uint64_t num_guest_pages; + vm_paddr_t start_gpa; + enum vm_mem_backing_src_type src_type = VM_MEM_SRC_ANONYMOUS_HUGETLB_1GB; + struct kvm_vm *vm; + + backing_page_size = get_backing_src_pagesz(src_type); + guest_page_size = vm_guest_mode_params[VM_MODE_DEFAULT].page_size; + alignment = max(backing_page_size, guest_page_size); + num_guest_pages = VM_MEM_SIZE / guest_page_size; + + vm = __vm_create_with_one_vcpu(vcpu, num_guest_pages, guest_code); + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(*vcpu); + + vm_install_sync_handler(vm, + /*vector=*/VECTOR_SYNC_CURRENT, + /*ec=*/ESR_ELx_EC_DABT_CUR, + /*handler=*/expect_sea_handler); + + start_gpa = (vm->max_gfn - num_guest_pages) * guest_page_size; + start_gpa = align_down(start_gpa, alignment); + + vm_userspace_mem_region_add( + /*vm=*/vm, + /*src_type=*/src_type, + /*guest_paddr=*/start_gpa, + /*slot=*/1, + /*npages=*/num_guest_pages, + /*flags=*/0); + + virt_map(vm, START_GVA, start_gpa, num_guest_pages); + + ksft_print_msg("Mapped %#lx pages: gva=%#lx to gpa=%#lx\n", + num_guest_pages, START_GVA, start_gpa); + return vm; +} + +static void vm_inject_memory_uer(struct kvm_vm *vm) +{ + uint64_t guest_data; + + einj_gpa = addr_gva2gpa(vm, EINJ_GVA); + einj_hva = addr_gva2hva(vm, EINJ_GVA); + + /* Populate certain data before injecting UER. */ + *(uint64_t *)einj_hva = 0xBAADCAFE; + guest_data = *(uint64_t *)einj_hva; + ksft_print_msg("Before EINJect: data=%#lx\n", + guest_data); + + einj_hpa = translate_to_host_paddr((unsigned long)einj_hva); + + ksft_print_msg("EINJ_GVA=%#lx, einj_gpa=%#lx, einj_hva=%p, einj_hpa=%#lx\n", + EINJ_GVA, einj_gpa, einj_hva, einj_hpa); + + inject_uer(einj_hpa); + ksft_print_msg("Memory UER EINJected\n"); +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + struct kvm_vcpu *vcpu; + + TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_SEA_TO_USER)); + + setup_sigbus_handler(); + + vm = vm_create_with_sea_handler(&vcpu); + + vm_enable_cap(vm, KVM_CAP_ARM_SEA_TO_USER, 0); + + vm_inject_memory_uer(vm); + + run_vm(vm, vcpu); + + kvm_vm_free(vm); + + return 0; +} diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 815bc45dd8dc6..bc9fcf6c3295a 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -2021,6 +2021,7 @@ static struct exit_reason { KVM_EXIT_STRING(NOTIFY), KVM_EXIT_STRING(LOONGARCH_IOCSR), KVM_EXIT_STRING(MEMORY_FAULT), + KVM_EXIT_STRING(ARM_SEA), }; /* From patchwork Mon May 5 16:14:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 888354 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E36326C3A4 for ; Mon, 5 May 2025 16:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461665; cv=none; b=Uteg11d5IY9CEmL/ELXrSWMUgGDxwtm/IaeGRQ6lClK8JV+XZwlAHjylgA0+WoghYtERXymWEP1J8zHpCvml6gnJiHcKaobPlLPy9nehYHU5w41iQ46HnSfRxYLIHNeai9pItteq7JJ208JBF92tLUdLQljvZ3xuOD11KN6z900= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461665; c=relaxed/simple; bh=0xPFNmQ2aYdSjFcVuUL7sWXmakZUXXwCI8TdRZJDz9g=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OtWFJSk1IpqaR/VRNazirIk95jJRYFUt2z+lCYuKMKJIwobUbheOxVyI0pwxGSLtHrTeVxRHjm90JWjWFMnzZy8WsDXAqQnEMJnR0CrbFL6zkeFRj+F5JzheQ28xHQkQfYPUXfMcwZ1beLI+e/yE5QOjBEBguEe2SWtpOjBTLjI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Cc/uRD0N; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Cc/uRD0N" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-224347aef79so63718245ad.2 for ; Mon, 05 May 2025 09:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461663; x=1747066463; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qoM9DcHdqZBj+rk9l05sHHopuoPZxH677juEX9RJpMg=; b=Cc/uRD0NkHMrvPWyT0Y0tQDFxYje4u+lAlj51UllWvP+vb8F75TLhDZfMk/s9tqp7e By63/13gFQ5NcWX7qRtzSQjeih5tCGR1c4iv2BjHSclUscQqkiqAuSwKsxd0IHNDsbN4 lJ9rEWZRh/3i2Z8c49kE1kkVvhByLqyf3sx4WUcRy4ipDV13h2nIem54t3B5v+pL2lTJ YETibOULRG4lMzgU1+AZWkqKJztkgRlfKGqXeceMHAucShGOXji7IaI+ZjWesRpq+mRV 9m0qNeK9j2gO5EmLahmYFSfuVz6COBeV5t8a+2SDHuUFsRcf/qxHnR97MySv8E7pPpl7 59vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461663; x=1747066463; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qoM9DcHdqZBj+rk9l05sHHopuoPZxH677juEX9RJpMg=; b=Vorpnymqh825ROxniJhKRTR5Q+eNs63/UsltLPNJMdUWJr5UAYuodtOxlhvtzrtktr b+kvr8dWOE1+ET90z+lnrn778O0LPpHUDgAcapMBG0l8B1Ow5rWJSvi4sxuWD5YnGbGR Iidbe+OVwieRfwqUotErD1RPapUHj5d6irFjacfyuq0cH7USejNJgITe5utDvefpIid1 i1fkMtqSdbjh+FkzkBKV6Kxm9fQ/lZUSbOPauPVxioaxN4zvgoZcvjeqjcMkWT8NP4ye lm6uOYvBNhGWjFesu2sFLi+QOsSuEcx+ZA7LbYWJJdxsVbpbH51oHyrC7s6IiukT9Qdd ejlQ== X-Forwarded-Encrypted: i=1; AJvYcCWkTdGUYbfl6XVnlA6/JGbd4JnjBCYUhT0IMtpM5MMtkzVtvnND45W8qZxAHtPCQRO1UM1jR/dH0lUOySnRyAE=@vger.kernel.org X-Gm-Message-State: AOJu0YzBDMErI6QKKXSD2Cl/3KbV4q4NkIJpBLCrQfisLUqwAS4m+9i2 GAIQu5pNWuf59Ufzx4lI25Rv+Jv2+AxWwsP8KJWtmZXe2aNZJFEc0+9wXUhLLyxkRz2e+lJXw3j Kt2iS/pk+HQ== X-Google-Smtp-Source: AGHT+IFZMxB5zKbz0DCB224XcS28TLNRLHuTk3G5CmkpWtkGxQ7uQKT12C81gz2CRUWIcDMNDv5x3jsR5vZQHg== X-Received: from pfbdh13.prod.google.com ([2002:a05:6a00:478d:b0:739:485f:c33e]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2383:b0:21f:52e:939e with SMTP id d9443c01a7336-22e18bc4033mr159804055ad.28.1746461662677; Mon, 05 May 2025 09:14:22 -0700 (PDT) Date: Mon, 5 May 2025 16:14:11 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-6-jiaqiyan@google.com> Subject: [PATCH v1 5/6] KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Test userspace can use KVM_SET_VCPU_EVENTS to inject an external instruction abort into guest. The test injects instruction abort at an arbitrary time without real SEA happening in the guest VCPU, so only certain ESR_EL1 value can be expected, but not the case for FAR_EL1. Signed-off-by: Jiaqi Yan --- tools/arch/arm64/include/uapi/asm/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/arm64/inject_iabt.c | 100 ++++++++++++++++++ 3 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h index af9d9acaf9975..d3a4530846311 100644 --- a/tools/arch/arm64/include/uapi/asm/kvm.h +++ b/tools/arch/arm64/include/uapi/asm/kvm.h @@ -184,8 +184,9 @@ struct kvm_vcpu_events { __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index 16d2e9f32619f..708fd126a36dd 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -148,6 +148,7 @@ TEST_GEN_PROGS_arm64 += arm64/aarch32_id_regs TEST_GEN_PROGS_arm64 += arm64/arch_timer_edge_cases TEST_GEN_PROGS_arm64 += arm64/debug-exceptions TEST_GEN_PROGS_arm64 += arm64/hypercalls +TEST_GEN_PROGS_arm64 += arm64/inject_iabt TEST_GEN_PROGS_arm64 += arm64/mmio_abort TEST_GEN_PROGS_arm64 += arm64/page_fault_test TEST_GEN_PROGS_arm64 += arm64/psci_test diff --git a/tools/testing/selftests/kvm/arm64/inject_iabt.c b/tools/testing/selftests/kvm/arm64/inject_iabt.c new file mode 100644 index 0000000000000..43b701e9143c2 --- /dev/null +++ b/tools/testing/selftests/kvm/arm64/inject_iabt.c @@ -0,0 +1,100 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * inject_iabt.c - Tests for injecting instruction aborts into guest. + */ + +#include "processor.h" +#include "test_util.h" + +static void expect_iabt_handler(struct ex_regs *regs) +{ + u64 esr = read_sysreg(esr_el1); + + GUEST_PRINTF("Guest SEA esr_el1=%#lx\n", esr); + GUEST_ASSERT_EQ(ESR_ELx_EC(esr), ESR_ELx_EC_IABT_CUR); + GUEST_ASSERT_EQ(esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + /* + * We inject IABT but there is no SEA in guest at all, + * so guest should see FnV == 1, which is set by KVM. + */ + GUEST_ASSERT(esr & ESR_ELx_FnV); + + GUEST_DONE(); +} + +static void guest_code(void) +{ + GUEST_FAIL("Guest should only run SEA handler"); +} + +static void vcpu_run_expect_done(struct kvm_vcpu *vcpu) +{ + struct ucall uc; + bool guest_done = false; + + do { + vcpu_run(vcpu); + switch (get_ucall(vcpu, &uc)) { + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + break; + case UCALL_PRINTF: + ksft_print_msg("From guest: %s", uc.buffer); + case UCALL_DONE: + ksft_print_msg("Guest done gracefully!\n"); + guest_done = true; + break; + default: + TEST_FAIL("Unexpected ucall: %lu", uc.cmd); + } + } while (!guest_done); +} + +static void vcpu_inject_ext_iabt(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events = {}; + + events.exception.ext_iabt_pending = true; + vcpu_events_set(vcpu, &events); +} + +static void vcpu_inject_invalid_abt(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events = {}; + int r; + + events.exception.ext_iabt_pending = true; + events.exception.ext_dabt_pending = true; + + ksft_print_msg("Injecting invalid external abort events\n"); + r = __vcpu_ioctl(vcpu, KVM_SET_VCPU_EVENTS, &events); + TEST_ASSERT(r && errno == EINVAL, + KVM_IOCTL_ERROR(KVM_SET_VCPU_EVENTS, r)); +} + +static void test_inject_iabt(void) +{ + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + + vm = vm_create_with_one_vcpu(&vcpu, guest_code); + + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(vcpu); + + vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT, + ESR_ELx_EC_IABT_CUR, expect_iabt_handler); + + vcpu_inject_invalid_abt(vcpu); + + vcpu_inject_ext_iabt(vcpu); + vcpu_run_expect_done(vcpu); + + kvm_vm_free(vm); +} + +int main(void) +{ + test_inject_iabt(); + return 0; +} From patchwork Mon May 5 16:14:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 888353 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE15426D4E0 for ; Mon, 5 May 2025 16:14:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461667; cv=none; b=BjF0XMDipKrOIcyHIKQ4sRblxIFir8t5oNP7j8b5j9gkiFIExaGxfhMEy7F7a619cEpUVocycoJX/RILVfnp4JHiW0VFW8QrpODag9uI8HMft/G2ziXX+djO8TdZr12ocfSh9KyK0cFnBDtpIMstcP9eVfhu0Ec31g2d9UTb7nM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461667; c=relaxed/simple; bh=fyGzyoomoTYo9P/p+ApF12DGGrn8h/VhXiAUSg88UfE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sQq+8TlDD3HhetDhiyK7g3ckH3+A6g5SzzJ4jrjUCM3Tk4xzuXZdZZMOGaruO7/fe3KLlgQvAAD4QyZMSGqcfNkl8LS/5e5YvAgf74Qbte4hZVA6smSEU3tJJnjjQgg9CW9s3KY1wcvi6d1KWRY2+Z/UTsj09Wa3X+lGcq0QqCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JjLX4CbE; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JjLX4CbE" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-736cd36189bso6538250b3a.2 for ; Mon, 05 May 2025 09:14:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461664; x=1747066464; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qCRTPaLL/ddSZCO1i6dVdhSDdFDF5EjuW/HmAYg1Ry0=; b=JjLX4CbEoI+2S1VRPgO5cgh/LAQ5gedaYVo9IK9XWCblD8R069jq+QHSdrX9ijz9Na x5onDG5LYYy2sjhiu/VwW59NpFhwJ08fm82kGvNJAmx5x+dFABnWuk77ggmHZQhx39Lh rtdk9X5PD+H92zftPKpwBRzxKQMQ+cQ/1KllcM/zbSL7zy7nMuuAX2uEqjL5jmUq3Q66 rp9x983JaJX1PSoMe9RlMvRMfgPIA/vU+DJxwdSbOh8rjuM9Saei/fSXrTXj84kW/Cz2 Xv+E+kjORA+R6knHiZ+W8ycm5lp30v6pRW5Bzf1RmZMRzkR0bsKS7jPD2KudT05bFDf7 cS5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461664; x=1747066464; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qCRTPaLL/ddSZCO1i6dVdhSDdFDF5EjuW/HmAYg1Ry0=; b=wG+nV5m/5t7oPU8SMzX/PV1QvtnisIEth63q+V4+kvxy7JU37BpvL75RrYI1BDnERG AyY4kQCjvWXOJQLy0D8VRd4ASSvQW7mxnzJ96afMY9JxflzACFN/tafvtLaaOH5h4KT5 ZYvtyxJ/ZKCvhEDX7DSs+GOLNQTzHS7XSFkl92q4SoS8MTf8ysDDG9rl9I/GHMTV7PQL hdH4DQuQG235sZ4Uq92k9V15bZIMjr1J/J6qRcZPmeNelS/M6xqKE+N+EHuz66Z8K75a baaMCrDP48ZiDMj/GkZu/GNEhvY2Rg4qLnyt/bu/4dkRt6zCOcGV8WpnHQ/L3smSnkS8 Urgw== X-Forwarded-Encrypted: i=1; AJvYcCX6Zw0wdzcfeXT2GBGl/ZYHoKrZ5b041zRoF2xhankymLTQjDGqUiR1HcpPx7dRJA9sAQL/GcUEKqI1H6l0u+M=@vger.kernel.org X-Gm-Message-State: AOJu0YyUaNRnHe5ENGdNSYN9u5La71shtSQzsgMBkgovlZwBt5uUzGyA C53BdaJoKvq+EP78yyZnvSVtzL9w7rLI5drGipumXMtDpUUH3alhhMqS3eXzrrnzPezhGDltxCe PKMq1mC+AOQ== X-Google-Smtp-Source: AGHT+IE1SVeU3n4ai6CrxrMJm0u9dBQoZvk8Si1b7xa5PwydPXT87LL3o3Oqy2HVQLl3IBAbqfHn31O9UmD3SA== X-Received: from pfbki23.prod.google.com ([2002:a05:6a00:9497:b0:73d:b1c4:5d7f]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2e9d:b0:740:6f7f:7645 with SMTP id d2e1a72fcca58-7406f7f7a1amr12841654b3a.8.1746461664178; Mon, 05 May 2025 09:14:24 -0700 (PDT) Date: Mon, 5 May 2025 16:14:12 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-7-jiaqiyan@google.com> Subject: [PATCH v1 6/6] Documentation: kvm: new uAPI for handling SEA From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Document the new userspace-visible features and APIs for handling synchronous external abort (SEA) - KVM_CAP_ARM_SEA_TO_USER: How userspace enables the new feature. - KVM_EXIT_ARM_SEA: When userspace needs to handle SEA and what userspace gets while taking the SEA. - KVM_CAP_ARM_INJECT_EXT_(D|I)ABT: How userspace injects SEA to guest while taking the SEA. Signed-off-by: Jiaqi Yan --- Documentation/virt/kvm/api.rst | 120 +++++++++++++++++++++++++++++---- 1 file changed, 107 insertions(+), 13 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 47c7c3f92314e..fa91a123e1b88 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1236,8 +1236,9 @@ directly to the virtual CPU). __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; @@ -1292,20 +1293,52 @@ ARM64: User space may need to inject several types of events to the guest. +Inject SError +~~~~~~~~~~~~~ + Set the pending SError exception state for this VCPU. It is not possible to 'cancel' an Serror that has been made pending. -If the guest performed an access to I/O memory which could not be handled by -userspace, for example because of missing instruction syndrome decode -information or because there is no device mapped at the accessed IPA, then -userspace can ask the kernel to inject an external abort using the address -from the exiting fault on the VCPU. It is a programming error to set -ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or -KVM_EXIT_ARM_NISV. This feature is only available if the system supports -KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in -how userspace reports accesses for the above cases to guests, across different -userspace implementations. Nevertheless, userspace can still emulate all Arm -exceptions by manipulating individual registers using the KVM_SET_ONE_REG API. +Inject SEA (synchronous external abort) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- If the guest performed an access to I/O memory which could not be handled by + userspace, for example because of missing instruction syndrome decode + information or because there is no device mapped at the accessed IPA. + +- If the guest consumed an uncorrected memory error, and RAS extension in the + Trusted Firmware choose to notify PE with SEA, KVM has to handle it when + host APEI is unable to claim the SEA. For the following types of faults, + if userspace enabled KVM_CAP_ARM_SEA_TO_USER, KVM returns to userspace with + KVM_EXIT_ARM_SEA: + + - Synchronous external abort, not on translation table walk or hardware + update of translation table. + + - Synchronous external abort on translation table walk or hardware update of + translation table, including all levels. + + - Synchronous parity or ECC error on memory access, not on translation table + walk. + + - Synchronous parity or ECC error on memory access on translation table walk + or hardware update of translation table, including all levels. + +For the cases above, userspace can ask the kernel to replay either an external +data abort (by setting ext_dabt_pending) or an external instruciton abort +(by setting ext_iabt_pending) into the faulting VCPU. KVM will use the address +from the exiting fault on the VCPU. Setting both ext_dabt_pending and +ext_iabt_pending at the same time will return -EINVAL. + +It is a programming error to set ext_dabt_pending or ext_iabt_pending after an +exit which was not KVM_EXIT_MMIO, KVM_EXIT_ARM_NISV or KVM_EXIT_ARM_SEA. +Injecting SEA for data and instruction abort is only available if KVM supports +KVM_CAP_ARM_INJECT_EXT_DABT and KVM_CAP_ARM_INJECT_EXT_IABT respectively. + +This is a helper which provides commonality in how userspace reports accesses +for the above cases to guests, across different userspace implementations. +Nevertheless, userspace can still emulate all Arm exceptions by manipulating +individual registers using the KVM_SET_ONE_REG API. See KVM_GET_VCPU_EVENTS for the data structure. @@ -7151,6 +7184,55 @@ The valid value for 'flags' is: - KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid in VMCS. It would run into unknown result if resume the target VM. +:: + + /* KVM_EXIT_ARM_SEA */ + struct { + __u64 esr; + #define KVM_EXIT_ARM_SEA_FLAG_GVA_VALID (1ULL << 0) + #define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 1) + __u64 flags; + __u64 gva; + __u64 gpa; + } arm_sea; + +Used on arm64 systems. When the VM capability KVM_CAP_ARM_SEA_TO_USER is +enabled, a VM exit is generated if guest caused a synchronous external abort +(SEA) and the host APEI fails to handle the SEA. + +Historically KVM handles SEA by first delegating the SEA to host APEI as there +is high chance that the SEA is caused by consuming uncorrected memory error. +However, not all platforms support SEA handling in APEI, and KVM's fallback +handling is to inject an async SError into the guest, which usually panics +guest kernel unpleasantly. As an alternative, userspace can participate into +the SEA handling by enabling KVM_CAP_ARM_SEA_TO_USER at VM creation, after +querying the capability. Once enabled, when KVM has to handle the guest +caused SEA, it returns to userspace with KVM_EXIT_ARM_SEA, with details +about the SEA available in 'arm_sea'. + +The 'esr' filed holds the value of the exception syndrome register (ESR) while +KVM taking the SEA, which tells userspace the character of the current SEA, +such as its Exception Class, Synchronous Error Type, Fault Specific Code and +so on. For more details on ESR, check the Arm Architecture Registers +documentation. + +The 'flags' field indicates if the faulting addresses are available while +taking the SEA: + + - KVM_EXIT_ARM_SEA_FLAG_GVA_VALID -- the faulting guest virtual address + is valid and userspace can get its value in the 'gva' field. + - KVM_EXIT_ARM_SEA_FLAG_GPA_VALID -- the faulting guest physical address + is valid and userspace can get its value in the 'gpa' filed. + +Userspace needs to take actions to handle guest SEA synchronously, namely in +the same thread that runs KVM_RUN and receives KVM_EXIT_ARM_SEA. One of the +encouraged approaches is to utilize the KVM_SET_VCPU_EVENTS to inject the SEA +to the faulting VCPU. This way, the guest has the opportunity to keep running +and limit the blast radius of the SEA to the particular guest application that +caused the SEA. If the Exception Class indicated by 'esr' field in 'arm_sea' +is data abort, userspace should inject data abort. If the Exception Class is +instruction abort, userspace should inject instruction abort. + :: /* Fix the size of the union. */ @@ -8478,7 +8560,7 @@ ENOSYS for the others. When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request. -7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS +7.42 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS ------------------------------------- :Architectures: arm64 @@ -8496,6 +8578,18 @@ aforementioned registers before the first KVM_RUN. These registers are VM scoped, meaning that the same set of values are presented on all vCPUs in a given VM. +7.43 KVM_CAP_ARM_SEA_TO_USER +---------------------------- + +:Architecture: arm64 +:Target: VM +:Parameters: none +:Returns: 0 on success, -EINVAL if unsupported. + +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means +that KVM has an implementation that allows userspace to participate in handling +synchronous external abort caused by VM, by an exit of KVM_EXIT_ARM_SEA. + 8. Other capabilities. ======================