From patchwork Fri May 12 08:04:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 681383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82181C77B75 for ; Fri, 12 May 2023 08:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240363AbjELIFT (ORCPT ); Fri, 12 May 2023 04:05:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240393AbjELIEq (ORCPT ); Fri, 12 May 2023 04:04:46 -0400 Received: from mailgw01.mediatek.com (unknown [60.244.123.138]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E6C6100DF; Fri, 12 May 2023 01:04:14 -0700 (PDT) X-UUID: 92b4b9a6f09b11ed9cb5633481061a41-20230512 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=kxpTPW4fUnQXTn1qEYtOfV7mHPepHDW5YLASuODBSPw=; b=pUJkpCwYnxrEPNZ28czpI4uPNzFWcGR/7Z/rFp9iTp45eUJ6meOk0qGUBNWmDpAgFUxHwUh2WtOfkALvBPn/SRRECZLltVR4GDanXZt/BcWvb/VrEmU2ck6sBNKN0F0ArWiVaQOvun/38qEL3iVVkoXOA5oBDTHavFcgV6uF0F0=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.24, REQID:43fff8e7-6256-465e-9ae8-9fff7299eda7, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:178d4d4, CLOUDID:96bca1c0-e32c-4c97-918d-fbb3fc224d4e, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:0,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-UUID: 92b4b9a6f09b11ed9cb5633481061a41-20230512 Received: from mtkmbs10n2.mediatek.inc [(172.21.101.183)] by mailgw01.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 978832527; Fri, 12 May 2023 16:04:09 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.194) by mtkmbs11n1.mediatek.inc (172.21.101.185) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Fri, 12 May 2023 16:04:08 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Fri, 12 May 2023 16:04:08 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Arnd Bergmann , Matthias Brugger , AngeloGioacchino Del Regno CC: , , , , , , Trilok Soni , David Bradil , Jade Shih , Miles Chen , Ivan Tseng , My Chuang , Shawn Hsiao , PeiLun Suei , Liju Chen Subject: [PATCH v3 5/7] virt: geniezone: Add irqchip support for virtual interrupt injection Date: Fri, 12 May 2023 16:04:03 +0800 Message-ID: <20230512080405.12043-6-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230512080405.12043-1-yi-de.wu@mediatek.com> References: <20230512080405.12043-1-yi-de.wu@mediatek.com> MIME-Version: 1.0 X-MTK: N Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org From: "Yingshiuan Pan" Enable GenieZone to handle virtual interrupt injection request. Signed-off-by: Yingshiuan Pan Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/Makefile | 2 +- arch/arm64/geniezone/vgic.c | 91 +++++++++++++++++++++++++++ drivers/virt/geniezone/Makefile | 2 +- drivers/virt/geniezone/gzvm_common.h | 12 ++++ drivers/virt/geniezone/gzvm_irqchip.c | 13 ++++ drivers/virt/geniezone/gzvm_vm.c | 80 ++++++++++++++++++++++- include/linux/gzvm_drv.h | 4 ++ include/uapi/linux/gzvm.h | 38 ++++++++++- 8 files changed, 238 insertions(+), 4 deletions(-) create mode 100644 arch/arm64/geniezone/vgic.c create mode 100644 drivers/virt/geniezone/gzvm_common.h create mode 100644 drivers/virt/geniezone/gzvm_irqchip.c diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile index 69b0a4abeab0..0e4f1087f9de 100644 --- a/arch/arm64/geniezone/Makefile +++ b/arch/arm64/geniezone/Makefile @@ -4,6 +4,6 @@ # include $(srctree)/drivers/virt/geniezone/Makefile -gzvm-y += vm.o vcpu.o +gzvm-y += vm.o vcpu.o vgic.o obj-$(CONFIG_MTK_GZVM) += gzvm.o diff --git a/arch/arm64/geniezone/vgic.c b/arch/arm64/geniezone/vgic.c new file mode 100644 index 000000000000..7e26a800b9d0 --- /dev/null +++ b/arch/arm64/geniezone/vgic.c @@ -0,0 +1,91 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include + +#include +#include +#include "gzvm_arch_common.h" + +/* is_irq_valid() - Check the irq number and irq_type are matched */ +static bool is_irq_valid(u32 irq, u32 irq_type) +{ + switch (irq_type) { + case GZVM_IRQ_TYPE_CPU: + /* 0 ~ 15: SGI */ + if (likely(irq <= GZVM_IRQ_CPU_FIQ)) + return true; + break; + case GZVM_IRQ_TYPE_PPI: + /* 16 ~ 31: PPI */ + if (likely(irq >= VGIC_NR_SGIS && irq < VGIC_NR_PRIVATE_IRQS)) + return true; + break; + case GZVM_IRQ_TYPE_SPI: + /* 32 ~ : SPT */ + if (likely(irq >= VGIC_NR_PRIVATE_IRQS)) + return true; + break; + default: + return false; + } + return false; +} + +/** + * gzvm_vgic_inject_irq() - Inject virtual interrupt to a VM + * @vcpu_idx: vcpu index, only valid if PPI + * @irq: irq number + * @level: 1 if true else 0 + */ +static int gzvm_vgic_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, u32 irq_type, + u32 irq, bool level) +{ + unsigned long a1 = assemble_vm_vcpu_tuple(gzvm->vm_id, vcpu_idx); + struct arm_smccc_res res; + + if (!unlikely(is_irq_valid(irq, irq_type))) + return -EINVAL; + + gzvm_hypcall_wrapper(MT_HVC_GZVM_IRQ_LINE, a1, irq, level, + 0, 0, 0, 0, &res); + if (res.a0) { + pr_err("Failed to set IRQ level (%d) to irq#%u on vcpu %d with ret=%d\n", + level, irq, vcpu_idx, (int)res.a0); + return -EFAULT; + } + + return 0; +} + +/** + * gzvm_vgic_inject_spi() - Inject virtual spi interrupt + * + * @spi_irq: This is spi interrupt number (starts from 0 instead of 32) + * + * Return 0 if succeed else other negative values indicating each errors + */ +static int gzvm_vgic_inject_spi(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 spi_irq, bool level) +{ + return gzvm_vgic_inject_irq(gzvm, 0, GZVM_IRQ_TYPE_SPI, + spi_irq + VGIC_NR_PRIVATE_IRQS, level); +} + +int gzvm_arch_create_device(gzvm_id_t vm_id, struct gzvm_create_device *gzvm_dev) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_DEVICE, vm_id, + virt_to_phys(gzvm_dev), 0, 0, 0, 0, 0, &res); +} + +int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, u32 irq_type, + u32 irq, bool level) +{ + /* default use spi */ + return gzvm_vgic_inject_spi(gzvm, vcpu_idx, irq, level); +} diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile index 8ebf2db0c970..67ba3ed76ea7 100644 --- a/drivers/virt/geniezone/Makefile +++ b/drivers/virt/geniezone/Makefile @@ -7,5 +7,5 @@ GZVM_DIR ?= ../../../drivers/virt/geniezone gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \ - $(GZVM_DIR)/gzvm_vcpu.o + $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqchip.o diff --git a/drivers/virt/geniezone/gzvm_common.h b/drivers/virt/geniezone/gzvm_common.h new file mode 100644 index 000000000000..d0e39ded79e6 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_common.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#ifndef __GZ_COMMON_H__ +#define __GZ_COMMON_H__ + +int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq_type, u32 irq, bool level); + +#endif /* __GZVM_COMMON_H__ */ diff --git a/drivers/virt/geniezone/gzvm_irqchip.c b/drivers/virt/geniezone/gzvm_irqchip.c new file mode 100644 index 000000000000..134bca3ab247 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_irqchip.c @@ -0,0 +1,13 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include "gzvm_common.h" + +int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq_type, u32 irq, bool level) +{ + return gzvm_arch_inject_irq(gzvm, vcpu_idx, irq_type, irq, level); +} diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index dc9992f0c68a..7ed4e4c3c1ee 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -12,6 +12,7 @@ #include #include #include +#include "gzvm_common.h" static DEFINE_MUTEX(gzvm_list_lock); static LIST_HEAD(gzvm_list); @@ -28,7 +29,8 @@ static LIST_HEAD(gzvm_list); * * 0 - Succeed * * -EFAULT - Failed to convert */ -static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, u64 *pfn) +static int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, + u64 *pfn) { hfn_t __pfn; struct kvm_memory_slot kvm_slot = {0}; @@ -186,6 +188,68 @@ gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm, return register_memslot_addr_range(gzvm, memslot); } +static int gzvm_vm_ioctl_irq_line(struct gzvm *gzvm, + struct gzvm_irq_level *irq_level) +{ + u32 irq = irq_level->irq; + unsigned int irq_type, vcpu_idx, irq_num; + bool level = irq_level->level; + + irq_type = (irq >> GZVM_IRQ_TYPE_SHIFT) & GZVM_IRQ_TYPE_MASK; + vcpu_idx = (irq >> GZVM_IRQ_VCPU_SHIFT) & GZVM_IRQ_VCPU_MASK; + vcpu_idx += ((irq >> GZVM_IRQ_VCPU2_SHIFT) & GZVM_IRQ_VCPU2_MASK) * + (GZVM_IRQ_VCPU_MASK + 1); + irq_num = (irq >> GZVM_IRQ_NUM_SHIFT) & GZVM_IRQ_NUM_MASK; + + return gzvm_irqchip_inject_irq(gzvm, vcpu_idx, irq_type, irq_num, + level); +} + +static int gzvm_vm_ioctl_create_device(struct gzvm *gzvm, void __user *argp) +{ + struct gzvm_create_device *gzvm_dev; + void *dev_data = NULL; + int ret; + + gzvm_dev = (struct gzvm_create_device *)alloc_pages_exact(PAGE_SIZE, + GFP_KERNEL); + if (!gzvm_dev) + return -ENOMEM; + if (copy_from_user(gzvm_dev, argp, sizeof(*gzvm_dev))) { + ret = -EFAULT; + goto err_free_dev; + } + + if (gzvm_dev->attr_addr != 0 && gzvm_dev->attr_size != 0) { + size_t attr_size = gzvm_dev->attr_size; + void __user *attr_addr = (void __user *)gzvm_dev->attr_addr; + + /* Size of device specific data should not be over a page. */ + if (attr_size > PAGE_SIZE) + return -EINVAL; + + dev_data = alloc_pages_exact(attr_size, GFP_KERNEL); + if (!dev_data) { + ret = -ENOMEM; + goto err_free_dev; + } + + if (copy_from_user(dev_data, attr_addr, attr_size)) { + ret = -EFAULT; + goto err_free_dev_data; + } + gzvm_dev->attr_addr = virt_to_phys(dev_data); + } + + ret = gzvm_arch_create_device(gzvm->vm_id, gzvm_dev); +err_free_dev_data: + if (dev_data) + free_pages_exact(dev_data, 0); +err_free_dev: + free_pages_exact(gzvm_dev, 0); + return ret; +} + static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp) @@ -220,6 +284,20 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_vm_ioctl_set_memory_region(gzvm, &userspace_mem); break; } + case GZVM_IRQ_LINE: { + struct gzvm_irq_level irq_event; + + ret = -EFAULT; + if (copy_from_user(&irq_event, argp, sizeof(irq_event))) + goto out; + + ret = gzvm_vm_ioctl_irq_line(gzvm, &irq_event); + break; + } + case GZVM_CREATE_DEVICE: { + ret = gzvm_vm_ioctl_create_device(gzvm, argp); + break; + } case GZVM_ENABLE_CAP: { struct gzvm_enable_cap cap; diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index 5736ddf97741..1e7c81597e9a 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -107,6 +107,10 @@ int gzvm_arch_create_vcpu(gzvm_id_t vm_id, int vcpuid, void *run); int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason); int gzvm_arch_destroy_vcpu(gzvm_id_t vm_id, int vcpuid); +int gzvm_arch_create_device(gzvm_id_t vm_id, struct gzvm_create_device *gzvm_dev); +int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, u32 irq_type, + u32 irq, bool level); + extern struct platform_device *gzvm_debug_dev; #endif /* __GZVM_DRV_H__ */ diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index dbf74d63379b..b39ace47d589 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -83,7 +83,43 @@ struct gzvm_userspace_memory_region { #define GZVM_IRQ_CPU_IRQ 0 #define GZVM_IRQ_CPU_FIQ 1 -/* ioctls for vcpu fds */ +struct gzvm_irq_level { + union { + __u32 irq; + __s32 status; + }; + __u32 level; +}; + +#define GZVM_IRQ_LINE _IOW(GZVM_IOC_MAGIC, 0x61, \ + struct gzvm_irq_level) + +enum gzvm_device_type { + GZVM_DEV_TYPE_ARM_VGIC_V3_DIST, + GZVM_DEV_TYPE_ARM_VGIC_V3_REDIST, + GZVM_DEV_TYPE_MAX, +}; + +struct gzvm_create_device { + __u32 dev_type; /* device type */ + __u32 id; /* out: device id */ + __u64 flags; /* device specific flags */ + __u64 dev_addr; /* device ipa address in VM's view */ + __u64 dev_reg_size; /* device register range size */ + /* + * If user -> kernel, this is user virtual address of device specific + * attributes (if needed). If kernel->hypervisor, this is ipa. + */ + __u64 attr_addr; + __u64 attr_size; /* size of device specific attributes */ +}; + +#define GZVM_CREATE_DEVICE _IOWR(GZVM_IOC_MAGIC, 0xe0, \ + struct gzvm_create_device) + +/* + * ioctls for vcpu fds + */ #define GZVM_RUN _IO(GZVM_IOC_MAGIC, 0x80) /* VM exit reason */ From patchwork Fri May 12 08:04:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 681381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E48DC77B75 for ; Fri, 12 May 2023 08:05:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240402AbjELIFu (ORCPT ); Fri, 12 May 2023 04:05:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240413AbjELIEy (ORCPT ); Fri, 12 May 2023 04:04:54 -0400 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18A8211B78; Fri, 12 May 2023 01:04:16 -0700 (PDT) X-UUID: 932c4fc0f09b11edb20a276fd37b9834-20230512 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=WvavxkIXfFSmquLoSHWtRn8G3Upwg2chJaG1DCsaQQk=; b=Ra6jF50PgSzauvfm38De24ftn08tmNcMZCKhADH8DAO6LvBS87uKTlsRVyK9FY0YhYD8cduLrFQH/7q+8LAEJMkOJobmU+rWp4f0R8ecO8cW/+undQxPXWuaMiUGQl1xQ7ZXRZpZoz+IhQJig1okbSM6mSqOTBdZbyThem26c/E=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.24, REQID:d4ec292b-d171-4d74-848a-35241cee5577, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:100,FILE:0,BULK:0,RULE:Release_Ham,ACT ION:release,TS:75 X-CID-INFO: VERSION:1.1.24, REQID:d4ec292b-d171-4d74-848a-35241cee5577, IP:0, URL :0,TC:0,Content:-25,EDM:0,RT:0,SF:100,FILE:0,BULK:0,RULE:Spam_GS981B3D,ACT ION:quarantine,TS:75 X-CID-META: VersionHash:178d4d4, CLOUDID:1365f53a-de1e-4348-bc35-c96f92f1dcbb, B ulkID:230512160411S4XULEVQ,BulkQuantity:1,Recheck:0,SF:19|48|38|29|28|17,T C:nil,Content:0,EDM:-3,IP:nil,URL:0,File:nil,Bulk:43,QS:nil,BEC:nil,COL:0, OSI:0,OSA:0,AV:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-UUID: 932c4fc0f09b11edb20a276fd37b9834-20230512 Received: from mtkmbs13n1.mediatek.inc [(172.21.101.193)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 85647774; Fri, 12 May 2023 16:04:09 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs13n1.mediatek.inc (172.21.101.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Fri, 12 May 2023 16:04:08 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Fri, 12 May 2023 16:04:08 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Arnd Bergmann , Matthias Brugger , AngeloGioacchino Del Regno CC: , , , , , , Trilok Soni , David Bradil , Jade Shih , Miles Chen , Ivan Tseng , My Chuang , Shawn Hsiao , PeiLun Suei , Liju Chen Subject: [PATCH v3 6/7] virt: geniezone: Add irqfd support Date: Fri, 12 May 2023 16:04:04 +0800 Message-ID: <20230512080405.12043-7-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230512080405.12043-1-yi-de.wu@mediatek.com> References: <20230512080405.12043-1-yi-de.wu@mediatek.com> MIME-Version: 1.0 X-MTK: N Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org From: "Yingshiuan Pan" irqfd enables other threads than vcpu threads to inject virtual interrupt through irqfd asynchronously rather through ioctl interface. This interface is necessary for VMM which creates separated thread for IO handling or uses vhost devices. Signed-off-by: Yingshiuan Pan Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/gzvm_arch_common.h | 6 + drivers/virt/geniezone/Makefile | 4 +- drivers/virt/geniezone/gzvm_irqfd.c | 537 ++++++++++++++++++++++++ drivers/virt/geniezone/gzvm_main.c | 5 + drivers/virt/geniezone/gzvm_vcpu.c | 1 + drivers/virt/geniezone/gzvm_vm.c | 18 + include/linux/gzvm_drv.h | 27 ++ include/uapi/linux/gzvm.h | 20 +- 8 files changed, 615 insertions(+), 3 deletions(-) create mode 100644 drivers/virt/geniezone/gzvm_irqfd.c diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 1b315264bf24..5affa28b935a 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -46,6 +46,7 @@ enum { #define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE) #define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE) #define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP) +#define GIC_V3_NR_LRS 16 /** * gzvm_hypercall_wrapper() @@ -72,6 +73,11 @@ static inline gzvm_vcpu_id_t get_vcpuid_from_tuple(unsigned int tuple) return (gzvm_vcpu_id_t)(tuple & 0xffff); } +struct gzvm_vcpu_hwstate { + __u32 nr_lrs; + __u64 lr[GIC_V3_NR_LRS]; +}; + static inline unsigned int assemble_vm_vcpu_tuple(gzvm_id_t vmid, gzvm_vcpu_id_t vcpuid) { diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile index 67ba3ed76ea7..aa52cee3ca8e 100644 --- a/drivers/virt/geniezone/Makefile +++ b/drivers/virt/geniezone/Makefile @@ -7,5 +7,5 @@ GZVM_DIR ?= ../../../drivers/virt/geniezone gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \ - $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqchip.o - + $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqchip.o \ + $(GZVM_DIR)/gzvm_irqfd.o diff --git a/drivers/virt/geniezone/gzvm_irqfd.c b/drivers/virt/geniezone/gzvm_irqfd.c new file mode 100644 index 000000000000..3a395b972fdf --- /dev/null +++ b/drivers/virt/geniezone/gzvm_irqfd.c @@ -0,0 +1,537 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include +#include "gzvm_common.h" + +struct gzvm_irq_ack_notifier { + struct hlist_node link; + unsigned int gsi; + void (*irq_acked)(struct gzvm_irq_ack_notifier *ian); +}; + +/** + * struct gzvm_kernel_irqfd_resampler - irqfd resampler descriptor. + * @gzvm: Poiner to gzvm. + * @list_head list: List of resampling struct _irqfd objects sharing this gsi. + * RCU list modified under gzvm->irqfds.resampler_lock. + * @notifier: gzvm irq ack notifier. + * @list_head link: Entry in list of gzvm->irqfd.resampler_list. + * Use for sharing esamplers among irqfds on the same gsi. + * Accessed and modified under gzvm->irqfds.resampler_lock. + * + * Resampling irqfds are a special variety of irqfds used to emulate + * level triggered interrupts. The interrupt is asserted on eventfd + * trigger. On acknowledgment through the irq ack notifier, the + * interrupt is de-asserted and userspace is notified through the + * resamplefd. All resamplers on the same gsi are de-asserted + * together, so we don't need to track the state of each individual + * user. We can also therefore share the same irq source ID. + */ +struct gzvm_kernel_irqfd_resampler { + struct gzvm *gzvm; + + struct list_head list; + struct gzvm_irq_ack_notifier notifier; + + struct list_head link; +}; + +/** + * struct gzvm_kernel_irqfd: gzvm kernel irqfd descriptor. + * @gzvm: Pointer to gzvm. + * @wait: Wait queue entry. + * @gsi: Used for level IRQ fast-path. + * @resampler: The resampler used by this irqfd (resampler-only). + * @resamplefd: Eventfd notified on resample (resampler-only). + * @resampler_link: Entry in list of irqfds for a resampler (resampler-only). + * @eventfd: Used for setup/shutdown. + */ +struct gzvm_kernel_irqfd { + struct gzvm *gzvm; + wait_queue_entry_t wait; + + int gsi; + + struct gzvm_kernel_irqfd_resampler *resampler; + + struct eventfd_ctx *resamplefd; + + struct list_head resampler_link; + + struct eventfd_ctx *eventfd; + struct list_head list; + poll_table pt; + struct work_struct shutdown; +}; + +static struct workqueue_struct *irqfd_cleanup_wq; + +/** + * irqfd_set_spi(): irqfd to inject virtual interrupt. + * @gzvm: Pointer to gzvm. + * @irq_source_id: irq source id. + * @irq: This is spi interrupt number (starts from 0 instead of 32). + * @level: irq triggered level. + * @line_status: irq status. + */ +static void irqfd_set_spi(struct gzvm *gzvm, int irq_source_id, u32 irq, + int level, bool line_status) +{ + if (level) + gzvm_irqchip_inject_irq(gzvm, irq_source_id, 0, irq, level); +} + +/** + * irqfd_resampler_ack() - Notify all of the resampler irqfds using this GSI + * when IRQ de-assert once. + * @ian: Pointer to gzvm_irq_ack_notifier. + * + * Since resampler irqfds share an IRQ source ID, we de-assert once + * then notify all of the resampler irqfds using this GSI. We can't + * do multiple de-asserts or we risk racing with incoming re-asserts. + */ +static void irqfd_resampler_ack(struct gzvm_irq_ack_notifier *ian) +{ + struct gzvm_kernel_irqfd_resampler *resampler; + struct gzvm *gzvm; + struct gzvm_kernel_irqfd *irqfd; + int idx; + + resampler = container_of(ian, + struct gzvm_kernel_irqfd_resampler, notifier); + gzvm = resampler->gzvm; + + irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, + resampler->notifier.gsi, 0, false); + + idx = srcu_read_lock(&gzvm->irq_srcu); + + list_for_each_entry_srcu(irqfd, &resampler->list, resampler_link, + srcu_read_lock_held(&gzvm->irq_srcu)) { + eventfd_signal(irqfd->resamplefd, 1); + } + + srcu_read_unlock(&gzvm->irq_srcu, idx); +} + +static void gzvm_register_irq_ack_notifier(struct gzvm *gzvm, + struct gzvm_irq_ack_notifier *ian) +{ + mutex_lock(&gzvm->irq_lock); + hlist_add_head_rcu(&ian->link, &gzvm->irq_ack_notifier_list); + mutex_unlock(&gzvm->irq_lock); +} + +static void gzvm_unregister_irq_ack_notifier(struct gzvm *gzvm, + struct gzvm_irq_ack_notifier *ian) +{ + mutex_lock(&gzvm->irq_lock); + hlist_del_init_rcu(&ian->link); + mutex_unlock(&gzvm->irq_lock); + synchronize_srcu(&gzvm->irq_srcu); +} + +static void irqfd_resampler_shutdown(struct gzvm_kernel_irqfd *irqfd) +{ + struct gzvm_kernel_irqfd_resampler *resampler = irqfd->resampler; + struct gzvm *gzvm = resampler->gzvm; + + mutex_lock(&gzvm->irqfds.resampler_lock); + + list_del_rcu(&irqfd->resampler_link); + synchronize_srcu(&gzvm->irq_srcu); + + if (list_empty(&resampler->list)) { + list_del(&resampler->link); + gzvm_unregister_irq_ack_notifier(gzvm, &resampler->notifier); + irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, + resampler->notifier.gsi, 0, false); + kfree(resampler); + } + + mutex_unlock(&gzvm->irqfds.resampler_lock); +} + +/** + * irqfd_shutdown() - Race-free decouple logic (ordering is critical). + * @work: Pointer to work_struct. + */ +static void irqfd_shutdown(struct work_struct *work) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(work, struct gzvm_kernel_irqfd, shutdown); + struct gzvm *gzvm = irqfd->gzvm; + u64 cnt; + + /* Make sure irqfd has been initialized in assign path. */ + synchronize_srcu(&gzvm->irq_srcu); + + /* + * Synchronize with the wait-queue and unhook ourselves to prevent + * further events. + */ + eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt); + + if (irqfd->resampler) { + irqfd_resampler_shutdown(irqfd); + eventfd_ctx_put(irqfd->resamplefd); + } + + /* + * It is now safe to release the object's resources + */ + eventfd_ctx_put(irqfd->eventfd); + kfree(irqfd); +} + +/** + * irqfd_is_active() - Assumes gzvm->irqfds.lock is held. + * @irqfd: Pointer to gzvm_kernel_irqfd. + */ +static bool irqfd_is_active(struct gzvm_kernel_irqfd *irqfd) +{ + return list_empty(&irqfd->list) ? false : true; +} + +/** + * irqfd_deactivate() - Mark the irqfd as inactive and schedule it for removal. + * assumes gzvm->irqfds.lock is held. + * @irqfd: Pointer to gzvm_kernel_irqfd. + */ +static void irqfd_deactivate(struct gzvm_kernel_irqfd *irqfd) +{ + if (!irqfd_is_active(irqfd)) + return; + + list_del_init(&irqfd->list); + + queue_work(irqfd_cleanup_wq, &irqfd->shutdown); +} + +/** + * irqfd_wakeup() - Wake up irqfd to do virtual interrupt injection. + * @wait: Pointer to wait_queue_entry_t. + * @mode: + * @sync: + * @key: + */ +static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync, + void *key) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(wait, struct gzvm_kernel_irqfd, wait); + __poll_t flags = key_to_poll(key); + struct gzvm *gzvm = irqfd->gzvm; + + if (flags & EPOLLIN) { + u64 cnt; + + eventfd_ctx_do_read(irqfd->eventfd, &cnt); + /* gzvm's irq injection is not blocked, don't need workq */ + irqfd_set_spi(gzvm, GZVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, + 1, false); + } + + if (flags & EPOLLHUP) { + /* The eventfd is closing, detach from GZVM */ + unsigned long iflags; + + spin_lock_irqsave(&gzvm->irqfds.lock, iflags); + + /* + * Do more check if someone deactivated the irqfd before + * we could acquire the irqfds.lock. + */ + if (irqfd_is_active(irqfd)) + irqfd_deactivate(irqfd); + + spin_unlock_irqrestore(&gzvm->irqfds.lock, iflags); + } + + return 0; +} + +static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(pt, struct gzvm_kernel_irqfd, pt); + add_wait_queue_priority(wqh, &irqfd->wait); +} + +static int gzvm_irqfd_assign(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + struct fd f; + struct eventfd_ctx *eventfd = NULL, *resamplefd = NULL; + int ret; + __poll_t events; + int idx; + + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL_ACCOUNT); + if (!irqfd) + return -ENOMEM; + + irqfd->gzvm = gzvm; + irqfd->gsi = args->gsi; + irqfd->resampler = NULL; + + INIT_LIST_HEAD(&irqfd->list); + INIT_WORK(&irqfd->shutdown, irqfd_shutdown); + + f = fdget(args->fd); + if (!f.file) { + ret = -EBADF; + goto out; + } + + eventfd = eventfd_ctx_fileget(f.file); + if (IS_ERR(eventfd)) { + ret = PTR_ERR(eventfd); + goto fail; + } + + irqfd->eventfd = eventfd; + + if (args->flags & GZVM_IRQFD_FLAG_RESAMPLE) { + struct gzvm_kernel_irqfd_resampler *resampler; + + resamplefd = eventfd_ctx_fdget(args->resamplefd); + if (IS_ERR(resamplefd)) { + ret = PTR_ERR(resamplefd); + goto fail; + } + + irqfd->resamplefd = resamplefd; + INIT_LIST_HEAD(&irqfd->resampler_link); + + mutex_lock(&gzvm->irqfds.resampler_lock); + + list_for_each_entry(resampler, + &gzvm->irqfds.resampler_list, link) { + if (resampler->notifier.gsi == irqfd->gsi) { + irqfd->resampler = resampler; + break; + } + } + + if (!irqfd->resampler) { + resampler = kzalloc(sizeof(*resampler), + GFP_KERNEL_ACCOUNT); + if (!resampler) { + ret = -ENOMEM; + mutex_unlock(&gzvm->irqfds.resampler_lock); + goto fail; + } + + resampler->gzvm = gzvm; + INIT_LIST_HEAD(&resampler->list); + resampler->notifier.gsi = irqfd->gsi; + resampler->notifier.irq_acked = irqfd_resampler_ack; + INIT_LIST_HEAD(&resampler->link); + + list_add(&resampler->link, &gzvm->irqfds.resampler_list); + gzvm_register_irq_ack_notifier(gzvm, + &resampler->notifier); + irqfd->resampler = resampler; + } + + list_add_rcu(&irqfd->resampler_link, &irqfd->resampler->list); + synchronize_srcu(&gzvm->irq_srcu); + + mutex_unlock(&gzvm->irqfds.resampler_lock); + } + + /* + * Install our own custom wake-up handling so we are notified via + * a callback whenever someone signals the underlying eventfd + */ + init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); + init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc); + + spin_lock_irq(&gzvm->irqfds.lock); + + ret = 0; + list_for_each_entry(tmp, &gzvm->irqfds.items, list) { + if (irqfd->eventfd != tmp->eventfd) + continue; + /* This fd is used for another irq already. */ + pr_err("already used: gsi=%d fd=%d\n", args->gsi, args->fd); + ret = -EBUSY; + spin_unlock_irq(&gzvm->irqfds.lock); + goto fail; + } + + idx = srcu_read_lock(&gzvm->irq_srcu); + + list_add_tail(&irqfd->list, &gzvm->irqfds.items); + + spin_unlock_irq(&gzvm->irqfds.lock); + + /* + * Check if there was an event already pending on the eventfd + * before we registered, and trigger it as if we didn't miss it. + */ + events = vfs_poll(f.file, &irqfd->pt); + + /* In case there is already a pending event */ + if (events & EPOLLIN) + irqfd_set_spi(gzvm, GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID, + irqfd->gsi, 1, false); + + srcu_read_unlock(&gzvm->irq_srcu, idx); + + /* + * do not drop the file until the irqfd is fully initialized, otherwise + * we might race against the EPOLLHUP + */ + fdput(f); + return 0; + +fail: + if (irqfd->resampler) + irqfd_resampler_shutdown(irqfd); + + if (resamplefd && !IS_ERR(resamplefd)) + eventfd_ctx_put(resamplefd); + + if (eventfd && !IS_ERR(eventfd)) + eventfd_ctx_put(eventfd); + + fdput(f); + +out: + kfree(irqfd); + return ret; +} + +static void gzvm_notify_acked_gsi(struct gzvm *gzvm, int gsi) +{ + struct gzvm_irq_ack_notifier *gian; + + hlist_for_each_entry_srcu(gian, &gzvm->irq_ack_notifier_list, + link, srcu_read_lock_held(&gzvm->irq_srcu)) + if (gian->gsi == gsi) + gian->irq_acked(gian); +} + +void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi) +{ + int idx; + + idx = srcu_read_lock(&gzvm->irq_srcu); + gzvm_notify_acked_gsi(gzvm, gsi); + srcu_read_unlock(&gzvm->irq_srcu, idx); +} + +/** + * gzvm_irqfd_deassign() - Shutdown any irqfd's that match fd+gsi. + * @gzvm: Pointer to gzvm. + * @args: Pointer to gzvm_irqfd. + */ +static int gzvm_irqfd_deassign(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + struct eventfd_ctx *eventfd; + + eventfd = eventfd_ctx_fdget(args->fd); + if (IS_ERR(eventfd)) + return PTR_ERR(eventfd); + + spin_lock_irq(&gzvm->irqfds.lock); + + list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list) { + if (irqfd->eventfd == eventfd && irqfd->gsi == args->gsi) + irqfd_deactivate(irqfd); + } + + spin_unlock_irq(&gzvm->irqfds.lock); + eventfd_ctx_put(eventfd); + + /* + * Block until we know all outstanding shutdown jobs have completed + * so that we guarantee there will not be any more interrupts on this + * gsi once this deassign function returns. + */ + flush_workqueue(irqfd_cleanup_wq); + + return 0; +} + +int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + if (args->flags & + ~(GZVM_IRQFD_FLAG_DEASSIGN | GZVM_IRQFD_FLAG_RESAMPLE)) + return -EINVAL; + + if (args->flags & GZVM_IRQFD_FLAG_DEASSIGN) + return gzvm_irqfd_deassign(gzvm, args); + + return gzvm_irqfd_assign(gzvm, args); +} + +/** + * gzvm_vm_irqfd_init() - Initialize irqfd data structure per VM + */ +int gzvm_vm_irqfd_init(struct gzvm *gzvm) +{ + mutex_init(&gzvm->irq_lock); + + spin_lock_init(&gzvm->irqfds.lock); + INIT_LIST_HEAD(&gzvm->irqfds.items); + INIT_LIST_HEAD(&gzvm->irqfds.resampler_list); + if (init_srcu_struct(&gzvm->irq_srcu)) + return -EINVAL; + INIT_HLIST_HEAD(&gzvm->irq_ack_notifier_list); + mutex_init(&gzvm->irqfds.resampler_lock); + + return 0; +} + +/** + * gzvm_vm_irqfd_release() - This function is called as the gzvm VM fd is being + * released. Shutdown all irqfds that still remain open. + * @gzvm: Pointer to gzvm. + */ +void gzvm_vm_irqfd_release(struct gzvm *gzvm) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + + spin_lock_irq(&gzvm->irqfds.lock); + + list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list) + irqfd_deactivate(irqfd); + + spin_unlock_irq(&gzvm->irqfds.lock); + + /* + * Block until we know all outstanding shutdown jobs have completed. + */ + flush_workqueue(irqfd_cleanup_wq); +} + +/** + * gzvm_drv_irqfd_init() - Erase flushing work items when a VM exits. + * + * Create a host-wide workqueue for issuing deferred shutdown requests + * aggregated from all vm* instances. We need our own isolated + * queue to ease flushing work items when a VM exits. + */ +int gzvm_drv_irqfd_init(void) +{ + irqfd_cleanup_wq = alloc_workqueue("gzvm-irqfd-cleanup", 0, 0); + if (!irqfd_cleanup_wq) + return -ENOMEM; + + return 0; +} + +void gzvm_drv_irqfd_exit(void) +{ + destroy_workqueue(irqfd_cleanup_wq); +} diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c index e3fe3ad9ffce..121816a09c8e 100644 --- a/drivers/virt/geniezone/gzvm_main.c +++ b/drivers/virt/geniezone/gzvm_main.c @@ -113,11 +113,16 @@ static int gzvm_drv_probe(struct platform_device *pdev) return ret; gzvm_debug_dev = pdev; + ret = gzvm_drv_irqfd_init(); + if (ret) + return ret; + return 0; } static int gzvm_drv_remove(struct platform_device *pdev) { + gzvm_drv_irqfd_exit(); destroy_all_vm(); misc_deregister(&gzvm_dev); return 0; diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c index d1bb2cba1893..fdf6e6297e66 100644 --- a/drivers/virt/geniezone/gzvm_vcpu.c +++ b/drivers/virt/geniezone/gzvm_vcpu.c @@ -211,6 +211,7 @@ int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid) ret = -ENOMEM; goto free_vcpu; } + vcpu->hwstate = (void *)vcpu->run + PAGE_SIZE; vcpu->vcpuid = cpuid; vcpu->gzvm = gzvm; mutex_init(&vcpu->lock); diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 7ed4e4c3c1ee..403192731597 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -298,6 +298,15 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_vm_ioctl_create_device(gzvm, argp); break; } + case GZVM_IRQFD: { + struct gzvm_irqfd data; + + ret = -EFAULT; + if (copy_from_user(&data, argp, sizeof(data))) + goto out; + ret = gzvm_irqfd(gzvm, &data); + break; + } case GZVM_ENABLE_CAP: { struct gzvm_enable_cap cap; @@ -322,6 +331,7 @@ static void gzvm_destroy_vm(struct gzvm *gzvm) mutex_lock(&gzvm->lock); + gzvm_vm_irqfd_release(gzvm); gzvm_destroy_vcpus(gzvm); gzvm_arch_destroy_vm(gzvm->vm_id); @@ -367,6 +377,14 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) gzvm->mm = current->mm; mutex_init(&gzvm->lock); + ret = gzvm_vm_irqfd_init(gzvm); + if (ret) { + dev_err(&gzvm_debug_dev->dev, + "Failed to initialize irqfd\n"); + kfree(gzvm); + return ERR_PTR(ret); + } + mutex_lock(&gzvm_list_lock); list_add(&gzvm->vm_list, &gzvm_list); mutex_unlock(&gzvm_list_lock); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index 1e7c81597e9a..a54a7915c514 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -10,6 +10,7 @@ #include #include #include +#include #define MODULE_NAME "gzvm" #define GZVM_VCPU_MMAP_SIZE PAGE_SIZE @@ -25,6 +26,8 @@ #define ERR_NOT_SUPPORTED (-24) #define ERR_NOT_IMPLEMENTED (-27) #define ERR_FAULT (-40) +#define GZVM_USERSPACE_IRQ_SOURCE_ID 0 +#define GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1 /* * The following data structures are for data transferring between driver and @@ -68,6 +71,7 @@ struct gzvm_vcpu { /* lock of vcpu*/ struct mutex lock; struct gzvm_vcpu_run *run; + struct gzvm_vcpu_hwstate *hwstate; }; struct gzvm { @@ -77,8 +81,23 @@ struct gzvm { struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION]; /* lock for list_add*/ struct mutex lock; + + struct { + /* lock for irqfds list operation */ + spinlock_t lock; + struct list_head items; + struct list_head resampler_list; + /* lock for irqfds resampler */ + struct mutex resampler_lock; + } irqfds; + struct list_head vm_list; gzvm_id_t vm_id; + + struct hlist_head irq_ack_notifier_list; + struct srcu_struct irq_srcu; + /* lock for irq injection */ + struct mutex irq_lock; }; long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args); @@ -111,6 +130,14 @@ int gzvm_arch_create_device(gzvm_id_t vm_id, struct gzvm_create_device *gzvm_dev int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, u32 irq_type, u32 irq, bool level); +void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi); +int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args); +int gzvm_drv_irqfd_init(void); +void gzvm_drv_irqfd_exit(void); +int gzvm_vm_irqfd_init(struct gzvm *gzvm); +void gzvm_vm_irqfd_release(struct gzvm *gzvm); +void gzvm_sync_hwstate(struct gzvm_vcpu *vcpu); + extern struct platform_device *gzvm_debug_dev; #endif /* __GZVM_DRV_H__ */ diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index b39ace47d589..0751f9f4f76f 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -226,4 +226,22 @@ struct gzvm_one_reg { #define GZVM_REG_GENERIC 0x0000000000000000ULL -#endif /* __GZVM_H__ */ +#define GZVM_IRQFD_FLAG_DEASSIGN (1 << 0) +/** + * GZVM_IRQFD_FLAG_RESAMPLE indicates resamplefd is valid and specifies + * the irqfd to operate in resampling mode for level triggered interrupt + * emulation. + */ +#define GZVM_IRQFD_FLAG_RESAMPLE (1 << 1) + +struct gzvm_irqfd { + __u32 fd; + __u32 gsi; + __u32 flags; + __u32 resamplefd; + __u8 pad[16]; +}; + +#define GZVM_IRQFD _IOW(GZVM_IOC_MAGIC, 0x76, struct gzvm_irqfd) + +#endif /* __GZVM__H__ */ From patchwork Fri May 12 08:04:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 681382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FF51C77B7C for ; Fri, 12 May 2023 08:05:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240112AbjELIFj (ORCPT ); Fri, 12 May 2023 04:05:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240401AbjELIEs (ORCPT ); Fri, 12 May 2023 04:04:48 -0400 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F42211B74; Fri, 12 May 2023 01:04:15 -0700 (PDT) X-UUID: 932f71aaf09b11edb20a276fd37b9834-20230512 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=JjgYCtFV/9DpjvnW+raRsijyxlEVvjaJ25KjfOxBbm8=; b=AT8yj85169IPpe66r1X7p8uC0BIKOnsleu9YmW/epYxIksuvTe4prs0D73WbFEA8BWJmiJYvfGAd8Ui71Nhjhp5Jet+TjSPt5YsKph1J76uUbVXRF/RITYKsxyno797bIM/2qrQOvFBDkx4PBbQnD/xOzOhk3vjm7zimPqQ+qm4=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.24, REQID:2a664c66-c3d3-4f76-9c5f-0fbb8fc75dd8, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:100,FILE:0,BULK:0,RULE:Release_Ham,ACT ION:release,TS:75 X-CID-INFO: VERSION:1.1.24, REQID:2a664c66-c3d3-4f76-9c5f-0fbb8fc75dd8, IP:0, URL :0,TC:0,Content:-25,EDM:0,RT:0,SF:100,FILE:0,BULK:0,RULE:Spam_GS981B3D,ACT ION:quarantine,TS:75 X-CID-META: VersionHash:178d4d4, CLOUDID:1165f53a-de1e-4348-bc35-c96f92f1dcbb, B ulkID:230512160411S4XULEVQ,BulkQuantity:0,Recheck:0,SF:29|28|17|19|48|38,T C:nil,Content:0,EDM:-3,IP:nil,URL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,CO L:0,OSI:0,OSA:0,AV:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-UUID: 932f71aaf09b11edb20a276fd37b9834-20230512 Received: from mtkmbs13n1.mediatek.inc [(172.21.101.193)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1446368848; Fri, 12 May 2023 16:04:09 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs13n1.mediatek.inc (172.21.101.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Fri, 12 May 2023 16:04:08 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Fri, 12 May 2023 16:04:08 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Arnd Bergmann , Matthias Brugger , AngeloGioacchino Del Regno CC: , , , , , , Trilok Soni , David Bradil , Jade Shih , Miles Chen , Ivan Tseng , My Chuang , Shawn Hsiao , PeiLun Suei , Liju Chen Subject: [PATCH v3 7/7] virt: geniezone: Add ioeventfd support Date: Fri, 12 May 2023 16:04:05 +0800 Message-ID: <20230512080405.12043-8-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230512080405.12043-1-yi-de.wu@mediatek.com> References: <20230512080405.12043-1-yi-de.wu@mediatek.com> MIME-Version: 1.0 X-MTK: N Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org From: "Yingshiuan Pan" Ioeventfd leverages eventfd to provide asynchronous notification mechanism for VMM. VMM can register a mmio address and bind with an eventfd. Once a mmio trap occurs on this registered region, its corresponding eventfd will be notified. Signed-off-by: Yingshiuan Pan Signed-off-by: Yi-De Wu --- drivers/virt/geniezone/Makefile | 2 +- drivers/virt/geniezone/gzvm_ioeventfd.c | 263 ++++++++++++++++++++++++ drivers/virt/geniezone/gzvm_vcpu.c | 29 ++- drivers/virt/geniezone/gzvm_vm.c | 17 ++ include/linux/gzvm_drv.h | 11 + include/uapi/linux/gzvm.h | 23 +++ 6 files changed, 342 insertions(+), 3 deletions(-) create mode 100644 drivers/virt/geniezone/gzvm_ioeventfd.c diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile index aa52cee3ca8e..25493a4d1c63 100644 --- a/drivers/virt/geniezone/Makefile +++ b/drivers/virt/geniezone/Makefile @@ -8,4 +8,4 @@ GZVM_DIR ?= ../../../drivers/virt/geniezone gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_vm.o \ $(GZVM_DIR)/gzvm_vcpu.o $(GZVM_DIR)/gzvm_irqchip.o \ - $(GZVM_DIR)/gzvm_irqfd.o + $(GZVM_DIR)/gzvm_irqfd.o $(GZVM_DIR)/gzvm_ioeventfd.o diff --git a/drivers/virt/geniezone/gzvm_ioeventfd.c b/drivers/virt/geniezone/gzvm_ioeventfd.c new file mode 100644 index 000000000000..f5664cab98c3 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_ioeventfd.c @@ -0,0 +1,263 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct gzvm_ioevent { + struct list_head list; + __u64 addr; + __u32 len; + struct eventfd_ctx *evt_ctx; + __u64 datamatch; + bool wildcard; +}; + +/** + * ioeventfd_check_collision() - Check collison assumes gzvm->slots_lock held. + * @gzvm: Pointer to gzvm. + * @p: Pointer to gzvm_ioevent. + */ +static bool ioeventfd_check_collision(struct gzvm *gzvm, struct gzvm_ioevent *p) +{ + struct gzvm_ioevent *_p; + + list_for_each_entry(_p, &gzvm->ioevents, list) + if (_p->addr == p->addr && + (!_p->len || !p->len || + (_p->len == p->len && + (_p->wildcard || p->wildcard || + _p->datamatch == p->datamatch)))) + return true; + + return false; +} + +static void gzvm_ioevent_release(struct gzvm_ioevent *p) +{ + eventfd_ctx_put(p->evt_ctx); + list_del(&p->list); + kfree(p); +} + +static bool gzvm_ioevent_in_range(struct gzvm_ioevent *p, __u64 addr, int len, + const void *val) +{ + u64 _val; + + if (addr != p->addr) + /* address must be precise for a hit */ + return false; + + if (!p->len) + /* length = 0 means only look at the address, so always a hit */ + return true; + + if (len != p->len) + /* address-range must be precise for a hit */ + return false; + + if (p->wildcard) + /* all else equal, wildcard is always a hit */ + return true; + + /* otherwise, we have to actually compare the data */ + + WARN_ON_ONCE(!IS_ALIGNED((unsigned long)val, len)); + + switch (len) { + case 1: + _val = *(u8 *)val; + break; + case 2: + _val = *(u16 *)val; + break; + case 4: + _val = *(u32 *)val; + break; + case 8: + _val = *(u64 *)val; + break; + default: + return false; + } + + return _val == p->datamatch; +} + +static int gzvm_deassign_ioeventfd(struct gzvm *gzvm, + struct gzvm_ioeventfd *args) +{ + struct gzvm_ioevent *p, *tmp; + struct eventfd_ctx *evt_ctx; + int ret = -ENOENT; + bool wildcard; + + evt_ctx = eventfd_ctx_fdget(args->fd); + if (IS_ERR(evt_ctx)) + return PTR_ERR(evt_ctx); + + wildcard = !(args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH); + + mutex_lock(&gzvm->lock); + + list_for_each_entry_safe(p, tmp, &gzvm->ioevents, list) { + if (p->evt_ctx != evt_ctx || + p->addr != args->addr || + p->len != args->len || + p->wildcard != wildcard) + continue; + + if (!p->wildcard && p->datamatch != args->datamatch) + continue; + + gzvm_ioevent_release(p); + ret = 0; + break; + } + + mutex_unlock(&gzvm->lock); + + /* got in the front of this function */ + eventfd_ctx_put(evt_ctx); + + return ret; +} + +static int gzvm_assign_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args) +{ + struct eventfd_ctx *evt_ctx; + struct gzvm_ioevent *evt; + int ret; + + evt_ctx = eventfd_ctx_fdget(args->fd); + if (IS_ERR(evt_ctx)) + return PTR_ERR(evt_ctx); + + evt = kmalloc(sizeof(*evt), GFP_KERNEL); + if (!evt) + return -ENOMEM; + *evt = (struct gzvm_ioevent) { + .addr = args->addr, + .len = args->len, + .evt_ctx = evt_ctx, + }; + if (args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH) { + evt->datamatch = args->datamatch; + evt->wildcard = false; + } else { + evt->wildcard = true; + } + + if (ioeventfd_check_collision(gzvm, evt)) { + ret = -EEXIST; + goto err_free; + } + + mutex_lock(&gzvm->lock); + list_add_tail(&evt->list, &gzvm->ioevents); + mutex_unlock(&gzvm->lock); + + return 0; + +err_free: + kfree(evt); + eventfd_ctx_put(evt_ctx); + return ret; +} + +/** + * gzvm_ioeventfd_check_valid() - Check user arguments is valid. + * @args: Pointer to gzvm_ioeventfd. + * + * Return true if user arguments are valid. + * Return false if user arguments are invalid. + */ +static bool gzvm_ioeventfd_check_valid(struct gzvm_ioeventfd *args) +{ + /* must be natural-word sized, or 0 to ignore length */ + switch (args->len) { + case 0: + case 1: + case 2: + case 4: + case 8: + break; + default: + return false; + } + + /* check for range overflow */ + if (args->addr + args->len < args->addr) + return false; + + /* check for extra flags that we don't understand */ + if (args->flags & ~GZVM_IOEVENTFD_VALID_FLAG_MASK) + return false; + + /* ioeventfd with no length can't be combined with DATAMATCH */ + if (!args->len && (args->flags & GZVM_IOEVENTFD_FLAG_DATAMATCH)) + return false; + + /* gzvm does not support pio bus ioeventfd */ + if (args->flags & GZVM_IOEVENTFD_FLAG_PIO) + return false; + + return true; +} + +/** + * gzvm_ioeventfd() - Register ioevent to ioevent list. + * @gzvm: Pointer to gzvm. + * @args: Pointer to gzvm_ioeventfd. + */ +int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args) +{ + if (gzvm_ioeventfd_check_valid(args) == false) + return -EINVAL; + + if (args->flags & GZVM_IOEVENTFD_FLAG_DEASSIGN) + return gzvm_deassign_ioeventfd(gzvm, args); + return gzvm_assign_ioeventfd(gzvm, args); +} + +/** + * gzvm_ioevent_write() - Travers this vm's registered ioeventfd to see if + * need notifying it. + * @vcpu: Pointer to vcpu. + * @addr: mmio address. + * @len: mmio size. + * @val: Pointer to void. + * + * Return true if this io is already sent to ioeventfd's listener. + * Return false if we cannot find any ioeventfd registering this mmio write. + */ +bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len, + const void *val) +{ + struct gzvm_ioevent *e; + + list_for_each_entry(e, &vcpu->gzvm->ioevents, list) { + if (gzvm_ioevent_in_range(e, addr, len, val)) { + eventfd_signal(e->evt_ctx, 1); + return true; + } + } + return false; +} + +int gzvm_init_ioeventfd(struct gzvm *gzvm) +{ + INIT_LIST_HEAD(&gzvm->ioevents); + + return 0; +} diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c index fdf6e6297e66..d57385395a21 100644 --- a/drivers/virt/geniezone/gzvm_vcpu.c +++ b/drivers/virt/geniezone/gzvm_vcpu.c @@ -49,10 +49,34 @@ static long gzvm_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, void * __user argp, return 0; } +/** + * gzvm_vcpu_handle_mmio() - Handle mmio in kernel space. + * @vcpu: Pointer to vcpu. + * + * Return: + * * true - This mmio exit has been processed. + * * false - This mmio exit has not been processed, require userspace. + */ +static bool gzvm_vcpu_handle_mmio(struct gzvm_vcpu *vcpu) +{ + __u64 addr; + __u32 len; + const void *val_ptr; + + /* So far, we don't have in-kernel mmio read handler */ + if (!vcpu->run->mmio.is_write) + return false; + addr = vcpu->run->mmio.phys_addr; + len = vcpu->run->mmio.size; + val_ptr = &vcpu->run->mmio.data; + + return gzvm_ioevent_write(vcpu, addr, len, val_ptr); +} + /** * gzvm_vcpu_run() - Handle vcpu run ioctl, entry point to guest and exit * point from guest - * @argp: pointer to struct gzvm_vcpu_run in userspace + * @argp: Pointer to struct gzvm_vcpu_run in userspace */ static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp) { @@ -70,7 +94,8 @@ static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp) switch (exit_reason) { case GZVM_EXIT_MMIO: - need_userspace = true; + if (!gzvm_vcpu_handle_mmio(vcpu)) + need_userspace = true; break; /** * it's geniezone's responsibility to fill corresponding data diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 403192731597..c90111e4b23e 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -307,6 +307,15 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_irqfd(gzvm, &data); break; } + case GZVM_IOEVENTFD: { + struct gzvm_ioeventfd data; + + ret = -EFAULT; + if (copy_from_user(&data, argp, sizeof(data))) + goto out; + ret = gzvm_ioeventfd(gzvm, &data); + break; + } case GZVM_ENABLE_CAP: { struct gzvm_enable_cap cap; @@ -385,6 +394,14 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) return ERR_PTR(ret); } + ret = gzvm_init_ioeventfd(gzvm); + if (ret) { + dev_err(&gzvm_debug_dev->dev, + "Failed to initialize ioeventfd\n"); + kfree(gzvm); + return ERR_PTR(ret); + } + mutex_lock(&gzvm_list_lock); list_add(&gzvm->vm_list, &gzvm_list); mutex_unlock(&gzvm_list_lock); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index a54a7915c514..3c9f617d6bf1 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -6,6 +6,7 @@ #ifndef __GZVM_DRV_H__ #define __GZVM_DRV_H__ +#include #include #include #include @@ -91,6 +92,8 @@ struct gzvm { struct mutex resampler_lock; } irqfds; + struct list_head ioevents; + struct list_head vm_list; gzvm_id_t vm_id; @@ -140,4 +143,12 @@ void gzvm_sync_hwstate(struct gzvm_vcpu *vcpu); extern struct platform_device *gzvm_debug_dev; +int gzvm_init_ioeventfd(struct gzvm *gzvm); +int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args); +bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len, + const void *val); +void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt); +struct vm_area_struct *vma_lookup(struct mm_struct *mm, unsigned long addr); +void add_wait_queue_priority(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry); + #endif /* __GZVM_DRV_H__ */ diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index 0751f9f4f76f..cf4d3cf296d3 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -244,4 +244,27 @@ struct gzvm_irqfd { #define GZVM_IRQFD _IOW(GZVM_IOC_MAGIC, 0x76, struct gzvm_irqfd) +enum { + gzvm_ioeventfd_flag_nr_datamatch, + gzvm_ioeventfd_flag_nr_pio, + gzvm_ioeventfd_flag_nr_deassign, + gzvm_ioeventfd_flag_nr_max, +}; + +#define GZVM_IOEVENTFD_FLAG_DATAMATCH (1 << gzvm_ioeventfd_flag_nr_datamatch) +#define GZVM_IOEVENTFD_FLAG_PIO (1 << gzvm_ioeventfd_flag_nr_pio) +#define GZVM_IOEVENTFD_FLAG_DEASSIGN (1 << gzvm_ioeventfd_flag_nr_deassign) +#define GZVM_IOEVENTFD_VALID_FLAG_MASK ((1 << gzvm_ioeventfd_flag_nr_max) - 1) + +struct gzvm_ioeventfd { + __u64 datamatch; + __u64 addr; /* legal pio/mmio address */ + __u32 len; /* 1, 2, 4, or 8 bytes; or 0 to ignore length */ + __s32 fd; + __u32 flags; + __u8 pad[36]; +}; + +#define GZVM_IOEVENTFD _IOW(GZVM_IOC_MAGIC, 0x79, struct gzvm_ioeventfd) + #endif /* __GZVM__H__ */