From patchwork Mon Jun 2 07:49:29 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Auger Eric X-Patchwork-Id: 31255 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pb0-f72.google.com (mail-pb0-f72.google.com [209.85.160.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 42A1220AE6 for ; Mon, 2 Jun 2014 07:50:33 +0000 (UTC) Received: by mail-pb0-f72.google.com with SMTP id ma3sf18486031pbc.7 for ; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=q5/M4VRn9nU5fRSfSRjo4qP9sypkZYXM37aOMDp0P6Y=; b=kJknAXEjzh6nDiyUcNZeiZxjXFRp0OONSyoRnc1k6nVRy1QcTbwoWBoe9kG9TS+Mpx RSqqSat25hHz484xDH1ISTpLvKTwihebEPuaRN94nauC2VehO2nB7+t2RKicHdETHBqK EQaebbUQsNmriIV84/w60IdUQr/9Mp+26ZvqQH4DvpqHZZTu3pSQMDdDIAQTj6GfNcuy S2JV2sZY04O7ab3dK10v5k5C9jfXHyOi79ewoRyDSg/kRaVNfLeDk3JlpZpOIsZAYY+u q+zcTxpV0EFkjZhF78C8RkWVym/WpE7eWFyaAvrRYNJrtgwW62IfjV7KMApPnx7aYpMK 90EQ== X-Gm-Message-State: ALoCoQm0L8nLXVIo9mofIRpIOpA9bug3aAc8u84zLAmdAM4v2x5fvzHUQq+3BXRN6pxaCM3gvqoW X-Received: by 10.66.246.196 with SMTP id xy4mr14289641pac.11.1401695432443; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.84.72 with SMTP id k66ls1917807qgd.96.gmail; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) X-Received: by 10.52.99.168 with SMTP id er8mr24387903vdb.26.1401695432301; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) Received: from mail-vc0-f170.google.com (mail-vc0-f170.google.com [209.85.220.170]) by mx.google.com with ESMTPS id tf2si7514805vcb.77.2014.06.02.00.50.32 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Jun 2014 00:50:32 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.170 as permitted sender) client-ip=209.85.220.170; Received: by mail-vc0-f170.google.com with SMTP id la4so4742150vcb.29 for ; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) X-Received: by 10.221.7.71 with SMTP id on7mr29093979vcb.18.1401695432187; Mon, 02 Jun 2014 00:50:32 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp78774vcb; Mon, 2 Jun 2014 00:50:31 -0700 (PDT) X-Received: by 10.180.11.37 with SMTP id n5mr19870427wib.41.1401695429654; Mon, 02 Jun 2014 00:50:29 -0700 (PDT) Received: from mail-wi0-f179.google.com (mail-wi0-f179.google.com [209.85.212.179]) by mx.google.com with ESMTPS id gh5si23844476wjd.145.2014.06.02.00.50.29 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Jun 2014 00:50:29 -0700 (PDT) Received-SPF: pass (google.com: domain of eric.auger@linaro.org designates 209.85.212.179 as permitted sender) client-ip=209.85.212.179; Received: by mail-wi0-f179.google.com with SMTP id bs8so4017027wib.0 for ; Mon, 02 Jun 2014 00:50:29 -0700 (PDT) X-Received: by 10.180.126.98 with SMTP id mx2mr4338122wib.55.1401695429217; Mon, 02 Jun 2014 00:50:29 -0700 (PDT) Received: from midway01-04-00.lavalab ([88.98.47.97]) by mx.google.com with ESMTPSA id je7sm30286772wic.14.2014.06.02.00.50.28 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 02 Jun 2014 00:50:28 -0700 (PDT) From: Eric Auger To: eric.auger@st.com, christoffer.dall@linaro.org, qemu-devel@nongnu.org, kim.phillips@freescale.com, a.rigo@virtualopensystems.com Cc: eric.auger@linaro.org, christophe.barnichon@st.com, kvmarm@lists.cs.columbia.edu, alex.williamson@redhat.com, agraf@suse.de, peter.maydell@linaro.org, stuart.yoder@freescale.com, a.motakis@virtualopensystems.com, patches@linaro.org Subject: [RFC v3 05/10] vfio: Add initial IRQ support in platform device Date: Mon, 2 Jun 2014 08:49:29 +0100 Message-Id: <1401695374-4287-6-git-send-email-eric.auger@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1401695374-4287-1-git-send-email-eric.auger@linaro.org> References: <1401695374-4287-1-git-send-email-eric.auger@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: eric.auger@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , This patch brings a first support for device IRQ assignment to a KVM guest. Code is inspired of PCI INTx code. General principle of IRQ handling: when a physical IRQ occurs, VFIO driver signals an eventfd that was registered by the QEMU VFIO platform device. The eventfd handler (vfio_intp_interrupt) injects the IRQ through QEMU/KVM and also disables MMIO region fast path (where MMIO regions are mapped as RAM). The purpose is to trap the IRQ status register guest reset. The physical interrupt is unmasked on the first read/write in any MMIO region. It was masked in the VFIO driver at the instant it signaled the eventfd. A single IRQ can be forwarded to the guest at a time, ie. before a new virtual IRQ to be injected, the previous active one must have completed. When no IRQ is pending anymore, fast path can be restored. This is done on mmap_timer scheduling. irqfd support will be added in a subsequent patch. irqfd brings a framework where the eventfd is handled on kernel side instead of in user-side as currently done, hence improving the performance. Although the code is prepared to support multiple IRQs, this is not tested at that stage. Tested on Calxeda Midway xgmac which can be directly assigned to one guest (unfortunately only the main IRQ is exercised). A KVM patch is required to invalidate stage2 entries on RAM memory region destruction (https://patches.linaro.org/27691/). Without that patch, slow/fast path switch cannot work. change v2 -> v3: - Move mmap_timer and mmap_timeout in new VFIODevice struct as PCI/platform factorization. - multiple IRQ handling (a pending IRQ queue is added) - not tested - - create vfio_mmap_set_enabled as in PCI code - name of irq changed in virt Signed-off-by: Eric Auger --- hw/arm/virt.c | 13 +- hw/vfio/pci.c | 22 ++-- hw/vfio/platform.c | 323 ++++++++++++++++++++++++++++++++++++++++++++++++-- hw/vfio/vfio-common.h | 10 +- 4 files changed, 346 insertions(+), 22 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index becd76b..f5693aa 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -112,6 +112,7 @@ static const MemMapEntry a15memmap[] = { static const int a15irqmap[] = { [VIRT_UART] = 1, [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */ + [VIRT_ETHERNET] = 77, }; static VirtBoardInfo machines[] = { @@ -348,8 +349,14 @@ static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic) hwaddr base = vbi->memmap[VIRT_ETHERNET].base; hwaddr size = vbi->memmap[VIRT_ETHERNET].size; const char compat[] = "calxeda,hb-xgmac"; + int main_irq = vbi->irqmap[VIRT_ETHERNET]; + int power_irq = main_irq+1; + int low_power_irq = main_irq+2; - sysbus_create_simple("vfio-platform", base, NULL); + sysbus_create_varargs("vfio-platform", base, + pic[main_irq], + pic[power_irq], + pic[low_power_irq], NULL); nodename = g_strdup_printf("/ethernet@%" PRIx64, base); qemu_fdt_add_subnode(vbi->fdt, nodename); @@ -357,6 +364,10 @@ static void create_ethernet(const VirtBoardInfo *vbi, qemu_irq *pic) /* Note that we can't use setprop_string because of the embedded NUL */ qemu_fdt_setprop(vbi->fdt, nodename, "compatible", compat, sizeof(compat)); qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg", 2, base, 2, size); + qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts", + 0x0, main_irq, 0x4, + 0x0, power_irq, 0x4, + 0x0, low_power_irq, 0x4); g_free(nodename); } diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ad0c2a0..1b49205 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -83,8 +83,6 @@ typedef struct VFIOINTx { EventNotifier interrupt; /* eventfd triggered on interrupt */ EventNotifier unmask; /* eventfd for unmask on QEMU bypass */ PCIINTxRoute route; /* routing info for QEMU bypass */ - uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */ - QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */ } VFIOINTx; typedef struct VFIOMSIVector { @@ -196,8 +194,8 @@ static void vfio_intx_mmap_enable(void *opaque) VFIOPCIDevice *vdev = opaque; if (vdev->intx.pending) { - timer_mod(vdev->intx.mmap_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vdev->intx.mmap_timeout); + timer_mod(vdev->vdev.mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vdev->vdev.mmap_timeout); return; } @@ -217,9 +215,9 @@ static void vfio_intx_interrupt(void *opaque) vdev->intx.pending = true; pci_irq_assert(&vdev->pdev); vfio_mmap_set_enabled(vdev, false); - if (vdev->intx.mmap_timeout) { - timer_mod(vdev->intx.mmap_timer, - qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vdev->intx.mmap_timeout); + if (vdev->vdev.mmap_timeout) { + timer_mod(vdev->vdev.mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vdev->vdev.mmap_timeout); } } @@ -457,7 +455,7 @@ static void vfio_disable_intx(VFIOPCIDevice *vdev) { int fd; - timer_del(vdev->intx.mmap_timer); + timer_del(vdev->vdev.mmap_timer); vfio_disable_intx_kvm(vdev); vfio_disable_irqindex(&vdev->vdev, VFIO_PCI_INTX_IRQ_INDEX); vdev->intx.pending = false; @@ -3079,7 +3077,7 @@ static int vfio_initfn(PCIDevice *pdev) } if (vfio_pci_read_config(&vdev->pdev, PCI_INTERRUPT_PIN, 1)) { - vdev->intx.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vdev->vdev.mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, vfio_intx_mmap_enable, vdev); pci_device_set_intx_routing_notifier(&vdev->pdev, vfio_update_irq); ret = vfio_enable_intx(vdev); @@ -3112,8 +3110,8 @@ static void vfio_exitfn(PCIDevice *pdev) vfio_unregister_err_notifier(vdev); pci_device_set_intx_routing_notifier(&vdev->pdev, NULL); vfio_disable_interrupts(vdev); - if (vdev->intx.mmap_timer) { - timer_free(vdev->intx.mmap_timer); + if (vdev->vdev.mmap_timer) { + timer_free(vdev->vdev.mmap_timer); } vfio_teardown_msi(vdev); vfio_unmap_bars(vdev); @@ -3158,7 +3156,7 @@ post_reset: static Property vfio_pci_dev_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice, - intx.mmap_timeout, 1100), + vdev.mmap_timeout, 1100), DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features, VFIO_FEATURE_ENABLE_VGA_BIT, false), DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1), diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 646aa53..5b9451f 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -24,11 +24,25 @@ #include "vfio-common.h" +typedef struct VFIOINTp { + QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */ + QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */ + EventNotifier interrupt; /* eventfd triggered on interrupt */ + EventNotifier unmask; /* eventfd for unmask on QEMU bypass */ + qemu_irq qemuirq; + struct VFIOPlatformDevice *vdev; /* back pointer to device */ + int state; /* inactive, pending, active */ + bool kvm_accel; /* set when QEMU bypass through KVM enabled */ + uint8_t pin; /* index */ +} VFIOINTp; + typedef struct VFIOPlatformDevice { SysBusDevice sbdev; VFIODevice vdev; /* not a QOM object */ -/* interrupts to come later on */ + QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */ + /* queue of pending IRQ */ + QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue; } VFIOPlatformDevice; @@ -38,9 +52,11 @@ static const MemoryRegionOps vfio_region_ops = { .endianness = DEVICE_NATIVE_ENDIAN, }; +static void vfio_intp_interrupt(void *opaque); + /* * It is mandatory to pass a VFIOPlatformDevice since VFIODevice - * is not an Object and cannot be passed to memory region functions + * is not a QOM Object and cannot be passed to memory region functions */ static void vfio_map_region(VFIOPlatformDevice *vdev, int nr) @@ -51,7 +67,7 @@ static void vfio_map_region(VFIOPlatformDevice *vdev, int nr) snprintf(name, sizeof(name), "VFIO %s region %d", vdev->vdev.name, nr); - /* A "slow" read/write mapping underlies all regions */ + /* A "slow" read/write mapping underlies all regions */ memory_region_init_io(®ion->mem, OBJECT(vdev), &vfio_region_ops, region, name, size); @@ -145,18 +161,292 @@ static int vfio_platform_hot_reset_multi(VFIODevice *vdev) return 0; } +/* + * eoi function is called on the first access to any MMIO region + * after an IRQ was triggered. It is assumed this access corresponds + * to the IRQ status register reset. + * With such a mechanism, a single IRQ can be handled at a time since + * there is no way to know which IRQ was completed by the guest. + * (we would need additional details about the IRQ status register mask) + */ + +static void vfio_platform_eoi(VFIODevice *vdev) +{ + VFIOINTp *intp; + VFIOPlatformDevice *vplatdev = container_of(vdev, VFIOPlatformDevice, vdev); + bool eoi_done = false; + + QLIST_FOREACH(intp, &vplatdev->intp_list, next) { + if (intp->state == VFIO_IRQ_ACTIVE) { + if (eoi_done) { + error_report("several IRQ pending: " + "this case should not happen!\n"); + } + DPRINTF("EOI IRQ #%d fd=%d\n", + intp->pin, event_notifier_get_fd(&intp->interrupt)); + intp->state = VFIO_IRQ_INACTIVE; + + /* deassert the virtual IRQ and unmask physical one */ + qemu_set_irq(intp->qemuirq, 0); + vfio_unmask_irqindex(vdev, intp->pin); + eoi_done = true; + } + } + + /* + * in case there are pending IRQs, handle them one at a time */ + if (!QSIMPLEQ_EMPTY(&vplatdev->pending_intp_queue)) { + intp = QSIMPLEQ_FIRST(&vplatdev->pending_intp_queue); + vfio_intp_interrupt(intp); + QSIMPLEQ_REMOVE_HEAD(&vplatdev->pending_intp_queue, pqnext); + } + + return; +} + +/* + * enable/disable the fast path mode + * fast path = MMIO region is mmaped (no KVM TRAP) + * slow path = MMIO region is trapped and region callbacks are called + * slow path enables to trap the IRQ status register guest reset +*/ + +static void vfio_mmap_set_enabled(VFIODevice *vdev, bool enabled) +{ + VFIORegion *region; + int i; + + DPRINTF("fast path = %d\n", enabled); + + for (i = 0; i < vdev->num_regions; i++) { + region = vdev->regions[i]; + + /* register space is unmapped to trap EOI */ + memory_region_set_enabled(®ion->mmap_mem, enabled); + } +} + +/* + * Checks whether the IRQ is still pending. In the negative + * the fast path mode (where reg space is mmaped) can be restored. + * if the IRQ is still pending, we must keep on trapping IRQ status + * register reset with mmap disabled (slow path). + * the function is called on mmap_timer event. + * by construction a single fd is handled at a time. See EOI comment + * for additional details. + */ + + +static void vfio_intp_mmap_enable(void *opaque) +{ + VFIOINTp *tmp; + VFIODevice *vdev = (VFIODevice *)opaque; + VFIOPlatformDevice *vplatdev = container_of(vdev, VFIOPlatformDevice, vdev); + bool one_active_irq = false; + + QLIST_FOREACH(tmp, &vplatdev->intp_list, next) { + if (tmp->state == VFIO_IRQ_ACTIVE) { + if (one_active_irq) { + error_report("several active IRQ: " + "this case should not happen!\n"); + } + DPRINTF("IRQ #%d still pending, stay in slow path\n", + tmp->pin); + timer_mod(vdev->mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + vdev->mmap_timeout); + one_active_irq = true; + } + } + if (one_active_irq) { + return; + } + + DPRINTF("no pending IRQ, restore fast path\n"); + vfio_mmap_set_enabled(vdev, true); +} + +/* + * The fd handler + */ + +static void vfio_intp_interrupt(void *opaque) +{ + int ret; + VFIOINTp *tmp, *intp = (VFIOINTp *)opaque; + VFIOPlatformDevice *vplatdev = intp->vdev; + VFIODevice *vdev = &vplatdev->vdev; + bool one_active_irq = false; + + /* + * first check whether there is a pending IRQ + * in the positive the new IRQ cannot be handled until the + * active one is not completed. + * by construction the same IRQ as the pending one cannot hit + * since the physical IRQ was disabled by the VFIO driver + */ + QLIST_FOREACH(tmp, &vplatdev->intp_list, next) { + if (tmp->state == VFIO_IRQ_ACTIVE) { + one_active_irq = true; + } + } + if (one_active_irq) { + /* + * the new IRQ gets a pending status and is pushed in + * the pending queue + */ + intp->state = VFIO_IRQ_PENDING; + QSIMPLEQ_INSERT_TAIL(&vplatdev->pending_intp_queue, + intp, pqnext); + return; + } + + /* no active IRQ, the new IRQ can be forwarded to guest */ + DPRINTF("Handle IRQ #%d (fd = %d)\n", + intp->pin, event_notifier_get_fd(&intp->interrupt)); + + ret = event_notifier_test_and_clear(&intp->interrupt); + if (!ret) { + DPRINTF("Error when clearing fd=%d\n", + event_notifier_get_fd(&intp->interrupt)); + } + + intp->state = VFIO_IRQ_ACTIVE; + + /* sets slow path */ + vfio_mmap_set_enabled(vdev, false); + + /* trigger the virtual IRQ */ + qemu_set_irq(intp->qemuirq, 1); + + /* schedule the mmap timer which will restore mmap path after EOI*/ + if (vdev->mmap_timeout) { + timer_mod(vdev->mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vdev->mmap_timeout); + } + +} + +static int vfio_enable_intp(VFIODevice *vdev, unsigned int index) +{ + struct vfio_irq_set *irq_set; + int32_t *pfd; + int ret, argsz; + int device = vdev->fd; + VFIOPlatformDevice *vplatdev = container_of(vdev, VFIOPlatformDevice, vdev); + SysBusDevice *sbdev = SYS_BUS_DEVICE(vplatdev); + + /* allocate and populate a new VFIOINTp structure put in a queue list */ + VFIOINTp *intp = g_malloc0(sizeof(*intp)); + intp->vdev = vplatdev; + intp->pin = index; + intp->state = VFIO_IRQ_INACTIVE; + + sysbus_init_irq(sbdev, &intp->qemuirq); + + ret = event_notifier_init(&intp->interrupt, 0); + if (ret) { + error_report("vfio: Error: event_notifier_init failed "); + return ret; + } + /* build the irq_set to be passed to the vfio kernel driver */ + + argsz = sizeof(*irq_set) + sizeof(*pfd); + + irq_set = g_malloc0(argsz); + irq_set->argsz = argsz; + irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER; + irq_set->index = index; + irq_set->start = 0; + irq_set->count = 1; + pfd = (int32_t *)&irq_set->data; + + *pfd = event_notifier_get_fd(&intp->interrupt); + + DPRINTF("register fd=%d/irq index=%d to kernel\n", *pfd, index); + + qemu_set_fd_handler(*pfd, vfio_intp_interrupt, NULL, intp); + + /* + * pass the index/fd binding to the kernel driver so that it + * triggers this fd on HW IRQ + */ + ret = ioctl(device, VFIO_DEVICE_SET_IRQS, irq_set); + g_free(irq_set); + if (ret) { + error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m"); + qemu_set_fd_handler(*pfd, NULL, NULL, NULL); + close(*pfd); /* TO DO : replace by event_notifier_cleanup */ + return -errno; + } + + /* store the new intp in qlist */ + + QLIST_INSERT_HEAD(&vplatdev->intp_list, intp, next); + + return 0; +} + -/* not implemented yet */ static int vfio_platform_get_device_interrupts(VFIODevice *vdev) { + struct vfio_irq_info irq = { .argsz = sizeof(irq) }; + int i, ret; + VFIOPlatformDevice *vplatdev = container_of(vdev, VFIOPlatformDevice, vdev); + + /* + * mmap timeout = 1100 ms, PCI default value + * this will become a user-defined value in subsequent patch + */ + vdev->mmap_timeout = 1100; + vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vfio_intp_mmap_enable, vdev); + + QSIMPLEQ_INIT(&vplatdev->pending_intp_queue); + + for (i = 0; i < vdev->num_irqs; i++) { + irq.index = i; + + DPRINTF("Retrieve IRQ info from vfio platform driver ...\n"); + + ret = ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq); + if (ret) { + error_printf("vfio: error getting device %s irq info", + vdev->name); + } + DPRINTF("- IRQ index %d: count %d, flags=0x%x\n", + irq.index, irq.count, irq.flags); + + vfio_enable_intp(vdev, irq.index); + } return 0; } -/* not implemented yet */ -static void vfio_platform_eoi(VFIODevice *vdev) + +static void vfio_disable_intp(VFIODevice *vdev) { + VFIOINTp *intp; + VFIOPlatformDevice *vplatdev = container_of(vdev, VFIOPlatformDevice, vdev); + int fd; + + QLIST_FOREACH(intp, &vplatdev->intp_list, next) { + fd = event_notifier_get_fd(&intp->interrupt); + DPRINTF("close IRQ pin=%d fd=%d\n", intp->pin, fd); + + vfio_disable_irqindex(vdev, intp->pin); + intp->state = VFIO_IRQ_INACTIVE; + qemu_set_irq(intp->qemuirq, 0); + + qemu_set_fd_handler(fd, NULL, NULL, NULL); + event_notifier_cleanup(&intp->interrupt); + } + + /* restore fast path */ + vfio_mmap_set_enabled(vdev, true); + } + static VFIODeviceOps vfio_platform_ops = { .vfio_eoi = vfio_platform_eoi, .vfio_compute_needs_reset = vfio_platform_compute_needs_reset, @@ -194,9 +484,11 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) static void vfio_platform_unrealize(DeviceState *dev, Error **errp) { int i; + VFIOINTp *intp, *next_intp; SysBusDevice *sbdev = SYS_BUS_DEVICE(dev); - VFIOPlatformDevice *vdev = container_of(sbdev, VFIOPlatformDevice, sbdev); - VFIODevice *vbasedev = &vdev->vdev; + VFIOPlatformDevice *vplatdev = container_of(sbdev, + VFIOPlatformDevice, sbdev); + VFIODevice *vbasedev = &vplatdev->vdev; VFIOGroup *group = vbasedev->group; /* * placeholder for @@ -205,6 +497,21 @@ static void vfio_platform_unrealize(DeviceState *dev, Error **errp) * timer free * g_free vdev dynamic fields */ + vfio_disable_intp(vbasedev); + + while (!QSIMPLEQ_EMPTY(&vplatdev->pending_intp_queue)) { + QSIMPLEQ_REMOVE_HEAD(&vplatdev->pending_intp_queue, pqnext); + } + + QLIST_FOREACH_SAFE(intp, &vplatdev->intp_list, next, next_intp) { + QLIST_REMOVE(intp, next); + g_free(intp); + } + + if (vbasedev->mmap_timer) { + timer_free(vbasedev->mmap_timer); + } + vfio_unmap_regions(vbasedev); for (i = 0; i < vbasedev->num_regions; i++) { diff --git a/hw/vfio/vfio-common.h b/hw/vfio/vfio-common.h index 2699fba..7139d81 100644 --- a/hw/vfio/vfio-common.h +++ b/hw/vfio/vfio-common.h @@ -42,6 +42,13 @@ enum { VFIO_DEVICE_TYPE_PLATFORM = 1, }; +enum { + VFIO_IRQ_INACTIVE = 0, + VFIO_IRQ_PENDING = 1, + VFIO_IRQ_ACTIVE = 2, + /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */ +}; + struct VFIOGroup; struct VFIODevice; @@ -61,7 +68,6 @@ typedef struct VFIORegion { uint8_t nr; /* cache the region number for debug */ } VFIORegion; - /* Base Class for a VFIO device */ typedef struct VFIODevice { @@ -75,6 +81,8 @@ typedef struct VFIODevice { int type; bool reset_works; bool needs_reset; + uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */ + QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */ VFIODeviceOps *ops; } VFIODevice;