From patchwork Wed Jun 10 11:53:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6931EC433DF for ; Wed, 10 Jun 2020 11:56:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 337EF2072E for ; Wed, 10 Jun 2020 11:56:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aRLnoQY+" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 337EF2072E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40614 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizLG-0004TR-DB for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 07:56:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43268) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizJg-0002Sd-4q for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:54:48 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:22677 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizJe-0001tb-0V for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:54:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790084; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U68jIOHuvNFPHE8HVjN9BLo12zDx2a4qnBp0EnzJr9o=; b=aRLnoQY+2mAPRl1CjeDp1dCzEVOJmubuDo+p5Qe0Ndub/Ywv0z0K8uq4UCF5y4rDVxA7o6 2GG3iXBcf3w9CWw8nuzCrjnyhDultn2rWo/SQZ1hjqzvCZDWQTBUCnUyVfEYYcnOHu8srt XV/jxZwGDkE77Pn0yCUoW7dss7w5qdo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-283-LYF1y1dlMwuABbJGim0Zvg-1; Wed, 10 Jun 2020 07:54:40 -0400 X-MC-Unique: LYF1y1dlMwuABbJGim0Zvg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9C04E193F560; Wed, 10 Jun 2020 11:54:39 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC3F25D9D3; Wed, 10 Jun 2020 11:54:37 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 01/21] exec: Introduce ram_block_discard_(disable|require)() Date: Wed, 10 Jun 2020 13:53:59 +0200 Message-Id: <20200610115419.51688-2-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We want to replace qemu_balloon_inhibit() by something more generic. Especially, we want to make sure that technologies that really rely on RAM block discards to work reliably to run mutual exclusive with technologies that effectively break it. E.g., vfio will usually pin all guest memory, turning the virtio-balloon basically useless and make the VM consume more memory than reported via the balloon. While the balloon is special already (=> no guarantees, same behavior possible afer reboots and with huge pages), this will be different, especially, with virtio-mem. Let's implement a way such that we can make both types of technology run mutually exclusive. We'll convert existing balloon inhibitors in successive patches and add some new ones. Add the check to qemu_balloon_is_inhibited() for now. We might want to make virtio-balloon an acutal inhibitor in the future - however, that requires more thought to not break existing setups. Reviewed-by: Dr. David Alan Gilbert Cc: "Michael S. Tsirkin" Cc: Richard Henderson Cc: Paolo Bonzini Signed-off-by: David Hildenbrand --- balloon.c | 3 ++- exec.c | 52 +++++++++++++++++++++++++++++++++++++++++++ include/exec/memory.h | 41 ++++++++++++++++++++++++++++++++++ 3 files changed, 95 insertions(+), 1 deletion(-) diff --git a/balloon.c b/balloon.c index f104b42961..5fff79523a 100644 --- a/balloon.c +++ b/balloon.c @@ -40,7 +40,8 @@ static int balloon_inhibit_count; bool qemu_balloon_is_inhibited(void) { - return atomic_read(&balloon_inhibit_count) > 0; + return atomic_read(&balloon_inhibit_count) > 0 || + ram_block_discard_is_disabled(); } void qemu_balloon_inhibit(bool state) diff --git a/exec.c b/exec.c index be4be2df3a..c4c1d9df84 100644 --- a/exec.c +++ b/exec.c @@ -4051,4 +4051,56 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, MemoryRegion *root) } } +/* + * If positive, discarding RAM is disabled. If negative, discarding RAM is + * required to work and cannot be disabled. + */ +static int ram_block_discard_disabled; + +int ram_block_discard_disable(bool state) +{ + int old; + + if (!state) { + atomic_dec(&ram_block_discard_disabled); + return 0; + } + + do { + old = atomic_read(&ram_block_discard_disabled); + if (old < 0) { + return -EBUSY; + } + } while (atomic_cmpxchg(&ram_block_discard_disabled, old, old + 1) != old); + return 0; +} + +int ram_block_discard_require(bool state) +{ + int old; + + if (!state) { + atomic_inc(&ram_block_discard_disabled); + return 0; + } + + do { + old = atomic_read(&ram_block_discard_disabled); + if (old > 0) { + return -EBUSY; + } + } while (atomic_cmpxchg(&ram_block_discard_disabled, old, old - 1) != old); + return 0; +} + +bool ram_block_discard_is_disabled(void) +{ + return atomic_read(&ram_block_discard_disabled) > 0; +} + +bool ram_block_discard_is_required(void) +{ + return atomic_read(&ram_block_discard_disabled) < 0; +} + #endif diff --git a/include/exec/memory.h b/include/exec/memory.h index 3e00cdbbfa..eea7d284b9 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2474,6 +2474,47 @@ static inline MemOp devend_memop(enum device_endian end) } #endif +/* + * Inhibit technologies that require discarding of pages in RAM blocks, e.g., + * to manage the actual amount of memory consumed by the VM (then, the memory + * provided by RAM blocks might be bigger than the desired memory consumption). + * This *must* be set if: + * - Discarding parts of a RAM blocks does not result in the change being + * reflected in the VM and the pages getting freed. + * - All memory in RAM blocks is pinned or duplicated, invaldiating any previous + * discards blindly. + * - Discarding parts of a RAM blocks will result in integrity issues (e.g., + * encrypted VMs). + * Technologies that only temporarily pin the current working set of a + * driver are fine, because we don't expect such pages to be discarded + * (esp. based on guest action like balloon inflation). + * + * This is *not* to be used to protect from concurrent discards (esp., + * postcopy). + * + * Returns 0 if successful. Returns -EBUSY if a technology that relies on + * discards to work reliably is active. + */ +int ram_block_discard_disable(bool state); + +/* + * Inhibit technologies that disable discarding of pages in RAM blocks. + * + * Returns 0 if successful. Returns -EBUSY if discards are already set to + * broken. + */ +int ram_block_discard_require(bool state); + +/* + * Test if discarding of memory in ram blocks is disabled. + */ +bool ram_block_discard_is_disabled(void); + +/* + * Test if discarding of memory in ram blocks is required to work reliably. + */ +bool ram_block_discard_is_required(void); + #endif #endif From patchwork Wed Jun 10 11:54:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE676C433DF for ; Wed, 10 Jun 2020 11:58:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 87BF52072E for ; Wed, 10 Jun 2020 11:58:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="IcDxchOh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 87BF52072E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49196 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizNK-0008D6-Ox for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 07:58:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43324) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizJs-0002sf-4N for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:00 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:26898 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizJq-0001vj-KU for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:54:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u3TnvMHIMVg+ynIb51APci4iCUQNAE73bAVqqwa7/LQ=; b=IcDxchOhHZu6tYlykfSrDZW00NEDgPDKxM8Jy1HoOfgrmjFBNNANZvlcm7sfkxQty9nC44 WGMwdKdh+H+mIXnYTVj1XTL0JqukvUD6egpMlJADnJEH6M6uyHkZpYwoU4mUTI4cxxkgx6 njUWel4P1TmJusOPyi6JBCOeIDGuEH0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-1-kUj70tKQP2ug4p5YQ87IUg-1; Wed, 10 Jun 2020 07:54:52 -0400 X-MC-Unique: kUj70tKQP2ug4p5YQ87IUg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 453A1107ACCA; Wed, 10 Jun 2020 11:54:51 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id EE79C5D9D3; Wed, 10 Jun 2020 11:54:39 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 02/21] vfio: Convert to ram_block_discard_disable() Date: Wed, 10 Jun 2020 13:54:00 +0200 Message-Id: <20200610115419.51688-3-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tony Krowiak , Eric Farman , Cornelia Huck , Alex Williamson , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , Pierre Morel , David Hildenbrand , "Dr . David Alan Gilbert" , Halil Pasic , Christian Borntraeger , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" VFIO is (except devices without a physical IOMMU or some mediated devices) incompatible with discarding of RAM. The kernel will pin basically all VM memory. Let's convert to ram_block_discard_disable(), which can now fail, in contrast to qemu_balloon_inhibit(). Leave "x-balloon-allowed" named as it is for now. Cc: Cornelia Huck Cc: Alex Williamson Cc: Christian Borntraeger Cc: Tony Krowiak Cc: Halil Pasic Cc: Pierre Morel Cc: Eric Farman Signed-off-by: David Hildenbrand --- hw/vfio/ap.c | 10 +++---- hw/vfio/ccw.c | 11 ++++---- hw/vfio/common.c | 53 +++++++++++++++++++---------------- hw/vfio/pci.c | 6 ++-- include/hw/vfio/vfio-common.h | 4 +-- 5 files changed, 45 insertions(+), 39 deletions(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 95564c17ed..d0b1bc7581 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -105,12 +105,12 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp) vapdev->vdev.dev = dev; /* - * vfio-ap devices operate in a way compatible with - * memory ballooning, as no pages are pinned in the host. - * This needs to be set before vfio_get_device() for vfio common to - * handle the balloon inhibitor. + * vfio-ap devices operate in a way compatible discarding of memory in + * RAM blocks, as no pages are pinned in the host. This needs to be + * set before vfio_get_device() for vfio common to handle + * ram_block_discard_disable(). */ - vapdev->vdev.balloon_allowed = true; + vapdev->vdev.ram_block_discard_allowed = true; ret = vfio_get_device(vfio_group, mdevid, &vapdev->vdev, errp); if (ret) { diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 63406184d2..82857f1615 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -418,12 +418,13 @@ static void vfio_ccw_get_device(VFIOGroup *group, VFIOCCWDevice *vcdev, /* * All vfio-ccw devices are believed to operate in a way compatible with - * memory ballooning, ie. pages pinned in the host are in the current - * working set of the guest driver and therefore never overlap with pages - * available to the guest balloon driver. This needs to be set before - * vfio_get_device() for vfio common to handle the balloon inhibitor. + * discarding of memory in RAM blocks, ie. pages pinned in the host are + * in the current working set of the guest driver and therefore never + * overlap e.g., with pages available to the guest balloon driver. This + * needs to be set before vfio_get_device() for vfio common to handle + * ram_block_discard_disable(). */ - vcdev->vdev.balloon_allowed = true; + vcdev->vdev.ram_block_discard_allowed = true; if (vfio_get_device(group, vcdev->cdev.mdevid, &vcdev->vdev, errp)) { goto out_err; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 0b3593b3c0..33357140b8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -33,7 +33,6 @@ #include "qemu/error-report.h" #include "qemu/main-loop.h" #include "qemu/range.h" -#include "sysemu/balloon.h" #include "sysemu/kvm.h" #include "sysemu/reset.h" #include "trace.h" @@ -1215,31 +1214,36 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, space = vfio_get_address_space(as); /* - * VFIO is currently incompatible with memory ballooning insofar as the + * VFIO is currently incompatible with discarding of RAM insofar as the * madvise to purge (zap) the page from QEMU's address space does not * interact with the memory API and therefore leaves stale virtual to * physical mappings in the IOMMU if the page was previously pinned. We - * therefore add a balloon inhibit for each group added to a container, + * therefore set discarding broken for each group added to a container, * whether the container is used individually or shared. This provides * us with options to allow devices within a group to opt-in and allow - * ballooning, so long as it is done consistently for a group (for instance + * discarding, so long as it is done consistently for a group (for instance * if the device is an mdev device where it is known that the host vendor * driver will never pin pages outside of the working set of the guest - * driver, which would thus not be ballooning candidates). + * driver, which would thus not be discarding candidates). * * The first opportunity to induce pinning occurs here where we attempt to * attach the group to existing containers within the AddressSpace. If any - * pages are already zapped from the virtual address space, such as from a - * previous ballooning opt-in, new pinning will cause valid mappings to be + * pages are already zapped from the virtual address space, such as from + * previous discards, new pinning will cause valid mappings to be * re-established. Likewise, when the overall MemoryListener for a new * container is registered, a replay of mappings within the AddressSpace * will occur, re-establishing any previously zapped pages as well. * - * NB. Balloon inhibiting does not currently block operation of the - * balloon driver or revoke previously pinned pages, it only prevents - * calling madvise to modify the virtual mapping of ballooned pages. + * Especially virtio-balloon is currently only prevented from discarding + * new memory, it will not yet set ram_block_discard_set_required() and + * therefore, neither stops us here or deals with the sudden memory + * consumption of inflated memory. */ - qemu_balloon_inhibit(true); + ret = ram_block_discard_disable(true); + if (ret) { + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"); + return ret; + } QLIST_FOREACH(container, &space->containers, next) { if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { @@ -1405,7 +1409,7 @@ close_fd_exit: close(fd); put_space_exit: - qemu_balloon_inhibit(false); + ram_block_discard_disable(false); vfio_put_address_space(space); return ret; @@ -1526,8 +1530,8 @@ void vfio_put_group(VFIOGroup *group) return; } - if (!group->balloon_allowed) { - qemu_balloon_inhibit(false); + if (!group->ram_block_discard_allowed) { + ram_block_discard_disable(false); } vfio_kvm_device_del_group(group); vfio_disconnect_container(group); @@ -1565,22 +1569,23 @@ int vfio_get_device(VFIOGroup *group, const char *name, } /* - * Clear the balloon inhibitor for this group if the driver knows the - * device operates compatibly with ballooning. Setting must be consistent - * per group, but since compatibility is really only possible with mdev - * currently, we expect singleton groups. + * Set discarding of RAM as not broken for this group if the driver knows + * the device operates compatibly with discarding. Setting must be + * consistent per group, but since compatibility is really only possible + * with mdev currently, we expect singleton groups. */ - if (vbasedev->balloon_allowed != group->balloon_allowed) { + if (vbasedev->ram_block_discard_allowed != + group->ram_block_discard_allowed) { if (!QLIST_EMPTY(&group->device_list)) { - error_setg(errp, - "Inconsistent device balloon setting within group"); + error_setg(errp, "Inconsistent setting of support for discarding " + "RAM (e.g., balloon) within group"); close(fd); return -1; } - if (!group->balloon_allowed) { - group->balloon_allowed = true; - qemu_balloon_inhibit(false); + if (!group->ram_block_discard_allowed) { + group->ram_block_discard_allowed = true; + ram_block_discard_disable(false); } } diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 342dd6b912..c33c11b7e4 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2796,7 +2796,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) } /* - * Mediated devices *might* operate compatibly with memory ballooning, but + * Mediated devices *might* operate compatibly with discarding of RAM, but * we cannot know for certain, it depends on whether the mdev vendor driver * stays in sync with the active working set of the guest driver. Prevent * the x-balloon-allowed option unless this is minimally an mdev device. @@ -2809,7 +2809,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) trace_vfio_mdev(vdev->vbasedev.name, is_mdev); - if (vdev->vbasedev.balloon_allowed && !is_mdev) { + if (vdev->vbasedev.ram_block_discard_allowed && !is_mdev) { error_setg(errp, "x-balloon-allowed only potentially compatible " "with mdev devices"); vfio_put_group(group); @@ -3163,7 +3163,7 @@ static Property vfio_pci_dev_properties[] = { VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), DEFINE_PROP_BOOL("x-balloon-allowed", VFIOPCIDevice, - vbasedev.balloon_allowed, false), + vbasedev.ram_block_discard_allowed, false), DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false), DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false), DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false), diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index fd564209ac..c78f3ff559 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -108,7 +108,7 @@ typedef struct VFIODevice { bool reset_works; bool needs_reset; bool no_mmap; - bool balloon_allowed; + bool ram_block_discard_allowed; VFIODeviceOps *ops; unsigned int num_irqs; unsigned int num_regions; @@ -128,7 +128,7 @@ typedef struct VFIOGroup { QLIST_HEAD(, VFIODevice) device_list; QLIST_ENTRY(VFIOGroup) next; QLIST_ENTRY(VFIOGroup) container_next; - bool balloon_allowed; + bool ram_block_discard_allowed; } VFIOGroup; typedef struct VFIODMABuf { From patchwork Wed Jun 10 11:54:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B37DC433DF for ; Wed, 10 Jun 2020 12:01:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 45D602072E for ; Wed, 10 Jun 2020 12:01:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ea6ADcck" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45D602072E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizPv-0003eB-Ex for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:01:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43348) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizJw-00030S-EY for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:04 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:57493 helo=us-smtp-delivery-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizJu-0001wT-As for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790101; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VUu0CYVglzX/Ll9iBfJF7XJmcLsniLRcSuurJhAdw+U=; b=Ea6ADcck4+L0T/aY0Dc5tihNuThdNN1WZHcLDNUcMnTm7NfhBlAnuVVKNkl1B9/ZYWveQK ngsBgzM1508m5EL/MkIZVeP75obvDG3LVvdy5099Bsw8BGFVGrwZ6z7n96taZahGmkzLp+ y3ivy4F3cpuFUuS6Gg/1hL2n4fzMxIg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-198-nG8aSuP-M3m0pAJ0p-BRKg-1; Wed, 10 Jun 2020 07:54:57 -0400 X-MC-Unique: nG8aSuP-M3m0pAJ0p-BRKg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 57E4C8014D4; Wed, 10 Jun 2020 11:54:56 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id CEBC65D9D3; Wed, 10 Jun 2020 11:54:53 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 04/21] s390x/pv: Convert to ram_block_discard_disable() Date: Wed, 10 Jun 2020 13:54:02 +0200 Message-Id: <20200610115419.51688-5-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=205.139.110.61; envelope-from=david@redhat.com; helo=us-smtp-delivery-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:51:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Cornelia Huck , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Halil Pasic , Christian Borntraeger , qemu-s390x@nongnu.org, Janosch Frank , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Discarding RAM does not work as expected with protected VMs. Let's switch to ram_block_discard_disable() for now, as we want to get rid of qemu_balloon_inhibit(). Note that it will currently never fail, but might fail in the future with new technologies (e.g., virtio-mem). Cc: Richard Henderson Cc: Cornelia Huck Cc: Halil Pasic Cc: Christian Borntraeger Cc: Janosch Frank Signed-off-by: David Hildenbrand --- hw/s390x/s390-virtio-ccw.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 60b16fef77..c985bb56eb 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -43,7 +43,6 @@ #include "hw/qdev-properties.h" #include "hw/s390x/tod.h" #include "sysemu/sysemu.h" -#include "sysemu/balloon.h" #include "hw/s390x/pv.h" #include "migration/blocker.h" @@ -329,7 +328,7 @@ static void s390_machine_unprotect(S390CcwMachineState *ms) ms->pv = false; migrate_del_blocker(pv_mig_blocker); error_free_or_abort(&pv_mig_blocker); - qemu_balloon_inhibit(false); + ram_block_discard_disable(false); } static int s390_machine_protect(S390CcwMachineState *ms) @@ -338,17 +337,22 @@ static int s390_machine_protect(S390CcwMachineState *ms) int rc; /* - * Ballooning on protected VMs needs support in the guest for - * sharing and unsharing balloon pages. Block ballooning for - * now, until we have a solution to make at least Linux guests - * either support it or fail gracefully. + * Discarding of memory in RAM blocks does not work as expected with + * protected VMs. Sharing and unsharing pages would be required. Disable + * it for now, until until we have a solution to make at least Linux + * guests either support it (e.g., virtio-balloon) or fail gracefully. */ - qemu_balloon_inhibit(true); + rc = ram_block_discard_disable(true); + if (rc) { + error_report("protected VMs: cannot disable RAM discard"); + return rc; + } + error_setg(&pv_mig_blocker, "protected VMs are currently not migrateable."); rc = migrate_add_blocker(pv_mig_blocker, &local_err); if (rc) { - qemu_balloon_inhibit(false); + ram_block_discard_disable(false); error_report_err(local_err); error_free_or_abort(&pv_mig_blocker); return rc; @@ -357,7 +361,7 @@ static int s390_machine_protect(S390CcwMachineState *ms) /* Create SE VM */ rc = s390_pv_vm_enable(); if (rc) { - qemu_balloon_inhibit(false); + ram_block_discard_disable(false); migrate_del_blocker(pv_mig_blocker); error_free_or_abort(&pv_mig_blocker); return rc; From patchwork Wed Jun 10 11:54:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15440C433DF for ; Wed, 10 Jun 2020 11:58:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D4CE02074B for ; Wed, 10 Jun 2020 11:58:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="c8fEJXLg" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D4CE02074B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50288 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizNV-0000Dk-1g for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 07:58:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizK4-0003CC-LA for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:12 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:39561 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizK3-0001xb-TN for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790111; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/bRpzWo8mGWvq3enPFtqLN32egzNgjYQizck8xugJ+M=; b=c8fEJXLguJXrmaV5V32A0rFBmkq81eRUOB4jKG/hqeNXxU8QTzj/AkCO5hHiCXoVnSPdrc 5vpsUcVN9GTB5/7yOiw3171odUG64gxnvHXZ+ham9MJRbBbr5bos8BBO+udCgIzHTSqGVN +MlClYt3y4yuAXC0+bodchkTqsC9xN0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-502-DLX1pjx1OeOHbokAJF5eVg-1; Wed, 10 Jun 2020 07:55:07 -0400 X-MC-Unique: DLX1pjx1OeOHbokAJF5eVg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7E40A461; Wed, 10 Jun 2020 11:55:06 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0AE695D9D3; Wed, 10 Jun 2020 11:54:58 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 06/21] target/i386: sev: Use ram_block_discard_disable() Date: Wed, 10 Jun 2020 13:54:04 +0200 Message-Id: <20200610115419.51688-7-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=205.139.110.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:51:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" AMD SEV will pin all guest memory, mark discarding of RAM broken. At the time this is called, we cannot have anyone active that relies on discards to work properly - let's still implement error handling. Reviewed-by: Dr. David Alan Gilbert Cc: "Michael S. Tsirkin" Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Signed-off-by: David Hildenbrand --- target/i386/sev.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/target/i386/sev.c b/target/i386/sev.c index 51cdbe5496..4a4863db28 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -649,6 +649,12 @@ sev_guest_init(const char *id) uint32_t host_cbitpos; struct sev_user_data_status status = {}; + ret = ram_block_discard_disable(true); + if (ret) { + error_report("%s: cannot disable RAM discard", __func__); + return NULL; + } + sev_state = s = g_new0(SEVState, 1); s->sev_info = lookup_sev_guest_info(id); if (!s->sev_info) { @@ -724,6 +730,7 @@ sev_guest_init(const char *id) err: g_free(sev_state); sev_state = NULL; + ram_block_discard_disable(false); return NULL; } From patchwork Wed Jun 10 11:54:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1DF1C433E5 for ; Wed, 10 Jun 2020 12:01:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A0C0F20760 for ; Wed, 10 Jun 2020 12:01:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TAnec0pD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A0C0F20760 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59194 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizQb-0004CA-Ma for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:01:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43418) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKC-0003H7-Pf for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:20 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:26859 helo=us-smtp-delivery-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizK9-000216-Mq for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790116; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3bFU0P1MN91YjNPtnZUSniyH3X2xNqVX/h+DSb3PV28=; b=TAnec0pDqo0p3B1UI6RlDt4RiYoPSGhQrb1I1vnMp38qnfgXcP27YHdywt+3uhIbBXbdxb iSs0YsQQdovb0YFpm0OmVZ6kWj03y80Ad1WHO9o6RhP9hASafC+FoJu5rrkdtf2VCfmkln cT4iOiLCxcDK34/WhzvAlM/1Dhal1r8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-26-c7OyNNCUO5iwegenYJpFYg-1; Wed, 10 Jun 2020 07:55:12 -0400 X-MC-Unique: c7OyNNCUO5iwegenYJpFYg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 80386107ACCA; Wed, 10 Jun 2020 11:55:11 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2EC0B5D9D3; Wed, 10 Jun 2020 11:55:09 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 08/21] migration/colo: Use ram_block_discard_disable() Date: Wed, 10 Jun 2020 13:54:06 +0200 Message-Id: <20200610115419.51688-9-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=205.139.110.61; envelope-from=david@redhat.com; helo=us-smtp-delivery-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:51:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lukas Straub , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Juan Quintela , qemu-s390x@nongnu.org, Hailiang Zhang , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" COLO will copy all memory in a RAM block, disable discarding of RAM. Reviewed-by: Dr. David Alan Gilbert Tested-by: Lukas Straub Cc: "Michael S. Tsirkin" Cc: Hailiang Zhang Cc: Juan Quintela Cc: "Dr. David Alan Gilbert" Signed-off-by: David Hildenbrand --- include/migration/colo.h | 2 +- migration/migration.c | 8 +++++++- migration/savevm.c | 11 +++++++++-- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/include/migration/colo.h b/include/migration/colo.h index 1636e6f907..768e1f04c3 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -25,7 +25,7 @@ void migrate_start_colo_process(MigrationState *s); bool migration_in_colo_state(void); /* loadvm */ -void migration_incoming_enable_colo(void); +int migration_incoming_enable_colo(void); void migration_incoming_disable_colo(void); bool migration_incoming_colo_enabled(void); void *colo_process_incoming_thread(void *opaque); diff --git a/migration/migration.c b/migration/migration.c index 14856cc930..0f6799f5d2 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -338,12 +338,18 @@ bool migration_incoming_colo_enabled(void) void migration_incoming_disable_colo(void) { + ram_block_discard_disable(false); migration_colo_enabled = false; } -void migration_incoming_enable_colo(void) +int migration_incoming_enable_colo(void) { + if (ram_block_discard_disable(true)) { + error_report("COLO: cannot disable RAM discard"); + return -EBUSY; + } migration_colo_enabled = true; + return 0; } void migrate_add_address(SocketAddress *address) diff --git a/migration/savevm.c b/migration/savevm.c index c00a6807d9..19b4f9600d 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2111,8 +2111,15 @@ static int loadvm_handle_recv_bitmap(MigrationIncomingState *mis, static int loadvm_process_enable_colo(MigrationIncomingState *mis) { - migration_incoming_enable_colo(); - return colo_init_ram_cache(); + int ret = migration_incoming_enable_colo(); + + if (!ret) { + ret = colo_init_ram_cache(); + if (ret) { + migration_incoming_disable_colo(); + } + } + return ret; } /* From patchwork Wed Jun 10 11:54:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAAF8C433DF for ; Wed, 10 Jun 2020 12:04:34 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83D4D2074B for ; Wed, 10 Jun 2020 12:04:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QGF8+40T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83D4D2074B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39202 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizT7-0007cG-Gz for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:04:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43460) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKS-0003c4-AN for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:36 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:27981 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKN-0002AD-6I for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790130; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T5v3hYYm9DIstRnaIeNybGzu7FzMeXYFYKVxfwtLhBE=; b=QGF8+40TlKEo5NqynLOS7O8Kqy8W4GAyWRqw8VxNaK+bf2+9bYfMSG3V51J4rcHhXM2E18 WSgHBfecJlEDb99PBTIKaapqoiowEdpw8hkAf20J7iGmLXXnU//NkdnSCJkxky2CA1Pka/ BPSuTG2kJwV7q1kBF3Dzo2YkoOpCQqI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-175-kgMDn9g2NhuBjnXkg83unw-1; Wed, 10 Jun 2020 07:55:28 -0400 X-MC-Unique: kgMDn9g2NhuBjnXkg83unw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 34C88DC09; Wed, 10 Jun 2020 11:55:27 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id CFE295D9D3; Wed, 10 Jun 2020 11:55:21 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 10/21] virtio-mem: Paravirtualized memory hot(un)plug Date: Wed, 10 Jun 2020 13:54:08 +0200 Message-Id: <20200610115419.51688-11-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Markus Armbruster , qemu-s390x@nongnu.org, Igor Mammedov , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This is the very basic/initial version of virtio-mem. An introduction to virtio-mem can be found in the Linux kernel driver [1]. While it can be used in the current state for hotplug of a smaller amount of memory, it will heavily benefit from resizeable memory regions in the future. Each virtio-mem device manages a memory region (provided via a memory backend). After requested by the hypervisor ("requested-size"), the guest can try to plug/unplug blocks of memory within that region, in order to reach the requested size. Initially, and after a reboot, all memory is unplugged (except in special cases - reboot during postcopy). The guest may only try to plug/unplug blocks of memory within the usable region size. The usable region size is a little bigger than the requested size, to give the device driver some flexibility. The usable region size will only grow, except on reboots or when all memory is requested to get unplugged. The guest can never plug more memory than requested. Unplugged memory will get zapped/discarded, similar to in a balloon device. The block size is variable, however, it is always chosen in a way such that THP splits are avoided (e.g., 2MB). The state of each block (plugged/unplugged) is tracked in a bitmap. As virtio-mem devices (e.g., virtio-mem-pci) will be memory devices, we now expose "VirtioMEMDeviceInfo" via "query-memory-devices". -------------------------------------------------------------------------- There are two important follow-up items that are in the works: 1. Resizeable memory regions: Use resizeable allocations/RAM blocks to grow/shrink along with the usable region size. This avoids creating initially very big VMAs, RAM blocks, and KVM slots. 2. Protection of unplugged memory: Make sure the gust cannot actually make use of unplugged memory. Other follow-up items that are in the works: 1. Exclude unplugged memory during migration (via precopy notifier). 2. Handle remapping of memory. 3. Support for other architectures. -------------------------------------------------------------------------- Example usage (virtio-mem-pci is introduced in follow-up patches): Start QEMU with two virtio-mem devices (one per NUMA node): $ qemu-system-x86_64 -m 4G,maxmem=20G \ -smp sockets=2,cores=2 \ -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3 \ [...] -object memory-backend-ram,id=mem0,size=8G \ -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,requested-size=0M \ -object memory-backend-ram,id=mem1,size=8G \ -device virtio-mem-pci,id=vm1,memdev=mem1,node=1,requested-size=1G Query the configuration: (qemu) info memory-devices Memory device [virtio-mem]: "vm0" memaddr: 0x140000000 node: 0 requested-size: 0 size: 0 max-size: 8589934592 block-size: 2097152 memdev: /objects/mem0 Memory device [virtio-mem]: "vm1" memaddr: 0x340000000 node: 1 requested-size: 1073741824 size: 1073741824 max-size: 8589934592 block-size: 2097152 memdev: /objects/mem1 Add some memory to node 0: (qemu) qom-set vm0 requested-size 500M Remove some memory from node 1: (qemu) qom-set vm1 requested-size 200M Query the configuration again: (qemu) info memory-devices Memory device [virtio-mem]: "vm0" memaddr: 0x140000000 node: 0 requested-size: 524288000 size: 524288000 max-size: 8589934592 block-size: 2097152 memdev: /objects/mem0 Memory device [virtio-mem]: "vm1" memaddr: 0x340000000 node: 1 requested-size: 209715200 size: 209715200 max-size: 8589934592 block-size: 2097152 memdev: /objects/mem1 [1] https://lkml.kernel.org/r/20200311171422.10484-1-david@redhat.com Cc: "Michael S. Tsirkin" Cc: Eric Blake Cc: Markus Armbruster Cc: "Dr. David Alan Gilbert" Cc: Igor Mammedov Signed-off-by: David Hildenbrand --- hw/virtio/Kconfig | 11 + hw/virtio/Makefile.objs | 1 + hw/virtio/virtio-mem.c | 724 +++++++++++++++++++++++++++++++++ include/hw/virtio/virtio-mem.h | 78 ++++ qapi/misc.json | 39 +- 5 files changed, 852 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/virtio-mem.c create mode 100644 include/hw/virtio/virtio-mem.h diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig index 83122424fa..0eda25c4e1 100644 --- a/hw/virtio/Kconfig +++ b/hw/virtio/Kconfig @@ -47,3 +47,14 @@ config VIRTIO_PMEM depends on VIRTIO depends on VIRTIO_PMEM_SUPPORTED select MEM_DEVICE + +config VIRTIO_MEM_SUPPORTED + bool + +config VIRTIO_MEM + bool + default y + depends on VIRTIO + depends on LINUX + depends on VIRTIO_MEM_SUPPORTED + select MEM_DEVICE diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 4e4d39a0a4..7df70e977e 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -18,6 +18,7 @@ common-obj-$(call land,$(CONFIG_VIRTIO_PMEM),$(CONFIG_VIRTIO_PCI)) += virtio-pme obj-$(call land,$(CONFIG_VHOST_USER_FS),$(CONFIG_VIRTIO_PCI)) += vhost-user-fs-pci.o obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o +obj-$(CONFIG_VIRTIO_MEM) += virtio-mem.o ifeq ($(CONFIG_VIRTIO_PCI),y) obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock-pci.o diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c new file mode 100644 index 0000000000..d8a0c974d3 --- /dev/null +++ b/hw/virtio/virtio-mem.c @@ -0,0 +1,724 @@ +/* + * Virtio MEM device + * + * Copyright (C) 2020 Red Hat, Inc. + * + * Authors: + * David Hildenbrand + * + * This work is licensed under the terms of the GNU GPL, version 2. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "qemu/iov.h" +#include "qemu/cutils.h" +#include "qemu/error-report.h" +#include "qemu/units.h" +#include "sysemu/numa.h" +#include "sysemu/sysemu.h" +#include "sysemu/reset.h" +#include "hw/virtio/virtio.h" +#include "hw/virtio/virtio-bus.h" +#include "hw/virtio/virtio-access.h" +#include "hw/virtio/virtio-mem.h" +#include "qapi/error.h" +#include "qapi/visitor.h" +#include "exec/ram_addr.h" +#include "migration/misc.h" +#include "hw/boards.h" +#include "hw/qdev-properties.h" +#include "config-devices.h" + +/* + * Use QEMU_VMALLOC_ALIGN, so no THP will have to be split when unplugging + * memory (e.g., 2MB on x86_64). + */ +#define VIRTIO_MEM_MIN_BLOCK_SIZE QEMU_VMALLOC_ALIGN +/* + * Size the usable region bigger than the requested size if possible. Esp. + * Linux guests will only add (aligned) memory blocks in case they fully + * fit into the usable region, but plug+online only a subset of the pages. + * The memory block size corresponds mostly to the section size. + * + * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and + * a section size of 1GB on arm64 (as long as the start address is properly + * aligned, similar to ordinary DIMMs). + * + * We can change this at any time and maybe even make it configurable if + * necessary (as the section size can change). But it's more likely that the + * section size will rather get smaller and not bigger over time. + */ +#if defined(__x86_64__) +#define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB)) +#else +#error VIRTIO_MEM_USABLE_EXTENT not defined +#endif + +static bool virtio_mem_is_busy(void) +{ + /* + * Postcopy cannot handle concurrent discards and we don't want to migrate + * pages on-demand with stale content when plugging new blocks. + */ + return migration_in_incoming_postcopy(); +} + +static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, + uint64_t size, bool plugged) +{ + const unsigned long first_bit = (start_gpa - vmem->addr) / vmem->block_size; + const unsigned long last_bit = first_bit + (size / vmem->block_size) - 1; + unsigned long found_bit; + + /* We fake a shorter bitmap to avoid searching too far. */ + if (plugged) { + found_bit = find_next_zero_bit(vmem->bitmap, last_bit + 1, first_bit); + } else { + found_bit = find_next_bit(vmem->bitmap, last_bit + 1, first_bit); + } + return found_bit > last_bit; +} + +static void virtio_mem_set_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, + uint64_t size, bool plugged) +{ + const unsigned long bit = (start_gpa - vmem->addr) / vmem->block_size; + const unsigned long nbits = size / vmem->block_size; + + if (plugged) { + bitmap_set(vmem->bitmap, bit, nbits); + } else { + bitmap_clear(vmem->bitmap, bit, nbits); + } +} + +static void virtio_mem_send_response(VirtIOMEM *vmem, VirtQueueElement *elem, + struct virtio_mem_resp *resp) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(vmem); + VirtQueue *vq = vmem->vq; + + iov_from_buf(elem->in_sg, elem->in_num, 0, resp, sizeof(*resp)); + + virtqueue_push(vq, elem, sizeof(*resp)); + virtio_notify(vdev, vq); +} + +static void virtio_mem_send_response_simple(VirtIOMEM *vmem, + VirtQueueElement *elem, + uint16_t type) +{ + struct virtio_mem_resp resp = { + .type = cpu_to_le16(type), + }; + + virtio_mem_send_response(vmem, elem, &resp); +} + +static bool virtio_mem_valid_range(VirtIOMEM *vmem, uint64_t gpa, uint64_t size) +{ + if (!QEMU_IS_ALIGNED(gpa, vmem->block_size)) { + return false; + } + if (gpa + size < gpa || !size) { + return false; + } + if (gpa < vmem->addr || gpa >= vmem->addr + vmem->usable_region_size) { + return false; + } + if (gpa + size > vmem->addr + vmem->usable_region_size) { + return false; + } + return true; +} + +static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa, + uint64_t size, bool plug) +{ + const uint64_t offset = start_gpa - vmem->addr; + int ret; + + if (virtio_mem_is_busy()) { + return -EBUSY; + } + + if (!plug) { + ret = ram_block_discard_range(vmem->memdev->mr.ram_block, offset, size); + if (ret) { + error_report("Unexpected error discarding RAM: %s", + strerror(-ret)); + return -EBUSY; + } + } + virtio_mem_set_bitmap(vmem, start_gpa, size, plug); + return 0; +} + +static int virtio_mem_state_change_request(VirtIOMEM *vmem, uint64_t gpa, + uint16_t nb_blocks, bool plug) +{ + const uint64_t size = nb_blocks * vmem->block_size; + int ret; + + if (!virtio_mem_valid_range(vmem, gpa, size)) { + return VIRTIO_MEM_RESP_ERROR; + } + + if (plug && (vmem->size + size > vmem->requested_size)) { + return VIRTIO_MEM_RESP_NACK; + } + + /* test if really all blocks are in the opposite state */ + if (!virtio_mem_test_bitmap(vmem, gpa, size, !plug)) { + return VIRTIO_MEM_RESP_ERROR; + } + + ret = virtio_mem_set_block_state(vmem, gpa, size, plug); + if (ret) { + return VIRTIO_MEM_RESP_BUSY; + } + if (plug) { + vmem->size += size; + } else { + vmem->size -= size; + } + return VIRTIO_MEM_RESP_ACK; +} + +static void virtio_mem_plug_request(VirtIOMEM *vmem, VirtQueueElement *elem, + struct virtio_mem_req *req) +{ + const uint64_t gpa = le64_to_cpu(req->u.plug.addr); + const uint16_t nb_blocks = le16_to_cpu(req->u.plug.nb_blocks); + uint16_t type; + + type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, true); + virtio_mem_send_response_simple(vmem, elem, type); +} + +static void virtio_mem_unplug_request(VirtIOMEM *vmem, VirtQueueElement *elem, + struct virtio_mem_req *req) +{ + const uint64_t gpa = le64_to_cpu(req->u.unplug.addr); + const uint16_t nb_blocks = le16_to_cpu(req->u.unplug.nb_blocks); + uint16_t type; + + type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, false); + virtio_mem_send_response_simple(vmem, elem, type); +} + +static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, + uint64_t requested_size, + bool can_shrink) +{ + uint64_t newsize = MIN(memory_region_size(&vmem->memdev->mr), + requested_size + VIRTIO_MEM_USABLE_EXTENT); + + if (!requested_size) { + newsize = 0; + } + + if (newsize < vmem->usable_region_size && !can_shrink) { + return; + } + + vmem->usable_region_size = newsize; +} + +static int virtio_mem_unplug_all(VirtIOMEM *vmem) +{ + RAMBlock *rb = vmem->memdev->mr.ram_block; + int ret; + + if (virtio_mem_is_busy()) { + return -EBUSY; + } + + ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); + if (ret) { + error_report("Unexpected error discarding RAM: %s", strerror(-ret)); + return -EBUSY; + } + bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size); + vmem->size = 0; + + virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); + return 0; +} + +static void virtio_mem_unplug_all_request(VirtIOMEM *vmem, + VirtQueueElement *elem) +{ + if (virtio_mem_unplug_all(vmem)) { + virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_BUSY); + } else { + virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_ACK); + } +} + +static void virtio_mem_state_request(VirtIOMEM *vmem, VirtQueueElement *elem, + struct virtio_mem_req *req) +{ + const uint16_t nb_blocks = le16_to_cpu(req->u.state.nb_blocks); + const uint64_t gpa = le64_to_cpu(req->u.state.addr); + const uint64_t size = nb_blocks * vmem->block_size; + struct virtio_mem_resp resp = { + .type = cpu_to_le16(VIRTIO_MEM_RESP_ACK), + }; + + if (!virtio_mem_valid_range(vmem, gpa, size)) { + virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_ERROR); + return; + } + + if (virtio_mem_test_bitmap(vmem, gpa, size, true)) { + resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_PLUGGED); + } else if (virtio_mem_test_bitmap(vmem, gpa, size, false)) { + resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_UNPLUGGED); + } else { + resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_MIXED); + } + virtio_mem_send_response(vmem, elem, &resp); +} + +static void virtio_mem_handle_request(VirtIODevice *vdev, VirtQueue *vq) +{ + const int len = sizeof(struct virtio_mem_req); + VirtIOMEM *vmem = VIRTIO_MEM(vdev); + VirtQueueElement *elem; + struct virtio_mem_req req; + uint16_t type; + + while (true) { + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + return; + } + + if (iov_to_buf(elem->out_sg, elem->out_num, 0, &req, len) < len) { + virtio_error(vdev, "virtio-mem protocol violation: invalid request" + " size: %d", len); + g_free(elem); + return; + } + + if (iov_size(elem->in_sg, elem->in_num) < + sizeof(struct virtio_mem_resp)) { + virtio_error(vdev, "virtio-mem protocol violation: not enough space" + " for response: %zu", + iov_size(elem->in_sg, elem->in_num)); + g_free(elem); + return; + } + + type = le16_to_cpu(req.type); + switch (type) { + case VIRTIO_MEM_REQ_PLUG: + virtio_mem_plug_request(vmem, elem, &req); + break; + case VIRTIO_MEM_REQ_UNPLUG: + virtio_mem_unplug_request(vmem, elem, &req); + break; + case VIRTIO_MEM_REQ_UNPLUG_ALL: + virtio_mem_unplug_all_request(vmem, elem); + break; + case VIRTIO_MEM_REQ_STATE: + virtio_mem_state_request(vmem, elem, &req); + break; + default: + virtio_error(vdev, "virtio-mem protocol violation: unknown request" + " type: %d", type); + g_free(elem); + return; + } + + g_free(elem); + } +} + +static void virtio_mem_get_config(VirtIODevice *vdev, uint8_t *config_data) +{ + VirtIOMEM *vmem = VIRTIO_MEM(vdev); + struct virtio_mem_config *config = (void *) config_data; + + config->block_size = cpu_to_le64(vmem->block_size); + config->node_id = cpu_to_le16(vmem->node); + config->requested_size = cpu_to_le64(vmem->requested_size); + config->plugged_size = cpu_to_le64(vmem->size); + config->addr = cpu_to_le64(vmem->addr); + config->region_size = cpu_to_le64(memory_region_size(&vmem->memdev->mr)); + config->usable_region_size = cpu_to_le64(vmem->usable_region_size); +} + +static uint64_t virtio_mem_get_features(VirtIODevice *vdev, uint64_t features, + Error **errp) +{ + MachineState *ms = MACHINE(qdev_get_machine()); + + if (ms->numa_state) { +#if defined(CONFIG_ACPI) + virtio_add_feature(&features, VIRTIO_MEM_F_ACPI_PXM); +#endif + } + return features; +} + +static void virtio_mem_system_reset(void *opaque) +{ + VirtIOMEM *vmem = VIRTIO_MEM(opaque); + + /* + * During usual resets, we will unplug all memory and shrink the usable + * region size. This is, however, not possible in all scenarios. Then, + * the guest has to deal with this manually (VIRTIO_MEM_REQ_UNPLUG_ALL). + */ + virtio_mem_unplug_all(vmem); +} + +static void virtio_mem_device_realize(DeviceState *dev, Error **errp) +{ + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = ms->numa_state ? ms->numa_state->num_nodes : 0; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtIOMEM *vmem = VIRTIO_MEM(dev); + uint64_t page_size; + RAMBlock *rb; + int ret; + + if (!vmem->memdev) { + error_setg(errp, "'%s' property is not set", VIRTIO_MEM_MEMDEV_PROP); + return; + } else if (host_memory_backend_is_mapped(vmem->memdev)) { + char *path = object_get_canonical_path_component(OBJECT(vmem->memdev)); + + error_setg(errp, "'%s' property specifies a busy memdev: %s", + VIRTIO_MEM_MEMDEV_PROP, path); + g_free(path); + return; + } else if (!memory_region_is_ram(&vmem->memdev->mr) || + memory_region_is_rom(&vmem->memdev->mr) || + !vmem->memdev->mr.ram_block) { + error_setg(errp, "'%s' property specifies an unsupported memdev", + VIRTIO_MEM_MEMDEV_PROP); + return; + } + + if ((nb_numa_nodes && vmem->node >= nb_numa_nodes) || + (!nb_numa_nodes && vmem->node)) { + error_setg(errp, "'%s' property has value '%" PRIu32 "', which exceeds" + "the number of numa nodes: %d", VIRTIO_MEM_NODE_PROP, + vmem->node, nb_numa_nodes ? nb_numa_nodes : 1); + return; + } + + if (enable_mlock) { + error_setg(errp, "Incompatible with mlock"); + return; + } + + rb = vmem->memdev->mr.ram_block; + page_size = qemu_ram_pagesize(rb); + + if (vmem->block_size < page_size) { + error_setg(errp, "'%s' property has to be at least the page size (0x%" + PRIx64 ")", VIRTIO_MEM_BLOCK_SIZE_PROP, page_size); + return; + } else if (!QEMU_IS_ALIGNED(vmem->requested_size, vmem->block_size)) { + error_setg(errp, "'%s' property has to be multiples of '%s' (0x%" PRIx64 + ")", VIRTIO_MEM_REQUESTED_SIZE_PROP, + VIRTIO_MEM_BLOCK_SIZE_PROP, vmem->block_size); + return; + } else if (!QEMU_IS_ALIGNED(memory_region_size(&vmem->memdev->mr), + vmem->block_size)) { + error_setg(errp, "'%s' property memdev size has to be multiples of" + "'%s' (0x%" PRIx64 ")", VIRTIO_MEM_MEMDEV_PROP, + VIRTIO_MEM_BLOCK_SIZE_PROP, vmem->block_size); + return; + } + + if (ram_block_discard_require(true)) { + error_setg(errp, "Discarding RAM is disabled"); + return; + } + + ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); + if (ret) { + error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); + ram_block_discard_require(false); + return; + } + + virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); + + vmem->bitmap_size = memory_region_size(&vmem->memdev->mr) / + vmem->block_size; + vmem->bitmap = bitmap_new(vmem->bitmap_size); + + virtio_init(vdev, TYPE_VIRTIO_MEM, VIRTIO_ID_MEM, + sizeof(struct virtio_mem_config)); + vmem->vq = virtio_add_queue(vdev, 128, virtio_mem_handle_request); + + host_memory_backend_set_mapped(vmem->memdev, true); + vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); + qemu_register_reset(virtio_mem_system_reset, vmem); +} + +static void virtio_mem_device_unrealize(DeviceState *dev) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtIOMEM *vmem = VIRTIO_MEM(dev); + + qemu_unregister_reset(virtio_mem_system_reset, vmem); + vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); + host_memory_backend_set_mapped(vmem->memdev, false); + virtio_del_queue(vdev, 0); + virtio_cleanup(vdev); + g_free(vmem->bitmap); + ram_block_discard_require(false); +} + +static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) +{ + RAMBlock *rb = vmem->memdev->mr.ram_block; + unsigned long first_zero_bit, last_zero_bit; + uint64_t offset, length; + int ret; + + /* Find consecutive unplugged blocks and discard the consecutive range. */ + first_zero_bit = find_first_zero_bit(vmem->bitmap, vmem->bitmap_size); + while (first_zero_bit < vmem->bitmap_size) { + offset = first_zero_bit * vmem->block_size; + last_zero_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + first_zero_bit + 1) - 1; + length = (last_zero_bit - first_zero_bit + 1) * vmem->block_size; + + ret = ram_block_discard_range(rb, offset, length); + if (ret) { + error_report("Unexpected error discarding RAM: %s", + strerror(-ret)); + return -EINVAL; + } + first_zero_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + last_zero_bit + 2); + } + return 0; +} + +static int virtio_mem_post_load(void *opaque, int version_id) +{ + if (migration_in_incoming_postcopy()) { + return 0; + } + + return virtio_mem_restore_unplugged(VIRTIO_MEM(opaque)); +} + +static const VMStateDescription vmstate_virtio_mem_device = { + .name = "virtio-mem-device", + .minimum_version_id = 1, + .version_id = 1, + .post_load = virtio_mem_post_load, + .fields = (VMStateField[]) { + VMSTATE_UINT64(usable_region_size, VirtIOMEM), + VMSTATE_UINT64(size, VirtIOMEM), + VMSTATE_UINT64(requested_size, VirtIOMEM), + VMSTATE_BITMAP(bitmap, VirtIOMEM, 0, bitmap_size), + VMSTATE_END_OF_LIST() + }, +}; + +static const VMStateDescription vmstate_virtio_mem = { + .name = "virtio-mem", + .minimum_version_id = 1, + .version_id = 1, + .fields = (VMStateField[]) { + VMSTATE_VIRTIO_DEVICE, + VMSTATE_END_OF_LIST() + }, +}; + +static void virtio_mem_fill_device_info(const VirtIOMEM *vmem, + VirtioMEMDeviceInfo *vi) +{ + vi->memaddr = vmem->addr; + vi->node = vmem->node; + vi->requested_size = vmem->requested_size; + vi->size = vmem->size; + vi->max_size = memory_region_size(&vmem->memdev->mr); + vi->block_size = vmem->block_size; + vi->memdev = object_get_canonical_path(OBJECT(vmem->memdev)); +} + +static MemoryRegion *virtio_mem_get_memory_region(VirtIOMEM *vmem, Error **errp) +{ + if (!vmem->memdev) { + error_setg(errp, "'%s' property must be set", VIRTIO_MEM_MEMDEV_PROP); + return NULL; + } + + return &vmem->memdev->mr; +} + +static void virtio_mem_get_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint64_t value = vmem->size; + + visit_type_size(v, name, &value, errp); +} + +static void virtio_mem_get_requested_size(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint64_t value = vmem->requested_size; + + visit_type_size(v, name, &value, errp); +} + +static void virtio_mem_set_requested_size(Object *obj, Visitor *v, + const char *name, void *opaque, + Error **errp) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + Error *err = NULL; + uint64_t value; + + visit_type_size(v, name, &value, &err); + if (err) { + error_propagate(errp, err); + return; + } + + /* + * The block size and memory backend are not fixed until the device was + * realized. realize() will verify these properties then. + */ + if (DEVICE(obj)->realized) { + if (!QEMU_IS_ALIGNED(value, vmem->block_size)) { + error_setg(errp, "'%s' has to be multiples of '%s' (0x%" PRIx64 + ")", name, VIRTIO_MEM_BLOCK_SIZE_PROP, + vmem->block_size); + return; + } else if (value > memory_region_size(&vmem->memdev->mr)) { + error_setg(errp, "'%s' cannot exceed the memory backend size" + "(0x%" PRIx64 ")", name, + memory_region_size(&vmem->memdev->mr)); + return; + } + + if (value != vmem->requested_size) { + virtio_mem_resize_usable_region(vmem, value, false); + vmem->requested_size = value; + } + /* + * Trigger a config update so the guest gets notified. We trigger + * even if the size didn't change (especially helpful for debugging). + */ + virtio_notify_config(VIRTIO_DEVICE(vmem)); + } else { + vmem->requested_size = value; + } +} + +static void virtio_mem_get_block_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint64_t value = vmem->block_size; + + visit_type_size(v, name, &value, errp); +} + +static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + Error *err = NULL; + uint64_t value; + + if (DEVICE(obj)->realized) { + error_setg(errp, "'%s' cannot be changed", name); + return; + } + + visit_type_size(v, name, &value, &err); + if (err) { + error_propagate(errp, err); + return; + } + + if (value < VIRTIO_MEM_MIN_BLOCK_SIZE) { + error_setg(errp, "'%s' property has to be at least 0x%" PRIx32, name, + VIRTIO_MEM_MIN_BLOCK_SIZE); + return; + } else if (!is_power_of_2(value)) { + error_setg(errp, "'%s' property has to be a power of two", name); + return; + } + vmem->block_size = value; +} + +static void virtio_mem_instance_init(Object *obj) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + + vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE; + + object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size, + NULL, NULL, NULL); + object_property_add(obj, VIRTIO_MEM_REQUESTED_SIZE_PROP, "size", + virtio_mem_get_requested_size, + virtio_mem_set_requested_size, NULL, NULL); + object_property_add(obj, VIRTIO_MEM_BLOCK_SIZE_PROP, "size", + virtio_mem_get_block_size, virtio_mem_set_block_size, + NULL, NULL); +} + +static Property virtio_mem_properties[] = { + DEFINE_PROP_UINT64(VIRTIO_MEM_ADDR_PROP, VirtIOMEM, addr, 0), + DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0), + DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev, + TYPE_MEMORY_BACKEND, HostMemoryBackend *), + DEFINE_PROP_END_OF_LIST(), +}; + +static void virtio_mem_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass); + VirtIOMEMClass *vmc = VIRTIO_MEM_CLASS(klass); + + device_class_set_props(dc, virtio_mem_properties); + dc->vmsd = &vmstate_virtio_mem; + + set_bit(DEVICE_CATEGORY_MISC, dc->categories); + vdc->realize = virtio_mem_device_realize; + vdc->unrealize = virtio_mem_device_unrealize; + vdc->get_config = virtio_mem_get_config; + vdc->get_features = virtio_mem_get_features; + vdc->vmsd = &vmstate_virtio_mem_device; + + vmc->fill_device_info = virtio_mem_fill_device_info; + vmc->get_memory_region = virtio_mem_get_memory_region; +} + +static const TypeInfo virtio_mem_info = { + .name = TYPE_VIRTIO_MEM, + .parent = TYPE_VIRTIO_DEVICE, + .instance_size = sizeof(VirtIOMEM), + .instance_init = virtio_mem_instance_init, + .class_init = virtio_mem_class_init, + .class_size = sizeof(VirtIOMEMClass), +}; + +static void virtio_register_types(void) +{ + type_register_static(&virtio_mem_info); +} + +type_init(virtio_register_types) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h new file mode 100644 index 0000000000..6981096f7c --- /dev/null +++ b/include/hw/virtio/virtio-mem.h @@ -0,0 +1,78 @@ +/* + * Virtio MEM device + * + * Copyright (C) 2020 Red Hat, Inc. + * + * Authors: + * David Hildenbrand + * + * This work is licensed under the terms of the GNU GPL, version 2. + * See the COPYING file in the top-level directory. + */ + +#ifndef HW_VIRTIO_MEM_H +#define HW_VIRTIO_MEM_H + +#include "standard-headers/linux/virtio_mem.h" +#include "hw/virtio/virtio.h" +#include "qapi/qapi-types-misc.h" +#include "sysemu/hostmem.h" + +#define TYPE_VIRTIO_MEM "virtio-mem" + +#define VIRTIO_MEM(obj) \ + OBJECT_CHECK(VirtIOMEM, (obj), TYPE_VIRTIO_MEM) +#define VIRTIO_MEM_CLASS(oc) \ + OBJECT_CLASS_CHECK(VirtIOMEMClass, (oc), TYPE_VIRTIO_MEM) +#define VIRTIO_MEM_GET_CLASS(obj) \ + OBJECT_GET_CLASS(VirtIOMEMClass, (obj), TYPE_VIRTIO_MEM) + +#define VIRTIO_MEM_MEMDEV_PROP "memdev" +#define VIRTIO_MEM_NODE_PROP "node" +#define VIRTIO_MEM_SIZE_PROP "size" +#define VIRTIO_MEM_REQUESTED_SIZE_PROP "requested-size" +#define VIRTIO_MEM_BLOCK_SIZE_PROP "block-size" +#define VIRTIO_MEM_ADDR_PROP "memaddr" + +typedef struct VirtIOMEM { + VirtIODevice parent_obj; + + /* guest -> host request queue */ + VirtQueue *vq; + + /* bitmap used to track unplugged memory */ + int32_t bitmap_size; + unsigned long *bitmap; + + /* assigned memory backend and memory region */ + HostMemoryBackend *memdev; + + /* NUMA node */ + uint32_t node; + + /* assigned address of the region in guest physical memory */ + uint64_t addr; + + /* usable region size (<= region_size) */ + uint64_t usable_region_size; + + /* actual size (how much the guest plugged) */ + uint64_t size; + + /* requested size */ + uint64_t requested_size; + + /* block size and alignment */ + uint64_t block_size; +} VirtIOMEM; + +typedef struct VirtIOMEMClass { + /* private */ + VirtIODevice parent; + + /* public */ + void (*fill_device_info)(const VirtIOMEM *vmen, VirtioMEMDeviceInfo *vi); + MemoryRegion *(*get_memory_region)(VirtIOMEM *vmem, Error **errp); +} VirtIOMEMClass; + +#endif diff --git a/qapi/misc.json b/qapi/misc.json index 99b90ac80b..e1c5547b65 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -1354,19 +1354,56 @@ } } +## +# @VirtioMEMDeviceInfo: +# +# VirtioMEMDevice state information +# +# @id: device's ID +# +# @memaddr: physical address in memory, where device is mapped +# +# @requested-size: the user requested size of the device +# +# @size: the (current) size of memory that the device provides +# +# @max-size: the maximum size of memory that the device can provide +# +# @block-size: the block size of memory that the device provides +# +# @node: NUMA node number where device is assigned to +# +# @memdev: memory backend linked with the region +# +# Since: 5.1 +## +{ 'struct': 'VirtioMEMDeviceInfo', + 'data': { '*id': 'str', + 'memaddr': 'size', + 'requested-size': 'size', + 'size': 'size', + 'max-size': 'size', + 'block-size': 'size', + 'node': 'int', + 'memdev': 'str' + } +} + ## # @MemoryDeviceInfo: # # Union containing information about a memory device # # nvdimm is included since 2.12. virtio-pmem is included since 4.1. +# virtio-mem is included since 5.1. # # Since: 2.1 ## { 'union': 'MemoryDeviceInfo', 'data': { 'dimm': 'PCDIMMDeviceInfo', 'nvdimm': 'PCDIMMDeviceInfo', - 'virtio-pmem': 'VirtioPMEMDeviceInfo' + 'virtio-pmem': 'VirtioPMEMDeviceInfo', + 'virtio-mem': 'VirtioMEMDeviceInfo' } } From patchwork Wed Jun 10 11:54:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB7ACC433E0 for ; Wed, 10 Jun 2020 12:07:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A513020734 for ; Wed, 10 Jun 2020 12:07:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NsuruMne" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A513020734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48180 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizVX-00034q-RR for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:07:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43482) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKU-0003iK-Vy for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:39 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:26229 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKU-0002C1-6O for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790137; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gfo8N/39HuNuGms67Iz3gfpzyYs5Q5599cbmKQTi7kc=; b=NsuruMnef/SeXrInDcbk1Tv3QbOVo9piPty2T2GrMBeTlv5O0l0SJlPL2l9783FMZCnADS M1SfiRiEEfFRQRYGoqtSdD1gVT6U/UCUaTuzFRewy5bRaqFIyNhHi7fDrl8SKGt5gSkB1b Js2UjyyUfW+ZKO7WLK3sewJlvgUYbXg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-434-PRFe2t6NMSCLqb1PnzQQoQ-1; Wed, 10 Jun 2020 07:55:33 -0400 X-MC-Unique: PRFe2t6NMSCLqb1PnzQQoQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 67AF0100A8EB; Wed, 10 Jun 2020 11:55:32 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3D5E75D9D3; Wed, 10 Jun 2020 11:55:30 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 12/21] MAINTAINERS: Add myself as virtio-mem maintainer Date: Wed, 10 Jun 2020 13:54:10 +0200 Message-Id: <20200610115419.51688-13-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=205.139.110.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:51:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Markus Armbruster , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's make sure patches/bug reports find the right person. Reviewed-by: Dr. David Alan Gilbert Cc: "Michael S. Tsirkin" Cc: Peter Maydell Cc: Markus Armbruster Signed-off-by: David Hildenbrand --- MAINTAINERS | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 3abe3faa4e..4889485e6c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1762,6 +1762,14 @@ F: hw/virtio/virtio-crypto.c F: hw/virtio/virtio-crypto-pci.c F: include/hw/virtio/virtio-crypto.h +virtio-mem +M: David Hildenbrand +S: Supported +F: hw/virtio/virtio-mem.c +F: hw/virtio/virtio-mem-pci.h +F: hw/virtio/virtio-mem-pci.c +F: include/hw/virtio/virtio-mem.h + nvme M: Keith Busch L: qemu-block@nongnu.org From patchwork Wed Jun 10 11:54:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEC10C433DF for ; Wed, 10 Jun 2020 12:05:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 76F9A20734 for ; Wed, 10 Jun 2020 12:05:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="a1VZUERH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 76F9A20734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41342 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizTd-0008Ul-JY for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:05:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43508) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKa-0003st-Fq for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:44 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:29394 helo=us-smtp-delivery-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKZ-0002CT-FQ for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790142; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q646sUohGiaUOuqh4dFa0vnY8G1fxPWotwWGJuhM3Gg=; b=a1VZUERHVVyEPcV5sEU/YHslbHybnOXukxGnhA2RY5LAheInMs7aeJXSxuCWzd0d2GnxSe X0KbgDzQJ/y5kAzjBt+44XBrhWQcdq0PMUF7cyK1KfjopFlsm6x67AbJL2rYtD2d1AwFZe 0MvKZThAFqdAu6+Tm/AI17heS1w1HqI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-327-zqAp5p9cM8qZNzUbrb8Wlg-1; Wed, 10 Jun 2020 07:55:41 -0400 X-MC-Unique: zqAp5p9cM8qZNzUbrb8Wlg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CFFC01B18BC0; Wed, 10 Jun 2020 11:55:39 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id E748A5D9D3; Wed, 10 Jun 2020 11:55:34 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 14/21] numa: Handle virtio-mem in NUMA stats Date: Wed, 10 Jun 2020 13:54:12 +0200 Message-Id: <20200610115419.51688-15-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.81; envelope-from=david@redhat.com; helo=us-smtp-delivery-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:22:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Account the memory to the configured nid. Reviewed-by: Pankaj Gupta Cc: Eduardo Habkost Cc: Marcel Apfelbaum Cc: "Michael S. Tsirkin" Signed-off-by: David Hildenbrand --- hw/core/numa.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/hw/core/numa.c b/hw/core/numa.c index 316bc50d75..06960918e7 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -812,6 +812,7 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[]) MemoryDeviceInfoList *info; PCDIMMDeviceInfo *pcdimm_info; VirtioPMEMDeviceInfo *vpi; + VirtioMEMDeviceInfo *vmi; for (info = info_list; info; info = info->next) { MemoryDeviceInfo *value = info->value; @@ -832,6 +833,11 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[]) node_mem[0].node_mem += vpi->size; node_mem[0].node_plugged_mem += vpi->size; break; + case MEMORY_DEVICE_INFO_KIND_VIRTIO_MEM: + vmi = value->u.virtio_mem.data; + node_mem[vmi->node].node_mem += vmi->size; + node_mem[vmi->node].node_plugged_mem += vmi->size; + break; default: g_assert_not_reached(); } From patchwork Wed Jun 10 11:54:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 024B8C433E0 for ; Wed, 10 Jun 2020 12:07:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C1ECB20734 for ; Wed, 10 Jun 2020 12:07:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EiCs9KOv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1ECB20734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50020 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizVt-0003wA-Ut for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:07:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43528) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKj-0004Bs-Ip for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:53 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:49401 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKg-0002Ct-Rz for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:55:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790150; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kV1TOZCW/a8NmT94QFWKHAqddj4smNK0o5RBU/7Yaho=; b=EiCs9KOvGrRflj1yPvKKK3Q4YcqNWzQbMDi9KkMXvEjjwayJRmDWzm73yhofyT5dvTPjqS brQQRhsYDn6EGzq9magWGQ6msaroxwBJVJjRNv5RKQKtUvjsDpp0fJmDWjHsBYWxPPYTYw YjB8ateH/wDLTnjeMTFkKmzrG1iiHWg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-367-UTbbtVl4MaOp9Esn4ry03Q-1; Wed, 10 Jun 2020 07:55:46 -0400 X-MC-Unique: UTbbtVl4MaOp9Esn4ry03Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 71FBA835BBF; Wed, 10 Jun 2020 11:55:45 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2AA085D9D3; Wed, 10 Jun 2020 11:55:40 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 15/21] pc: Support for virtio-mem-pci Date: Wed, 10 Jun 2020 13:54:13 +0200 Message-Id: <20200610115419.51688-16-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Markus Armbruster , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's wire it up similar to virtio-pmem. Also disallow unplug, so it's harder for users to shoot themselves into the foot. Reviewed-by: Pankaj Gupta Cc: "Michael S. Tsirkin" Cc: Marcel Apfelbaum Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Cc: Eric Blake Cc: Markus Armbruster Signed-off-by: David Hildenbrand --- hw/i386/Kconfig | 1 + hw/i386/pc.c | 49 ++++++++++++++++++++++++++++--------------------- 2 files changed, 29 insertions(+), 21 deletions(-) diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig index c93f32f657..03e347b207 100644 --- a/hw/i386/Kconfig +++ b/hw/i386/Kconfig @@ -35,6 +35,7 @@ config PC select ACPI_PCI select ACPI_VMGENID select VIRTIO_PMEM_SUPPORTED + select VIRTIO_MEM_SUPPORTED config PC_PCI bool diff --git a/hw/i386/pc.c b/hw/i386/pc.c index c740495eb6..ee6368915b 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -86,6 +86,7 @@ #include "hw/net/ne2000-isa.h" #include "standard-headers/asm-x86/bootparam.h" #include "hw/virtio/virtio-pmem-pci.h" +#include "hw/virtio/virtio-mem-pci.h" #include "hw/mem/memory-device.h" #include "sysemu/replay.h" #include "qapi/qmp/qerror.h" @@ -1657,8 +1658,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev, numa_cpu_pre_plug(cpu_slot, dev, errp); } -static void pc_virtio_pmem_pci_pre_plug(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void pc_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev); Error *local_err = NULL; @@ -1669,7 +1670,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler *hotplug_dev, * order. We should never reach this point when hotplugging on x86, * however, better add a safety net. */ - error_setg(errp, "virtio-pmem-pci hotplug not supported on this bus."); + error_setg(errp, "hotplug of virtio based memory devices not supported" + " on this bus."); return; } /* @@ -1684,8 +1686,8 @@ static void pc_virtio_pmem_pci_pre_plug(HotplugHandler *hotplug_dev, error_propagate(errp, local_err); } -static void pc_virtio_pmem_pci_plug(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void pc_virtio_md_pci_plug(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev); Error *local_err = NULL; @@ -1705,17 +1707,17 @@ static void pc_virtio_pmem_pci_plug(HotplugHandler *hotplug_dev, error_propagate(errp, local_err); } -static void pc_virtio_pmem_pci_unplug_request(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void pc_virtio_md_pci_unplug_request(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { - /* We don't support virtio pmem hot unplug */ - error_setg(errp, "virtio pmem device unplug not supported."); + /* We don't support hot unplug of virtio based memory devices */ + error_setg(errp, "virtio based memory devices cannot be unplugged."); } -static void pc_virtio_pmem_pci_unplug(HotplugHandler *hotplug_dev, - DeviceState *dev, Error **errp) +static void pc_virtio_md_pci_unplug(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) { - /* We don't support virtio pmem hot unplug */ + /* We don't support hot unplug of virtio based memory devices */ } static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev, @@ -1725,8 +1727,9 @@ static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev, pc_memory_pre_plug(hotplug_dev, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { pc_cpu_pre_plug(hotplug_dev, dev, errp); - } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) { - pc_virtio_pmem_pci_pre_plug(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) || + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) { + pc_virtio_md_pci_pre_plug(hotplug_dev, dev, errp); } } @@ -1737,8 +1740,9 @@ static void pc_machine_device_plug_cb(HotplugHandler *hotplug_dev, pc_memory_plug(hotplug_dev, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { pc_cpu_plug(hotplug_dev, dev, errp); - } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) { - pc_virtio_pmem_pci_plug(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) || + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) { + pc_virtio_md_pci_plug(hotplug_dev, dev, errp); } } @@ -1749,8 +1753,9 @@ static void pc_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev, pc_memory_unplug_request(hotplug_dev, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { pc_cpu_unplug_request_cb(hotplug_dev, dev, errp); - } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) { - pc_virtio_pmem_pci_unplug_request(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) || + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) { + pc_virtio_md_pci_unplug_request(hotplug_dev, dev, errp); } else { error_setg(errp, "acpi: device unplug request for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -1764,8 +1769,9 @@ static void pc_machine_device_unplug_cb(HotplugHandler *hotplug_dev, pc_memory_unplug(hotplug_dev, dev, errp); } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { pc_cpu_unplug_cb(hotplug_dev, dev, errp); - } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) { - pc_virtio_pmem_pci_unplug(hotplug_dev, dev, errp); + } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) || + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) { + pc_virtio_md_pci_unplug(hotplug_dev, dev, errp); } else { error_setg(errp, "acpi: device unplug for not supported device" " type: %s", object_get_typename(OBJECT(dev))); @@ -1777,7 +1783,8 @@ static HotplugHandler *pc_get_hotplug_handler(MachineState *machine, { if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) || object_dynamic_cast(OBJECT(dev), TYPE_CPU) || - object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI)) { + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_PMEM_PCI) || + object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) { return HOTPLUG_HANDLER(machine); } From patchwork Wed Jun 10 11:54:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE828C433E0 for ; Wed, 10 Jun 2020 12:09:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B80DE20734 for ; Wed, 10 Jun 2020 12:09:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QmPNMJ4V" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B80DE20734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizXe-000756-Vc for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:09:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKt-0004S0-Ut for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:04 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:58615 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKt-0002Eh-2a for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790162; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LNpYa81sj5PjQyv9U3qiC+uKU04GSUhnVRclqJQiLmE=; b=QmPNMJ4VmwPgY370O6zCWVUNwseQgIBH8cTbcd2AJ8VH1g+IEwo3RXnM4UBJQhjzBdfpa0 w98WXcsZu6kCBUYuO+ue91o8+/dzcxcqNR0wjuDeihOZWRJkNuOS+KmpdDzNKTeTiZcUgE k8fLV4LGE2xVF+U87iwMvpsfNVfZ5fc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-257-WwlQQFjoPvGbDLkts7bPog-1; Wed, 10 Jun 2020 07:55:58 -0400 X-MC-Unique: WwlQQFjoPvGbDLkts7bPog-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 899CC461; Wed, 10 Jun 2020 11:55:57 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 075FA5D9D3; Wed, 10 Jun 2020 11:55:52 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 19/21] virtio-mem: Add trace events Date: Wed, 10 Jun 2020 13:54:17 +0200 Message-Id: <20200610115419.51688-20-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's add some trace events that might come in handy later. Cc: "Michael S. Tsirkin" Cc: "Dr. David Alan Gilbert" Signed-off-by: David Hildenbrand --- hw/virtio/trace-events | 10 ++++++++++ hw/virtio/virtio-mem.c | 10 +++++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index e83500bee9..c40ad5ea27 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -73,3 +73,13 @@ virtio_iommu_get_domain(uint32_t domain_id) "Alloc domain=%d" virtio_iommu_put_domain(uint32_t domain_id) "Free domain=%d" virtio_iommu_translate_out(uint64_t virt_addr, uint64_t phys_addr, uint32_t sid) "0x%"PRIx64" -> 0x%"PRIx64 " for sid=%d" virtio_iommu_report_fault(uint8_t reason, uint32_t flags, uint32_t endpoint, uint64_t addr) "FAULT reason=%d flags=%d endpoint=%d address =0x%"PRIx64 + +# virtio-mem.c +virtio_mem_send_response(uint16_t type) "type=%" PRIu16 +virtio_mem_plug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 " nb_blocks=%" PRIu16 +virtio_mem_unplug_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 " nb_blocks=%" PRIu16 +virtio_mem_unplugged_all(void) "" +virtio_mem_unplug_all_request(void) "" +virtio_mem_resized_usable_region(uint64_t old_size, uint64_t new_size) "old_size=0x%" PRIx64 "new_size=0x%" PRIx64 +virtio_mem_state_request(uint64_t addr, uint16_t nb_blocks) "addr=0x%" PRIx64 " nb_blocks=%" PRIu16 +virtio_mem_state_response(uint16_t state) "state=%" PRIu16 diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 450b8dc49d..468fb4170f 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -30,6 +30,7 @@ #include "hw/boards.h" #include "hw/qdev-properties.h" #include "config-devices.h" +#include "trace.h" /* * Use QEMU_VMALLOC_ALIGN, so no THP will have to be split when unplugging @@ -100,6 +101,7 @@ static void virtio_mem_send_response(VirtIOMEM *vmem, VirtQueueElement *elem, VirtIODevice *vdev = VIRTIO_DEVICE(vmem); VirtQueue *vq = vmem->vq; + trace_virtio_mem_send_response(le16_to_cpu(resp->type)); iov_from_buf(elem->in_sg, elem->in_num, 0, resp, sizeof(*resp)); virtqueue_push(vq, elem, sizeof(*resp)); @@ -195,6 +197,7 @@ static void virtio_mem_plug_request(VirtIOMEM *vmem, VirtQueueElement *elem, const uint16_t nb_blocks = le16_to_cpu(req->u.plug.nb_blocks); uint16_t type; + trace_virtio_mem_plug_request(gpa, nb_blocks); type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, true); virtio_mem_send_response_simple(vmem, elem, type); } @@ -206,6 +209,7 @@ static void virtio_mem_unplug_request(VirtIOMEM *vmem, VirtQueueElement *elem, const uint16_t nb_blocks = le16_to_cpu(req->u.unplug.nb_blocks); uint16_t type; + trace_virtio_mem_unplug_request(gpa, nb_blocks); type = virtio_mem_state_change_request(vmem, gpa, nb_blocks, false); virtio_mem_send_response_simple(vmem, elem, type); } @@ -225,6 +229,7 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, return; } + trace_virtio_mem_resized_usable_region(vmem->usable_region_size, newsize); vmem->usable_region_size = newsize; } @@ -247,7 +252,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem) vmem->size = 0; notifier_list_notify(&vmem->size_change_notifiers, &vmem->size); } - + trace_virtio_mem_unplugged_all(); virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); return 0; } @@ -255,6 +260,7 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem) static void virtio_mem_unplug_all_request(VirtIOMEM *vmem, VirtQueueElement *elem) { + trace_virtio_mem_unplug_all_request(); if (virtio_mem_unplug_all(vmem)) { virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_BUSY); } else { @@ -272,6 +278,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, VirtQueueElement *elem, .type = cpu_to_le16(VIRTIO_MEM_RESP_ACK), }; + trace_virtio_mem_state_request(gpa, nb_blocks); if (!virtio_mem_valid_range(vmem, gpa, size)) { virtio_mem_send_response_simple(vmem, elem, VIRTIO_MEM_RESP_ERROR); return; @@ -284,6 +291,7 @@ static void virtio_mem_state_request(VirtIOMEM *vmem, VirtQueueElement *elem, } else { resp.u.state.state = cpu_to_le16(VIRTIO_MEM_STATE_MIXED); } + trace_virtio_mem_state_response(le16_to_cpu(resp.u.state.state)); virtio_mem_send_response(vmem, elem, &resp); } From patchwork Wed Jun 10 11:54:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EF83C433DF for ; Wed, 10 Jun 2020 12:11:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2738720734 for ; Wed, 10 Jun 2020 12:11:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MTl8kkfz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2738720734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38172 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizaA-0002XA-Ek for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:11:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43586) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizKw-0004Vr-Bd for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:06 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:32478 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizKv-0002F2-Af for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wekDDw574QgT8J4I/42gBDbU1WxIpKSkHwqQsu+1Vwk=; b=MTl8kkfzt/qx1HfDlw22FZK5jL5vH6jWKOvGBc12zo1/5tvVrXYmD/EAQ843i9S6yvcKpc WwfcgBEBcmC7lJTn9IdO48ZOG4x2zxRGpvWarurJczuRuhrYc1vvZG0rPIFRlc1txvPrhs qzUlAFjPTkxh3Hm80EUfrvjU8dnkz8E= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-460-4QSvKPO-Mq6I8ILWbBNFhA-1; Wed, 10 Jun 2020 07:56:00 -0400 X-MC-Unique: 4QSvKPO-Mq6I8ILWbBNFhA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BA1961B18BC0; Wed, 10 Jun 2020 11:55:59 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id DE2CB5D9D3; Wed, 10 Jun 2020 11:55:57 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 20/21] virtio-mem: Exclude unplugged memory during migration Date: Wed, 10 Jun 2020 13:54:18 +0200 Message-Id: <20200610115419.51688-21-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=205.139.110.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 23:51:15 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The content of unplugged memory is undefined and should not be migrated, ever. Exclude all unplugged memory during precopy using the precopy notifier infrastructure introduced for free page hinting in virtio-balloon. Unplugged memory is marked as "not dirty", meaning it won't be considered for migration. Cc: "Michael S. Tsirkin" Cc: "Dr. David Alan Gilbert" Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 54 +++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 3 ++ 2 files changed, 56 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 468fb4170f..21a23a3252 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -62,8 +62,14 @@ static bool virtio_mem_is_busy(void) /* * Postcopy cannot handle concurrent discards and we don't want to migrate * pages on-demand with stale content when plugging new blocks. + * + * For precopy, we don't want unplugged blocks in our migration stream, and + * when plugging new blocks, the page content might differ between source + * and destination (observable by the guest when not initializing pages + * after plugging them) until we're running on the destination (as we didn't + * migrate these blocks when they were unplugged). */ - return migration_in_incoming_postcopy(); + return migration_in_incoming_postcopy() || !migration_is_idle(); } static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, @@ -475,6 +481,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) host_memory_backend_set_mapped(vmem->memdev, true); vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); + precopy_add_notifier(&vmem->precopy_notifier); } static void virtio_mem_device_unrealize(DeviceState *dev) @@ -482,6 +489,7 @@ static void virtio_mem_device_unrealize(DeviceState *dev) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOMEM *vmem = VIRTIO_MEM(dev); + precopy_remove_notifier(&vmem->precopy_notifier); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); @@ -756,12 +764,56 @@ static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name, vmem->block_size = value; } +static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem) +{ + void * const host = qemu_ram_get_host_addr(vmem->memdev->mr.ram_block); + unsigned long first_zero_bit, last_zero_bit; + uint64_t offset, length; + + /* + * Find consecutive unplugged blocks and exclude them from migration. + * + * Note: Blocks cannot get (un)plugged during precopy, no locking needed. + */ + first_zero_bit = find_first_zero_bit(vmem->bitmap, vmem->bitmap_size); + while (first_zero_bit < vmem->bitmap_size) { + offset = first_zero_bit * vmem->block_size; + last_zero_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + first_zero_bit + 1) - 1; + length = (last_zero_bit - first_zero_bit + 1) * vmem->block_size; + + qemu_guest_free_page_hint(host + offset, length); + first_zero_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + last_zero_bit + 2); + } +} + +static int virtio_mem_precopy_notify(NotifierWithReturn *n, void *data) +{ + VirtIOMEM *vmem = container_of(n, VirtIOMEM, precopy_notifier); + PrecopyNotifyData *pnd = data; + + switch (pnd->reason) { + case PRECOPY_NOTIFY_SETUP: + precopy_enable_free_page_optimization(); + break; + case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC: + virtio_mem_precopy_exclude_unplugged(vmem); + break; + default: + break; + } + + return 0; +} + static void virtio_mem_instance_init(Object *obj) { VirtIOMEM *vmem = VIRTIO_MEM(obj); vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE; notifier_list_init(&vmem->size_change_notifiers); + vmem->precopy_notifier.notify = virtio_mem_precopy_notify; object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size, NULL, NULL, NULL); diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index b74c77cd42..0778224964 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -67,6 +67,9 @@ typedef struct VirtIOMEM { /* notifiers to notify when "size" changes */ NotifierList size_change_notifiers; + + /* don't migrate unplugged memory */ + NotifierWithReturn precopy_notifier; } VirtIOMEM; typedef struct VirtIOMEMClass { From patchwork Wed Jun 10 11:54:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 280906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B1EBC433DF for ; Wed, 10 Jun 2020 12:14:12 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6404920734 for ; Wed, 10 Jun 2020 12:14:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Vc6+AD0A" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6404920734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45200 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jizcR-0005lK-Kl for qemu-devel@archiver.kernel.org; Wed, 10 Jun 2020 08:14:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43614) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jizL4-0004oG-Gv for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:15 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:24030 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1jizL2-0002Fu-VP for qemu-devel@nongnu.org; Wed, 10 Jun 2020 07:56:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591790172; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OMm4of9YaGzegN5bSIFM+q+axawQAsBn39Dg4QO3Ssc=; b=Vc6+AD0AMdlrA2DwSOk6vaiZ0+BJ6swksuGsbIOqkd6I740uCsYXTe0H7sDHnrO+YSkvWN no/gsWZqJcHaAQn8aRi545YN4YcQZkCKnYjESCn63h+kR3c3eLW2zdicsA/4ZG67kuGqsG NhGY5FhwoaXtVPLzsZjXPBEbAeie1cU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-237-iZtHDbZuMtaMTb9TjcRE4w-1; Wed, 10 Jun 2020 07:56:10 -0400 X-MC-Unique: iZtHDbZuMtaMTb9TjcRE4w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 607EF1B18BC2; Wed, 10 Jun 2020 11:56:08 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-42.ams2.redhat.com [10.36.114.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 188245D9D3; Wed, 10 Jun 2020 11:55:59 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v4 21/21] numa: Auto-enable NUMA when any memory devices are possible Date: Wed, 10 Jun 2020 13:54:19 +0200 Message-Id: <20200610115419.51688-22-david@redhat.com> In-Reply-To: <20200610115419.51688-1-david@redhat.com> References: <20200610115419.51688-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Received-SPF: pass client-ip=207.211.31.120; envelope-from=david@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/09 21:17:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Sergio Lopez , Eduardo Habkost , kvm@vger.kernel.org, "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , qemu-s390x@nongnu.org, "qemu-arm @ nongnu . org" , Igor Mammedov , Paolo Bonzini , Alex Shi , Richard Henderson Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's auto-enable it also when maxmem is specified but no slots are defined. This will result in us properly creating ACPI srat tables, indicating the maximum possible PFN to the guest OS. Based on this, e.g., Linux will enable the swiotlb properly. This avoids having to manually force the switolb on (swiotlb=force) in Linux in case we're booting only using DMA memory (e.g., 2GB on x86-64), and virtio-mem adds memory later on that really needs the swiotlb to be used for DMA. Let's take care of backwards compatibility if somebody has a setup that specifies "maxram" without "slots". Reported-by: Alex Shi Cc: Peter Maydell Cc: Eduardo Habkost Cc: Marcel Apfelbaum Cc: Sergio Lopez Cc: Paolo Bonzini Cc: Richard Henderson Cc: "Michael S. Tsirkin" Cc: Igor Mammedov Cc: qemu-arm@nongnu.org Signed-off-by: David Hildenbrand --- hw/arm/virt.c | 2 ++ hw/core/numa.c | 11 ++++++----- hw/i386/microvm.c | 1 + hw/i386/pc.c | 1 + hw/i386/pc_piix.c | 1 + hw/i386/pc_q35.c | 1 + include/hw/boards.h | 1 + 7 files changed, 13 insertions(+), 5 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 37462a6f78..d50ad6f841 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2265,6 +2265,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data) mc->numa_mem_supported = true; mc->nvdimm_supported = true; mc->auto_enable_numa_with_memhp = true; + mc->auto_enable_numa_with_memdev = true; mc->default_ram_id = "mach-virt.ram"; object_class_property_add(oc, "acpi", "OnOffAuto", @@ -2375,6 +2376,7 @@ DEFINE_VIRT_MACHINE_AS_LATEST(5, 1) static void virt_machine_5_0_options(MachineClass *mc) { virt_machine_5_1_options(mc); + mc->auto_enable_numa_with_memdev = false; } DEFINE_VIRT_MACHINE(5, 0) diff --git a/hw/core/numa.c b/hw/core/numa.c index 06960918e7..b9923cdd45 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -681,8 +681,9 @@ void numa_complete_configuration(MachineState *ms) NodeInfo *numa_info = ms->numa_state->nodes; /* - * If memory hotplug is enabled (slots > 0) but without '-numa' - * options explicitly on CLI, guestes will break. + * If memory hotplug is enabled (slot > 0) or memory devices are enabled + * (ms->maxram_size > ram_size) but without '-numa' options explicitly on + * CLI, guests will break. * * Windows: won't enable memory hotplug without SRAT table at all * @@ -697,9 +698,9 @@ void numa_complete_configuration(MachineState *ms) * assume there is just one node with whole RAM. */ if (ms->numa_state->num_nodes == 0 && - ((ms->ram_slots > 0 && - mc->auto_enable_numa_with_memhp) || - mc->auto_enable_numa)) { + ((ms->ram_slots && mc->auto_enable_numa_with_memhp) || + (ms->maxram_size > ms->ram_size && mc->auto_enable_numa_with_memdev) || + mc->auto_enable_numa)) { NumaNodeOptions node = { }; parse_numa_node(ms, &node, &error_abort); numa_info[0].node_mem = ram_size; diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c index 937db10ae6..2c052ee2e5 100644 --- a/hw/i386/microvm.c +++ b/hw/i386/microvm.c @@ -497,6 +497,7 @@ static void microvm_class_init(ObjectClass *oc, void *data) mc->max_cpus = 288; mc->has_hotpluggable_cpus = false; mc->auto_enable_numa_with_memhp = false; + mc->auto_enable_numa_with_memdev = false; mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE; mc->nvdimm_supported = false; mc->default_ram_id = "microvm.ram"; diff --git a/hw/i386/pc.c b/hw/i386/pc.c index ee6368915b..ec93870de5 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1955,6 +1955,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) mc->get_default_cpu_node_id = x86_get_default_cpu_node_id; mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids; mc->auto_enable_numa_with_memhp = true; + mc->auto_enable_numa_with_memdev = true; mc->has_hotpluggable_cpus = true; mc->default_boot_order = "cad"; mc->hot_add_cpu = pc_hot_add_cpu; diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index f66e1d73ce..1bc68c3841 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -440,6 +440,7 @@ static void pc_i440fx_5_0_machine_options(MachineClass *m) m->is_default = false; compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len); compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len); + m->auto_enable_numa_with_memdev = false; } DEFINE_I440FX_MACHINE(v5_0, "pc-i440fx-5.0", NULL, diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 4ba8ac8774..a17459e37c 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -368,6 +368,7 @@ static void pc_q35_5_0_machine_options(MachineClass *m) m->alias = NULL; compat_props_add(m->compat_props, hw_compat_5_0, hw_compat_5_0_len); compat_props_add(m->compat_props, pc_compat_5_0, pc_compat_5_0_len); + m->auto_enable_numa_with_memhp = false; } DEFINE_Q35_MACHINE(v5_0, "pc-q35-5.0", NULL, diff --git a/include/hw/boards.h b/include/hw/boards.h index 18815d9be2..426ce5f625 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -207,6 +207,7 @@ struct MachineClass { const char **valid_cpu_types; strList *allowed_dynamic_sysbus_devices; bool auto_enable_numa_with_memhp; + bool auto_enable_numa_with_memdev; void (*numa_auto_assign_ram)(MachineClass *mc, NodeInfo *nodes, int nb_nodes, ram_addr_t size); bool ignore_boot_device_suffixes;