From patchwork Thu Sep 24 16:04:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 304491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E664C2D0E2 for ; Thu, 24 Sep 2020 16:34:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E439D23741 for ; Thu, 24 Sep 2020 16:34:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gnEnrl9W" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E439D23741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41800 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLUCk-0002zf-Tz for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:34:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57736) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTlF-0003SL-K2 for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:06:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:49815) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTlB-00047k-0E for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:06:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ns91EjBugtmDxMCQ5VfxqyGOa9vP01gmqc4UUw1zhlo=; b=gnEnrl9WkxQmAKwp48oOXbegUwE29FSVp5SS9lF3gi64FhItbu0WjTmf6Tjv3pn9tYKtri 7LVvYc71CxU7fs27CdLco9pME3EWXP9uE4pgjPnFRLz4ML8lnz+c7wSYmYKmKVOzNGAVHh Ccg53oy9g/w08HBiuyZZKgLrubDwqXg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-492-dB8S5PCUMAKcngodM1kPug-1; Thu, 24 Sep 2020 12:05:01 -0400 X-MC-Unique: dB8S5PCUMAKcngodM1kPug-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4BF2A10BBED8; Thu, 24 Sep 2020 16:05:00 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8A3E119C66; Thu, 24 Sep 2020 16:04:45 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 1/6] memory: Introduce sparse RAM handler for memory regions Date: Thu, 24 Sep 2020 18:04:18 +0200 Message-Id: <20200924160423.106747-2-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/24 01:10:00 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We have some special memory ram regions (managed by paravirtualized memory devices - virtio-mem), whereby the guest agreed to only use selected memory ranges. This results in "sparse" mmaps, "sparse" RAMBlocks and "sparse" memory ram regions. In most cases, we don't currently care about that - e.g., in KVM, we simply have a single KVM memory slot (and as the number is fairly limited, we'll have to keep it like that). However, in case of vfio, registering the whole region with the kernel results in all pages getting pinned, and therefore an unexpected high memory consumption. This is the main reason why vfio is incompatible with memory ballooning. Let's introduce a way to communicate the actual accessible/mapped (meaning, not discarded) pieces for such a sparse memory region, and get notified on changes (e.g., a virito-mem device plugging/unplugging memory). We expect that the SparseRAMHandler is set for a memory region before it is mapped into guest physical address space (so before any memory listeners get notified about the addition), and the SparseRAMHandler isn't unset before the memory region was unmapped from guest physical address space (so after any memory listener got notified about the removal). This is somewhat similar to the iommu memory region notifier mechanism. TODO: - Better documentation. - Better Naming? - Handle it on RAMBlocks? - SPAPR spacial handling required (virtio-mem only supports x86-64 for now)? - Catch mapping errors during hotplug in a nice way - Fail early when a certain number of mappings would be exceeded (instead of eventually consuming too many, leaving none for others) - Resizeable memory region handling (future). - Callback to check the state of a block. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- include/exec/memory.h | 115 ++++++++++++++++++++++++++++++++++++++++++ softmmu/memory.c | 7 +++ 2 files changed, 122 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1bb2a7df5..2931ead730 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -42,6 +42,12 @@ typedef struct IOMMUMemoryRegionClass IOMMUMemoryRegionClass; DECLARE_OBJ_CHECKERS(IOMMUMemoryRegion, IOMMUMemoryRegionClass, IOMMU_MEMORY_REGION, TYPE_IOMMU_MEMORY_REGION) +#define TYPE_SPARSE_RAM_HANDLER "sparse-ram-handler" +typedef struct SparseRAMHandlerClass SparseRAMHandlerClass; +typedef struct SparseRAMHandler SparseRAMHandler; +DECLARE_OBJ_CHECKERS(SparseRAMHandler, SparseRAMHandlerClass, + SPARSE_RAM_HANDLER, TYPE_SPARSE_RAM_HANDLER) + extern bool global_dirty_log; typedef struct MemoryRegionOps MemoryRegionOps; @@ -136,6 +142,28 @@ static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, n->iommu_idx = iommu_idx; } +struct SparseRAMNotifier; +typedef int (*SparseRAMNotifyMap)(struct SparseRAMNotifier *notifier, + const MemoryRegion *mr, uint64_t mr_offset, + uint64_t size); +typedef void (*SparseRAMNotifyUnmap)(struct SparseRAMNotifier *notifier, + const MemoryRegion *mr, uint64_t mr_offset, + uint64_t size); + +typedef struct SparseRAMNotifier { + SparseRAMNotifyMap notify_map; + SparseRAMNotifyUnmap notify_unmap; + QLIST_ENTRY(SparseRAMNotifier) next; +} SparseRAMNotifier; + +static inline void sparse_ram_notifier_init(SparseRAMNotifier *notifier, + SparseRAMNotifyMap map_fn, + SparseRAMNotifyUnmap unmap_fn) +{ + notifier->notify_map = map_fn; + notifier->notify_unmap = unmap_fn; +} + /* * Memory region callbacks */ @@ -352,6 +380,36 @@ struct IOMMUMemoryRegionClass { int (*num_indexes)(IOMMUMemoryRegion *iommu); }; +struct SparseRAMHandlerClass { + /* private */ + InterfaceClass parent_class; + + /* + * Returns the minimum granularity in which (granularity-aligned pieces + * within the memory region) can become either be mapped or unmapped. + */ + uint64_t (*get_granularity)(const SparseRAMHandler *srh, + const MemoryRegion *mr); + + /* + * Register a listener for mapping changes. + */ + void (*register_listener)(SparseRAMHandler *srh, const MemoryRegion *mr, + SparseRAMNotifier *notifier); + + /* + * Unregister a listener for mapping changes. + */ + void (*unregister_listener)(SparseRAMHandler *srh, const MemoryRegion *mr, + SparseRAMNotifier *notifier); + + /* + * Replay notifications for mapped RAM. + */ + int (*replay_mapped)(SparseRAMHandler *srh, const MemoryRegion *mr, + SparseRAMNotifier *notifier); +}; + typedef struct CoalescedMemoryRange CoalescedMemoryRange; typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd; @@ -399,6 +457,7 @@ struct MemoryRegion { const char *name; unsigned ioeventfd_nb; MemoryRegionIoeventfd *ioeventfds; + SparseRAMHandler *srh; /* For RAM only */ }; struct IOMMUMemoryRegion { @@ -1889,6 +1948,62 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr); */ bool memory_region_is_mapped(MemoryRegion *mr); + +static inline SparseRAMHandler* memory_region_get_sparse_ram_handler( + MemoryRegion *mr) +{ + return mr->srh; +} + +static inline bool memory_region_is_sparse_ram(MemoryRegion *mr) +{ + return memory_region_get_sparse_ram_handler(mr) != NULL; +} + +static inline void memory_region_set_sparse_ram_handler(MemoryRegion *mr, + SparseRAMHandler *srh) +{ + g_assert(memory_region_is_ram(mr)); + mr->srh = srh; +} + +static inline void memory_region_register_sparse_ram_notifier(MemoryRegion *mr, + SparseRAMNotifier *n) +{ + SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr); + SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh); + + srhc->register_listener(srh, mr, n); +} + +static inline void memory_region_unregister_sparse_ram_notifier( + MemoryRegion *mr, + SparseRAMNotifier *n) +{ + SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr); + SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh); + + srhc->unregister_listener(srh, mr, n); +} + +static inline uint64_t memory_region_sparse_ram_get_granularity( + MemoryRegion *mr) +{ + SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr); + SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh); + + return srhc->get_granularity(srh, mr); +} + +static inline int memory_region_sparse_ram_replay_mapped(MemoryRegion *mr, + SparseRAMNotifier *n) +{ + SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr); + SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh); + + return srhc->replay_mapped(srh, mr, n); +} + /** * memory_region_find: translate an address/size relative to a * MemoryRegion into a #MemoryRegionSection. diff --git a/softmmu/memory.c b/softmmu/memory.c index d030eb6f7c..89649f52f7 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -3241,10 +3241,17 @@ static const TypeInfo iommu_memory_region_info = { .abstract = true, }; +static const TypeInfo sparse_ram_handler_info = { + .parent = TYPE_INTERFACE, + .name = TYPE_SPARSE_RAM_HANDLER, + .class_size = sizeof(SparseRAMHandlerClass), +}; + static void memory_register_types(void) { type_register_static(&memory_region_info); type_register_static(&iommu_memory_region_info); + type_register_static(&sparse_ram_handler_info); } type_init(memory_register_types) From patchwork Thu Sep 24 16:04:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 272815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDAC1C2D0E2 for ; Thu, 24 Sep 2020 16:39:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5979423741 for ; Thu, 24 Sep 2020 16:39:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OSigXw2C" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5979423741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50804 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLUHg-00074L-7f for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:39:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57424) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTkN-0002Ds-MF for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:31094) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTkF-0003bq-LH for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963514; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T/ZKOQNJcI2maVD2pHtRKj9D8ra+xSOLD0oKJHjr4E4=; b=OSigXw2CqvQFQh1XQHIBzCPab6fEzhDUMR9C8WdC8thlGVdbO8XV5e/Pr7gwolXq2UbPXX X2CL3Up5SJHdB5WkZl3yOcTjGUyQIZKP3FHfS1WlFp/Sh7zDq5LeJKTrIepON72IGH9bBE h6CWd1JUux+a5MwdCuhUtUXTq3feNqs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-240-t5RGKrAvOdmNOHXtgOY4HA-1; Thu, 24 Sep 2020 12:05:12 -0400 X-MC-Unique: t5RGKrAvOdmNOHXtgOY4HA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D29AA18BA284; Thu, 24 Sep 2020 16:05:11 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id A058F19C66; Thu, 24 Sep 2020 16:05:00 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 2/6] virtio-mem: Impelement SparseRAMHandler interface Date: Thu, 24 Sep 2020 18:04:19 +0200 Message-Id: <20200924160423.106747-3-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/24 01:10:00 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's properly notify when (un)plugging blocks. Handle errors from notifiers gracefully when mapping, rolling back the change and telling the guest that the VM is busy. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 158 ++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 3 + 2 files changed, 160 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 8fbec77ccc..e23969eaed 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -72,6 +72,64 @@ static bool virtio_mem_is_busy(void) return migration_in_incoming_postcopy() || !migration_is_idle(); } +static void virtio_mem_srh_notify_unmap(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + SparseRAMNotifier *notifier; + + QLIST_FOREACH(notifier, &vmem->sram_notify, next) { + notifier->notify_unmap(notifier, &vmem->memdev->mr, offset, size); + } +} + +static int virtio_mem_srh_notify_map(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + SparseRAMNotifier *notifier, *notifier2; + int ret = 0; + + QLIST_FOREACH(notifier, &vmem->sram_notify, next) { + ret = notifier->notify_map(notifier, &vmem->memdev->mr, offset, size); + if (ret) { + break; + } + } + + /* In case any notifier failed, undo the whole operation. */ + if (ret) { + QLIST_FOREACH(notifier2, &vmem->sram_notify, next) { + if (notifier2 == notifier) { + break; + } + notifier2->notify_unmap(notifier2, &vmem->memdev->mr, offset, size); + } + } + return ret; +} + +/* + * TODO: Maybe we could notify directly that everything is unmapped/discarded. + * at least vfio should be able to deal with that. + */ +static void virtio_mem_srh_notify_unplug_all(VirtIOMEM *vmem) +{ + unsigned long first_zero_bit, last_zero_bit; + uint64_t offset, length; + + /* Find consecutive unplugged blocks and notify */ + first_zero_bit = find_first_zero_bit(vmem->bitmap, vmem->bitmap_size); + while (first_zero_bit < vmem->bitmap_size) { + offset = first_zero_bit * vmem->block_size; + last_zero_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + first_zero_bit + 1) - 1; + length = (last_zero_bit - first_zero_bit + 1) * vmem->block_size; + + virtio_mem_srh_notify_unmap(vmem, offset, length); + first_zero_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + last_zero_bit + 2); + } +} + static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, uint64_t size, bool plugged) { @@ -146,7 +204,7 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa, uint64_t size, bool plug) { const uint64_t offset = start_gpa - vmem->addr; - int ret; + int ret, ret2; if (virtio_mem_is_busy()) { return -EBUSY; @@ -159,6 +217,23 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa, strerror(-ret)); return -EBUSY; } + /* + * We'll notify *after* discarding succeeded, because we might not be + * able to map again ... + */ + virtio_mem_srh_notify_unmap(vmem, offset, size); + } else if (virtio_mem_srh_notify_map(vmem, offset, size)) { + /* + * Could be a mapping attempt already already resulted in memory + * getting populated. + */ + ret2 = ram_block_discard_range(vmem->memdev->mr.ram_block, offset, + size); + if (ret2) { + error_report("Unexpected error discarding RAM: %s", + strerror(-ret2)); + } + return -EBUSY; } virtio_mem_set_bitmap(vmem, start_gpa, size, plug); return 0; @@ -253,6 +328,8 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem) error_report("Unexpected error discarding RAM: %s", strerror(-ret)); return -EBUSY; } + virtio_mem_srh_notify_unplug_all(vmem); + bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size); if (vmem->size) { vmem->size = 0; @@ -480,6 +557,13 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); precopy_add_notifier(&vmem->precopy_notifier); + + /* + * Set it to sparse, so everybody is aware of it before the plug handler + * exposes the region to the system. + */ + memory_region_set_sparse_ram_handler(&vmem->memdev->mr, + SPARSE_RAM_HANDLER(vmem)); } static void virtio_mem_device_unrealize(DeviceState *dev) @@ -487,6 +571,7 @@ static void virtio_mem_device_unrealize(DeviceState *dev) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOMEM *vmem = VIRTIO_MEM(dev); + memory_region_set_sparse_ram_handler(&vmem->memdev->mr, NULL); precopy_remove_notifier(&vmem->precopy_notifier); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); @@ -813,6 +898,7 @@ static void virtio_mem_instance_init(Object *obj) vmem->block_size = VIRTIO_MEM_MIN_BLOCK_SIZE; notifier_list_init(&vmem->size_change_notifiers); vmem->precopy_notifier.notify = virtio_mem_precopy_notify; + QLIST_INIT(&vmem->sram_notify); object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size, NULL, NULL, NULL); @@ -832,11 +918,72 @@ static Property virtio_mem_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static uint64_t virtio_mem_srh_get_granularity(const SparseRAMHandler *srh, + const MemoryRegion *mr) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(srh); + + g_assert(mr == &vmem->memdev->mr); + return vmem->block_size; +} + +static void virtio_mem_srh_register_listener(SparseRAMHandler *srh, + const MemoryRegion *mr, + SparseRAMNotifier *notifier) +{ + VirtIOMEM *vmem = VIRTIO_MEM(srh); + + g_assert(mr == &vmem->memdev->mr); + QLIST_INSERT_HEAD(&vmem->sram_notify, notifier, next); +} + +static void virtio_mem_srh_unregister_listener(SparseRAMHandler *srh, + const MemoryRegion *mr, + SparseRAMNotifier *notifier) +{ + VirtIOMEM *vmem = VIRTIO_MEM(srh); + + g_assert(mr == &vmem->memdev->mr); + QLIST_REMOVE(notifier, next); +} + +static int virtio_mem_srh_replay_mapped(SparseRAMHandler *srh, + const MemoryRegion *mr, + SparseRAMNotifier *notifier) +{ + VirtIOMEM *vmem = VIRTIO_MEM(srh); + unsigned long first_bit, last_bit; + uint64_t offset, length; + int ret = 0; + + g_assert(mr == &vmem->memdev->mr); + + /* Find consecutive plugged blocks and notify */ + first_bit = find_first_bit(vmem->bitmap, vmem->bitmap_size); + while (first_bit < vmem->bitmap_size) { + offset = first_bit * vmem->block_size; + last_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + first_bit + 1) - 1; + length = (last_bit - first_bit + 1) * vmem->block_size; + + ret = notifier->notify_map(notifier, mr, offset, length); + if (ret) { + break; + } + first_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + last_bit + 2); + } + + /* TODO: cleanup on error if necessary. */ + return ret; +} + static void virtio_mem_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass); VirtIOMEMClass *vmc = VIRTIO_MEM_CLASS(klass); + SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_CLASS(klass); device_class_set_props(dc, virtio_mem_properties); dc->vmsd = &vmstate_virtio_mem; @@ -852,6 +999,11 @@ static void virtio_mem_class_init(ObjectClass *klass, void *data) vmc->get_memory_region = virtio_mem_get_memory_region; vmc->add_size_change_notifier = virtio_mem_add_size_change_notifier; vmc->remove_size_change_notifier = virtio_mem_remove_size_change_notifier; + + srhc->get_granularity = virtio_mem_srh_get_granularity; + srhc->register_listener = virtio_mem_srh_register_listener; + srhc->unregister_listener = virtio_mem_srh_unregister_listener; + srhc->replay_mapped = virtio_mem_srh_replay_mapped; } static const TypeInfo virtio_mem_info = { @@ -861,6 +1013,10 @@ static const TypeInfo virtio_mem_info = { .instance_init = virtio_mem_instance_init, .class_init = virtio_mem_class_init, .class_size = sizeof(VirtIOMEMClass), + .interfaces = (InterfaceInfo[]) { + { TYPE_SPARSE_RAM_HANDLER }, + { } + }, }; static void virtio_register_types(void) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index 4eeb82d5dd..91d9b48ba0 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -67,6 +67,9 @@ struct VirtIOMEM { /* don't migrate unplugged memory */ NotifierWithReturn precopy_notifier; + + /* SparseRAMNotifier list to be notified on plug/unplug events. */ + QLIST_HEAD(, SparseRAMNotifier) sram_notify; }; struct VirtIOMEMClass { From patchwork Thu Sep 24 16:04:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 304490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F87AC4363D for ; Thu, 24 Sep 2020 16:42:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C8ECD20738 for ; Thu, 24 Sep 2020 16:42:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EI5hfHcj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C8ECD20738 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59076 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLUK1-0002F7-Rf for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:42:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57464) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTkU-0002S9-K5 for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:35151) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTkS-0003oV-DZ for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963531; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3N5IgJYqkCr1hw868wwfOZv+LThV2WpVpRBD9vwnCeQ=; b=EI5hfHcjpPi++V+6SR8yvLH8VXYV7mBGnuOfLmBD+7Kl1j81+h2tShz40crKKnXXGYj3Q9 mfL41fZewoLBLYVgVkMsbbjWbhpo18EIntXSuUPb0qxnAeLKeTCNJW/UUMPb+t1LKBbaMo oAIRVLk3WqXgice4mgbNmfWjfgtSbDA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-243-KPUqA1oKPsiww6_mqYbP7w-1; Thu, 24 Sep 2020 12:05:27 -0400 X-MC-Unique: KPUqA1oKPsiww6_mqYbP7w-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5F72B8010C7; Thu, 24 Sep 2020 16:05:26 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 323E119C66; Thu, 24 Sep 2020 16:05:12 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions Date: Thu, 24 Sep 2020 18:04:20 +0200 Message-Id: <20200924160423.106747-4-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/22 23:02:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Implement support for sparse RAM, to be used by virtio-mem. Handling is somewhat-similar to memory_region_is_iommu() handling, which also notifies on changes. Instead of mapping the whole region, we only map selected pieces (and unmap previously selected pieces) when notified by the SparseRAMHandler. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 155 ++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 12 +++ 2 files changed, 167 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 13471ae294..a3aaf70dd8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -37,6 +37,7 @@ #include "sysemu/reset.h" #include "trace.h" #include "qapi/error.h" +#include "qemu/units.h" VFIOGroupList vfio_group_list = QLIST_HEAD_INITIALIZER(vfio_group_list); @@ -498,6 +499,143 @@ out: rcu_read_unlock(); } +static int vfio_sparse_ram_notify(SparseRAMNotifier *n, const MemoryRegion *mr, + uint64_t mr_offset, uint64_t size, + bool map) +{ + VFIOSparseRAM *sram = container_of(n, VFIOSparseRAM, notifier); + const hwaddr mr_start = MAX(mr_offset, sram->offset_within_region); + const hwaddr mr_end = MIN(mr_offset + size, + sram->offset_within_region + sram->size); + const hwaddr iova_start = mr_start + sram->offset_within_address_space; + const hwaddr iova_end = mr_end + sram->offset_within_address_space; + hwaddr mr_cur, iova_cur, mr_next; + void *vaddr; + int ret, ret2; + + g_assert(mr == sram->mr); + + /* We get notified about everything, ignore range we don't care about. */ + if (mr_start >= mr_end) { + return 0; + } + + /* Unmap everything with a single call. */ + if (!map) { + ret = vfio_dma_unmap(sram->container, iova_start, + iova_end - iova_start); + if (ret) { + error_report("%s: vfio_dma_unmap() failed: %s", __func__, + strerror(-ret)); + } + return 0; + } + + /* TODO: fail early if we would exceed a specified number of mappings. */ + + /* Map in (aligned within MR) granularity, so we can unmap later. */ + for (mr_cur = mr_start; mr_cur < mr_end; mr_cur = mr_next) { + iova_cur = mr_cur + sram->offset_within_address_space; + mr_next = QEMU_ALIGN_UP(mr_cur + 1, sram->granularity); + mr_next = MIN(mr_next, mr_end); + + vaddr = memory_region_get_ram_ptr(sram->mr) + mr_cur; + ret = vfio_dma_map(sram->container, iova_cur, mr_next - mr_cur, + vaddr, mr->readonly); + if (ret) { + /* Rollback in case of error. */ + if (mr_cur != mr_start) { + ret2 = vfio_dma_unmap(sram->container, iova_start, + iova_end - iova_start); + if (ret2) { + error_report("%s: vfio_dma_unmap() failed: %s", __func__, + strerror(-ret)); + } + } + return ret; + } + } + return 0; +} + +static int vfio_sparse_ram_notify_map(SparseRAMNotifier *n, + const MemoryRegion *mr, + uint64_t mr_offset, uint64_t size) +{ + return vfio_sparse_ram_notify(n, mr, mr_offset, size, true); +} + +static void vfio_sparse_ram_notify_unmap(SparseRAMNotifier *n, + const MemoryRegion *mr, + uint64_t mr_offset, uint64_t size) +{ + vfio_sparse_ram_notify(n, mr, mr_offset, size, false); +} + +static void vfio_register_sparse_ram(VFIOContainer *container, + MemoryRegionSection *section) +{ + VFIOSparseRAM *sram; + int ret; + + sram = g_new0(VFIOSparseRAM, 1); + sram->container = container; + sram->mr = section->mr; + sram->offset_within_region = section->offset_within_region; + sram->offset_within_address_space = section->offset_within_address_space; + sram->size = int128_get64(section->size); + sram->granularity = memory_region_sparse_ram_get_granularity(section->mr); + + /* + * TODO: We usually want a bigger granularity (for a lot of addded memory, + * as we need quite a lot of mappings) - however, this has to be configured + * by the user. + */ + g_assert(sram->granularity >= 1 * MiB && + is_power_of_2(sram->granularity)); + + /* Register the notifier */ + sparse_ram_notifier_init(&sram->notifier, vfio_sparse_ram_notify_map, + vfio_sparse_ram_notify_unmap); + memory_region_register_sparse_ram_notifier(section->mr, &sram->notifier); + QLIST_INSERT_HEAD(&container->sram_list, sram, next); + /* + * Replay mapped blocks - if anything goes wrong (only when hotplugging + * vfio devices), report the error for now. + * + * TODO: Can we catch this earlier? + */ + ret = memory_region_sparse_ram_replay_mapped(section->mr, &sram->notifier); + if (ret) { + error_report("%s: failed to replay mappings: %s", __func__, + strerror(-ret)); + } +} + +static void vfio_unregister_sparse_ram(VFIOContainer *container, + MemoryRegionSection *section) +{ + VFIOSparseRAM *sram = NULL; + + QLIST_FOREACH(sram, &container->sram_list, next) { + if (sram->mr == section->mr && + sram->offset_within_region == section->offset_within_region && + sram->offset_within_address_space == + section->offset_within_address_space) { + break; + } + } + + if (!sram) { + hw_error("vfio: Trying to unregister non-existant sparse RAM"); + } + + memory_region_unregister_sparse_ram_notifier(section->mr, &sram->notifier); + QLIST_REMOVE(sram, next); + g_free(sram); + /* The caller is expected to vfio_dma_unmap(). */ +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -650,6 +788,15 @@ static void vfio_listener_region_add(MemoryListener *listener, /* Here we assume that memory_region_is_ram(section->mr)==true */ + /* + * For sparse RAM, we only want to register the actually mapped + * pieces - and update the mapping whenever we're notified about changes. + */ + if (memory_region_is_sparse_ram(section->mr)) { + vfio_register_sparse_ram(container, section); + return; + } + vaddr = memory_region_get_ram_ptr(section->mr) + section->offset_within_region + (iova - section->offset_within_address_space); @@ -786,6 +933,13 @@ static void vfio_listener_region_del(MemoryListener *listener, pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask)); + } else if (memory_region_is_sparse_ram(section->mr)) { + vfio_unregister_sparse_ram(container, section); + /* + * We rely on a single vfio_dma_unmap() call below to clean the whole + * region. + */ + try_unmap = true; } if (try_unmap) { @@ -1275,6 +1429,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->error = NULL; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); + QLIST_INIT(&container->sram_list); ret = vfio_init_container(container, group->fd, errp); if (ret) { diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index c78f3ff559..dfa18dbd8e 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -77,6 +77,7 @@ typedef struct VFIOContainer { QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; + QLIST_HEAD(, VFIOSparseRAM) sram_list; QLIST_ENTRY(VFIOContainer) next; } VFIOContainer; @@ -88,6 +89,17 @@ typedef struct VFIOGuestIOMMU { QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; } VFIOGuestIOMMU; +typedef struct VFIOSparseRAM { + VFIOContainer *container; + MemoryRegion *mr; + hwaddr offset_within_region; + hwaddr offset_within_address_space; + hwaddr size; + uint64_t granularity; + SparseRAMNotifier notifier; + QLIST_ENTRY(VFIOSparseRAM) next; +} VFIOSparseRAM; + typedef struct VFIOHostDMAWindow { hwaddr min_iova; hwaddr max_iova; From patchwork Thu Sep 24 16:04:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 304489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7BF9C4363D for ; Thu, 24 Sep 2020 16:46:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6C76621D91 for ; Thu, 24 Sep 2020 16:46:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="dckT8UZM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6C76621D91 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39512 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLUNi-0005ji-FA for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:46:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57492) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTke-0002jE-C6 for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:57077) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTkc-0003vR-A9 for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963541; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wJrqradJRkoK4uTfcB0Mh5atTSDfbUU1dmDu1yYFJuQ=; b=dckT8UZMMzUam3TNAQKkC/zNfvRPJuJniSWi+8SLkEVVh55Rt/roycpzb7vL63OrR2Pl/m OauCqmlixdWsxj/e192Z7hXXvuDWfei3ZehdQToiQRyDGQapLhdZ0CfPmNYZDSp3AICi5X x7TFK6/V+5rEYZyNkm0Ut8G8GkWNvlY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-358-pF05UGVmMRm_LEOhtMMDhQ-1; Thu, 24 Sep 2020 12:05:36 -0400 X-MC-Unique: pF05UGVmMRm_LEOhtMMDhQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 545E918BA284; Thu, 24 Sep 2020 16:05:35 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id B732F19C66; Thu, 24 Sep 2020 16:05:26 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 4/6] memory: Extend ram_block_discard_(require|disable) by two discard types Date: Thu, 24 Sep 2020 18:04:21 +0200 Message-Id: <20200924160423.106747-5-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/24 01:10:00 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We want to separate the two cases whereby - balloning drivers do random discards on random guest memory (e.g., virtio-balloon) - uncoordinated discards - paravirtualized memory devices do discards in well-known granularity, and always know which block is currently accessible or inaccessible by a guest. - coordinated discards This will be required to get virtio_mem + vfio running - vfio still wants to block random memory ballooning. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- exec.c | 109 ++++++++++++++++++++++++++++++++++-------- include/exec/memory.h | 36 ++++++++++++-- 2 files changed, 121 insertions(+), 24 deletions(-) diff --git a/exec.c b/exec.c index e34b602bdf..83098e9230 100644 --- a/exec.c +++ b/exec.c @@ -4098,52 +4098,121 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, MemoryRegion *root) * If positive, discarding RAM is disabled. If negative, discarding RAM is * required to work and cannot be disabled. */ -static int ram_block_discard_disabled; +static int uncoordinated_discard_disabled; +static int coordinated_discard_disabled; -int ram_block_discard_disable(bool state) +static int __ram_block_discard_disable(int *counter) { int old; - if (!state) { - atomic_dec(&ram_block_discard_disabled); - return 0; - } - do { - old = atomic_read(&ram_block_discard_disabled); + old = atomic_read(counter); if (old < 0) { return -EBUSY; } - } while (atomic_cmpxchg(&ram_block_discard_disabled, old, old + 1) != old); + } while (atomic_cmpxchg(counter, old, old + 1) != old); + return 0; } -int ram_block_discard_require(bool state) +int ram_block_discard_type_disable(RamBlockDiscardType type, bool state) { - int old; + int ret; - if (!state) { - atomic_inc(&ram_block_discard_disabled); - return 0; + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED) { + if (!state) { + atomic_dec(&uncoordinated_discard_disabled); + } else { + ret = __ram_block_discard_disable(&uncoordinated_discard_disabled); + if (ret) { + return ret; + } + } } + if (type & RAM_BLOCK_DISCARD_T_COORDINATED) { + if (!state) { + atomic_dec(&coordinated_discard_disabled); + } else { + ret = __ram_block_discard_disable(&coordinated_discard_disabled); + if (ret) { + /* Rollback the previous change. */ + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED) { + atomic_dec(&uncoordinated_discard_disabled); + } + return ret; + } + } + } + return 0; +} + +static int __ram_block_discard_require(int *counter) +{ + int old; do { - old = atomic_read(&ram_block_discard_disabled); + old = atomic_read(counter); if (old > 0) { return -EBUSY; } - } while (atomic_cmpxchg(&ram_block_discard_disabled, old, old - 1) != old); + } while (atomic_cmpxchg(counter, old, old - 1) != old); + + return 0; +} + +int ram_block_discard_type_require(RamBlockDiscardType type, bool state) +{ + int ret; + + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED) { + if (!state) { + atomic_inc(&uncoordinated_discard_disabled); + } else { + ret = __ram_block_discard_require(&uncoordinated_discard_disabled); + if (ret) { + return ret; + } + } + } + if (type & RAM_BLOCK_DISCARD_T_COORDINATED) { + if (!state) { + atomic_inc(&coordinated_discard_disabled); + } else { + ret = __ram_block_discard_require(&coordinated_discard_disabled); + if (ret) { + /* Rollback the previous change. */ + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED) { + atomic_inc(&uncoordinated_discard_disabled); + } + return ret; + } + } + } return 0; } -bool ram_block_discard_is_disabled(void) +bool ram_block_discard_type_is_disabled(RamBlockDiscardType type) { - return atomic_read(&ram_block_discard_disabled) > 0; + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED && + atomic_read(&uncoordinated_discard_disabled) > 0) { + return true; + } else if (type & RAM_BLOCK_DISCARD_T_COORDINATED && + atomic_read(&coordinated_discard_disabled) > 0) { + return true; + } + return false; } -bool ram_block_discard_is_required(void) +bool ram_block_discard_type_is_required(RamBlockDiscardType type) { - return atomic_read(&ram_block_discard_disabled) < 0; + if (type & RAM_BLOCK_DISCARD_T_UNCOORDINATED && + atomic_read(&uncoordinated_discard_disabled) < 0) { + return true; + } else if (type & RAM_BLOCK_DISCARD_T_COORDINATED && + atomic_read(&coordinated_discard_disabled) < 0) { + return true; + } + return false; } #endif diff --git a/include/exec/memory.h b/include/exec/memory.h index 2931ead730..3169ebc3d9 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2588,6 +2588,18 @@ static inline MemOp devend_memop(enum device_endian end) } #endif +typedef enum RamBlockDiscardType { + /* Uncorrdinated discards (e.g., virtio-balloon */ + RAM_BLOCK_DISCARD_T_UNCOORDINATED = 1, + /* + * Coordinated discards on selected memory regions (e.g., virtio-mem via + * SparseRamNotifier). + */ + RAM_BLOCK_DISCARD_T_COORDINATED = 2, + /* Any type of discards */ + RAM_BLOCK_DISCARD_T_ANY = 3, +} RamBlockDiscardType; + /* * Inhibit technologies that require discarding of pages in RAM blocks, e.g., * to manage the actual amount of memory consumed by the VM (then, the memory @@ -2609,7 +2621,11 @@ static inline MemOp devend_memop(enum device_endian end) * Returns 0 if successful. Returns -EBUSY if a technology that relies on * discards to work reliably is active. */ -int ram_block_discard_disable(bool state); +int ram_block_discard_type_disable(RamBlockDiscardType type, bool state); +static inline int ram_block_discard_disable(bool state) +{ + return ram_block_discard_type_disable(RAM_BLOCK_DISCARD_T_ANY, state); +} /* * Inhibit technologies that disable discarding of pages in RAM blocks. @@ -2617,17 +2633,29 @@ int ram_block_discard_disable(bool state); * Returns 0 if successful. Returns -EBUSY if discards are already set to * broken. */ -int ram_block_discard_require(bool state); +int ram_block_discard_type_require(RamBlockDiscardType type, bool state); +static inline int ram_block_discard_require(bool state) +{ + return ram_block_discard_type_require(RAM_BLOCK_DISCARD_T_ANY, state); +} /* * Test if discarding of memory in ram blocks is disabled. */ -bool ram_block_discard_is_disabled(void); +bool ram_block_discard_type_is_disabled(RamBlockDiscardType type); +static inline bool ram_block_discard_is_disabled(void) +{ + return ram_block_discard_type_is_disabled(RAM_BLOCK_DISCARD_T_ANY); +} /* * Test if discarding of memory in ram blocks is required to work reliably. */ -bool ram_block_discard_is_required(void); +bool ram_block_discard_type_is_required(RamBlockDiscardType type); +static inline bool ram_block_discard_is_required(void) +{ + return ram_block_discard_type_is_required(RAM_BLOCK_DISCARD_T_ANY); +} #endif From patchwork Thu Sep 24 16:04:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 304493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF9F8C4363D for ; Thu, 24 Sep 2020 16:23:27 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0AC782311B for ; Thu, 24 Sep 2020 16:23:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TPlSCqu3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0AC782311B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46980 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLU1l-0001F0-Rf for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:23:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57554) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTkp-00036e-FN for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:21279) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTkn-00043A-Ik for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:05:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963552; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6IefpecqGL3XgdT1oT6U0apZGd5TzKbwUBmHtxx5Ps=; b=TPlSCqu3YejMpTspCx2GvrcnzZ5y7zxJC2qcQgAiNZz5KE0EG/hwsK5ZSf053hqPxAIH5f tCIl+pcO6cZHYiL211tjnbqyjumAxBCh9HU1eQqsSSynKg4cr03NIH0WYfwi7aBdIL5+Fr jhFQoNilbPElWBxMrAMq6e2ClZs3BV8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-64-QVowD1Z_NfCRO4e7H1nxcw-1; Thu, 24 Sep 2020 12:05:51 -0400 X-MC-Unique: QVowD1Z_NfCRO4e7H1nxcw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1DE26AD687; Thu, 24 Sep 2020 16:05:50 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id ADDD619C66; Thu, 24 Sep 2020 16:05:35 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 5/6] virtio-mem: Require only RAM_BLOCK_DISCARD_T_COORDINATED discards Date: Thu, 24 Sep 2020 18:04:22 +0200 Message-Id: <20200924160423.106747-6-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/24 01:10:00 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We implement the SparseRamHandler interface and properly communicate changes by notifying listeners - especially in all scenarios - when memory becomes usable by the guest - when memory becomes unusable by the guest (and we discard memory) Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index e23969eaed..efeff7c64c 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -531,7 +531,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) return; } - if (ram_block_discard_require(true)) { + if (ram_block_discard_type_require(RAM_BLOCK_DISCARD_T_COORDINATED, true)) { error_setg(errp, "Discarding RAM is disabled"); return; } @@ -539,7 +539,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); if (ret) { error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); - ram_block_discard_require(false); + ram_block_discard_type_require(RAM_BLOCK_DISCARD_T_COORDINATED, false); return; } @@ -579,7 +579,7 @@ static void virtio_mem_device_unrealize(DeviceState *dev) virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); - ram_block_discard_require(false); + ram_block_discard_type_require(RAM_BLOCK_DISCARD_T_COORDINATED, false); } static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) From patchwork Thu Sep 24 16:04:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 304492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02B70C2D0E2 for ; Thu, 24 Sep 2020 16:27:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6FF2A23741 for ; Thu, 24 Sep 2020 16:27:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="axxWIf89" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6FF2A23741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kLU5f-0004s7-9y for qemu-devel@archiver.kernel.org; Thu, 24 Sep 2020 12:27:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57664) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kLTlB-0003PY-8Y for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:06:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36699) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kLTl1-00046t-RK for qemu-devel@nongnu.org; Thu, 24 Sep 2020 12:06:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600963567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AH8IYQbb3ay9Cjv/MkGhbM86WcU24pssD37+05te6Zg=; b=axxWIf89I2Z2dPI+ZoAtfNdjckBFPfHntBJdqVI2Oio+D+KeNcIQi/0wV5R80E+++3/Bw5 m9tKkDHpXog+/tm/wjGMopGJJ2JU5V6Zy7Ekk31u3QKIYUeUnpcfnuOB1B9ERW7s6iqqEu cPyFgb2c5UlI5eAfxb0UBifNrGSG3ok= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-21-ydXYsntHOt-q1VhWHPH5Kw-1; Thu, 24 Sep 2020 12:06:00 -0400 X-MC-Unique: ydXYsntHOt-q1VhWHPH5Kw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1C7DB801E53; Thu, 24 Sep 2020 16:05:59 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-4.ams2.redhat.com [10.36.114.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 70EF61A4D7; Thu, 24 Sep 2020 16:05:50 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH PROTOTYPE 6/6] vfio: Disable only RAM_BLOCK_DISCARD_T_UNCOORDINATED discards Date: Thu, 24 Sep 2020 18:04:23 +0200 Message-Id: <20200924160423.106747-7-david@redhat.com> In-Reply-To: <20200924160423.106747-1-david@redhat.com> References: <20200924160423.106747-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/22 23:02:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -32 X-Spam_score: -3.3 X-Spam_bar: --- X-Spam_report: (-3.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.199, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , Wei Yang , Igor Mammedov , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This unlocks virtio-mem with vfio. A virtio-mem device properly notifies about all accessible/mapped blocks inside a managed memory region - whenever blocks become accessible and whenever blocks become inaccessible. Note: The block size of a virtio-mem device has to be set to sane sizes, depending on the maximum hotplug size - to not run out of vfio mappings. The default virtio-mem block size is usually in the range of a couple of MBs. Linux kernels (x86-64) don't support block sizes > 128MB with an initial memory size of < 64 MB - and above that only in some cases 2GB. The larger the blocks, the less likely that a lot of memory can get unplugged again. The smaller the blocks, the slower memory hot(un)plug will be. Assume you want to hotplug 256GB - the block size would have to be at least 8 MB (resulting in 32768 distinct mappings). It's expected that the block size will be comparatively large when virtio-mem is used with vfio in the future (e.g., 128MB, 1G, 2G) - something Linux guests will have to be optimized for. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Wei Yang Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index a3aaf70dd8..4d82296967 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1392,8 +1392,12 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * new memory, it will not yet set ram_block_discard_set_required() and * therefore, neither stops us here or deals with the sudden memory * consumption of inflated memory. + * + * We do support discarding for memory regions where accessible pieces + * are coordinated via the SparseRAMNotifier. */ - ret = ram_block_discard_disable(true); + ret = ram_block_discard_type_disable(RAM_BLOCK_DISCARD_T_UNCOORDINATED, + true); if (ret) { error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"); return ret; @@ -1564,7 +1568,7 @@ close_fd_exit: close(fd); put_space_exit: - ram_block_discard_disable(false); + ram_block_discard_type_disable(RAM_BLOCK_DISCARD_T_UNCOORDINATED, false); vfio_put_address_space(space); return ret; @@ -1686,7 +1690,8 @@ void vfio_put_group(VFIOGroup *group) } if (!group->ram_block_discard_allowed) { - ram_block_discard_disable(false); + ram_block_discard_type_disable(RAM_BLOCK_DISCARD_T_UNCOORDINATED, + false); } vfio_kvm_device_del_group(group); vfio_disconnect_container(group); @@ -1740,7 +1745,8 @@ int vfio_get_device(VFIOGroup *group, const char *name, if (!group->ram_block_discard_allowed) { group->ram_block_discard_allowed = true; - ram_block_discard_disable(false); + ram_block_discard_type_disable(RAM_BLOCK_DISCARD_T_UNCOORDINATED, + false); } }