Message ID | 25a922038d256e47f3eb99683c5e3bd9c34753ac.1612747873.git.haibo.xu@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | target/arm: Add MTE support to KVM guest | expand |
On 2/7/21 7:20 PM, Haibo Xu wrote: > + if (kvm_physical_memory_addr_from_host(kvm_state, addr, &ipa)) { > + /* Buffer for the page tags(one byte per tag) */ > + tag_buf = g_try_malloc0(size); > + if (!tag_buf) { > + error_report("%s: Error allocating MTE tag_buf", __func__); > + return 0; > + } > + > + if (kvm_arm_mte_get_tags(ipa, TARGET_PAGE_SIZE, tag_buf) < 0) { > + error_report("%s: Can't get MTE tags from guest", __func__); > + g_free(tag_buf); > + return 0; > + } > + > + qemu_put_buffer(f, tag_buf, size); > + > + g_free(tag_buf); > + > + return size; > + } So, in patch 2 you disabled the allocation of tag-memory. Now you're allocating new memory (admittedly quite a small amount -- 1/16th of a page, small enough to just be a local variable). Why don't you allocate tag-memory, copy the data into it, and then let migration proceed as normal. Then you don't have to have a random data block that happens to follow each ram page. I'm concerned that what you're doing here makes it impossible to migrate between kvm and tcg. r~
On Tue, 16 Feb 2021 at 23:31, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 2/7/21 7:20 PM, Haibo Xu wrote: > > + if (kvm_physical_memory_addr_from_host(kvm_state, addr, &ipa)) { > > + /* Buffer for the page tags(one byte per tag) */ > > + tag_buf = g_try_malloc0(size); > > + if (!tag_buf) { > > + error_report("%s: Error allocating MTE tag_buf", __func__); > > + return 0; > > + } > > + > > + if (kvm_arm_mte_get_tags(ipa, TARGET_PAGE_SIZE, tag_buf) < 0) { > > + error_report("%s: Can't get MTE tags from guest", __func__); > > + g_free(tag_buf); > > + return 0; > > + } > > + > > + qemu_put_buffer(f, tag_buf, size); > > + > > + g_free(tag_buf); > > + > > + return size; > > + } > > So, in patch 2 you disabled the allocation of tag-memory. Now you're > allocating new memory (admittedly quite a small amount -- 1/16th of a page, > small enough to just be a local variable). > Hi Richard! Thanks so much for the comments! Yes, the allocated memory here is for temporary tag storage. As you mentioned, it can be defined as a local variable which is reserved for temporary tag buffers. > Why don't you allocate tag-memory, copy the data into it, and then let > migration proceed as normal. Then you don't have to have a random data block > that happens to follow each ram page. > As I mentioned in the cover later, the reason to let the tag go with the memory data together is to make it easier to sync with each other. I think if we migratie them separately, it would be hard to keep the tags to sync with the data. Saying if we migration all the data first, then the tags. If the data got dirty during the migration of the tag memory, we may need to send the data again, or freeze the source VM after data migration? What's more, the KVM_GET_DIRTY_LOG API may not be able to differentiate between a tag and data changes. I'm not sure whether the aforementioned situation does exist. Please correct me if something goes wrong! > I'm concerned that what you're doing here makes it impossible to migrate > between kvm and tcg. > > You mean to migrate from a KVM mode VM to a TCG mode VM? > r~ Regards, Haibo
On 2/22/21 1:46 AM, Haibo Xu wrote: > As I mentioned in the cover later, the reason to let the tag go with the > memory data together is to make it easier to sync with each other. I think > if we migratie them separately, it would be hard to keep the tags to sync > with the data. Well, maybe, maybe not. See below. > Saying if we migration all the data first, then the tags. If the data got > dirty during the migration of the tag memory, we may need to send the data > again, or freeze the source VM after data migration? What's more, the > KVM_GET_DIRTY_LOG API may not be able to differentiate between a tag and > data changes. I would certainly expect KVM_GET_DIRTY_LOG to only care about the normal memory. That is, pages as viewed by the guest. I would expect the separate tag_memory block to be private to the host. If a normal page is dirty, then we would read the tags into the tag_memory and manually mark that dirty. Migration would continue as normal, and eventually both normal and tag memory would all be clean and migrated. But I'll admit that it does require that we retain a buffer 1/16 the size of main memory, which is otherwise unused, and thus this is less than ideal. So if we do it your way, we should arrange for tcg to migrate the tag data in the same way. I'll still wait for migration experts, of which I am not one. r~
++ more migration experts! On Tue, 23 Feb 2021 at 06:47, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 2/22/21 1:46 AM, Haibo Xu wrote: > > As I mentioned in the cover later, the reason to let the tag go with the > > memory data together is to make it easier to sync with each other. I think > > if we migratie them separately, it would be hard to keep the tags to sync > > with the data. > Well, maybe, maybe not. See below. > > > > Saying if we migration all the data first, then the tags. If the data got > > dirty during the migration of the tag memory, we may need to send the data > > again, or freeze the source VM after data migration? What's more, the > > KVM_GET_DIRTY_LOG API may not be able to differentiate between a tag and > > data changes. > I would certainly expect KVM_GET_DIRTY_LOG to only care about the normal > memory. That is, pages as viewed by the guest. > > I would expect the separate tag_memory block to be private to the host. If a > normal page is dirty, then we would read the tags into the tag_memory and > manually mark that dirty. Migration would continue as normal, and eventually > both normal and tag memory would all be clean and migrated. > > But I'll admit that it does require that we retain a buffer 1/16 the size of > main memory, which is otherwise unused, and thus this is less than ideal. So > if we do it your way, we should arrange for tcg to migrate the tag data in the > same way. > > I'll still wait for migration experts, of which I am not one. > > > r~
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 36fcb29641..6182b3e1d1 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -165,6 +165,8 @@ struct VirtMachineState { DeviceState *acpi_dev; Notifier powerdown_notifier; PCIBus *bus; + /* migrate memory tags */ + NotifierWithReturn precopy_notifier; }; #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM) diff --git a/include/migration/misc.h b/include/migration/misc.h index bccc1b6b44..005133f471 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -38,6 +38,7 @@ void precopy_add_notifier(NotifierWithReturn *n); void precopy_remove_notifier(NotifierWithReturn *n); int precopy_notify(PrecopyNotifyReason reason, Error **errp); void precopy_enable_free_page_optimization(void); +void precopy_enable_metadata_migration(void); void ram_mig_init(void); void qemu_guest_free_page_hint(void *addr, size_t len); diff --git a/migration/ram.c b/migration/ram.c index 7811cde643..32f764680d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -53,9 +53,11 @@ #include "block.h" #include "sysemu/sysemu.h" #include "sysemu/cpu-throttle.h" +#include "sysemu/kvm.h" #include "savevm.h" #include "qemu/iov.h" #include "multifd.h" +#include "kvm_arm.h" /***********************************************************/ /* ram save/restore */ @@ -75,6 +77,9 @@ #define RAM_SAVE_FLAG_XBZRLE 0x40 /* 0x80 is reserved in migration.h start with 0x100 next */ #define RAM_SAVE_FLAG_COMPRESS_PAGE 0x100 +#define RAM_SAVE_FLAG_MTE 0x200 + +#define MTE_GRANULE_SIZE (16) static inline bool is_zero_range(uint8_t *p, uint64_t size) { @@ -310,6 +315,8 @@ struct RAMState { bool ram_bulk_stage; /* The free page optimization is enabled */ bool fpo_enabled; + /* The RAM meta data(e.t memory tag) is enabled */ + bool metadata_enabled; /* How many times we have dirty too many pages */ int dirty_rate_high_cnt; /* these variables are used for bitmap sync */ @@ -387,6 +394,15 @@ void precopy_enable_free_page_optimization(void) ram_state->fpo_enabled = true; } +void precopy_enable_metadata_migration(void) +{ + if (!ram_state) { + return; + } + + ram_state->metadata_enabled = true; +} + uint64_t ram_bytes_remaining(void) { return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) : @@ -1127,6 +1143,61 @@ static bool control_save_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, return true; } +static int save_normal_page_mte_tags(QEMUFile *f, uint8_t *addr) +{ + uint8_t *tag_buf = NULL; + uint64_t ipa; + int size = TARGET_PAGE_SIZE / MTE_GRANULE_SIZE; + + if (kvm_physical_memory_addr_from_host(kvm_state, addr, &ipa)) { + /* Buffer for the page tags(one byte per tag) */ + tag_buf = g_try_malloc0(size); + if (!tag_buf) { + error_report("%s: Error allocating MTE tag_buf", __func__); + return 0; + } + + if (kvm_arm_mte_get_tags(ipa, TARGET_PAGE_SIZE, tag_buf) < 0) { + error_report("%s: Can't get MTE tags from guest", __func__); + g_free(tag_buf); + return 0; + } + + qemu_put_buffer(f, tag_buf, size); + + g_free(tag_buf); + + return size; + } + + return 0; +} + +static void load_normal_page_mte_tags(QEMUFile *f, uint8_t *addr) +{ + uint8_t *tag_buf = NULL; + uint64_t ipa; + int size = TARGET_PAGE_SIZE / MTE_GRANULE_SIZE; + + if (kvm_physical_memory_addr_from_host(kvm_state, addr, &ipa)) { + /* Buffer for the page tags(one byte per tag) */ + tag_buf = g_try_malloc0(size); + if (!tag_buf) { + error_report("%s: Error allocating MTE tag_buf", __func__); + return; + } + + qemu_get_buffer(f, tag_buf, size); + if (kvm_arm_mte_set_tags(ipa, TARGET_PAGE_SIZE, tag_buf) < 0) { + error_report("%s: Can't set MTE tags to guest", __func__); + } + + g_free(tag_buf); + } + + return; +} + /* * directly send the page to the stream * @@ -1141,6 +1212,10 @@ static bool control_save_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, uint8_t *buf, bool async) { + if (rs->metadata_enabled) { + offset |= RAM_SAVE_FLAG_MTE; + } + ram_counters.transferred += save_page_header(rs, rs->f, block, offset | RAM_SAVE_FLAG_PAGE); if (async) { @@ -1152,6 +1227,11 @@ static int save_normal_page(RAMState *rs, RAMBlock *block, ram_addr_t offset, } ram_counters.transferred += TARGET_PAGE_SIZE; ram_counters.normal++; + + if (rs->metadata_enabled) { + ram_counters.transferred += save_normal_page_mte_tags(rs->f, buf); + } + return 1; } @@ -1905,6 +1985,7 @@ static void ram_state_reset(RAMState *rs) rs->last_version = ram_list.version; rs->ram_bulk_stage = true; rs->fpo_enabled = false; + rs->metadata_enabled = false; } #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -3492,7 +3573,7 @@ static int ram_load_precopy(QEMUFile *f) trace_ram_load_loop(block->idstr, (uint64_t)addr, flags, host); } - switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { + switch (flags & ~(RAM_SAVE_FLAG_CONTINUE | RAM_SAVE_FLAG_MTE)) { case RAM_SAVE_FLAG_MEM_SIZE: /* Synchronize RAM block list */ total_ram_bytes = addr; @@ -3562,6 +3643,9 @@ static int ram_load_precopy(QEMUFile *f) case RAM_SAVE_FLAG_PAGE: qemu_get_buffer(f, host, TARGET_PAGE_SIZE); + if (flags & RAM_SAVE_FLAG_MTE) { + load_normal_page_mte_tags(f, host); + } break; case RAM_SAVE_FLAG_COMPRESS_PAGE:
To make it easier to keep the page tags sync with the page data, tags for one page are appended to the data during ram save iteration. This patch only add the pre-copy migration support. Post-copy and compress as well as zero page saving are not supported yet. Signed-off-by: Haibo Xu <haibo.xu@linaro.org> --- include/hw/arm/virt.h | 2 + include/migration/misc.h | 1 + migration/ram.c | 86 +++++++++++++++++++++++++++++++++++++++- 3 files changed, 88 insertions(+), 1 deletion(-) -- 2.17.1