mbox series

[PULL,00/10] migration queue

Message ID 20201008191046.272549-1-dgilbert@redhat.com
Headers show
Series migration queue | expand

Message

Dr. David Alan Gilbert Oct. 8, 2020, 7:10 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

The following changes since commit e64cf4d569f6461d6b9072e00d6e78d0ab8bd4a7:

  Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20201008' into staging (2020-10-08 17:18:46 +0100)

are available in the Git repository at:

  git://github.com/dagrh/qemu.git tags/pull-migration-20201008a

for you to fetch changes up to ee02b58c82749adb486ef2ae7efdc8e05b093cfd:

  migration/dirtyrate: present dirty rate only when querying the rate has completed (2020-10-08 19:57:00 +0100)

----------------------------------------------------------------
Migration and virtiofs pull 2020-10-08

v2 (from yesterdays)
  Updated types in comparison to fix mingw build
  rebased

-
Migration:
  Dirtyrate measurement API cleanup
  Postcopy recovery fixes

Virtiofsd:
  Missing qemu_init_exec_dir call
  Support for setting the group on socket creation
  Stop a gcc warning
  Avoid tempdir in sandboxing

----------------------------------------------------------------
Alex Bennée (1):
      tools/virtiofsd: add support for --socket-group

Chuan Zheng (2):
      migration/dirtyrate: record start_time and calc_time while at the measuring state
      migration/dirtyrate: present dirty rate only when querying the rate has completed

Dr. David Alan Gilbert (2):
      virtiofsd: Silence gcc warning
      virtiofsd: Call qemu_init_exec_dir

Peter Xu (4):
      migration: Pass incoming state into qemu_ufd_copy_ioctl()
      migration: Introduce migrate_send_rp_message_req_pages()
      migration: Maintain postcopy faulted addresses
      migration: Sync requested pages after postcopy recovery

Stefan Hajnoczi (1):
      virtiofsd: avoid /proc/self/fd tempdir

 docs/tools/virtiofsd.rst         |  4 +++
 migration/dirtyrate.c            | 16 ++++++-----
 migration/migration.c            | 49 ++++++++++++++++++++++++++++++++--
 migration/migration.h            | 21 ++++++++++++++-
 migration/postcopy-ram.c         | 25 +++++++++++++-----
 migration/savevm.c               | 57 ++++++++++++++++++++++++++++++++++++++++
 migration/trace-events           |  3 +++
 qapi/migration.json              |  8 +++---
 tools/virtiofsd/fuse_i.h         |  1 +
 tools/virtiofsd/fuse_lowlevel.c  |  6 +++++
 tools/virtiofsd/fuse_virtio.c    | 21 +++++++++++++--
 tools/virtiofsd/passthrough_ll.c | 38 ++++++++++-----------------
 12 files changed, 203 insertions(+), 46 deletions(-)

Comments

Peter Maydell Oct. 11, 2020, 6:29 p.m. UTC | #1
On Thu, 8 Oct 2020 at 20:13, Dr. David Alan Gilbert (git)
<dgilbert@redhat.com> wrote:
>

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

>

> The following changes since commit e64cf4d569f6461d6b9072e00d6e78d0ab8bd4a7:

>

>   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20201008' into staging (2020-10-08 17:18:46 +0100)

>

> are available in the Git repository at:

>

>   git://github.com/dagrh/qemu.git tags/pull-migration-20201008a

>

> for you to fetch changes up to ee02b58c82749adb486ef2ae7efdc8e05b093cfd:

>

>   migration/dirtyrate: present dirty rate only when querying the rate has completed (2020-10-08 19:57:00 +0100)

>

> ----------------------------------------------------------------

> Migration and virtiofs pull 2020-10-08

>

> v2 (from yesterdays)

>   Updated types in comparison to fix mingw build

>   rebased

>

> -

> Migration:

>   Dirtyrate measurement API cleanup

>   Postcopy recovery fixes

>

> Virtiofsd:

>   Missing qemu_init_exec_dir call

>   Support for setting the group on socket creation

>   Stop a gcc warning

>   Avoid tempdir in sandboxing


This seems to hang in 'make check' trying to run
tests/qtest/migration-test on s390x and ppc, ie
the big-endian hosts.

thanks
-- PMM
Dr. David Alan Gilbert Oct. 12, 2020, 8:12 a.m. UTC | #2
* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Thu, 8 Oct 2020 at 20:13, Dr. David Alan Gilbert (git)
> <dgilbert@redhat.com> wrote:
> >
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > The following changes since commit e64cf4d569f6461d6b9072e00d6e78d0ab8bd4a7:
> >
> >   Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20201008' into staging (2020-10-08 17:18:46 +0100)
> >
> > are available in the Git repository at:
> >
> >   git://github.com/dagrh/qemu.git tags/pull-migration-20201008a
> >
> > for you to fetch changes up to ee02b58c82749adb486ef2ae7efdc8e05b093cfd:
> >
> >   migration/dirtyrate: present dirty rate only when querying the rate has completed (2020-10-08 19:57:00 +0100)
> >
> > ----------------------------------------------------------------
> > Migration and virtiofs pull 2020-10-08
> >
> > v2 (from yesterdays)
> >   Updated types in comparison to fix mingw build
> >   rebased
> >
> > -
> > Migration:
> >   Dirtyrate measurement API cleanup
> >   Postcopy recovery fixes
> >
> > Virtiofsd:
> >   Missing qemu_init_exec_dir call
> >   Support for setting the group on socket creation
> >   Stop a gcc warning
> >   Avoid tempdir in sandboxing
> 
> This seems to hang in 'make check' trying to run
> tests/qtest/migration-test on s390x and ppc, ie
> the big-endian hosts.

OK, I'll give it a try.

Dave

> thanks
> -- PMM
>
Peter Xu Oct. 14, 2020, 8:01 p.m. UTC | #3
On Sun, Oct 11, 2020 at 07:29:25PM +0100, Peter Maydell wrote:
> > Migration:

> >   Dirtyrate measurement API cleanup

> >   Postcopy recovery fixes

> >

> > Virtiofsd:

> >   Missing qemu_init_exec_dir call

> >   Support for setting the group on socket creation

> >   Stop a gcc warning

> >   Avoid tempdir in sandboxing

> 

> This seems to hang in 'make check' trying to run

> tests/qtest/migration-test on s390x and ppc, ie

> the big-endian hosts.


Hi, Peter,

Do you know what's the page size on both platforms?

Asking because after I debugged I do see a bug in one of the patches, however
it's not about endianess but page size.  Something like:

-------8<----------
diff --git a/migration/migration.c b/migration/migration.c
index d8a5c0de44..ca18b1cf17 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -370,7 +370,7 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis,
 int migrate_send_rp_req_pages(MigrationIncomingState *mis,
                               RAMBlock *rb, ram_addr_t start, uint64_t haddr)
 {
-    void *aligned = (void *)(uintptr_t)(haddr & (-qemu_target_page_size()));
+    void *aligned = (void *)(uintptr_t)(haddr & qemu_real_host_page_mask));
     bool received;
 
     WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) {
-------8<----------

When I reproduce this issue I was running x86_64 guests on ppc64be hosts (which
has host page size == 64k).  Above fix works for me on that.  So I want to
confirm with you on the s390x failure you mentioned.  If both of them have
special page sizes then it's probably it (AFAICT, the bug could trigger when
guest page size is smaller than host page size).  But I'd like to double
confirm with you before I repost, just in case there's another bug hidden.

I'm also trying to find a s390x host to give it a shot.  However I decided to
also ask this loud so it might be even faster.

Thanks,

-- 
Peter Xu
Thomas Huth Oct. 15, 2020, 9:13 a.m. UTC | #4
On 14/10/2020 22.01, Peter Xu wrote:
> On Sun, Oct 11, 2020 at 07:29:25PM +0100, Peter Maydell wrote:

>>> Migration:

>>>   Dirtyrate measurement API cleanup

>>>   Postcopy recovery fixes

>>>

>>> Virtiofsd:

>>>   Missing qemu_init_exec_dir call

>>>   Support for setting the group on socket creation

>>>   Stop a gcc warning

>>>   Avoid tempdir in sandboxing

>>

>> This seems to hang in 'make check' trying to run

>> tests/qtest/migration-test on s390x and ppc, ie

>> the big-endian hosts.

> 

> Hi, Peter,

> 

> Do you know what's the page size on both platforms?


s390x uses 4k page size by default. Only huge-pages are different.

> I'm also trying to find a s390x host to give it a shot.  However I decided to

> also ask this loud so it might be even faster.


Easiest way to test on s390x is likely to use Travis. If you have already an
github or gitlab account, you can simply clone the qemu repository there and
add Travis (from the Marketplace in Github, not sure how it exactly works
with Gitlab) to your cloned repo. If you then push commits to a branch,
Travis should trigger automatically, including runs on s390x, see e.g.:

 https://travis-ci.com/github/huth/qemu/jobs/399317194

 HTH,
  Thomas
Peter Xu Oct. 15, 2020, 6:58 p.m. UTC | #5
On Thu, Oct 15, 2020 at 11:13:44AM +0200, Thomas Huth wrote:
> On 14/10/2020 22.01, Peter Xu wrote:
> > On Sun, Oct 11, 2020 at 07:29:25PM +0100, Peter Maydell wrote:
> >>> Migration:
> >>>   Dirtyrate measurement API cleanup
> >>>   Postcopy recovery fixes
> >>>
> >>> Virtiofsd:
> >>>   Missing qemu_init_exec_dir call
> >>>   Support for setting the group on socket creation
> >>>   Stop a gcc warning
> >>>   Avoid tempdir in sandboxing
> >>
> >> This seems to hang in 'make check' trying to run
> >> tests/qtest/migration-test on s390x and ppc, ie
> >> the big-endian hosts.
> > 
> > Hi, Peter,
> > 
> > Do you know what's the page size on both platforms?
> 
> s390x uses 4k page size by default. Only huge-pages are different.

Hmm... Then I can't explain.  Maybe there're two bugs, or maybe there's
something I've overlooked.

> 
> > I'm also trying to find a s390x host to give it a shot.  However I decided to
> > also ask this loud so it might be even faster.
> 
> Easiest way to test on s390x is likely to use Travis. If you have already an
> github or gitlab account, you can simply clone the qemu repository there and
> add Travis (from the Marketplace in Github, not sure how it exactly works
> with Gitlab) to your cloned repo. If you then push commits to a branch,
> Travis should trigger automatically, including runs on s390x, see e.g.:
> 
>  https://travis-ci.com/github/huth/qemu/jobs/399317194

Finally I setup the CI this time and that's quite handy.  I should probably do
this even earlier, thanks Thomas!

Anyway, I'll see whether my fix will pass travis (still running), and see
whether I should repost again.