Message ID | 20201008160330.130431-1-dgilbert@redhat.com |
---|---|
State | New |
Headers | show |
Series | tests/migration: Allow longer timeouts | expand |
On 08/10/2020 18.03, Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > In travis, with gcov and gprof we're seeing timeouts; hopefully fix > this by increasing the test timeouts a bit, but for xbzrle ensure it > really does get a couple of cycles through to test the cache. > > I think the problem in travis is we have about 2 host CPU threads, > in the test we have at least 3: > a) The vCPU thread (100% flat out) > b) The source migration thread > c) The destination migration thread > > if (b) & (c) are slow for any reason - gcov+gperf or a slow host - > then they're sharing one host CPU thread so limit the migration > bandwidth. > > Tested on my laptop with: > taskset -c 0,1 ./tests/qtest/migration-test -p /x86_64/migration > > Reported-by: Alex Bennée <alex.bennee@linaro.org> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > tests/qtest/migration-test.c | 21 +++++++++++---------- > 1 file changed, 11 insertions(+), 10 deletions(-) This seems to fix the gcov/gprof test indeed: https://travis-ci.com/github/huth/qemu/jobs/398270396 Thus: Tested-by: Thomas Huth <thuth@redhat.com> I'm also queuing this to my qtest-next branch (in case you don't plan a migration pull request within the next days): https://gitlab.com/huth/qemu/-/commits/qtest-next/ Thomas
On 12/10/2020 15.13, Thomas Huth wrote: > On 08/10/2020 18.03, Dr. David Alan Gilbert (git) wrote: >> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> >> >> In travis, with gcov and gprof we're seeing timeouts; hopefully fix >> this by increasing the test timeouts a bit, but for xbzrle ensure it >> really does get a couple of cycles through to test the cache. >> >> I think the problem in travis is we have about 2 host CPU threads, >> in the test we have at least 3: >> a) The vCPU thread (100% flat out) >> b) The source migration thread >> c) The destination migration thread >> >> if (b) & (c) are slow for any reason - gcov+gperf or a slow host - >> then they're sharing one host CPU thread so limit the migration >> bandwidth. >> >> Tested on my laptop with: >> taskset -c 0,1 ./tests/qtest/migration-test -p /x86_64/migration >> >> Reported-by: Alex Bennée <alex.bennee@linaro.org> >> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> >> --- >> tests/qtest/migration-test.c | 21 +++++++++++---------- >> 1 file changed, 11 insertions(+), 10 deletions(-) > > This seems to fix the gcov/gprof test indeed: > > https://travis-ci.com/github/huth/qemu/jobs/398270396 > > Thus: > > Tested-by: Thomas Huth <thuth@redhat.com> > > I'm also queuing this to my qtest-next branch (in case you don't plan a > migration pull request within the next days): > > https://gitlab.com/huth/qemu/-/commits/qtest-next/ FYI, this patch fails to build on non-Linux systems: https://cirrus-ci.com/task/5951706225704960?command=main#L6076 The #define needs to be moved out of the #if defined(__linux__) block. I can fixup the patch here locally, but if you want to include it in your next migration pull request instead, you should do that, too. Cheers, Thomas
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 00a233cd8c..481db4e929 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -44,6 +44,9 @@ static bool uffd_feature_thread_id; #include <sys/ioctl.h> #include <linux/userfaultfd.h> +/* A downtime where the test really should converge */ +#define CONVERGE_DOWNTIME 1000 + static bool ufd_version_check(void) { struct uffdio_api api_struct; @@ -864,8 +867,7 @@ static void test_precopy_unix(void) wait_for_migration_pass(from); - /* 300 ms should converge */ - migrate_set_parameter_int(from, "downtime-limit", 300); + migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME); if (!got_stop) { qtest_qmp_eventwait(from, "STOP"); @@ -946,10 +948,12 @@ static void test_xbzrle(const char *uri) migrate_qmp(from, uri, "{}"); + wait_for_migration_pass(from); + /* Make sure we have 2 passes, so the xbzrle cache gets a workout */ wait_for_migration_pass(from); - /* 300ms should converge */ - migrate_set_parameter_int(from, "downtime-limit", 300); + /* 1000ms should converge */ + migrate_set_parameter_int(from, "downtime-limit", 1000); if (!got_stop) { qtest_qmp_eventwait(from, "STOP"); @@ -999,8 +1003,7 @@ static void test_precopy_tcp(void) wait_for_migration_pass(from); - /* 300ms should converge */ - migrate_set_parameter_int(from, "downtime-limit", 300); + migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME); if (!got_stop) { qtest_qmp_eventwait(from, "STOP"); @@ -1068,8 +1071,7 @@ static void test_migrate_fd_proto(void) wait_for_migration_pass(from); - /* 300ms should converge */ - migrate_set_parameter_int(from, "downtime-limit", 300); + migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME); if (!got_stop) { qtest_qmp_eventwait(from, "STOP"); @@ -1304,8 +1306,7 @@ static void test_multifd_tcp(const char *method) wait_for_migration_pass(from); - /* 300ms it should converge */ - migrate_set_parameter_int(from, "downtime-limit", 300); + migrate_set_parameter_int(from, "downtime-limit", CONVERGE_DOWNTIME); if (!got_stop) { qtest_qmp_eventwait(from, "STOP");