mbox series

[RFC,v1,0/7] *** Add Multifd support for TLS migration ***

Message ID 1599663177-53993-1-git-send-email-zhengchuan@huawei.com
Headers show
Series *** Add Multifd support for TLS migration *** | expand

Message

Zheng Chuan Sept. 9, 2020, 2:52 p.m. UTC
TLS migration could easily reach bottleneck of cpu because of encryption
and decryption in migration thread.
In our test, the tls migration could only reach 300MB/s under bandwidth
of 500MB/s.

Inspired by multifd, we add multifd support for tls migration to make fully
use of given net bandwidth at the cost of multi-cpus and could reduce
at most of 100% migration time with 4U16G test vm.

Evaluate migration time of migration vm.
The VM specifications for migration are as follows:
- VM use 4-K page;
- the number of VCPU is 4;
- the total memory is 16Gigabit;
- use 'mempress' tool to pressurize VM(mempress 4096 100);
- migration flag is 73755 (8219 + 65536 (TLS)) vs 204827 (8219 + 65536 (TLS) + 131072(Multifd))

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
|                      |         TLS           |      MultiFD + TLS (2 channel)    |
--------------------------------------------------------t---------------------------
| mempress 1024 120    |       25.035s         |           15.067s                 |
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| mempress 1024 200    |       48.798s         |           25.334s                 |
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| mempress 1024 300    |   Migration Failed    |           25.617s                 |
------------------------------------------------------------------------------------

Chuan Zheng (7):
  migration/tls: save hostname into MigrationState
  migration/tls: extract migration_tls_client_create for common-use
  migration/tls: add MigrationState into MultiFDSendParams
  migration/tls: extract cleanup function for common-use
  migration/tls: add support for tls check
  migration/tls: add support for multifd tls-handshake
  migration/tls: add trace points for multifd-tls

 migration/channel.c    |   5 ++
 migration/migration.c  |   1 +
 migration/migration.h  |   5 ++
 migration/multifd.c    | 121 +++++++++++++++++++++++++++++++++++++++++++------
 migration/multifd.h    |   2 +
 migration/tls.c        |  26 +++++++----
 migration/tls.h        |   6 +++
 migration/trace-events |   5 ++
 8 files changed, 149 insertions(+), 22 deletions(-)

Comments

Daniel P. Berrangé Sept. 10, 2020, 1:11 p.m. UTC | #1
On Wed, Sep 09, 2020 at 10:52:51PM +0800, Chuan Zheng wrote:
> hostname is need in multifd-tls, save hostname into MigrationState
> 
> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
> Signed-off-by: Yan Jin <jinyan12@huawei.com>
> ---
>  migration/channel.c   | 5 +++++
>  migration/migration.c | 1 +
>  migration/migration.h | 5 +++++
>  3 files changed, 11 insertions(+)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index 20e4c8e..2af3069 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -66,6 +66,11 @@ void migration_channel_connect(MigrationState *s,
>      trace_migration_set_outgoing_channel(
>          ioc, object_get_typename(OBJECT(ioc)), hostname, error);
>  
> +    /* Save hostname into MigrationState for handshake */
> +    if (hostname) {
> +        s->hostname = g_strdup(hostname);
> +    }
> +
>      if (!error) {
>          if (s->parameters.tls_creds &&
>              *s->parameters.tls_creds &&
> diff --git a/migration/migration.c b/migration/migration.c
> index 58a5452..e20b778 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1883,6 +1883,7 @@ void migrate_init(MigrationState *s)
>      s->migration_thread_running = false;
>      error_free(s->error);
>      s->error = NULL;
> +    s->hostname = NULL;
>  
>      migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP);
>  
> diff --git a/migration/migration.h b/migration/migration.h
> index ae497bd..758f803 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -261,6 +261,11 @@ struct MigrationState
>       * (which is in 4M chunk).
>       */
>      uint8_t clear_bitmap_shift;
> +
> +    /*
> +     * This save hostname when out-going migration starts
> +     */
> +    char *hostname;
>  };

Something needs to free(hostname) at the appropriate time, otherwise
well have a memory leak if we run migration multiple times.


Regards,
Daniel
Daniel P. Berrangé Sept. 10, 2020, 1:37 p.m. UTC | #2
On Wed, Sep 09, 2020 at 10:52:57PM +0800, Chuan Zheng wrote:
> add trace points for multifd-tls for debug.
> 
> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
> Signed-off-by: Yan Jin <jinyan12@huawei.com>
> ---
>  migration/multifd.c    | 10 +++++++++-
>  migration/trace-events |  5 +++++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 2509187..26935b6 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -730,7 +730,11 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>      QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
>      Error *err = NULL;
>  
> -    qio_task_propagate_error(task, &err);
> +    if (qio_task_propagate_error(task, &err)) {
> +        trace_multifd_tls_outgoing_handshake_error(error_get_pretty(err));
> +    } else {
> +        trace_multifd_tls_outgoing_handshake_complete();
> +    }
>      multifd_channel_connect(p, ioc, err);
>  }
>  
> @@ -747,6 +751,7 @@ static void multifd_tls_channel_connect(MultiFDSendParams *p,
>          return;
>      }
>  
> +    trace_multifd_tls_outgoing_handshake_start(hostname);
>      qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
>      qio_channel_tls_handshake(tioc,
>                                multifd_tls_outgoing_handshake,
> @@ -762,6 +767,9 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
>  {
>      MigrationState *s = p->s;
>  
> +    trace_multifd_set_outgoing_channel(
> +        ioc, object_get_typename(OBJECT(ioc)), s->hostname, error);
> +
>      if (!error) {
>          if (s->parameters.tls_creds &&
>              *s->parameters.tls_creds &&
> diff --git a/migration/trace-events b/migration/trace-events
> index 4ab0a50..860d2c4 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -109,6 +109,11 @@ multifd_send_sync_main_wait(uint8_t id) "channel %d"
>  multifd_send_terminate_threads(bool error) "error %d"
>  multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel %d packets %" PRIu64 " pages %"  PRIu64
>  multifd_send_thread_start(uint8_t id) "%d"
> +multifd_tls_outgoing_handshake_start(const char *hostname) "hostname=%s"
> +multifd_tls_outgoing_handshake_error(const char *err) "err=%s"
> +multifd_tls_outgoing_handshake_complete(void) ""

I'd suggest adding 'void *ioc' for all of these to make it clearer to
correlate the traces.

> +multifd_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname, void *err)  "ioc=%p ioctype=%s hostname=%s err=%p"
> +
>  ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx"
>  ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p"
>  ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x"
> -- 
> 1.8.3.1
> 

Regards,
Daniel
Zheng Chuan Sept. 10, 2020, 1:56 p.m. UTC | #3
On 2020/9/10 21:25, Daniel P. Berrangé wrote:
> On Wed, Sep 09, 2020 at 10:52:56PM +0800, Chuan Zheng wrote:
>> add support for multifd tls-handshake
>>
>> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
>> Signed-off-by: Yan Jin <jinyan12@huawei.com>
>> ---
>>  migration/multifd.c | 32 +++++++++++++++++++++++++++++++-
>>  1 file changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index b2076d7..2509187 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -719,11 +719,41 @@ out:
>>      return NULL;
>>  }
>>  
>> +static bool multifd_channel_connect(MultiFDSendParams *p,
>> +                                    QIOChannel *ioc,
>> +                                    Error *error);
>> +
>> +static void multifd_tls_outgoing_handshake(QIOTask *task,
>> +                                           gpointer opaque)
>> +{
>> +    MultiFDSendParams *p = opaque;
>> +    QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task));
>> +    Error *err = NULL;
>> +
>> +    qio_task_propagate_error(task, &err);
>> +    multifd_channel_connect(p, ioc, err);
>> +}
>> +
>>  static void multifd_tls_channel_connect(MultiFDSendParams *p,
>>                                      QIOChannel *ioc,
>>                                      Error **errp)
>>  {
>> -    /* TODO */
>> +    MigrationState *s = p->s;
>> +    const char *hostname = s->hostname;
>> +    QIOChannelTLS *tioc;
>> +
>> +    tioc = migration_tls_client_create(s, ioc, hostname, errp);
>> +    if (!tioc) {
>> +        return;
>> +    }
>> +
>> +    qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
>> +    qio_channel_tls_handshake(tioc,
>> +                              multifd_tls_outgoing_handshake,
>> +                              p,
>> +                              NULL,
>> +                              NULL);
>> +
>>  }
> 
> 
> Please squash this back into the previous patch, and both are
> inter-dependant on each other, and thus don't make sense to split
> 
> Assuming it is squashed in
> 
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> 
OK, will squash it in v2
> 
> 
> Regards,
> Daniel
>