Message ID | 20200601181118.579-1-vsementsov@virtuozzo.com |
---|---|
Headers | show |
Series | backup performance: block_status + async | expand |
17.07.2020 16:45, Max Reitz wrote: > On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >> Refactor common path to use BlockCopyCallState pointer as parameter, to >> prepare it for use in asynchronous block-copy (at least, we'll need to >> run block-copy in a coroutine, passing the whole parameters as one >> pointer). >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> block/block-copy.c | 51 ++++++++++++++++++++++++++++++++++------------ >> 1 file changed, 38 insertions(+), 13 deletions(-) >> >> diff --git a/block/block-copy.c b/block/block-copy.c >> index 43a018d190..75882a094c 100644 >> --- a/block/block-copy.c >> +++ b/block/block-copy.c > > [...] > >> @@ -646,16 +653,16 @@ out: >> * it means that some I/O operation failed in context of _this_ block_copy call, >> * not some parallel operation. >> */ >> -int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes, >> - bool *error_is_read) >> +static int coroutine_fn block_copy_common(BlockCopyCallState *call_state) >> { >> int ret; >> >> do { >> - ret = block_copy_dirty_clusters(s, offset, bytes, error_is_read); >> + ret = block_copy_dirty_clusters(call_state); > > It’s possible that much of this code will change in a future patch of > this series, but as it is, it might be nice to make > block_copy_dirty_clusters’s argument a const pointer so it’s clear that > the call to block_copy_wait_one() below will use the original @offset > and @bytes values. Hm. I'm trying this, and it doesn't work: block_copy_task_entry() wants to change call_state: t->call_state->failed = true; > > *shrug* > > Reviewed-by: Max Reitz <mreitz@redhat.com> > >> >> if (ret == 0) { >> - ret = block_copy_wait_one(s, offset, bytes); >> + ret = block_copy_wait_one(call_state->s, call_state->offset, >> + call_state->bytes); >> } >> >> /* >
On 18.09.20 22:11, Vladimir Sementsov-Ogievskiy wrote: > 17.07.2020 16:45, Max Reitz wrote: >> On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >>> Refactor common path to use BlockCopyCallState pointer as parameter, to >>> prepare it for use in asynchronous block-copy (at least, we'll need to >>> run block-copy in a coroutine, passing the whole parameters as one >>> pointer). >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> --- >>> block/block-copy.c | 51 ++++++++++++++++++++++++++++++++++------------ >>> 1 file changed, 38 insertions(+), 13 deletions(-) >>> >>> diff --git a/block/block-copy.c b/block/block-copy.c >>> index 43a018d190..75882a094c 100644 >>> --- a/block/block-copy.c >>> +++ b/block/block-copy.c >> >> [...] >> >>> @@ -646,16 +653,16 @@ out: >>> * it means that some I/O operation failed in context of _this_ >>> block_copy call, >>> * not some parallel operation. >>> */ >>> -int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, >>> int64_t bytes, >>> - bool *error_is_read) >>> +static int coroutine_fn block_copy_common(BlockCopyCallState >>> *call_state) >>> { >>> int ret; >>> do { >>> - ret = block_copy_dirty_clusters(s, offset, bytes, >>> error_is_read); >>> + ret = block_copy_dirty_clusters(call_state); >> >> It’s possible that much of this code will change in a future patch of >> this series, but as it is, it might be nice to make >> block_copy_dirty_clusters’s argument a const pointer so it’s clear that >> the call to block_copy_wait_one() below will use the original @offset >> and @bytes values. > > Hm. I'm trying this, and it doesn't work: > > block_copy_task_entry() wants to change call_state: > > t->call_state->failed = true; Too bad :) >> *shrug* >> >> Reviewed-by: Max Reitz <mreitz@redhat.com> >> >>> if (ret == 0) { >>> - ret = block_copy_wait_one(s, offset, bytes); >>> + ret = block_copy_wait_one(call_state->s, >>> call_state->offset, >>> + call_state->bytes); >>> } >>> /* >> > >
23.07.2020 12:47, Max Reitz wrote: > On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >> This brings async request handling and block-status driven chunk sizes >> to backup out of the box, which improves backup performance. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> include/block/block-copy.h | 9 +-- >> block/backup.c | 145 +++++++++++++++++++------------------ >> block/block-copy.c | 21 ++---- >> 3 files changed, 83 insertions(+), 92 deletions(-) > > This patch feels like it should be multiple ones. I don’t see why a > patch that lets backup use block-copy (more) should have to modify the > block-copy code. > > More specifically: I think that block_copy_set_progress_callback() > should be removed in a separate patch. Also, moving > @cb_opaque/@progress_opaque from BlockCopyState into BlockCopyCallState > seems like a separate patch to me, too. > > (I presume if the cb had had its own opaque object from patch 5 on, > there wouldn’t be this problem now. We’d just stop using the progress > callback in backup, and remove it in one separate patch.) > > [...] > >> diff --git a/block/backup.c b/block/backup.c >> index ec2676abc2..59c00f5293 100644 >> --- a/block/backup.c >> +++ b/block/backup.c >> @@ -44,42 +44,19 @@ typedef struct BackupBlockJob { >> BlockdevOnError on_source_error; >> BlockdevOnError on_target_error; >> uint64_t len; >> - uint64_t bytes_read; >> int64_t cluster_size; >> int max_workers; >> int64_t max_chunk; >> >> BlockCopyState *bcs; >> + >> + BlockCopyCallState *bcs_call; > > Can this be more descriptive? E.g. background_bcs? bg_bcs_call? bg_bcsc? > >> + int ret; >> + bool error_is_read; >> } BackupBlockJob; >> >> static const BlockJobDriver backup_job_driver; >> > > [...] > >> static int coroutine_fn backup_loop(BackupBlockJob *job) >> { >> - bool error_is_read; >> - int64_t offset; >> - BdrvDirtyBitmapIter *bdbi; >> - int ret = 0; >> + while (true) { /* retry loop */ >> + assert(!job->bcs_call); >> + job->bcs_call = block_copy_async(job->bcs, 0, >> + QEMU_ALIGN_UP(job->len, >> + job->cluster_size), >> + true, job->max_workers, job->max_chunk, >> + backup_block_copy_callback, job); >> >> - bdbi = bdrv_dirty_iter_new(block_copy_dirty_bitmap(job->bcs)); >> - while ((offset = bdrv_dirty_iter_next(bdbi)) != -1) { >> - do { >> - if (yield_and_check(job)) { >> - goto out; >> + while (job->bcs_call && !job->common.job.cancelled) { >> + /* wait and handle pauses */ > > Doesn’t someone need to reset BlockCopyCallState.cancelled when the job > is unpaused? I can’t see anyone doing that. > > Well, or even just reentering the block-copy operation after > backup_pause() has cancelled it. Is there some magic I’m missing? > > Does pausing (which leads to cancelling the block-copy operation) just > yield to the CB being invoked, so the copy operation is considered done, > and then we just enter the next iteration of the loop and try again? Yes, that's my idea: we cancel on pause and just run new block_copy operation on resume. > But cancelling the block-copy operation leads to it returning 0, AFAIR, > so... Looks like it should be fixed. By returning ECANCELED or by checking the bitmap. > >> + >> + job_pause_point(&job->common.job); >> + >> + if (job->bcs_call && !job->common.job.cancelled) { >> + job_yield(&job->common.job); >> } >> - ret = backup_do_cow(job, offset, job->cluster_size, &error_is_read); >> - if (ret < 0 && backup_error_action(job, error_is_read, -ret) == >> - BLOCK_ERROR_ACTION_REPORT) >> - { >> - goto out; >> + } >> + >> + if (!job->bcs_call && job->ret == 0) { >> + /* Success */ >> + return 0; > > ...I would assume we return here when the job is paused. > >> + } >> + >> + if (job->common.job.cancelled) { >> + if (job->bcs_call) { >> + block_copy_cancel(job->bcs_call); >> } >> - } while (ret < 0); >> + return 0; >> + } >> + >> + if (!job->bcs_call && job->ret < 0 && > > Is it even possible for bcs_call to be non-NULL here? > >> + (backup_error_action(job, job->error_is_read, -job->ret) == >> + BLOCK_ERROR_ACTION_REPORT)) >> + { >> + return job->ret; >> + } > > So if we get an error, and the error action isn’t REPORT, we just try > the whole operation again? That doesn’t sound very IGNORE-y to me. Not the whole. We have the dirty bitmap: clusters that was already copied are not touched more.
22.07.2020 15:22, Max Reitz wrote: > On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >> Add new parameters to configure future backup features. The patch >> doesn't introduce aio backup requests (so we actually have only one >> worker) neither requests larger than one cluster. Still, formally we >> satisfy these maximums anyway, so add the parameters now, to facilitate >> further patch which will really change backup job behavior. >> >> Options are added with x- prefixes, as the only use for them are some >> very conservative iotests which will be updated soon. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> qapi/block-core.json | 9 ++++++++- >> include/block/block_int.h | 7 +++++++ >> block/backup.c | 21 +++++++++++++++++++++ >> block/replication.c | 2 +- >> blockdev.c | 5 +++++ >> 5 files changed, 42 insertions(+), 2 deletions(-) >> [..] >> @@ -422,6 +436,11 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, >> if (cluster_size < 0) { >> goto error; >> } >> + if (max_chunk && max_chunk < cluster_size) { >> + error_setg(errp, "Required max-chunk (%" PRIi64") is less than backup " > > (missing a space after PRIi64) > >> + "cluster size (%" PRIi64 ")", max_chunk, cluster_size); > > Should this be noted in the QAPI documentation? Hmm.. It makes sense, but I don't know what to write: should be >= job cluster_size? But then I'll have to describe the logic of backup_calculate_cluster_size()... > (And perhaps the fact > that without copy offloading, we’ll never copy anything bigger than one > cluster at a time anyway?) This is a parameter for background copying. Look at block_copy_task_create(), if call_state has own max_chunk, it's used instead of common copy_size (derived from cluster_size). But at a moment of this patch background process through async block-copy is not yet implemented, and the parameter doesn't work, which is described in commit message. > >> + return NULL; >> + } >> >> /* [..]
23.07.2020 11:35, Max Reitz wrote: > On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >> The further change of moving backup to be a on block-copy call will > > -on? > >> make copying chunk-size and cluster-size a separate things. So, even > > s/a/two/ > >> with 64k cluster sized qcow2 image, default chunk would be 1M. >> Test 219 depends on specified chunk-size. Update it for explicit >> chunk-size for backup as for mirror. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> tests/qemu-iotests/219 | 13 +++++++------ >> 1 file changed, 7 insertions(+), 6 deletions(-) >> >> diff --git a/tests/qemu-iotests/219 b/tests/qemu-iotests/219 >> index db272c5249..2bbed28f39 100755 >> --- a/tests/qemu-iotests/219 >> +++ b/tests/qemu-iotests/219 >> @@ -203,13 +203,13 @@ with iotests.FilePath('disk.img') as disk_path, \ >> # but related to this also automatic state transitions like job >> # completion), but still get pause points often enough to avoid making this >> # test very slow, it's important to have the right ratio between speed and >> - # buf_size. >> + # copy-chunk-size. >> # >> - # For backup, buf_size is hard-coded to the source image cluster size (64k), >> - # so we'll pick the same for mirror. The slice time, i.e. the granularity >> - # of the rate limiting is 100ms. With a speed of 256k per second, we can >> - # get four pause points per second. This gives us 250ms per iteration, >> - # which should be enough to stay deterministic. >> + # Chose 64k copy-chunk-size both for mirror (by buf_size) and backup (by >> + # x-max-chunk). The slice time, i.e. the granularity of the rate limiting >> + # is 100ms. With a speed of 256k per second, we can get four pause points >> + # per second. This gives us 250ms per iteration, which should be enough to >> + # stay deterministic. > > Don’t we also have to limit the number of workers to 1 so we actually > keep 250 ms per iteration instead of just finishing four requests > immediately, then pausing for a second? Block-copy rate limiter works good enough: it will not start too much requests. > >> test_job_lifecycle(vm, 'drive-mirror', has_ready=True, job_args={ >> 'device': 'drive0-node', >> @@ -226,6 +226,7 @@ with iotests.FilePath('disk.img') as disk_path, \ >> 'target': copy_path, >> 'sync': 'full', >> 'speed': 262144, >> + 'x-max-chunk': 65536, >> 'auto-finalize': auto_finalize, >> 'auto-dismiss': auto_dismiss, >> }) >> > >
23.07.2020 12:47, Max Reitz wrote: >> +static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed) >> +{ >> + BackupBlockJob *s = container_of(job, BackupBlockJob, common); >> + >> + if (s->bcs) { >> + /* In block_job_create we yet don't have bcs */ > Shouldn’t hurt to make it conditional, but how can we get here in > block_job_create()? > block_job_set_speed is called from block_job_create.
On 22.10.20 22:35, Vladimir Sementsov-Ogievskiy wrote: > 22.07.2020 15:22, Max Reitz wrote: >> On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >>> Add new parameters to configure future backup features. The patch >>> doesn't introduce aio backup requests (so we actually have only one >>> worker) neither requests larger than one cluster. Still, formally we >>> satisfy these maximums anyway, so add the parameters now, to facilitate >>> further patch which will really change backup job behavior. >>> >>> Options are added with x- prefixes, as the only use for them are some >>> very conservative iotests which will be updated soon. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> --- >>> qapi/block-core.json | 9 ++++++++- >>> include/block/block_int.h | 7 +++++++ >>> block/backup.c | 21 +++++++++++++++++++++ >>> block/replication.c | 2 +- >>> blockdev.c | 5 +++++ >>> 5 files changed, 42 insertions(+), 2 deletions(-) >>> > > [..] > >>> @@ -422,6 +436,11 @@ BlockJob *backup_job_create(const char *job_id, >>> BlockDriverState *bs, >>> if (cluster_size < 0) { >>> goto error; >>> } >>> + if (max_chunk && max_chunk < cluster_size) { >>> + error_setg(errp, "Required max-chunk (%" PRIi64") is less >>> than backup " >> >> (missing a space after PRIi64) >> >>> + "cluster size (%" PRIi64 ")", max_chunk, >>> cluster_size); >> >> Should this be noted in the QAPI documentation? > > Hmm.. It makes sense, but I don't know what to write: should be >= job > cluster_size? But then I'll have to describe the logic of > backup_calculate_cluster_size()... Isn’t the logic basically just “cluster size of the target image file (min 64k)”? The other cases just cover error cases, and they always return 64k, which would effectively be documented by stating that 64k is the minimum. But in the common cases (qcow2 or raw), we’ll never get an error anyway. I think it’d be good to describe the cluster size somewhere, because otherwise I find it a bit malicious to throw an error at the user that they could not have anticipated because they have no idea what valid values for the parameter are (until they try). >> (And perhaps the fact >> that without copy offloading, we’ll never copy anything bigger than one >> cluster at a time anyway?) > > This is a parameter for background copying. Look at > block_copy_task_create(), if call_state has own max_chunk, it's used > instead of common copy_size (derived from cluster_size). But at a moment > of this patch background process through async block-copy is not yet > implemented, and the parameter doesn't work, which is described in > commit message. Ah, OK. Right. Max
On 26.10.20 16:18, Vladimir Sementsov-Ogievskiy wrote: > 23.07.2020 12:47, Max Reitz wrote: >>> +static void coroutine_fn backup_set_speed(BlockJob *job, int64_t speed) >>> +{ >>> + BackupBlockJob *s = container_of(job, BackupBlockJob, common); >>> + >>> + if (s->bcs) { >>> + /* In block_job_create we yet don't have bcs */ >> Shouldn’t hurt to make it conditional, but how can we get here in >> block_job_create()? >> > > block_job_set_speed is called from block_job_create. Ah, right. Max
04.11.2020 20:19, Max Reitz wrote: > On 22.10.20 22:35, Vladimir Sementsov-Ogievskiy wrote: >> 22.07.2020 15:22, Max Reitz wrote: >>> On 01.06.20 20:11, Vladimir Sementsov-Ogievskiy wrote: >>>> Add new parameters to configure future backup features. The patch >>>> doesn't introduce aio backup requests (so we actually have only one >>>> worker) neither requests larger than one cluster. Still, formally we >>>> satisfy these maximums anyway, so add the parameters now, to facilitate >>>> further patch which will really change backup job behavior. >>>> >>>> Options are added with x- prefixes, as the only use for them are some >>>> very conservative iotests which will be updated soon. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>> --- >>>> qapi/block-core.json | 9 ++++++++- >>>> include/block/block_int.h | 7 +++++++ >>>> block/backup.c | 21 +++++++++++++++++++++ >>>> block/replication.c | 2 +- >>>> blockdev.c | 5 +++++ >>>> 5 files changed, 42 insertions(+), 2 deletions(-) >>>> >> >> [..] >> >>>> @@ -422,6 +436,11 @@ BlockJob *backup_job_create(const char *job_id, >>>> BlockDriverState *bs, >>>> if (cluster_size < 0) { >>>> goto error; >>>> } >>>> + if (max_chunk && max_chunk < cluster_size) { >>>> + error_setg(errp, "Required max-chunk (%" PRIi64") is less >>>> than backup " >>> >>> (missing a space after PRIi64) >>> >>>> + "cluster size (%" PRIi64 ")", max_chunk, >>>> cluster_size); >>> >>> Should this be noted in the QAPI documentation? >> >> Hmm.. It makes sense, but I don't know what to write: should be >= job >> cluster_size? But then I'll have to describe the logic of >> backup_calculate_cluster_size()... > > Isn’t the logic basically just “cluster size of the target image file > (min 64k)”? The other cases just cover error cases, and they always > return 64k, which would effectively be documented by stating that 64k is > the minimum. > > But in the common cases (qcow2 or raw), we’ll never get an error anyway. > > I think it’d be good to describe the cluster size somewhere, because > otherwise I find it a bit malicious to throw an error at the user that > they could not have anticipated because they have no idea what valid > values for the parameter are (until they try). OK > >>> (And perhaps the fact >>> that without copy offloading, we’ll never copy anything bigger than one >>> cluster at a time anyway?) >> >> This is a parameter for background copying. Look at >> block_copy_task_create(), if call_state has own max_chunk, it's used >> instead of common copy_size (derived from cluster_size). But at a moment >> of this patch background process through async block-copy is not yet >> implemented, and the parameter doesn't work, which is described in >> commit message. > > Ah, OK. Right. > > Max > -- Best regards, Vladimir