Message ID | 20170403124524.10824-9-alex.bennee@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | MTTCG and record/replay fixes for rc3 | expand |
On 03/04/2017 14:45, Alex Bennée wrote: > Outside of the vCPU thread icount time will only be tracked against > timers_state.qemu_icount. We no longer credit cycles until they have > completed the run. Inside the vCPU thread we adjust for passage of > time by looking at how many have run so far. This is only valid inside > the vCPU thread while it is running. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> I suspect icount_budget and icount_extra could be merged into one, but not for 2.9. For now this looks nice! Paolo > --- > cpus.c | 27 +++++++++++++++++++++------ > include/qom/cpu.h | 1 + > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/cpus.c b/cpus.c > index 87638a75d2..3d18374b0e 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp) > } > } > > +/* The current number of executed instructions is based on what we > + * originally budgeted minus the current state of the decrementing > + * icount counters in extra/u16.low. > + */ > +static int64_t cpu_get_icount_executed(CPUState *cpu) > +{ > + return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra); > +} > + > int64_t cpu_get_icount_raw(void) > { > int64_t icount; > @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void) > fprintf(stderr, "Bad icount read\n"); > exit(1); > } > - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); > + /* Take into account what has run */ > + icount += cpu_get_icount_executed(cpu); > } > return icount; > } > @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu) > > count = tcg_get_icount_limit(); > > - timers_state.qemu_icount += count; > + /* To calculate what we have executed so far we need to know > + * what we originally budgeted to run this cycle */ > + cpu->icount_budget = count; > + > decr = (count > 0xffff) ? 0xffff : count; > count -= decr; > cpu->icount_decr.u16.low = decr; > @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu) > static void process_icount_data(CPUState *cpu) > { > if (use_icount) { > - /* Fold pending instructions back into the > - instruction counter, and clear the interrupt flag. */ > - timers_state.qemu_icount -= (cpu->icount_decr.u16.low > - + cpu->icount_extra); > + /* Account for executed instructions */ > + timers_state.qemu_icount += cpu_get_icount_executed(cpu); > > /* We must be under BQL here as cpu_exit can tweak > icount_decr.u32 */ > g_assert(qemu_mutex_iothread_locked()); > + > + /* Reset the counters */ > cpu->icount_decr.u32 = 0; > cpu->icount_extra = 0; > + cpu->icount_budget = 0; > + > replay_account_executed_instructions(); > } > } > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index c3292efe1c..5d10359c8f 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -332,6 +332,7 @@ struct CPUState { > /* updates protected by BQL */ > uint32_t interrupt_request; > int singlestep_enabled; > + int64_t icount_budget; > int64_t icount_extra; > sigjmp_buf jmp_env; > >
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Outside of the vCPU thread icount time will only be tracked against > timers_state.qemu_icount. We no longer credit cycles until they have > completed the run. Inside the vCPU thread we adjust for passage of > time by looking at how many have run so far. This is only valid inside > the vCPU thread while it is running. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > cpus.c | 27 +++++++++++++++++++++------ > include/qom/cpu.h | 1 + > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/cpus.c b/cpus.c > index 87638a75d2..3d18374b0e 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp) > } > } > > +/* The current number of executed instructions is based on what we > + * originally budgeted minus the current state of the decrementing > + * icount counters in extra/u16.low. > + */ > +static int64_t cpu_get_icount_executed(CPUState *cpu) > +{ > + return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra); > +} > + > int64_t cpu_get_icount_raw(void) > { > int64_t icount; > @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void) > fprintf(stderr, "Bad icount read\n"); > exit(1); > } > - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); > + /* Take into account what has run */ > + icount += cpu_get_icount_executed(cpu); > } > return icount; As far, as I understand, this one will return the same value in iothread until vCPU thread finishes cpu_exec? This value will not jump forward and backward, but still will not allow making execution deterministic. Consider the following scenarios: First: vCPU iothread access HW ---- ... access HW in timer Second: vCPU iothread ... access HW in timer access HW ---- These scenarios will generate the same order of events in the log. Synchronization checkpoint in iothread will try to write already executed instructions, but it does not have access to current_cpu and the icount value will point to the "past" - it will have less instructions than already executed. That is why you see "negative" instruction count event. > } > @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu) > > count = tcg_get_icount_limit(); > > - timers_state.qemu_icount += count; > + /* To calculate what we have executed so far we need to know > + * what we originally budgeted to run this cycle */ > + cpu->icount_budget = count; > + > decr = (count > 0xffff) ? 0xffff : count; > count -= decr; > cpu->icount_decr.u16.low = decr; > @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu) > static void process_icount_data(CPUState *cpu) > { > if (use_icount) { > - /* Fold pending instructions back into the > - instruction counter, and clear the interrupt flag. */ > - timers_state.qemu_icount -= (cpu->icount_decr.u16.low > - + cpu->icount_extra); > + /* Account for executed instructions */ > + timers_state.qemu_icount += cpu_get_icount_executed(cpu); > > /* We must be under BQL here as cpu_exit can tweak > icount_decr.u32 */ > g_assert(qemu_mutex_iothread_locked()); > + > + /* Reset the counters */ > cpu->icount_decr.u32 = 0; > cpu->icount_extra = 0; > + cpu->icount_budget = 0; > + > replay_account_executed_instructions(); > } > } > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index c3292efe1c..5d10359c8f 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -332,6 +332,7 @@ struct CPUState { > /* updates protected by BQL */ > uint32_t interrupt_request; > int singlestep_enabled; > + int64_t icount_budget; > int64_t icount_extra; > sigjmp_buf jmp_env; > > -- > 2.11.0 Pavel Dovgalyuk
On 04/04/2017 07:37, Pavel Dovgalyuk wrote: >> - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); >> + /* Take into account what has run */ >> + icount += cpu_get_icount_executed(cpu); >> } >> return icount; > As far, as I understand, this one will return the same value in iothread > until vCPU thread finishes cpu_exec? > This value will not jump forward and backward, but still will not allow > making execution deterministic. > > Consider the following scenarios: > > First: > vCPU iothread > access HW ---- > ... access HW in timer > > Second: > vCPU iothread > ... access HW in timer > access HW ---- > > These scenarios will generate the same order of events in the log. > Synchronization checkpoint in iothread will try to write already > executed instructions, but it does not have access to current_cpu > and the icount value will point to the "past" - it will have less > instructions than already executed. The actual access should be covered by a lock, but I think you're right that the two threads can be nondeterministically off by one instruction, even if we make gen_io_start update timers_state.qemu_icount atomically. Paolo
On 03/04/2017 14:45, Alex Bennée wrote: > Outside of the vCPU thread icount time will only be tracked against > timers_state.qemu_icount. We no longer credit cycles until they have > completed the run. Inside the vCPU thread we adjust for passage of > time by looking at how many have run so far. This is only valid inside > the vCPU thread while it is running. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > cpus.c | 27 +++++++++++++++++++++------ > include/qom/cpu.h | 1 + > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/cpus.c b/cpus.c > index 87638a75d2..3d18374b0e 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp) > } > } > > +/* The current number of executed instructions is based on what we > + * originally budgeted minus the current state of the decrementing > + * icount counters in extra/u16.low. > + */ > +static int64_t cpu_get_icount_executed(CPUState *cpu) > +{ > + return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra); > +} > + > int64_t cpu_get_icount_raw(void) > { > int64_t icount; > @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void) > fprintf(stderr, "Bad icount read\n"); > exit(1); > } > - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); > + /* Take into account what has run */ > + icount += cpu_get_icount_executed(cpu); > } > return icount; > } > @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu) > > count = tcg_get_icount_limit(); > > - timers_state.qemu_icount += count; > + /* To calculate what we have executed so far we need to know > + * what we originally budgeted to run this cycle */ > + cpu->icount_budget = count; > + > decr = (count > 0xffff) ? 0xffff : count; > count -= decr; > cpu->icount_decr.u16.low = decr; > @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu) > static void process_icount_data(CPUState *cpu) > { > if (use_icount) { > - /* Fold pending instructions back into the > - instruction counter, and clear the interrupt flag. */ > - timers_state.qemu_icount -= (cpu->icount_decr.u16.low > - + cpu->icount_extra); > + /* Account for executed instructions */ > + timers_state.qemu_icount += cpu_get_icount_executed(cpu); > > /* We must be under BQL here as cpu_exit can tweak > icount_decr.u32 */ > g_assert(qemu_mutex_iothread_locked()); > + > + /* Reset the counters */ > cpu->icount_decr.u32 = 0; If you only reset u16.low, then the assertion can be removed. After all prepare_icount_for_run is also checking for u16.low to be zero. Thanks, Paolo > cpu->icount_extra = 0; > + cpu->icount_budget = 0; > + > replay_account_executed_instructions(); > } > } > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index c3292efe1c..5d10359c8f 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -332,6 +332,7 @@ struct CPUState { > /* updates protected by BQL */ > uint32_t interrupt_request; > int singlestep_enabled; > + int64_t icount_budget; > int64_t icount_extra; > sigjmp_buf jmp_env; > >
> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] > On 04/04/2017 07:37, Pavel Dovgalyuk wrote: > >> - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); > >> + /* Take into account what has run */ > >> + icount += cpu_get_icount_executed(cpu); > >> } > >> return icount; > > As far, as I understand, this one will return the same value in iothread > > until vCPU thread finishes cpu_exec? > > This value will not jump forward and backward, but still will not allow > > making execution deterministic. > > > > Consider the following scenarios: > > > > First: > > vCPU iothread > > access HW ---- > > ... access HW in timer > > > > Second: > > vCPU iothread > > ... access HW in timer > > access HW ---- > > > > These scenarios will generate the same order of events in the log. > > Synchronization checkpoint in iothread will try to write already > > executed instructions, but it does not have access to current_cpu > > and the icount value will point to the "past" - it will have less > > instructions than already executed. > > The actual access should be covered by a lock, but I think you're right > that the two threads can be nondeterministically off by one instruction, > even if we make gen_io_start update timers_state.qemu_icount atomically. Yes. The actual problem is that icount is updated once for the whole TB. But TB execution is not atomic and machine state is different in different parts of TB. Therefore iothread may expose different behavior depending on moment when it is activated. Pavel Dovgalyuk
diff --git a/cpus.c b/cpus.c index 87638a75d2..3d18374b0e 100644 --- a/cpus.c +++ b/cpus.c @@ -223,6 +223,15 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp) } } +/* The current number of executed instructions is based on what we + * originally budgeted minus the current state of the decrementing + * icount counters in extra/u16.low. + */ +static int64_t cpu_get_icount_executed(CPUState *cpu) +{ + return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra); +} + int64_t cpu_get_icount_raw(void) { int64_t icount; @@ -234,7 +243,8 @@ int64_t cpu_get_icount_raw(void) fprintf(stderr, "Bad icount read\n"); exit(1); } - icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); + /* Take into account what has run */ + icount += cpu_get_icount_executed(cpu); } return icount; } @@ -1195,7 +1205,10 @@ static void prepare_icount_for_run(CPUState *cpu) count = tcg_get_icount_limit(); - timers_state.qemu_icount += count; + /* To calculate what we have executed so far we need to know + * what we originally budgeted to run this cycle */ + cpu->icount_budget = count; + decr = (count > 0xffff) ? 0xffff : count; count -= decr; cpu->icount_decr.u16.low = decr; @@ -1206,16 +1219,18 @@ static void prepare_icount_for_run(CPUState *cpu) static void process_icount_data(CPUState *cpu) { if (use_icount) { - /* Fold pending instructions back into the - instruction counter, and clear the interrupt flag. */ - timers_state.qemu_icount -= (cpu->icount_decr.u16.low - + cpu->icount_extra); + /* Account for executed instructions */ + timers_state.qemu_icount += cpu_get_icount_executed(cpu); /* We must be under BQL here as cpu_exit can tweak icount_decr.u32 */ g_assert(qemu_mutex_iothread_locked()); + + /* Reset the counters */ cpu->icount_decr.u32 = 0; cpu->icount_extra = 0; + cpu->icount_budget = 0; + replay_account_executed_instructions(); } } diff --git a/include/qom/cpu.h b/include/qom/cpu.h index c3292efe1c..5d10359c8f 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -332,6 +332,7 @@ struct CPUState { /* updates protected by BQL */ uint32_t interrupt_request; int singlestep_enabled; + int64_t icount_budget; int64_t icount_extra; sigjmp_buf jmp_env;
Outside of the vCPU thread icount time will only be tracked against timers_state.qemu_icount. We no longer credit cycles until they have completed the run. Inside the vCPU thread we adjust for passage of time by looking at how many have run so far. This is only valid inside the vCPU thread while it is running. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> --- cpus.c | 27 +++++++++++++++++++++------ include/qom/cpu.h | 1 + 2 files changed, 22 insertions(+), 6 deletions(-) -- 2.11.0