mbox series

[v2,00/11] MTTCG fixups for 2.9

Message ID 20170302195337.31558-1-alex.bennee@linaro.org
Headers show
Series MTTCG fixups for 2.9 | expand

Message

Alex Bennée March 2, 2017, 7:53 p.m. UTC
Hi Peter,

So perhaps predictably the merging of MTTCG has thrown up some issues
during soft-freeze. The following patches fix up some of those:

The following where in v1 and just have r-b tags:

  vl/cpus: be smarter with icount and MTTCG
  target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO
  cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

The next change downgrades the asserts to tcg_debug_asserts. This will
reduce the instances of production builds failing on non-MTTCG enabled
guests. The races still need to be fixed but you now have to
--enable-debug-tcg to go hunting for them:

  translate: downgrade IRQ BQL asserts to tcg_debug_assert

This never came up in my testing but it seems some guests can
translate while doing a tlb_fill. The call to cpu_restore_state never
worked before (as we are translating the block you are looking for)
but by bugging out we avoid the double tb_lock().

  translate-all: exit cpu_restore_state early if translating

Then we have a bunch of fixes from the reports on the list. They are
safe to merge but I'll leave that up to the maintainers. There may be
an argument for holding these patches back until the guest is
converted to be MTTCG aware and a full audit of the CPU<->HW emulation
boundaries is done for each one:

  sparc/sparc64: grab BQL before calling cpu_check_irqs
  s390x/misc_helper.c: wrap IO instructions in BQL
  target/xtensa: hold BQL for interrupt processing
  target/mips/op_helper: hold BQL before calling cpu_mips_get_count

Finally these are just tiny debug fixes while investigating the icount
regression:

  target/arm/helper: make it clear the EC field is also in hex
  hw/intc/arm_gic: modernise the DPRINTF

Going forward I think 1-5 are ready to be merged now. For 6-9 we
should await decisions from the target maintainers. The last two are
entirely up to you.

It looks like the -icount regression is a poor interaction between
icount timers and the BQL but this is going to require discussion with
Paolo in the morning.

Cheers,


Alex Bennée (11):
  vl/cpus: be smarter with icount and MTTCG
  target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO
  cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG
  translate: downgrade IRQ BQL asserts to tcg_debug_assert
  translate-all: exit cpu_restore_state early if translating
  sparc/sparc64: grab BQL before calling cpu_check_irqs
  s390x/misc_helper.c: wrap IO instructions in BQL
  target/xtensa: hold BQL for interrupt processing
  target/mips/op_helper: hold BQL before calling cpu_mips_get_count
  target/arm/helper: make it clear the EC field is also in hex
  hw/intc/arm_gic: modernise the DPRINTF

 cpus.c                      | 11 +++++++----
 hw/intc/arm_gic.c           | 13 +++++++++----
 hw/sparc/sun4m.c            |  3 +++
 hw/sparc64/sparc64.c        |  3 +++
 target/arm/helper.c         |  2 +-
 target/i386/cpu.h           |  3 +++
 target/mips/op_helper.c     | 17 ++++++++++++++---
 target/s390x/misc_helper.c  | 21 +++++++++++++++++++++
 target/sparc/int64_helper.c |  3 +++
 target/sparc/win_helper.c   | 11 +++++++++++
 target/xtensa/helper.c      |  1 +
 target/xtensa/op_helper.c   |  7 +++++++
 translate-all.c             |  9 ++++++++-
 translate-common.c          |  3 ++-
 vl.c                        |  7 ++-----
 15 files changed, 95 insertions(+), 19 deletions(-)

-- 
2.11.0

Comments

Frederic Konrad March 3, 2017, 5:38 p.m. UTC | #1
Hi All,

I've a strangeness with the "drop iolock" patch.
It seems it has an impact on the MMIO execution patch-set I'm working
on:

Basically modifying the memory tree is causing a Bad Ram Address error.
I wonder if this can't happen with hotplug/unplug model as well.

I'll look into this.
Shoot if you have any evidence of why that doesn't work :).

Thanks,
Fred

On 03/02/2017 08:53 PM, Alex Bennée wrote:
> Hi Peter,

> 

> So perhaps predictably the merging of MTTCG has thrown up some issues

> during soft-freeze. The following patches fix up some of those:

> 

> The following where in v1 and just have r-b tags:

> 

>   vl/cpus: be smarter with icount and MTTCG

>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

> 

> The next change downgrades the asserts to tcg_debug_asserts. This will

> reduce the instances of production builds failing on non-MTTCG enabled

> guests. The races still need to be fixed but you now have to

> --enable-debug-tcg to go hunting for them:

> 

>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

> 

> This never came up in my testing but it seems some guests can

> translate while doing a tlb_fill. The call to cpu_restore_state never

> worked before (as we are translating the block you are looking for)

> but by bugging out we avoid the double tb_lock().

> 

>   translate-all: exit cpu_restore_state early if translating

> 

> Then we have a bunch of fixes from the reports on the list. They are

> safe to merge but I'll leave that up to the maintainers. There may be

> an argument for holding these patches back until the guest is

> converted to be MTTCG aware and a full audit of the CPU<->HW emulation

> boundaries is done for each one:

> 

>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>   s390x/misc_helper.c: wrap IO instructions in BQL

>   target/xtensa: hold BQL for interrupt processing

>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

> 

> Finally these are just tiny debug fixes while investigating the icount

> regression:

> 

>   target/arm/helper: make it clear the EC field is also in hex

>   hw/intc/arm_gic: modernise the DPRINTF

> 

> Going forward I think 1-5 are ready to be merged now. For 6-9 we

> should await decisions from the target maintainers. The last two are

> entirely up to you.

> 

> It looks like the -icount regression is a poor interaction between

> icount timers and the BQL but this is going to require discussion with

> Paolo in the morning.

> 

> Cheers,

> 

> 

> Alex Bennée (11):

>   vl/cpus: be smarter with icount and MTTCG

>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

>   translate-all: exit cpu_restore_state early if translating

>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>   s390x/misc_helper.c: wrap IO instructions in BQL

>   target/xtensa: hold BQL for interrupt processing

>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

>   target/arm/helper: make it clear the EC field is also in hex

>   hw/intc/arm_gic: modernise the DPRINTF

> 

>  cpus.c                      | 11 +++++++----

>  hw/intc/arm_gic.c           | 13 +++++++++----

>  hw/sparc/sun4m.c            |  3 +++

>  hw/sparc64/sparc64.c        |  3 +++

>  target/arm/helper.c         |  2 +-

>  target/i386/cpu.h           |  3 +++

>  target/mips/op_helper.c     | 17 ++++++++++++++---

>  target/s390x/misc_helper.c  | 21 +++++++++++++++++++++

>  target/sparc/int64_helper.c |  3 +++

>  target/sparc/win_helper.c   | 11 +++++++++++

>  target/xtensa/helper.c      |  1 +

>  target/xtensa/op_helper.c   |  7 +++++++

>  translate-all.c             |  9 ++++++++-

>  translate-common.c          |  3 ++-

>  vl.c                        |  7 ++-----

>  15 files changed, 95 insertions(+), 19 deletions(-)

>
Alex Bennée March 6, 2017, 9:43 a.m. UTC | #2
Frederic Konrad <fred.konrad@greensocs.com> writes:

> Hi All,

>

> I've a strangeness with the "drop iolock" patch.

> It seems it has an impact on the MMIO execution patch-set I'm working

> on:


Hmm the only real change is the BQL (implicit or explict) being taken on
entry to the mmio functions.

> Basically modifying the memory tree is causing a Bad Ram Address error.

> I wonder if this can't happen with hotplug/unplug model as well.


Isn't this all done under an RCU? I don't think any of this should have
changed for the MTTCG patch series.

> I'll look into this.

> Shoot if you have any evidence of why that doesn't work :).


--enable-tcg-debug will turn on more lock checking (at a performance
  hit). You could try that.

>

> Thanks,

> Fred

>

> On 03/02/2017 08:53 PM, Alex Bennée wrote:

>> Hi Peter,

>>

>> So perhaps predictably the merging of MTTCG has thrown up some issues

>> during soft-freeze. The following patches fix up some of those:

>>

>> The following where in v1 and just have r-b tags:

>>

>>   vl/cpus: be smarter with icount and MTTCG

>>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

>>

>> The next change downgrades the asserts to tcg_debug_asserts. This will

>> reduce the instances of production builds failing on non-MTTCG enabled

>> guests. The races still need to be fixed but you now have to

>> --enable-debug-tcg to go hunting for them:

>>

>>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

>>

>> This never came up in my testing but it seems some guests can

>> translate while doing a tlb_fill. The call to cpu_restore_state never

>> worked before (as we are translating the block you are looking for)

>> but by bugging out we avoid the double tb_lock().

>>

>>   translate-all: exit cpu_restore_state early if translating

>>

>> Then we have a bunch of fixes from the reports on the list. They are

>> safe to merge but I'll leave that up to the maintainers. There may be

>> an argument for holding these patches back until the guest is

>> converted to be MTTCG aware and a full audit of the CPU<->HW emulation

>> boundaries is done for each one:

>>

>>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>>   s390x/misc_helper.c: wrap IO instructions in BQL

>>   target/xtensa: hold BQL for interrupt processing

>>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

>>

>> Finally these are just tiny debug fixes while investigating the icount

>> regression:

>>

>>   target/arm/helper: make it clear the EC field is also in hex

>>   hw/intc/arm_gic: modernise the DPRINTF

>>

>> Going forward I think 1-5 are ready to be merged now. For 6-9 we

>> should await decisions from the target maintainers. The last two are

>> entirely up to you.

>>

>> It looks like the -icount regression is a poor interaction between

>> icount timers and the BQL but this is going to require discussion with

>> Paolo in the morning.

>>

>> Cheers,

>>

>>

>> Alex Bennée (11):

>>   vl/cpus: be smarter with icount and MTTCG

>>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

>>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

>>   translate-all: exit cpu_restore_state early if translating

>>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>>   s390x/misc_helper.c: wrap IO instructions in BQL

>>   target/xtensa: hold BQL for interrupt processing

>>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

>>   target/arm/helper: make it clear the EC field is also in hex

>>   hw/intc/arm_gic: modernise the DPRINTF

>>

>>  cpus.c                      | 11 +++++++----

>>  hw/intc/arm_gic.c           | 13 +++++++++----

>>  hw/sparc/sun4m.c            |  3 +++

>>  hw/sparc64/sparc64.c        |  3 +++

>>  target/arm/helper.c         |  2 +-

>>  target/i386/cpu.h           |  3 +++

>>  target/mips/op_helper.c     | 17 ++++++++++++++---

>>  target/s390x/misc_helper.c  | 21 +++++++++++++++++++++

>>  target/sparc/int64_helper.c |  3 +++

>>  target/sparc/win_helper.c   | 11 +++++++++++

>>  target/xtensa/helper.c      |  1 +

>>  target/xtensa/op_helper.c   |  7 +++++++

>>  translate-all.c             |  9 ++++++++-

>>  translate-common.c          |  3 ++-

>>  vl.c                        |  7 ++-----

>>  15 files changed, 95 insertions(+), 19 deletions(-)

>>



--
Alex Bennée
Frederic Konrad March 6, 2017, 10:45 a.m. UTC | #3
On 03/06/2017 10:43 AM, Alex Bennée wrote:
> 

> Frederic Konrad <fred.konrad@greensocs.com> writes:

> 

>> Hi All,

>>

>> I've a strangeness with the "drop iolock" patch.

>> It seems it has an impact on the MMIO execution patch-set I'm working

>> on:

> 

> Hmm the only real change is the BQL (implicit or explict) being taken on

> entry to the mmio functions.


Yes maybe it just trigger a bug more easily..

Fred

> 

>> Basically modifying the memory tree is causing a Bad Ram Address error.

>> I wonder if this can't happen with hotplug/unplug model as well.

> 

> Isn't this all done under an RCU? I don't think any of this should have

> changed for the MTTCG patch series.

> 

>> I'll look into this.

>> Shoot if you have any evidence of why that doesn't work :).

> 

> --enable-tcg-debug will turn on more lock checking (at a performance

>   hit). You could try that.

> 

>>

>> Thanks,

>> Fred

>>

>> On 03/02/2017 08:53 PM, Alex Bennée wrote:

>>> Hi Peter,

>>>

>>> So perhaps predictably the merging of MTTCG has thrown up some issues

>>> during soft-freeze. The following patches fix up some of those:

>>>

>>> The following where in v1 and just have r-b tags:

>>>

>>>   vl/cpus: be smarter with icount and MTTCG

>>>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>>>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

>>>

>>> The next change downgrades the asserts to tcg_debug_asserts. This will

>>> reduce the instances of production builds failing on non-MTTCG enabled

>>> guests. The races still need to be fixed but you now have to

>>> --enable-debug-tcg to go hunting for them:

>>>

>>>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

>>>

>>> This never came up in my testing but it seems some guests can

>>> translate while doing a tlb_fill. The call to cpu_restore_state never

>>> worked before (as we are translating the block you are looking for)

>>> but by bugging out we avoid the double tb_lock().

>>>

>>>   translate-all: exit cpu_restore_state early if translating

>>>

>>> Then we have a bunch of fixes from the reports on the list. They are

>>> safe to merge but I'll leave that up to the maintainers. There may be

>>> an argument for holding these patches back until the guest is

>>> converted to be MTTCG aware and a full audit of the CPU<->HW emulation

>>> boundaries is done for each one:

>>>

>>>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>>>   s390x/misc_helper.c: wrap IO instructions in BQL

>>>   target/xtensa: hold BQL for interrupt processing

>>>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

>>>

>>> Finally these are just tiny debug fixes while investigating the icount

>>> regression:

>>>

>>>   target/arm/helper: make it clear the EC field is also in hex

>>>   hw/intc/arm_gic: modernise the DPRINTF

>>>

>>> Going forward I think 1-5 are ready to be merged now. For 6-9 we

>>> should await decisions from the target maintainers. The last two are

>>> entirely up to you.

>>>

>>> It looks like the -icount regression is a poor interaction between

>>> icount timers and the BQL but this is going to require discussion with

>>> Paolo in the morning.

>>>

>>> Cheers,

>>>

>>>

>>> Alex Bennée (11):

>>>   vl/cpus: be smarter with icount and MTTCG

>>>   target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO

>>>   cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG

>>>   translate: downgrade IRQ BQL asserts to tcg_debug_assert

>>>   translate-all: exit cpu_restore_state early if translating

>>>   sparc/sparc64: grab BQL before calling cpu_check_irqs

>>>   s390x/misc_helper.c: wrap IO instructions in BQL

>>>   target/xtensa: hold BQL for interrupt processing

>>>   target/mips/op_helper: hold BQL before calling cpu_mips_get_count

>>>   target/arm/helper: make it clear the EC field is also in hex

>>>   hw/intc/arm_gic: modernise the DPRINTF

>>>

>>>  cpus.c                      | 11 +++++++----

>>>  hw/intc/arm_gic.c           | 13 +++++++++----

>>>  hw/sparc/sun4m.c            |  3 +++

>>>  hw/sparc64/sparc64.c        |  3 +++

>>>  target/arm/helper.c         |  2 +-

>>>  target/i386/cpu.h           |  3 +++

>>>  target/mips/op_helper.c     | 17 ++++++++++++++---

>>>  target/s390x/misc_helper.c  | 21 +++++++++++++++++++++

>>>  target/sparc/int64_helper.c |  3 +++

>>>  target/sparc/win_helper.c   | 11 +++++++++++

>>>  target/xtensa/helper.c      |  1 +

>>>  target/xtensa/op_helper.c   |  7 +++++++

>>>  translate-all.c             |  9 ++++++++-

>>>  translate-common.c          |  3 ++-

>>>  vl.c                        |  7 ++-----

>>>  15 files changed, 95 insertions(+), 19 deletions(-)

>>>

> 

> 

> --

> Alex Bennée

>