diff mbox series

[for-5.0] hw/ppc/ppc440_uc.c: Remove incorrect iothread locking from dcr_write_pcie()

Message ID 20200330125228.24994-1-peter.maydell@linaro.org
State Superseded
Headers show
Series [for-5.0] hw/ppc/ppc440_uc.c: Remove incorrect iothread locking from dcr_write_pcie() | expand

Commit Message

Peter Maydell March 30, 2020, 12:52 p.m. UTC
In dcr_write_pcie() we take the iothread lock around a call to
pcie_host_mmcfg_udpate().  This is an incorrect attempt to deal with
the bug fixed in commit 235352ee6e73d7716, where we were not taking
the iothread lock before calling device dcr read/write functions.
(It's not sufficient locking, because although the other cases in the
switch statement won't assert, there is no locking which prevents
multiple guest CPUs from trying to access the PPC460EXPCIEState
struct at the same time and corrupting data.)

Unfortunately with commit 235352ee6e73d7716 we are now trying
to recursively take the iothread lock, which will assert:

  $ qemu-system-ppc -M sam460ex --display none
  **
  ERROR:/home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1830:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())
  Aborted (core dumped)

Remove the locking within dcr_write_pcie().

Fixes: 235352ee6e73d7716
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

---
I did a grep of hw/ppc and didn't see anything else that was doing
its own locking inside a dcr read/write fn.
---
 hw/ppc/ppc440_uc.c | 3 ---
 1 file changed, 3 deletions(-)

-- 
2.20.1

Comments

BALATON Zoltan March 30, 2020, 1:17 p.m. UTC | #1
On Mon, 30 Mar 2020, Peter Maydell wrote:
> In dcr_write_pcie() we take the iothread lock around a call to

> pcie_host_mmcfg_udpate().  This is an incorrect attempt to deal with

> the bug fixed in commit 235352ee6e73d7716, where we were not taking

> the iothread lock before calling device dcr read/write functions.

> (It's not sufficient locking, because although the other cases in the

> switch statement won't assert, there is no locking which prevents

> multiple guest CPUs from trying to access the PPC460EXPCIEState

> struct at the same time and corrupting data.)


Even though there's only a single CPU on sam460ex and PCIe is mostly 
unused, with this patch I could no more reproduce a problem that we had 
before with some programs crashing within guest under AmigaOS for unknown 
reason. That problem happened randomly (although I could reproduce it 
before) so I'm not sure if this fixed it or something else (more likely 
commit 235352ee6e) or will just resurface later but at least this seems to 
work so

Tested-by: BALATON Zoltan <balaton@eik.bme.hu>


Thanks for fixing it.

> Unfortunately with commit 235352ee6e73d7716 we are now trying

> to recursively take the iothread lock, which will assert:

>

>  $ qemu-system-ppc -M sam460ex --display none

>  **

>  ERROR:/home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1830:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())

>  Aborted (core dumped)

>

> Remove the locking within dcr_write_pcie().

>

> Fixes: 235352ee6e73d7716

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> ---

> I did a grep of hw/ppc and didn't see anything else that was doing

> its own locking inside a dcr read/write fn.


I think we needed to add locking here because it asserted otherwise but I 
don't remember the details now.

Regards,
BALATON Zoltan

> ---

> hw/ppc/ppc440_uc.c | 3 ---

> 1 file changed, 3 deletions(-)

>

> diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c

> index d5ea962249f..b30e093cbb0 100644

> --- a/hw/ppc/ppc440_uc.c

> +++ b/hw/ppc/ppc440_uc.c

> @@ -13,7 +13,6 @@

> #include "qemu/error-report.h"

> #include "qapi/error.h"

> #include "qemu/log.h"

> -#include "qemu/main-loop.h"

> #include "qemu/module.h"

> #include "cpu.h"

> #include "hw/irq.h"

> @@ -1183,9 +1182,7 @@ static void dcr_write_pcie(void *opaque, int dcrn, uint32_t val)

>     case PEGPL_CFGMSK:

>         s->cfg_mask = val;

>         size = ~(val & 0xfffffffe) + 1;

> -        qemu_mutex_lock_iothread();

>         pcie_host_mmcfg_update(PCIE_HOST_BRIDGE(s), val & 1, s->cfg_base, size);

> -        qemu_mutex_unlock_iothread();

>         break;

>     case PEGPL_MSGBAH:

>         s->msg_base = ((uint64_t)val << 32) | (s->msg_base & 0xffffffff);

>
Peter Maydell March 30, 2020, 1:24 p.m. UTC | #2
On Mon, 30 Mar 2020 at 14:17, BALATON Zoltan <balaton@eik.bme.hu> wrote:
>

> On Mon, 30 Mar 2020, Peter Maydell wrote:

> > In dcr_write_pcie() we take the iothread lock around a call to

> > pcie_host_mmcfg_udpate().  This is an incorrect attempt to deal with

> > the bug fixed in commit 235352ee6e73d7716, where we were not taking

> > the iothread lock before calling device dcr read/write functions.

> > (It's not sufficient locking, because although the other cases in the

> > switch statement won't assert, there is no locking which prevents

> > multiple guest CPUs from trying to access the PPC460EXPCIEState

> > struct at the same time and corrupting data.)

>

> Even though there's only a single CPU on sam460ex and PCIe is mostly

> unused, with this patch I could no more reproduce a problem that we had

> before with some programs crashing within guest under AmigaOS for unknown

> reason. That problem happened randomly (although I could reproduce it

> before) so I'm not sure if this fixed it or something else (more likely

> commit 235352ee6e) or will just resurface later but at least this seems to

> work so

>

> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>

>

> Thanks for fixing it.


Thanks for the testing. I'm not sure why a single-cpu setup
would have problems but I guess some device has a bottom-half or
timer callback that will run in the iothread context, in which
case it could race with the vcpu thread doing a dcr access. As
you say, probably 235352ee6e rather than this change that's fixed it,
assuming we really have fixed it.

> > Unfortunately with commit 235352ee6e73d7716 we are now trying

> > to recursively take the iothread lock, which will assert:

> >

> >  $ qemu-system-ppc -M sam460ex --display none

> >  **

> >  ERROR:/home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1830:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())

> >  Aborted (core dumped)

> >

> > Remove the locking within dcr_write_pcie().

> >

> > Fixes: 235352ee6e73d7716

> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> > ---

> > I did a grep of hw/ppc and didn't see anything else that was doing

> > its own locking inside a dcr read/write fn.

>

> I think we needed to add locking here because it asserted otherwise but I

> don't remember the details now.


Yeah, the memory-region adjustment done under the pcie_host_mmcfg_update()
function will assert that the iothread lock is held. The locking just
needs to be one level further up in the callstack.

thanks
-- PMM
David Gibson March 30, 2020, 11:41 p.m. UTC | #3
On Mon, Mar 30, 2020 at 01:52:28PM +0100, Peter Maydell wrote:
> In dcr_write_pcie() we take the iothread lock around a call to

> pcie_host_mmcfg_udpate().  This is an incorrect attempt to deal with

> the bug fixed in commit 235352ee6e73d7716, where we were not taking

> the iothread lock before calling device dcr read/write functions.

> (It's not sufficient locking, because although the other cases in the

> switch statement won't assert, there is no locking which prevents

> multiple guest CPUs from trying to access the PPC460EXPCIEState

> struct at the same time and corrupting data.)

> 

> Unfortunately with commit 235352ee6e73d7716 we are now trying

> to recursively take the iothread lock, which will assert:

> 

>   $ qemu-system-ppc -M sam460ex --display none

>   **

>   ERROR:/home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1830:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())

>   Aborted (core dumped)

> 

> Remove the locking within dcr_write_pcie().

> 

> Fixes: 235352ee6e73d7716

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

> ---

> I did a grep of hw/ppc and didn't see anything else that was doing

> its own locking inside a dcr read/write fn.

> ---


Applied to ppc-for-5.0, thanks.

>  hw/ppc/ppc440_uc.c | 3 ---

>  1 file changed, 3 deletions(-)

> 

> diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c

> index d5ea962249f..b30e093cbb0 100644

> --- a/hw/ppc/ppc440_uc.c

> +++ b/hw/ppc/ppc440_uc.c

> @@ -13,7 +13,6 @@

>  #include "qemu/error-report.h"

>  #include "qapi/error.h"

>  #include "qemu/log.h"

> -#include "qemu/main-loop.h"

>  #include "qemu/module.h"

>  #include "cpu.h"

>  #include "hw/irq.h"

> @@ -1183,9 +1182,7 @@ static void dcr_write_pcie(void *opaque, int dcrn, uint32_t val)

>      case PEGPL_CFGMSK:

>          s->cfg_mask = val;

>          size = ~(val & 0xfffffffe) + 1;

> -        qemu_mutex_lock_iothread();

>          pcie_host_mmcfg_update(PCIE_HOST_BRIDGE(s), val & 1, s->cfg_base, size);

> -        qemu_mutex_unlock_iothread();

>          break;

>      case PEGPL_MSGBAH:

>          s->msg_base = ((uint64_t)val << 32) | (s->msg_base & 0xffffffff);


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
diff mbox series

Patch

diff --git a/hw/ppc/ppc440_uc.c b/hw/ppc/ppc440_uc.c
index d5ea962249f..b30e093cbb0 100644
--- a/hw/ppc/ppc440_uc.c
+++ b/hw/ppc/ppc440_uc.c
@@ -13,7 +13,6 @@ 
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qemu/log.h"
-#include "qemu/main-loop.h"
 #include "qemu/module.h"
 #include "cpu.h"
 #include "hw/irq.h"
@@ -1183,9 +1182,7 @@  static void dcr_write_pcie(void *opaque, int dcrn, uint32_t val)
     case PEGPL_CFGMSK:
         s->cfg_mask = val;
         size = ~(val & 0xfffffffe) + 1;
-        qemu_mutex_lock_iothread();
         pcie_host_mmcfg_update(PCIE_HOST_BRIDGE(s), val & 1, s->cfg_base, size);
-        qemu_mutex_unlock_iothread();
         break;
     case PEGPL_MSGBAH:
         s->msg_base = ((uint64_t)val << 32) | (s->msg_base & 0xffffffff);