diff mbox

[RFC] target-arm/psci.c: wake up sleeping CPUs (MTTCG)

Message ID 1435160084-938-1-git-send-email-alex.bennee@linaro.org
State New
Headers show

Commit Message

Alex Bennée June 24, 2015, 3:34 p.m. UTC
Testing with Alexander's bare metal syncronisation tests fails in MTTCG
leaving one CPU spinning forever waiting for the second CPU to wake up.
We simply need to poke the halt_cond once we have processed the PSCI
power on call.

Tested-by: Alex Bennée <alex.bennee@linaro.org>
CC: Alexander Spyridakis <a.spyridakis@virtualopensystems.com>

---
TODO
  - exactly how does the vexpress wake up it's sleeping CPUs?
---
 target-arm/psci.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Alex Bennée June 24, 2015, 5:18 p.m. UTC | #1
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/06/2015 17:34, Alex Bennée wrote:
>> Testing with Alexander's bare metal syncronisation tests fails in MTTCG
>> leaving one CPU spinning forever waiting for the second CPU to wake up.
>> We simply need to poke the halt_cond once we have processed the PSCI
>> power on call.
>> 
>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> CC: Alexander Spyridakis <a.spyridakis@virtualopensystems.com>
>> 
>> ---
>> TODO
>>   - exactly how does the vexpress wake up it's sleeping CPUs?
>> ---
>>  target-arm/psci.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/target-arm/psci.c b/target-arm/psci.c
>> index d8fafab..661ff28 100644
>> --- a/target-arm/psci.c
>> +++ b/target-arm/psci.c
>> @@ -196,6 +196,8 @@ void arm_handle_psci_call(ARMCPU *cpu)
>>          }
>>          target_cpu_class->set_pc(target_cpu_state, entry);
>>  
>> +        qemu_cond_signal(target_cpu_state->halt_cond);
>
> That's called qemu_cpu_kick(target_cpu_state). :)  The patch should be
> acceptable now upstream, I think.

Oh so this might well fail in KVM too?

The qemu_cpu_kick does a qemu_cond_broadcast(cpu->halt_cond) which seems
a little excessive? Won't all sleeping CPUs wake up (and return to sleep)?

>
> Paolo
>
>>          ret = 0;
>>          break;
>>      case QEMU_PSCI_0_1_FN_CPU_OFF:
>>
Alex Bennée June 24, 2015, 6:15 p.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/06/2015 19:18, Alex Bennée wrote:
>>>> >> @@ -196,6 +196,8 @@ void arm_handle_psci_call(ARMCPU *cpu)
>>>> >>          }
>>>> >>          target_cpu_class->set_pc(target_cpu_state, entry);
>>>> >>  
>>>> >> +        qemu_cond_signal(target_cpu_state->halt_cond);
>>> >
>>> > That's called qemu_cpu_kick(target_cpu_state). :)  The patch should be
>>> > acceptable now upstream, I think.
>> Oh so this might well fail in KVM too?
>> 
>> The qemu_cpu_kick does a qemu_cond_broadcast(cpu->halt_cond) which seems
>> a little excessive? Won't all sleeping CPUs wake up (and return to sleep)?
>
> On KVM (and I assume on MT-TCG), each CPU has a different halt_cond.

You are right of course, I got my sense the wrong way around. Given it
is per-cpu I wonder if you will ever have multiple threads waiting on
it?

Anyway I'll fix that up and re-submit after I've tested to see of these
test cases break current KVM.

>
> Paolo
Peter Maydell June 24, 2015, 7:12 p.m. UTC | #3
On 24 June 2015 at 18:18, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
>> On 24/06/2015 17:34, Alex Bennée wrote:
>>> Testing with Alexander's bare metal syncronisation tests fails in MTTCG
>>> leaving one CPU spinning forever waiting for the second CPU to wake up.
>>> We simply need to poke the halt_cond once we have processed the PSCI
>>> power on call.
>>>
>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>> CC: Alexander Spyridakis <a.spyridakis@virtualopensystems.com>
>>>
>>> ---
>>> TODO
>>>   - exactly how does the vexpress wake up it's sleeping CPUs?
>>> ---
>>>  target-arm/psci.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/target-arm/psci.c b/target-arm/psci.c
>>> index d8fafab..661ff28 100644
>>> --- a/target-arm/psci.c
>>> +++ b/target-arm/psci.c
>>> @@ -196,6 +196,8 @@ void arm_handle_psci_call(ARMCPU *cpu)
>>>          }
>>>          target_cpu_class->set_pc(target_cpu_state, entry);
>>>
>>> +        qemu_cond_signal(target_cpu_state->halt_cond);
>>
>> That's called qemu_cpu_kick(target_cpu_state). :)  The patch should be
>> acceptable now upstream, I think.
>
> Oh so this might well fail in KVM too?

In KVM we won't use target-arm/psci.c because PSCI calls
are handled in the kernel.

-- PMM
Alex Bennée June 25, 2015, 6:27 a.m. UTC | #4
Alexander Spyridakis <a.spyridakis@virtualopensystems.com> writes:

> On 24 June 2015 at 17:34, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Testing with Alexander's bare metal syncronisation tests fails in MTTCG
>> leaving one CPU spinning forever waiting for the second CPU to wake up.
>> We simply need to poke the halt_cond once we have processed the PSCI
>> power on call.
>
> Thanks Alex. Works for me, also with qemu_cpu_kick(target_cpu_state)
> as Paolo mentioned.
>
> The test seems to stress the current multi-threaded implementation
> quite a lot. With 8 CPUs running, the resulting errors are in the
> range of 500 per vCPU (10 million iterations).

We need to get to the bottom of this one first as obviously the
implementation needs to bullet proof for all the various synchronisation
patterns the CPU can use.

> Performance is another issue as mentioned before, but even more
> pronounced with 8 cores. Upstream QEMU needs around 10 seconds to
> complete, with multi-threading around 100 seconds for the same test.

I'm not overly surprised as this is a high-contention test and the
additional locking implies a lot of extra overhead. It's certainly a
useful test to compare the comparative performance of the various
approaches to atomics/exclusives but I hope in real world tasks we gain
a bunch of performance for normal unlocked code running across multiple
cores.

I wonder if the perf tools can give us some insight to where the extra
latency is coming from?

>
> Best regards.
Alex Bennée June 26, 2015, 7:06 a.m. UTC | #5
Andrew Jones <drjones@redhat.com> writes:

> On Wed, Jun 24, 2015 at 08:12:52PM +0100, Peter Maydell wrote:
>> On 24 June 2015 at 18:18, Alex Bennée <alex.bennee@linaro.org> wrote:
>> >
>> > Paolo Bonzini <pbonzini@redhat.com> writes:
>> >
>> >> On 24/06/2015 17:34, Alex Bennée wrote:
>> >>> Testing with Alexander's bare metal syncronisation tests fails in MTTCG
>> >>> leaving one CPU spinning forever waiting for the second CPU to wake up.
>> >>> We simply need to poke the halt_cond once we have processed the PSCI
>> >>> power on call.
>> >>>
>> >>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> >>> CC: Alexander Spyridakis <a.spyridakis@virtualopensystems.com>
>> >>>
>> >>> ---
>> >>> TODO
>> >>>   - exactly how does the vexpress wake up it's sleeping CPUs?
>> >>> ---
>> >>>  target-arm/psci.c | 2 ++
>> >>>  1 file changed, 2 insertions(+)
>> >>>
>> >>> diff --git a/target-arm/psci.c b/target-arm/psci.c
>> >>> index d8fafab..661ff28 100644
>> >>> --- a/target-arm/psci.c
>> >>> +++ b/target-arm/psci.c
>> >>> @@ -196,6 +196,8 @@ void arm_handle_psci_call(ARMCPU *cpu)
>> >>>          }
>> >>>          target_cpu_class->set_pc(target_cpu_state, entry);
>> >>>
>> >>> +        qemu_cond_signal(target_cpu_state->halt_cond);
>> >>
>> >> That's called qemu_cpu_kick(target_cpu_state). :)  The patch should be
>> >> acceptable now upstream, I think.
>> >
>> > Oh so this might well fail in KVM too?
>> 
>> In KVM we won't use target-arm/psci.c because PSCI calls
>> are handled in the kernel.
>>
>
> It's also not valid to use Alexander's test on KVM, as the test
> framework doesn't enable the mmu, and thus {ldr,str}ex won't work
> as expected.
>
> I guess I need to do a better job at advertising/documenting
> kvm-unit-tests/arm, as that framework would suit this test just
> fine. I've attached a patch porting the test over to k-u-t[1].
> After applying the patch, do
>
> ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> OR
> ./configure --arch=arm --cross-prefix=arm-linux-gnu-
>
> make
>
> arm/run arm/vos-spinlock-test.flat -smp 4 # non-atomic locks
> OR
> arm/run arm/vos-spinlock-test.flat -smp 4 -append atomic # atomic
> locks

Thanks for that. I shall have a play with it today.

>
> Thanks,
> drew
>
> [1] git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
diff mbox

Patch

diff --git a/target-arm/psci.c b/target-arm/psci.c
index d8fafab..661ff28 100644
--- a/target-arm/psci.c
+++ b/target-arm/psci.c
@@ -196,6 +196,8 @@  void arm_handle_psci_call(ARMCPU *cpu)
         }
         target_cpu_class->set_pc(target_cpu_state, entry);
 
+        qemu_cond_signal(target_cpu_state->halt_cond);
+
         ret = 0;
         break;
     case QEMU_PSCI_0_1_FN_CPU_OFF: