[RFC,v4,13/23] cpus: only take BQL for sleeping threads

Message ID 20180119084417.7100.69568.stgit@pasha-VirtualBox
State New
Headers show
Series
  • Untitled series #8205
Related show

Commit Message

Pavel Dovgalyuk Jan. 19, 2018, 8:44 a.m.
From: Alex Bennée <alex.bennee@linaro.org>


Now the only real need to hold the BQL is for when we sleep on the
cpu->halt conditional. The lock is actually dropped while the thread
sleeps so the actual window for contention is pretty small. This also
means we can remove the special case hack for exclusive work and
simply declare that work no longer has an implicit BQL held. This
isn't a major problem async work is generally only changing things in
the context of its own vCPU. If it needs to work across vCPUs it
should be using the exclusive mechanism or possibly taking the lock
itself.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Tested-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>

---
 cpus-common.c |   13 +++++--------
 cpus.c        |    7 ++++---
 2 files changed, 9 insertions(+), 11 deletions(-)

Comments

Paolo Bonzini Jan. 19, 2018, 8:59 a.m. | #1
On 19/01/2018 09:44, Pavel Dovgalyuk wrote:
>      while (all_cpu_threads_idle()) {

> +        qemu_mutex_lock_iothread();

>          stop_tcg_kick_timer();

>          qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);

> +        qemu_mutex_unlock_iothread();

>      }


cpu_has_work cannot be called outside BQL yet.  You first need to access
cpu->interrupt_request with atomics.

In general, testing the condition outside the mutex is a very dangerous
pattern (and I'm usually the one who enjoys dangerous patterns).

But also, taking a slightly wider look:

>  static void qemu_tcg_rr_wait_io_event(CPUState *cpu)

>  {

>      while (all_cpu_threads_idle()) {

> +        qemu_mutex_lock_iothread();

>          stop_tcg_kick_timer();

>          qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);

> +        qemu_mutex_unlock_iothread();

>      }

>  

>      start_tcg_kick_timer();

>  

>      qemu_wait_io_event_common(cpu);

> -

> -    qemu_mutex_unlock_iothread();

>  }

>  


You are adding a qemu_mutex_lock_iothread to a function that wasn't
there before.  Either it was broken before, or it is now.

Paolo
Pavel Dovgalyuk Jan. 19, 2018, 12:05 p.m. | #2
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> On 19/01/2018 09:44, Pavel Dovgalyuk wrote:

> >      while (all_cpu_threads_idle()) {

> > +        qemu_mutex_lock_iothread();

> >          stop_tcg_kick_timer();

> >          qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);

> > +        qemu_mutex_unlock_iothread();

> >      }

> 

> cpu_has_work cannot be called outside BQL yet.  You first need to access

> cpu->interrupt_request with atomics.

> 

> In general, testing the condition outside the mutex is a very dangerous

> pattern (and I'm usually the one who enjoys dangerous patterns).


It means, that I'll have to fix all the has_work function to avoid races,
because x86_cpu_has_work may have them?

static bool x86_cpu_has_work(CPUState *cs)
{
    X86CPU *cpu = X86_CPU(cs);
    CPUX86State *env = &cpu->env;

    return ((cs->interrupt_request & (CPU_INTERRUPT_HARD |
                                      CPU_INTERRUPT_POLL)) &&
            (env->eflags & IF_MASK)) ||
           (cs->interrupt_request & (CPU_INTERRUPT_NMI |
                                     CPU_INTERRUPT_INIT |
                                     CPU_INTERRUPT_SIPI |
                                     CPU_INTERRUPT_MCE)) ||
           ((cs->interrupt_request & CPU_INTERRUPT_SMI) &&
            !(env->hflags & HF_SMM_MASK));
}

Pavel Dovgalyuk
Paolo Bonzini Jan. 19, 2018, 12:19 p.m. | #3
On 19/01/2018 13:05, Pavel Dovgalyuk wrote:
>> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

>> On 19/01/2018 09:44, Pavel Dovgalyuk wrote:

>>>      while (all_cpu_threads_idle()) {

>>> +        qemu_mutex_lock_iothread();

>>>          stop_tcg_kick_timer();

>>>          qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);

>>> +        qemu_mutex_unlock_iothread();

>>>      }

>>

>> cpu_has_work cannot be called outside BQL yet.  You first need to access

>> cpu->interrupt_request with atomics.

>>

>> In general, testing the condition outside the mutex is a very dangerous

>> pattern (and I'm usually the one who enjoys dangerous patterns).

> 

> It means, that I'll have to fix all the has_work function to avoid races,

> because x86_cpu_has_work may have them?


Why only x86_cpu_has_work?

Even reading cs->interrupt_request outside the mutex is unsafe.

Paolo

> static bool x86_cpu_has_work(CPUState *cs)

> {

>     X86CPU *cpu = X86_CPU(cs);

>     CPUX86State *env = &cpu->env;

> 

>     return ((cs->interrupt_request & (CPU_INTERRUPT_HARD |

>                                       CPU_INTERRUPT_POLL)) &&

>             (env->eflags & IF_MASK)) ||

>            (cs->interrupt_request & (CPU_INTERRUPT_NMI |

>                                      CPU_INTERRUPT_INIT |

>                                      CPU_INTERRUPT_SIPI |

>                                      CPU_INTERRUPT_MCE)) ||

>            ((cs->interrupt_request & CPU_INTERRUPT_SMI) &&

>             !(env->hflags & HF_SMM_MASK));

> }

> 

> Pavel Dovgalyuk

>
Pavel Dovgalyuk Jan. 19, 2018, 12:25 p.m. | #4
> -----Original Message-----

> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> Sent: Friday, January 19, 2018 3:20 PM

> To: Pavel Dovgalyuk; 'Pavel Dovgalyuk'; qemu-devel@nongnu.org

> Cc: kwolf@redhat.com; peter.maydell@linaro.org; boost.lists@gmail.com; quintela@redhat.com;

> jasowang@redhat.com; mst@redhat.com; zuban32s@gmail.com; maria.klimushenkova@ispras.ru;

> kraxel@redhat.com; alex.bennee@linaro.org

> Subject: Re: [RFC PATCH v4 13/23] cpus: only take BQL for sleeping threads

> 

> On 19/01/2018 13:05, Pavel Dovgalyuk wrote:

> >> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> >> On 19/01/2018 09:44, Pavel Dovgalyuk wrote:

> >>>      while (all_cpu_threads_idle()) {

> >>> +        qemu_mutex_lock_iothread();

> >>>          stop_tcg_kick_timer();

> >>>          qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);

> >>> +        qemu_mutex_unlock_iothread();

> >>>      }

> >>

> >> cpu_has_work cannot be called outside BQL yet.  You first need to access

> >> cpu->interrupt_request with atomics.

> >>

> >> In general, testing the condition outside the mutex is a very dangerous

> >> pattern (and I'm usually the one who enjoys dangerous patterns).

> >

> > It means, that I'll have to fix all the has_work function to avoid races,

> > because x86_cpu_has_work may have them?

> 

> Why only x86_cpu_has_work?

> 

> Even reading cs->interrupt_request outside the mutex is unsafe.


All the vcpu function that access interrupt controller or peripheral state may be unsafe?
How can it work safely then?

Pavel Dovgalyuk
Paolo Bonzini Jan. 19, 2018, 12:26 p.m. | #5
On 19/01/2018 13:25, Pavel Dovgalyuk wrote:
>>> It means, that I'll have to fix all the has_work function to avoid races,

>>> because x86_cpu_has_work may have them?

>> Why only x86_cpu_has_work?

>>

>> Even reading cs->interrupt_request outside the mutex is unsafe.

> All the vcpu function that access interrupt controller or peripheral state may be unsafe?

> How can it work safely then?


They do it inside the big QEMU lock.  But here you're calling
cpu_has_work (via all_cpu_threads_idle) outside the lock.

Paolo
Pavel Dovgalyuk Jan. 19, 2018, 12:36 p.m. | #6
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> Sent: Friday, January 19, 2018 3:26 PM

> To: Pavel Dovgalyuk; 'Pavel Dovgalyuk'; qemu-devel@nongnu.org

> Cc: kwolf@redhat.com; peter.maydell@linaro.org; boost.lists@gmail.com; quintela@redhat.com;

> jasowang@redhat.com; mst@redhat.com; zuban32s@gmail.com; maria.klimushenkova@ispras.ru;

> kraxel@redhat.com; alex.bennee@linaro.org

> Subject: Re: [RFC PATCH v4 13/23] cpus: only take BQL for sleeping threads

> 

> On 19/01/2018 13:25, Pavel Dovgalyuk wrote:

> >>> It means, that I'll have to fix all the has_work function to avoid races,

> >>> because x86_cpu_has_work may have them?

> >> Why only x86_cpu_has_work?

> >>

> >> Even reading cs->interrupt_request outside the mutex is unsafe.

> > All the vcpu function that access interrupt controller or peripheral state may be unsafe?

> > How can it work safely then?

> 

> They do it inside the big QEMU lock.  


Right. Without these patches.
They are within the replay lock. And BQL is not covering vcpu execution with these patches.
Therefore RR will be ok and regular execution may encounter races?
It means that I missed something in Alex ideas, because he prepared the initial patches.

> But here you're calling cpu_has_work (via all_cpu_threads_idle) outside the lock.


Yes, I see, but what we have to do?

Pavel Dovgalyuk
Pavel Dovgalyuk Jan. 19, 2018, 1:20 p.m. | #7
Pavel Dovgalyuk


> -----Original Message-----

> From: Pavel Dovgalyuk [mailto:dovgaluk@ispras.ru]

> Sent: Friday, January 19, 2018 3:37 PM

> To: 'Paolo Bonzini'; 'Pavel Dovgalyuk'; qemu-devel@nongnu.org

> Cc: kwolf@redhat.com; peter.maydell@linaro.org; boost.lists@gmail.com; quintela@redhat.com;

> jasowang@redhat.com; mst@redhat.com; zuban32s@gmail.com; maria.klimushenkova@ispras.ru;

> kraxel@redhat.com; alex.bennee@linaro.org

> Subject: RE: [RFC PATCH v4 13/23] cpus: only take BQL for sleeping threads

> 

> > From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> > Sent: Friday, January 19, 2018 3:26 PM

> > To: Pavel Dovgalyuk; 'Pavel Dovgalyuk'; qemu-devel@nongnu.org

> > Cc: kwolf@redhat.com; peter.maydell@linaro.org; boost.lists@gmail.com; quintela@redhat.com;

> > jasowang@redhat.com; mst@redhat.com; zuban32s@gmail.com; maria.klimushenkova@ispras.ru;

> > kraxel@redhat.com; alex.bennee@linaro.org

> > Subject: Re: [RFC PATCH v4 13/23] cpus: only take BQL for sleeping threads

> >

> > On 19/01/2018 13:25, Pavel Dovgalyuk wrote:

> > >>> It means, that I'll have to fix all the has_work function to avoid races,

> > >>> because x86_cpu_has_work may have them?

> > >> Why only x86_cpu_has_work?

> > >>

> > >> Even reading cs->interrupt_request outside the mutex is unsafe.

> > > All the vcpu function that access interrupt controller or peripheral state may be unsafe?

> > > How can it work safely then?

> >

> > They do it inside the big QEMU lock.

> 

> Right. Without these patches.


Ah, I forgot about unlocking in tcg_cpu_exec.
Now I see, that vcpu is taking the lock when trying to access the peripherals.

Therefore, I have to fix all *_cpu_has_work, right?

> They are within the replay lock. And BQL is not covering vcpu execution with these patches.

> Therefore RR will be ok and regular execution may encounter races?

> It means that I missed something in Alex ideas, because he prepared the initial patches.

> 

> > But here you're calling cpu_has_work (via all_cpu_threads_idle) outside the lock.

> 

> Yes, I see, but what we have to do?



Pavel Dovgalyuk
Paolo Bonzini Jan. 19, 2018, 1:33 p.m. | #8
On 19/01/2018 13:36, Pavel Dovgalyuk wrote:
>> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

>> Sent: Friday, January 19, 2018 3:26 PM

>> To: Pavel Dovgalyuk; 'Pavel Dovgalyuk'; qemu-devel@nongnu.org

>> Cc: kwolf@redhat.com; peter.maydell@linaro.org; boost.lists@gmail.com; quintela@redhat.com;

>> jasowang@redhat.com; mst@redhat.com; zuban32s@gmail.com; maria.klimushenkova@ispras.ru;

>> kraxel@redhat.com; alex.bennee@linaro.org

>> Subject: Re: [RFC PATCH v4 13/23] cpus: only take BQL for sleeping threads

>>

>> On 19/01/2018 13:25, Pavel Dovgalyuk wrote:

>>>>> It means, that I'll have to fix all the has_work function to avoid races,

>>>>> because x86_cpu_has_work may have them?

>>>> Why only x86_cpu_has_work?

>>>>

>>>> Even reading cs->interrupt_request outside the mutex is unsafe.

>>> All the vcpu function that access interrupt controller or peripheral state may be unsafe?

>>> How can it work safely then?

>>

>> They do it inside the big QEMU lock.  

> 

> Right. Without these patches.

> They are within the replay lock. And BQL is not covering vcpu execution with these patches.

> Therefore RR will be ok and regular execution may encounter races?

> It means that I missed something in Alex ideas, because he prepared the initial patches.


Yes.

>> But here you're calling cpu_has_work (via all_cpu_threads_idle) outside the lock.

> 

> Yes, I see, but what we have to do?


I don't know.  But the idiom in these patches,

	while(...) {
		lock()
		cond_wait()
		unlock()
	}

is unsafe as well, so the issue is more than just cpu_has_work.

Paolo
Pavel Dovgalyuk Jan. 22, 2018, 6:47 a.m. | #9
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> On 19/01/2018 13:36, Pavel Dovgalyuk wrote:

> >> From: Paolo Bonzini [mailto:pbonzini@redhat.com]

> >> On 19/01/2018 13:25, Pavel Dovgalyuk wrote:

> >>>>> It means, that I'll have to fix all the has_work function to avoid races,

> >>>>> because x86_cpu_has_work may have them?

> >>>> Why only x86_cpu_has_work?

> >>>>

> >>>> Even reading cs->interrupt_request outside the mutex is unsafe.

> >>> All the vcpu function that access interrupt controller or peripheral state may be unsafe?

> >>> How can it work safely then?

> >>

> >> They do it inside the big QEMU lock.

> >

> > Right. Without these patches.

> > They are within the replay lock. And BQL is not covering vcpu execution with these patches.

> > Therefore RR will be ok and regular execution may encounter races?

> > It means that I missed something in Alex ideas, because he prepared the initial patches.

> 

> Yes.

> 

> >> But here you're calling cpu_has_work (via all_cpu_threads_idle) outside the lock.

> >

> > Yes, I see, but what we have to do?

> 

> I don't know.  But the idiom in these patches,

> 

> 	while(...) {

> 		lock()

> 		cond_wait()

> 		unlock()

> 	}

> 

> is unsafe as well, so the issue is more than just cpu_has_work.


Maybe it's better to omit this patch?
It seems that replay and regular execution are ok without it.

Pavel Dovgalyuk

Patch

diff --git a/cpus-common.c b/cpus-common.c
index 59f751e..64661c3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -310,6 +310,11 @@  void async_safe_run_on_cpu(CPUState *cpu, run_on_cpu_func func,
     queue_work_on_cpu(cpu, wi);
 }
 
+/* Work items run outside of the BQL. This is essential for avoiding a
+ * deadlock for exclusive work but also applies to non-exclusive work.
+ * If the work requires cross-vCPU changes then it should use the
+ * exclusive mechanism.
+ */
 void process_queued_cpu_work(CPUState *cpu)
 {
     struct qemu_work_item *wi;
@@ -327,17 +332,9 @@  void process_queued_cpu_work(CPUState *cpu)
         }
         qemu_mutex_unlock(&cpu->work_mutex);
         if (wi->exclusive) {
-            /* Running work items outside the BQL avoids the following deadlock:
-             * 1) start_exclusive() is called with the BQL taken while another
-             * CPU is running; 2) cpu_exec in the other CPU tries to takes the
-             * BQL, so it goes to sleep; start_exclusive() is sleeping too, so
-             * neither CPU can proceed.
-             */
-            qemu_mutex_unlock_iothread();
             start_exclusive();
             wi->func(cpu, wi->data);
             end_exclusive();
-            qemu_mutex_lock_iothread();
         } else {
             wi->func(cpu, wi->data);
         }
diff --git a/cpus.c b/cpus.c
index ca86d9f..c841333 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1143,23 +1143,24 @@  static void qemu_wait_io_event_common(CPUState *cpu)
 static void qemu_tcg_rr_wait_io_event(CPUState *cpu)
 {
     while (all_cpu_threads_idle()) {
+        qemu_mutex_lock_iothread();
         stop_tcg_kick_timer();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
+        qemu_mutex_unlock_iothread();
     }
 
     start_tcg_kick_timer();
 
     qemu_wait_io_event_common(cpu);
-
-    qemu_mutex_unlock_iothread();
 }
 
 static void qemu_wait_io_event(CPUState *cpu)
 {
-    qemu_mutex_lock_iothread();
 
     while (cpu_thread_is_idle(cpu)) {
+        qemu_mutex_lock_iothread();
         qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
+        qemu_mutex_unlock_iothread();
     }
 
 #ifdef _WIN32