[Xen-devel,PATCH-4.5,v3,10/12] xen/arm: don't protect GICH and lr_queue accesses with gic.lock

Message ID 1393439997-26936-10-git-send-email-stefano.stabellini@eu.citrix.com
State New
Headers show

Commit Message

Stefano Stabellini Feb. 26, 2014, 6:39 p.m.
GICH is banked, protect accesses by disabling interrupts.
Protect lr_queue accesses with the vgic.lock only.
gic.lock only protects accesses to GICD now.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 xen/arch/arm/gic.c  |   22 +++-------------------
 xen/arch/arm/vgic.c |   12 ++++++++++--
 2 files changed, 13 insertions(+), 21 deletions(-)

Comments

Julien Grall Feb. 26, 2014, 8:17 p.m. | #1
Hi Stefano,

On 26/02/14 18:39, Stefano Stabellini wrote:
> GICH is banked, protect accesses by disabling interrupts.
> Protect lr_queue accesses with the vgic.lock only.

When the interrupt is an SPI, the lr_queue is shared between every VCPU. 
Using only vgic.lock seems wrong to me.
Stefano Stabellini March 18, 2014, 2:59 p.m. | #2
On Wed, 26 Feb 2014, Julien Grall wrote:
> Hi Stefano,
> 
> On 26/02/14 18:39, Stefano Stabellini wrote:
> > GICH is banked, protect accesses by disabling interrupts.
> > Protect lr_queue accesses with the vgic.lock only.
> 
> When the interrupt is an SPI, the lr_queue is shared between every VCPU. Using
> only vgic.lock seems wrong to me.

Even though lr_queue is a field in a struct that can be per domain for
irq > 32, the lr_pending queue is always per vcpu, in fact it keeps
track of the irqs waiting to go into lr registers, that are banked.
Julien Grall March 18, 2014, 3:20 p.m. | #3
Hi Stefano,

On 03/18/2014 02:59 PM, Stefano Stabellini wrote:
> On Wed, 26 Feb 2014, Julien Grall wrote:
>> On 26/02/14 18:39, Stefano Stabellini wrote:
>>> GICH is banked, protect accesses by disabling interrupts.
>>> Protect lr_queue accesses with the vgic.lock only.
>>
>> When the interrupt is an SPI, the lr_queue is shared between every VCPU. Using
>> only vgic.lock seems wrong to me.
> 
> Even though lr_queue is a field in a struct that can be per domain for
> irq > 32, the lr_pending queue is always per vcpu, in fact it keeps
> track of the irqs waiting to go into lr registers, that are banked.
> 

It might have a race between adding and removing lr_queue. For now it's
safe because the SPIs are always routed to VCPU0 (even if the guest ask
to route on VCPU1).

Just by reading the code, nothing protect correctly SPIs between VCPUs.
We are only take vgic.lock, which is wrong.

Regards,
Stefano Stabellini March 18, 2014, 6:52 p.m. | #4
On Tue, 18 Mar 2014, Julien Grall wrote:
> Hi Stefano,
> 
> On 03/18/2014 02:59 PM, Stefano Stabellini wrote:
> > On Wed, 26 Feb 2014, Julien Grall wrote:
> >> On 26/02/14 18:39, Stefano Stabellini wrote:
> >>> GICH is banked, protect accesses by disabling interrupts.
> >>> Protect lr_queue accesses with the vgic.lock only.
> >>
> >> When the interrupt is an SPI, the lr_queue is shared between every VCPU. Using
> >> only vgic.lock seems wrong to me.
> > 
> > Even though lr_queue is a field in a struct that can be per domain for
> > irq > 32, the lr_pending queue is always per vcpu, in fact it keeps
> > track of the irqs waiting to go into lr registers, that are banked.
> > 
> 
> It might have a race between adding and removing lr_queue. For now it's
> safe because the SPIs are always routed to VCPU0 (even if the guest ask
> to route on VCPU1).
> 
> Just by reading the code, nothing protect correctly SPIs between VCPUs.
> We are only take vgic.lock, which is wrong.

Like you said, today SPIs are always routed to the same cpu.
When we'll implement IRQ migration we'll have to make sure that we take
the appropriate locks during the operation but I don't expect that this
patch is going to make things more difficult.
Julien Grall March 18, 2014, 8:26 p.m. | #5
Hi Stefano,

On 03/18/2014 06:52 PM, Stefano Stabellini wrote:
> On Tue, 18 Mar 2014, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 03/18/2014 02:59 PM, Stefano Stabellini wrote:
>>> On Wed, 26 Feb 2014, Julien Grall wrote:
>>>> On 26/02/14 18:39, Stefano Stabellini wrote:
>>>>> GICH is banked, protect accesses by disabling interrupts.
>>>>> Protect lr_queue accesses with the vgic.lock only.
>>>>
>>>> When the interrupt is an SPI, the lr_queue is shared between every VCPU. Using
>>>> only vgic.lock seems wrong to me.
>>>
>>> Even though lr_queue is a field in a struct that can be per domain for
>>> irq > 32, the lr_pending queue is always per vcpu, in fact it keeps
>>> track of the irqs waiting to go into lr registers, that are banked.
>>>
>>
>> It might have a race between adding and removing lr_queue. For now it's
>> safe because the SPIs are always routed to VCPU0 (even if the guest ask
>> to route on VCPU1).
>>
>> Just by reading the code, nothing protect correctly SPIs between VCPUs.
>> We are only take vgic.lock, which is wrong.
> 
> Like you said, today SPIs are always routed to the same cpu.
> When we'll implement IRQ migration we'll have to make sure that we take
> the appropriate locks during the operation but I don't expect that this
> patch is going to make things more difficult.

I'm fine with that. Can you create a TODO somewhere to not forget this
possible locking issue later?

Regards,
Stefano Stabellini March 19, 2014, 11:30 a.m. | #6
On Tue, 18 Mar 2014, Julien Grall wrote:
> Hi Stefano,
> 
> On 03/18/2014 06:52 PM, Stefano Stabellini wrote:
> > On Tue, 18 Mar 2014, Julien Grall wrote:
> >> Hi Stefano,
> >>
> >> On 03/18/2014 02:59 PM, Stefano Stabellini wrote:
> >>> On Wed, 26 Feb 2014, Julien Grall wrote:
> >>>> On 26/02/14 18:39, Stefano Stabellini wrote:
> >>>>> GICH is banked, protect accesses by disabling interrupts.
> >>>>> Protect lr_queue accesses with the vgic.lock only.
> >>>>
> >>>> When the interrupt is an SPI, the lr_queue is shared between every VCPU. Using
> >>>> only vgic.lock seems wrong to me.
> >>>
> >>> Even though lr_queue is a field in a struct that can be per domain for
> >>> irq > 32, the lr_pending queue is always per vcpu, in fact it keeps
> >>> track of the irqs waiting to go into lr registers, that are banked.
> >>>
> >>
> >> It might have a race between adding and removing lr_queue. For now it's
> >> safe because the SPIs are always routed to VCPU0 (even if the guest ask
> >> to route on VCPU1).
> >>
> >> Just by reading the code, nothing protect correctly SPIs between VCPUs.
> >> We are only take vgic.lock, which is wrong.
> > 
> > Like you said, today SPIs are always routed to the same cpu.
> > When we'll implement IRQ migration we'll have to make sure that we take
> > the appropriate locks during the operation but I don't expect that this
> > patch is going to make things more difficult.
> 
> I'm fine with that. Can you create a TODO somewhere to not forget this
> possible locking issue later?

Sure

Patch

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 2dc6386..296d9a7 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -668,17 +668,14 @@  void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq)
 {
     struct pending_irq *p = irq_to_pending(v, virtual_irq);
 
-    spin_lock(&gic.lock);
     if ( !list_empty(&p->lr_queue) )
         list_del_init(&p->lr_queue);
-    spin_unlock(&gic.lock);
 }
 
 void gic_raise_guest_irq(struct vcpu *v, unsigned int irq,
                          unsigned int priority)
 {
     int i;
-    unsigned long flags;
     struct pending_irq *n = irq_to_pending(v, irq);
 
     if ( test_bit(GIC_IRQ_GUEST_VISIBLE, &n->status))
@@ -688,23 +685,17 @@  void gic_raise_guest_irq(struct vcpu *v, unsigned int irq,
         return;
     }
 
-    spin_lock_irqsave(&gic.lock, flags);
-
     if ( v == current && list_empty(&v->arch.vgic.lr_pending) )
     {
         i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
         if (i < nr_lrs) {
             set_bit(i, &this_cpu(lr_mask));
             gic_set_lr(v, i, irq, GICH_LR_PENDING, priority);
-            goto out;
+            return;
         }
     }
 
     gic_add_to_lr_pending(v, irq, priority);
-
-out:
-    spin_unlock_irqrestore(&gic.lock, flags);
-    return;
 }
 
 static void _gic_clear_lr(struct vcpu *v, int i)
@@ -726,8 +717,6 @@  static void _gic_clear_lr(struct vcpu *v, int i)
     } else if ( lr & GICH_LR_PENDING ) {
         clear_bit(GIC_IRQ_GUEST_PENDING, &p->status);
     } else {
-        spin_lock(&gic.lock);
-
         GICH[GICH_LR + i] = 0;
         clear_bit(i, &this_cpu(lr_mask));
 
@@ -741,8 +730,6 @@  static void _gic_clear_lr(struct vcpu *v, int i)
             gic_raise_guest_irq(v, irq, p->priority);
         } else
             list_del_init(&p->inflight);
-
-        spin_unlock(&gic.lock);
     }
 }
 
@@ -772,11 +759,11 @@  static void gic_restore_pending_irqs(struct vcpu *v)
         i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
         if ( i >= nr_lrs ) return;
 
-        spin_lock_irqsave(&gic.lock, flags);
+        spin_lock_irqsave(&v->arch.vgic.lock, flags);
         gic_set_lr(v, i, p->irq, GICH_LR_PENDING, p->priority);
         list_del_init(&p->lr_queue);
         set_bit(i, &this_cpu(lr_mask));
-        spin_unlock_irqrestore(&gic.lock, flags);
+        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
     }
 
 }
@@ -784,13 +771,10 @@  static void gic_restore_pending_irqs(struct vcpu *v)
 void gic_clear_pending_irqs(struct vcpu *v)
 {
     struct pending_irq *p, *t;
-    unsigned long flags;
 
-    spin_lock_irqsave(&gic.lock, flags);
     v->arch.lr_mask = 0;
     list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue )
         list_del_init(&p->lr_queue);
-    spin_unlock_irqrestore(&gic.lock, flags);
 }
 
 int gic_events_need_delivery(void)
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 981db6c..e9464b2 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -365,12 +365,15 @@  static void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
     struct pending_irq *p;
     unsigned int irq;
     int i = 0;
+    unsigned long flags;
 
     while ( (i = find_next_bit((const long unsigned int *) &r, 32, i)) < 32 ) {
         irq = i + (32 * n);
         p = irq_to_pending(v, irq);
+        spin_lock_irqsave(&v->arch.vgic.lock, flags);
         clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
         gic_remove_from_queues(v, irq);
+        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
         if ( p->desc != NULL )
             p->desc->handler->disable(p->desc);
         i++;
@@ -391,8 +394,13 @@  static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
              vcpu_info(current, evtchn_upcall_pending) &&
              list_empty(&p->inflight) )
             vgic_vcpu_inject_irq(v, irq);
-        else if ( !list_empty(&p->inflight) && !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
-            gic_raise_guest_irq(v, irq, p->priority);
+        else {
+            unsigned long flags;
+            spin_lock_irqsave(&v->arch.vgic.lock, flags);
+            if ( !list_empty(&p->inflight) && !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
+                gic_raise_guest_irq(v, irq, p->priority);
+            spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
+        }
         if ( p->desc != NULL )
             p->desc->handler->enable(p->desc);
         i++;