diff mbox

[Xen-devel,v6,1/5] xen/arm: observe itargets setting in vgic_enable_irqs and vgic_disable_irqs

Message ID 1403541463-23734-1-git-send-email-stefano.stabellini@eu.citrix.com
State New
Headers show

Commit Message

Stefano Stabellini June 23, 2014, 4:37 p.m. UTC
vgic_enable_irqs should enable irq delivery to the vcpu specified by
GICD_ITARGETSR, rather than the vcpu that wrote to GICD_ISENABLER.
Similarly vgic_disable_irqs should use the target vcpu specified by
itarget to disable irqs.

itargets can be set to a mask but vgic_get_target_vcpu always returns
the lower vcpu in the mask.

Correctly initialize itargets for SPIs.

Ignore bits in GICD_ITARGETSR corresponding to invalid vcpus.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

---


Changes in v6:
- add assert and bug_on;
- add in-code comments;
- move additional check on itargets writing from the following patch to
this patch;
- sizeof(itargets) instead of 8*sizeof(itargets[0]);
- remove the unneeded cast of &target for find_first_bit.

Changes in v5:
- improve in-code comments;
- use vgic_rank_irq;
- use bit masks to write-ignore GICD_ITARGETSR;
- introduce an version of vgic_get_target_vcpu that doesn't take the
rank lock;
- keep the rank lock while enabling/disabling irqs;
- use find_first_bit instead of find_next_bit;
- check for zero writes to GICD_ITARGETSR.

Changes in v4:
- remove assert that could allow a guest to crash Xen;
- add itargets validation to vgic_distr_mmio_write;
- export vgic_get_target_vcpu.

Changes in v3:
- add assert in get_target_vcpu;
- rename get_target_vcpu to vgic_get_target_vcpu.

Changes in v2:
- refactor the common code in get_target_vcpu;
- unify PPI and SPI paths;
- correctly initialize itargets for SPI;
- use byte_read.
---
 xen/arch/arm/vgic.c       |   78 ++++++++++++++++++++++++++++++++++++++-------
 xen/include/asm-arm/gic.h |    2 ++
 2 files changed, 68 insertions(+), 12 deletions(-)

Comments

Julien Grall June 23, 2014, 5:41 p.m. UTC | #1
Hi Stefano,

On 06/23/2014 05:37 PM, Stefano Stabellini wrote:
> +/* the rank lock is already taken */
> +static struct vcpu *_vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
> +{
> +    unsigned long target;
> +    struct vcpu *v_target;
> +    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
> +    ASSERT(spin_is_locked(&rank->lock));
> +
> +    target = byte_read(rank->itargets[(irq%32)/4], 0, irq % 4);
> +    /* 1-N SPI should be delivered as pending to all the vcpus in the
> +     * mask, but here we just return the first vcpu for simplicity and
> +     * because it would be too slow to do otherwise. */
> +    target = find_first_bit(&target, 8);

I'm tempted to ask you adding an ASSERT here. Just for catching coding
error.

[..]

> @@ -536,8 +575,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>          vgic_lock_rank(v, rank);
>          tr = rank->ienable;
>          rank->ienable |= *r;
> -        vgic_unlock_rank(v, rank);
>          vgic_enable_irqs(v, (*r) & (~tr), gicd_reg - GICD_ISENABLER);
> +        vgic_unlock_rank(v, rank);
>          return 1;
>  
>      case GICD_ICENABLER ... GICD_ICENABLERN:
> @@ -547,8 +586,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>          vgic_lock_rank(v, rank);
>          tr = rank->ienable;
>          rank->ienable &= ~*r;
> -        vgic_unlock_rank(v, rank);
>          vgic_disable_irqs(v, (*r) & tr, gicd_reg - GICD_ICENABLER);
> +        vgic_unlock_rank(v, rank);
>          return 1;

Sorry for not having catch this earlier. I don't really like the idea to
extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
can be long to execute as it may touch the GIC distributor.

In another way, the rank lock is only taken in the distributor emulation.

Assuming we consider that distributor access may be slow:

Acked-by: Julien Grall <julien.grall@linaro.org>

Aside that, that made me think about two possible issue (not related to
this patch series) in vgic_inject_irq:
	- We don't take the rank lock to read the priority, even tho it's only
a byte access.
	- If the an interrupt is injected while it's already active, we don't
update the priority of this interrupt.

Regards,
Stefano Stabellini June 24, 2014, 11:38 a.m. UTC | #2
On Mon, 23 Jun 2014, Julien Grall wrote:
> Hi Stefano,
> 
> On 06/23/2014 05:37 PM, Stefano Stabellini wrote:
> > +/* the rank lock is already taken */
> > +static struct vcpu *_vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
> > +{
> > +    unsigned long target;
> > +    struct vcpu *v_target;
> > +    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
> > +    ASSERT(spin_is_locked(&rank->lock));
> > +
> > +    target = byte_read(rank->itargets[(irq%32)/4], 0, irq % 4);
> > +    /* 1-N SPI should be delivered as pending to all the vcpus in the
> > +     * mask, but here we just return the first vcpu for simplicity and
> > +     * because it would be too slow to do otherwise. */
> > +    target = find_first_bit(&target, 8);
> 
> I'm tempted to ask you adding an ASSERT here. Just for catching coding
> error.
> 

OK


> 
> > @@ -536,8 +575,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >          vgic_lock_rank(v, rank);
> >          tr = rank->ienable;
> >          rank->ienable |= *r;
> > -        vgic_unlock_rank(v, rank);
> >          vgic_enable_irqs(v, (*r) & (~tr), gicd_reg - GICD_ISENABLER);
> > +        vgic_unlock_rank(v, rank);
> >          return 1;
> >  
> >      case GICD_ICENABLER ... GICD_ICENABLERN:
> > @@ -547,8 +586,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >          vgic_lock_rank(v, rank);
> >          tr = rank->ienable;
> >          rank->ienable &= ~*r;
> > -        vgic_unlock_rank(v, rank);
> >          vgic_disable_irqs(v, (*r) & tr, gicd_reg - GICD_ICENABLER);
> > +        vgic_unlock_rank(v, rank);
> >          return 1;
> 
> Sorry for not having catch this earlier. I don't really like the idea to
> extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
> can be long to execute as it may touch the GIC distributor.
> 
> In another way, the rank lock is only taken in the distributor emulation.
>
> Assuming we consider that distributor access may be slow:

We could end up enabling or disabling the wrong set of interrupts
without this change. I think it is necessary.


> Acked-by: Julien Grall <julien.grall@linaro.org>

Thanks


> Aside that, that made me think about two possible issue (not related to
> this patch series) in vgic_inject_irq:
> 	- We don't take the rank lock to read the priority, even tho it's only
> a byte access.

That's a good point


> 	- If the an interrupt is injected while it's already active, we don't
> update the priority of this interrupt.

Yeah, I noticed this before. I'll write a comment to make sure we keep
in mind that we have this limitation.


> Regards,
> 
> -- 
> Julien Grall
>
Julien Grall June 24, 2014, 12:07 p.m. UTC | #3
On 24/06/14 12:38, Stefano Stabellini wrote:
>> Sorry for not having catch this earlier. I don't really like the idea to
>> extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
>> can be long to execute as it may touch the GIC distributor.
>>
>> In another way, the rank lock is only taken in the distributor emulation.
>>
>> Assuming we consider that distributor access may be slow:
>
> We could end up enabling or disabling the wrong set of interrupts
> without this change. I think it is necessary.

AFAIU, this lock only protects the rank when we retrieve the target 
VCPU, the other part of the function will still work without it.

What I meant is to call vgic_get_target_vcpu, so the lock will protect 
only what is necessary and not more.
Stefano Stabellini June 24, 2014, 6:04 p.m. UTC | #4
On Tue, 24 Jun 2014, Julien Grall wrote:
> On 24/06/14 12:38, Stefano Stabellini wrote:
> > > Sorry for not having catch this earlier. I don't really like the idea to
> > > extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
> > > can be long to execute as it may touch the GIC distributor.
> > > 
> > > In another way, the rank lock is only taken in the distributor emulation.
> > > 
> > > Assuming we consider that distributor access may be slow:
> > 
> > We could end up enabling or disabling the wrong set of interrupts
> > without this change. I think it is necessary.
> 
> AFAIU, this lock only protects the rank when we retrieve the target VCPU, the
> other part of the function will still work without it.
> 
> What I meant is to call vgic_get_target_vcpu, so the lock will protect only
> what is necessary and not more.

I don't think so (unless I misunderstood your suggestion): setting
ienable and enabling the interrupts need to be atomic.  Otherwise this
can happen:

    VCPU0                                       VCPU1
- rank_lock
- write icenabler
- get target vcpus
- rank_unlock
                                         - rank_lock
                                         - write icenable
                                         - get target vcpus (some are the same)
                                         - rank_unlock

                                         - vgic_enable_irqs

- vgic_enable_irqs


we now have an inconsistent state: we enabled the irqs written from
vcpu0 but icenable reflects what was written from vcpu1.
Julien Grall June 24, 2014, 6:20 p.m. UTC | #5
On 06/24/2014 07:04 PM, Stefano Stabellini wrote:
> On Tue, 24 Jun 2014, Julien Grall wrote:
>> On 24/06/14 12:38, Stefano Stabellini wrote:
>>>> Sorry for not having catch this earlier. I don't really like the idea to
>>>> extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
>>>> can be long to execute as it may touch the GIC distributor.
>>>>
>>>> In another way, the rank lock is only taken in the distributor emulation.
>>>>
>>>> Assuming we consider that distributor access may be slow:
>>>
>>> We could end up enabling or disabling the wrong set of interrupts
>>> without this change. I think it is necessary.
>>
>> AFAIU, this lock only protects the rank when we retrieve the target VCPU, the
>> other part of the function will still work without it.
>>
>> What I meant is to call vgic_get_target_vcpu, so the lock will protect only
>> what is necessary and not more.
> 
> I don't think so (unless I misunderstood your suggestion): setting
> ienable and enabling the interrupts need to be atomic.  Otherwise this
> can happen:
> 
>     VCPU0                                       VCPU1
> - rank_lock
> - write icenabler
> - get target vcpus
> - rank_unlock
>                                          - rank_lock
>                                          - write icenable
>                                          - get target vcpus (some are the same)
>                                          - rank_unlock
> 
>                                          - vgic_enable_irqs
> 
> - vgic_enable_irqs
> 
> 
> we now have an inconsistent state: we enabled the irqs written from
> vcpu0 but icenable reflects what was written from vcpu1.

In your example it's not possible because we save the value of ienable
in a temporary variable. So calling to vgic_enable_irqs on 2 different
VCPU with the same range of IRQ won't be a problem.

But... there is a possible race condition between enable and disable. We
need to serialize the access otherwise we may enable/disable by mistake
an IRQ and it's not synced anymore with the register state.

This is issue is also on Xen 4.4. Can you send a single patch which move
the unlock for this branch?

Thanks,
Stefano Stabellini June 24, 2014, 6:29 p.m. UTC | #6
On Tue, 24 Jun 2014, Julien Grall wrote:
> On 06/24/2014 07:04 PM, Stefano Stabellini wrote:
> > On Tue, 24 Jun 2014, Julien Grall wrote:
> >> On 24/06/14 12:38, Stefano Stabellini wrote:
> >>>> Sorry for not having catch this earlier. I don't really like the idea to
> >>>> extend the rank lock to vgic_{enable/disable}_irqs. The 2 functions
> >>>> can be long to execute as it may touch the GIC distributor.
> >>>>
> >>>> In another way, the rank lock is only taken in the distributor emulation.
> >>>>
> >>>> Assuming we consider that distributor access may be slow:
> >>>
> >>> We could end up enabling or disabling the wrong set of interrupts
> >>> without this change. I think it is necessary.
> >>
> >> AFAIU, this lock only protects the rank when we retrieve the target VCPU, the
> >> other part of the function will still work without it.
> >>
> >> What I meant is to call vgic_get_target_vcpu, so the lock will protect only
> >> what is necessary and not more.
> > 
> > I don't think so (unless I misunderstood your suggestion): setting
> > ienable and enabling the interrupts need to be atomic.  Otherwise this
> > can happen:
> > 
> >     VCPU0                                       VCPU1
> > - rank_lock
> > - write icenabler
> > - get target vcpus
> > - rank_unlock
> >                                          - rank_lock
> >                                          - write icenable
> >                                          - get target vcpus (some are the same)
> >                                          - rank_unlock
> > 
> >                                          - vgic_enable_irqs
> > 
> > - vgic_enable_irqs
> > 
> > 
> > we now have an inconsistent state: we enabled the irqs written from
> > vcpu0 but icenable reflects what was written from vcpu1.
> 
> In your example it's not possible because we save the value of ienable
> in a temporary variable. So calling to vgic_enable_irqs on 2 different
> VCPU with the same range of IRQ won't be a problem.
> 
> But... there is a possible race condition between enable and disable. We
> need to serialize the access otherwise we may enable/disable by mistake
> an IRQ and it's not synced anymore with the register state.
> 
> This is issue is also on Xen 4.4. Can you send a single patch which move
> the unlock for this branch?

Sure



> Thanks,
> 
> -- 
> Julien Grall
>
Ian Campbell June 27, 2014, 3:17 p.m. UTC | #7
On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> @@ -589,12 +628,27 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
>          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
>          if ( rank == NULL) goto write_ignore;
> +        /* 8-bit vcpu mask for this domain */
> +        BUG_ON(v->domain->max_vcpus > 8);
> +        tr = (1 << v->domain->max_vcpus) - 1;
> +        if ( dabt.size == 2 )
> +            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> +        else
> +            tr = (tr << (8 * (offset & 0x3)));
> +        tr &= *r;
> +        /* ignore zero writes */
> +        if ( !tr )
> +            goto write_ignore;

Please can you add a comment here like:

                /* For word reads ignore writes where any single byte is zero */

With that: Acked-by: Ian Campbell <ian.campbell@citrix.com>

Although, might it be more reasonable to use the existing value for such
bytes? (i.e. only ignore the zero bytes, not the whole thing)

Ian.
Stefano Stabellini July 2, 2014, 3:39 p.m. UTC | #8
On Fri, 27 Jun 2014, Ian Campbell wrote:
> On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > @@ -589,12 +628,27 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> >          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
> >          if ( rank == NULL) goto write_ignore;
> > +        /* 8-bit vcpu mask for this domain */
> > +        BUG_ON(v->domain->max_vcpus > 8);
> > +        tr = (1 << v->domain->max_vcpus) - 1;
> > +        if ( dabt.size == 2 )
> > +            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> > +        else
> > +            tr = (tr << (8 * (offset & 0x3)));
> > +        tr &= *r;
> > +        /* ignore zero writes */
> > +        if ( !tr )
> > +            goto write_ignore;
> 
> Please can you add a comment here like:
> 
>                 /* For word reads ignore writes where any single byte is zero */
> 
> With that: Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Although, might it be more reasonable to use the existing value for such
> bytes? (i.e. only ignore the zero bytes, not the whole thing)

I don't know.. I would consider the entire write as invalid as it
containts some invalid configurations.
Ian Campbell July 2, 2014, 3:58 p.m. UTC | #9
On Wed, 2014-07-02 at 16:39 +0100, Stefano Stabellini wrote:
> On Fri, 27 Jun 2014, Ian Campbell wrote:
> > On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > > @@ -589,12 +628,27 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> > >          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> > >          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
> > >          if ( rank == NULL) goto write_ignore;
> > > +        /* 8-bit vcpu mask for this domain */
> > > +        BUG_ON(v->domain->max_vcpus > 8);
> > > +        tr = (1 << v->domain->max_vcpus) - 1;
> > > +        if ( dabt.size == 2 )
> > > +            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> > > +        else
> > > +            tr = (tr << (8 * (offset & 0x3)));
> > > +        tr &= *r;
> > > +        /* ignore zero writes */
> > > +        if ( !tr )
> > > +            goto write_ignore;
> > 
> > Please can you add a comment here like:
> > 
> >                 /* For word reads ignore writes where any single byte is zero */
> > 
> > With that: Acked-by: Ian Campbell <ian.campbell@citrix.com>
> > 
> > Although, might it be more reasonable to use the existing value for such
> > bytes? (i.e. only ignore the zero bytes, not the whole thing)
> 
> I don't know.. I would consider the entire write as invalid as it
> containts some invalid configurations.

The question is what does the spec say?

Ian.
Stefano Stabellini July 2, 2014, 6:05 p.m. UTC | #10
On Wed, 2 Jul 2014, Ian Campbell wrote:
> On Wed, 2014-07-02 at 16:39 +0100, Stefano Stabellini wrote:
> > On Fri, 27 Jun 2014, Ian Campbell wrote:
> > > On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > > > @@ -589,12 +628,27 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> > > >          if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
> > > >          rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
> > > >          if ( rank == NULL) goto write_ignore;
> > > > +        /* 8-bit vcpu mask for this domain */
> > > > +        BUG_ON(v->domain->max_vcpus > 8);
> > > > +        tr = (1 << v->domain->max_vcpus) - 1;
> > > > +        if ( dabt.size == 2 )
> > > > +            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
> > > > +        else
> > > > +            tr = (tr << (8 * (offset & 0x3)));
> > > > +        tr &= *r;
> > > > +        /* ignore zero writes */
> > > > +        if ( !tr )
> > > > +            goto write_ignore;
> > > 
> > > Please can you add a comment here like:
> > > 
> > >                 /* For word reads ignore writes where any single byte is zero */
> > > 
> > > With that: Acked-by: Ian Campbell <ian.campbell@citrix.com>
> > > 
> > > Although, might it be more reasonable to use the existing value for such
> > > bytes? (i.e. only ignore the zero bytes, not the whole thing)
> > 
> > I don't know.. I would consider the entire write as invalid as it
> > containts some invalid configurations.
> 
> The question is what does the spec say?

Unfortunately the spec is not clear about it.
diff mbox

Patch

diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 757707e..1e1c244 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -111,7 +111,13 @@  int domain_vgic_init(struct domain *d)
         INIT_LIST_HEAD(&d->arch.vgic.pending_irqs[i].lr_queue);
     }
     for (i=0; i<DOMAIN_NR_RANKS(d); i++)
+    {
         spin_lock_init(&d->arch.vgic.shared_irqs[i].lock);
+        /* By default deliver to CPU0 */
+        memset(d->arch.vgic.shared_irqs[i].itargets,
+               0x1,
+               sizeof(d->arch.vgic.shared_irqs[i].itargets));
+    }
     return 0;
 }
 
@@ -374,6 +380,35 @@  read_as_zero:
     return 1;
 }
 
+/* the rank lock is already taken */
+static struct vcpu *_vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
+{
+    unsigned long target;
+    struct vcpu *v_target;
+    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
+    ASSERT(spin_is_locked(&rank->lock));
+
+    target = byte_read(rank->itargets[(irq%32)/4], 0, irq % 4);
+    /* 1-N SPI should be delivered as pending to all the vcpus in the
+     * mask, but here we just return the first vcpu for simplicity and
+     * because it would be too slow to do otherwise. */
+    target = find_first_bit(&target, 8);
+    v_target = v->domain->vcpu[target];
+    return v_target;
+}
+
+/* takes the rank lock */
+struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
+{
+    struct vcpu *v_target;
+    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
+
+    vgic_lock_rank(v, rank);
+    v_target = _vgic_get_target_vcpu(v, irq);
+    vgic_unlock_rank(v, rank);
+    return v_target;
+}
+
 static void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 {
     const unsigned long mask = r;
@@ -381,12 +416,14 @@  static void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
     unsigned int irq;
     unsigned long flags;
     int i = 0;
+    struct vcpu *v_target;
 
     while ( (i = find_next_bit(&mask, 32, i)) < 32 ) {
         irq = i + (32 * n);
-        p = irq_to_pending(v, irq);
+        v_target = _vgic_get_target_vcpu(v, irq);
+        p = irq_to_pending(v_target, irq);
         clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
-        gic_remove_from_queues(v, irq);
+        gic_remove_from_queues(v_target, irq);
         if ( p->desc != NULL )
         {
             spin_lock_irqsave(&p->desc->lock, flags);
@@ -404,24 +441,26 @@  static void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n)
     unsigned int irq;
     unsigned long flags;
     int i = 0;
+    struct vcpu *v_target;
 
     while ( (i = find_next_bit(&mask, 32, i)) < 32 ) {
         irq = i + (32 * n);
-        p = irq_to_pending(v, irq);
+        v_target = _vgic_get_target_vcpu(v, irq);
+        p = irq_to_pending(v_target, irq);
         set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
         /* We need to force the first injection of evtchn_irq because
          * evtchn_upcall_pending is already set by common code on vcpu
          * creation. */
-        if ( irq == v->domain->arch.evtchn_irq &&
+        if ( irq == v_target->domain->arch.evtchn_irq &&
              vcpu_info(current, evtchn_upcall_pending) &&
              list_empty(&p->inflight) )
-            vgic_vcpu_inject_irq(v, irq);
+            vgic_vcpu_inject_irq(v_target, irq);
         else {
             unsigned long flags;
-            spin_lock_irqsave(&v->arch.vgic.lock, flags);
+            spin_lock_irqsave(&v_target->arch.vgic.lock, flags);
             if ( !list_empty(&p->inflight) && !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
-                gic_raise_guest_irq(v, irq, p->priority);
-            spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
+                gic_raise_guest_irq(v_target, irq, p->priority);
+            spin_unlock_irqrestore(&v_target->arch.vgic.lock, flags);
         }
         if ( p->desc != NULL )
         {
@@ -536,8 +575,8 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
         vgic_lock_rank(v, rank);
         tr = rank->ienable;
         rank->ienable |= *r;
-        vgic_unlock_rank(v, rank);
         vgic_enable_irqs(v, (*r) & (~tr), gicd_reg - GICD_ISENABLER);
+        vgic_unlock_rank(v, rank);
         return 1;
 
     case GICD_ICENABLER ... GICD_ICENABLERN:
@@ -547,8 +586,8 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
         vgic_lock_rank(v, rank);
         tr = rank->ienable;
         rank->ienable &= ~*r;
-        vgic_unlock_rank(v, rank);
         vgic_disable_irqs(v, (*r) & tr, gicd_reg - GICD_ICENABLER);
+        vgic_unlock_rank(v, rank);
         return 1;
 
     case GICD_ISPENDR ... GICD_ISPENDRN:
@@ -589,12 +628,27 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
         if ( dabt.size != 0 && dabt.size != 2 ) goto bad_width;
         rank = vgic_rank_offset(v, 8, gicd_reg - GICD_ITARGETSR);
         if ( rank == NULL) goto write_ignore;
+        /* 8-bit vcpu mask for this domain */
+        BUG_ON(v->domain->max_vcpus > 8);
+        tr = (1 << v->domain->max_vcpus) - 1;
+        if ( dabt.size == 2 )
+            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
+        else
+            tr = (tr << (8 * (offset & 0x3)));
+        tr &= *r;
+        /* ignore zero writes */
+        if ( !tr )
+            goto write_ignore;
+        if ( dabt.size == 2 &&
+            !((tr & 0xff) && (tr & (0xff << 8)) &&
+             (tr & (0xff << 16)) && (tr & (0xff << 24))))
+            goto write_ignore;
         vgic_lock_rank(v, rank);
         if ( dabt.size == 2 )
-            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = *r;
+            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = tr;
         else
             byte_write(&rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)],
-                       *r, offset);
+                       tr, offset);
         vgic_unlock_rank(v, rank);
         return 1;
 
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 8e37ccf..3950554 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -226,6 +226,8 @@  int gic_irq_xlate(const u32 *intspec, unsigned int intsize,
                   unsigned int *out_hwirq, unsigned int *out_type);
 void gic_clear_lrs(struct vcpu *v);
 
+struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq);
+
 #endif /* __ASSEMBLY__ */
 #endif