diff mbox

[Xen-devel,v6,2/5] xen/arm: inflight irqs during migration

Message ID 1403541463-23734-2-git-send-email-stefano.stabellini@eu.citrix.com
State New
Headers show

Commit Message

Stefano Stabellini June 23, 2014, 4:37 p.m. UTC
We need to take special care when migrating irqs that are already
inflight from one vcpu to another. See "The effect of changes to an
GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
Controller Architecture Specification.

The main issue from the Xen point of view is that the lr_pending and
inflight lists are per-vcpu. The lock we take to protect them is also
per-vcpu.

In order to avoid issues, if the irq is still lr_pending, we can
immediately move it to the new vcpu for injection.

Otherwise if it is in a GICH_LR register, set a new flag
GIC_IRQ_GUEST_MIGRATING, so that we can recognize when we receive an irq
while the previous one is still inflight (given that we are only dealing
with hardware interrupts here, it just means that its LR hasn't been
cleared yet on the old vcpu).  If GIC_IRQ_GUEST_MIGRATING is set, we
only set GIC_IRQ_GUEST_QUEUED and interrupt the old vcpu. When clearing
the LR on the old vcpu, we take special care of injecting the interrupt
into the new vcpu. To do that we need to release the old vcpu lock
before taking the new vcpu lock.

In vgic_vcpu_inject_irq set GIC_IRQ_GUEST_QUEUED before testing
GIC_IRQ_GUEST_MIGRATING to avoid races with gic_update_one_lr.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

---

Changes in v6:
- remove unnecessary casts to (const unsigned long *) to the arguments
of find_first_bit and find_next_bit;
- deal with migrating an irq that is inflight and still lr_pending;
- replace the dsb with smb_wmb and smb_rmb, use them to ensure the order
of accesses to GIC_IRQ_GUEST_QUEUED and GIC_IRQ_GUEST_MIGRATING.

Changes in v5:
- pass unsigned long to find_next_bit for alignment on aarch64;
- call vgic_get_target_vcpu instead of gic_add_to_lr_pending to add the
irq in the right inflight queue;
- add barrier and bit tests to make sure that vgic_migrate_irq and
gic_update_one_lr can run simultaneously on different cpus without
issues;
- rework the loop to identify the new vcpu when ITARGETSR is written;
- use find_first_bit instead of find_next_bit.
---
 xen/arch/arm/gic.c           |   26 ++++++++++++--
 xen/arch/arm/vgic.c          |   78 ++++++++++++++++++++++++++++++++++++------
 xen/include/asm-arm/domain.h |    4 +++
 3 files changed, 95 insertions(+), 13 deletions(-)

Comments

Julien Grall June 23, 2014, 8:14 p.m. UTC | #1
Hi Stefano,

On 23/06/14 17:37, Stefano Stabellini wrote:
> @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
>   
>       spin_lock_irqsave(&v->arch.vgic.lock, flags);
>   
> +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> +    /* update QUEUED before MIGRATING */
> +    smp_wmb();
> +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> +        goto out;

Why do you kick the current VCPU here? It looks like useless because the migration will take care of it.

> +
>       if ( !list_empty(&n->inflight) )
>       {
> -        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
>           gic_raise_inflight_irq(v, irq);
>           goto out;
>       }
> @@ -796,6 +852,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
>       /* vcpu offline */
>       if ( test_bit(_VPF_down, &v->pause_flags) )
>       {
> +        clear_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
>           spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>           return;

Rather than setting & clearing the GUEST_QUEUED bit. Wouldn't it be better to move the if (test_bit(_VPF_down,...)) before the spin_lock?

If the VCPU is down here, it will likely be down before. Hence, the lock doesn't protect the pause_flags. This would also make the code clearer.

Regards,
Stefano Stabellini June 24, 2014, 11:57 a.m. UTC | #2
On Mon, 23 Jun 2014, Julien Grall wrote:
> Hi Stefano,
> 
> On 23/06/14 17:37, Stefano Stabellini wrote:
> > @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
> >   
> >       spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >   
> > +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> > +    /* update QUEUED before MIGRATING */
> > +    smp_wmb();
> > +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> > +        goto out;
> 
> Why do you kick the current VCPU here? It looks like useless because the migration will take care of it.

With the physical follow virtual patch, the vcpu_unblock below is going
to be mostly useless but harmless as vgic_vcpu_inject_irq is going to be called on
the pcpu running the destination vcpu. smp_send_event_check_mask won't be called as
v == current.

On the other hand without that patch, the pcpu receiving the physical
interrupt could be different from any of the pcpus running the vcpus
involved in vcpu migration, therefore we would need the kick to wake up
the destination vcpu.


> > +
> >       if ( !list_empty(&n->inflight) )
> >       {
> > -        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> >           gic_raise_inflight_irq(v, irq);
> >           goto out;
> >       }
> > @@ -796,6 +852,7 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
> >       /* vcpu offline */
> >       if ( test_bit(_VPF_down, &v->pause_flags) )
> >       {
> > +        clear_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> >           spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> >           return;
> 
> Rather than setting & clearing the GUEST_QUEUED bit. Wouldn't it be better to move the if (test_bit(_VPF_down,...)) before the spin_lock?
> 
> If the VCPU is down here, it will likely be down before. Hence, the lock doesn't protect the pause_flags. This would also make the code clearer.

Good suggestion
Julien Grall June 24, 2014, 12:17 p.m. UTC | #3
On 24/06/14 12:57, Stefano Stabellini wrote:
> On Mon, 23 Jun 2014, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 23/06/14 17:37, Stefano Stabellini wrote:
>>> @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
>>>
>>>        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>>
>>> +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
>>> +    /* update QUEUED before MIGRATING */
>>> +    smp_wmb();
>>> +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
>>> +        goto out;
>>
>> Why do you kick the current VCPU here? It looks like useless because the migration will take care of it.
>
> With the physical follow virtual patch, the vcpu_unblock below is going
> to be mostly useless but harmless as vgic_vcpu_inject_irq is going to be called on
> the pcpu running the destination vcpu. smp_send_event_check_mask won't be called as
> v == current.
>
> On the other hand without that patch, the pcpu receiving the physical
> interrupt could be different from any of the pcpus running the vcpus
> involved in vcpu migration, therefore we would need the kick to wake up
> the destination vcpu.

If the MIGRATING bit is set it means that an IRQ is already inflight, 
and therefore gic_update_one_lr will take care of injecting this IRQ to 
the new VCPU by calling vgic_vcpu_inject_irq.
So kicking the new VCPU here is pointless and may unschedule another 
VCPU for nothing. The latter may be able to run a bit more.

With your patch #4 (physical IRQ follow virtual IRQ), there is a tiny 
range the VCPU may not run on the same pCPU where the physical IRQ as 
been routed. This is the case when the VCPU is unscheduled.

Regards,
Ian Campbell June 27, 2014, 3:37 p.m. UTC | #4
On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> - replace the dsb with smb_wmb and smb_rmb, use them to ensure the order
> of accesses to GIC_IRQ_GUEST_QUEUED and GIC_IRQ_GUEST_MIGRATING.

You access/change those with test_bit et al which already include
appropriate barriers/ordering guarantees, I think.

The __test_foo are the variants without ordering.

IOW I think you can probably do without some/all of those barriers.

> @@ -546,6 +575,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
>      int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
>      int gicd_reg = REG(offset);
>      uint32_t tr;
> +    unsigned long trl;

Can you add /* Need unsigned long for XXX */?

Actually, even better would be to call this variable "target" or
something. (tgt?)

And even better than that would be:

 case GICD_ITARGETSR + 8 ... GICD_ITARGETSRN:
 {
      unsigned long target;
      ... stuff ...
      return 1;
 }

so the scope is limited to the uses.

> +    int i;
>  
>      switch ( gicd_reg )
>      {
[...]
> @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
>  
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>  
> +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> +    /* update QUEUED before MIGRATING */
                               ^testing

(otherwise I wonder why you aren't setting it)

> +    smp_wmb();
> +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> +        goto out;
> +
>      if ( !list_empty(&n->inflight) )
>      {
> -        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
>          gic_raise_inflight_irq(v, irq);
>          goto out;
>      }

Ian.
Ian Campbell June 27, 2014, 3:40 p.m. UTC | #5
On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:

> +        list_del_init(&p->lr_queue);
> +        list_del_init(&p->inflight);
> +        spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
> +        vgic_vcpu_inject_irq(new, irq);
> +        return;
> +    /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
> +     * and wait for the EOI */

Is it safe that MIGRATING serves dual purpose -- it indicates an irq
which is actively being moved but it also indicates an irq where we are
awaiting an EOI before migrating it. (Or have I misunderstood what is
going on here?)

I suspect what I'm worried about is something completing the migration
without realising it needs to wait for an EOI?

Ian.
Stefano Stabellini July 2, 2014, 6:22 p.m. UTC | #6
On Fri, 27 Jun 2014, Ian Campbell wrote:
> On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > - replace the dsb with smb_wmb and smb_rmb, use them to ensure the order
> > of accesses to GIC_IRQ_GUEST_QUEUED and GIC_IRQ_GUEST_MIGRATING.
> 
> You access/change those with test_bit et al which already include
> appropriate barriers/ordering guarantees, I think.
> 
> The __test_foo are the variants without ordering.
> 
> IOW I think you can probably do without some/all of those barriers.
> 
> > @@ -546,6 +575,8 @@ static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
> >      int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
> >      int gicd_reg = REG(offset);
> >      uint32_t tr;
> > +    unsigned long trl;
> 
> Can you add /* Need unsigned long for XXX */?
> 
> Actually, even better would be to call this variable "target" or
> something. (tgt?)
> 
> And even better than that would be:
> 
>  case GICD_ITARGETSR + 8 ... GICD_ITARGETSRN:
>  {
>       unsigned long target;
>       ... stuff ...
>       return 1;
>  }

OK

> so the scope is limited to the uses.
> 
> > +    int i;
> >  
> >      switch ( gicd_reg )
> >      {
> [...]
> > @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
> >  
> >      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >  
> > +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> > +    /* update QUEUED before MIGRATING */
>                                ^testing
> 
> (otherwise I wonder why you aren't setting it)

right

> > +    smp_wmb();
> > +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> > +        goto out;
> > +
> >      if ( !list_empty(&n->inflight) )
> >      {
> > -        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> >          gic_raise_inflight_irq(v, irq);
> >          goto out;
> >      }
> 
> Ian.
> 
>
Stefano Stabellini July 2, 2014, 6:33 p.m. UTC | #7
On Fri, 27 Jun 2014, Ian Campbell wrote:
> On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> 
> > +        list_del_init(&p->lr_queue);
> > +        list_del_init(&p->inflight);
> > +        spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
> > +        vgic_vcpu_inject_irq(new, irq);
> > +        return;
> > +    /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
> > +     * and wait for the EOI */
> 
> Is it safe that MIGRATING serves dual purpose -- it indicates an irq
> which is actively being moved but it also indicates an irq where we are
> awaiting an EOI before migrating it. (Or have I misunderstood what is
> going on here?)

As described in the commit message, MIGRATING is just to flag an irq
that has been migrated but for which we are also awaiting an EOI.
Maybe I can rename GIC_IRQ_GUEST_MIGRATING to
GIC_IRQ_GUEST_MIGRATING_EOI to make it more obvious that is targeting a
specific corner case.


> I suspect what I'm worried about is something completing the migration
> without realising it needs to wait for an EOI?

We carefully check the current status of the irq in vgic_migrate_irq
before setting GIC_IRQ_GUEST_MIGRATING. Only if the irq is already in
an LR register we set GIC_IRQ_GUEST_MIGRATING.
Stefano Stabellini July 2, 2014, 10:27 p.m. UTC | #8
On Tue, 24 Jun 2014, Julien Grall wrote:
> On 24/06/14 12:57, Stefano Stabellini wrote:
> > On Mon, 23 Jun 2014, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 23/06/14 17:37, Stefano Stabellini wrote:
> > > > @@ -786,9 +837,14 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned
> > > > int irq)
> > > > 
> > > >        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> > > > 
> > > > +    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
> > > > +    /* update QUEUED before MIGRATING */
> > > > +    smp_wmb();
> > > > +    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
> > > > +        goto out;
> > > 
> > > Why do you kick the current VCPU here? It looks like useless because the
> > > migration will take care of it.
> > 
> > With the physical follow virtual patch, the vcpu_unblock below is going
> > to be mostly useless but harmless as vgic_vcpu_inject_irq is going to be
> > called on
> > the pcpu running the destination vcpu. smp_send_event_check_mask won't be
> > called as
> > v == current.
> > 
> > On the other hand without that patch, the pcpu receiving the physical
> > interrupt could be different from any of the pcpus running the vcpus
> > involved in vcpu migration, therefore we would need the kick to wake up
> > the destination vcpu.
> 
> If the MIGRATING bit is set it means that an IRQ is already inflight, and
> therefore gic_update_one_lr will take care of injecting this IRQ to the new
> VCPU by calling vgic_vcpu_inject_irq.
> So kicking the new VCPU here is pointless and may unschedule another VCPU for
> nothing. The latter may be able to run a bit more.
> 
> With your patch #4 (physical IRQ follow virtual IRQ), there is a tiny range
> the VCPU may not run on the same pCPU where the physical IRQ as been routed.
> This is the case when the VCPU is unscheduled.

I think that you are right. I'll remove the vcpu kick.
Also I realize now that I might be able to simplify the code by updating
the affinity of the physical irq only after the MIGRATING virq has been
EOIed.
Ian Campbell July 3, 2014, 9:29 a.m. UTC | #9
On Wed, 2014-07-02 at 19:33 +0100, Stefano Stabellini wrote:
> On Fri, 27 Jun 2014, Ian Campbell wrote:
> > On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > 
> > > +        list_del_init(&p->lr_queue);
> > > +        list_del_init(&p->inflight);
> > > +        spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
> > > +        vgic_vcpu_inject_irq(new, irq);
> > > +        return;
> > > +    /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
> > > +     * and wait for the EOI */
> > 
> > Is it safe that MIGRATING serves dual purpose -- it indicates an irq
> > which is actively being moved but it also indicates an irq where we are
> > awaiting an EOI before migrating it. (Or have I misunderstood what is
> > going on here?)
> 
> As described in the commit message, MIGRATING is just to flag an irq
> that has been migrated but for which we are also awaiting an EOI.

Ah, for some reason I thought it was also signalling an IRQ which was
currently being migrated.

How do you handle the race between writing the real itargets register
and inflight (real) interrupts? After the write you might still see some
interrupt on the old processor I think. I thought that was what the bit
was for.

e.g. on x86 I think they don't consider the interrupt to have migrated
until they observe the first one on the target processor. Maybe x86 int
controllers and ARM ones differ wrt this sort of thing though.

> Maybe I can rename GIC_IRQ_GUEST_MIGRATING to
> GIC_IRQ_GUEST_MIGRATING_EOI to make it more obvious that is targeting a
> specific corner case.
> 
> 
> > I suspect what I'm worried about is something completing the migration
> > without realising it needs to wait for an EOI?
> 
> We carefully check the current status of the irq in vgic_migrate_irq
> before setting GIC_IRQ_GUEST_MIGRATING. Only if the irq is already in
> an LR register we set GIC_IRQ_GUEST_MIGRATING.

Assuming the above doesn't cause a rethink of the implementation then I
think just clarifying the comment:

+     * GIC_IRQ_GUEST_MIGRATING: the irq is being migrated to a different
+     * vcpu.

would be sufficient.

Ian.
Stefano Stabellini July 3, 2014, 2:43 p.m. UTC | #10
On Thu, 3 Jul 2014, Ian Campbell wrote:
> On Wed, 2014-07-02 at 19:33 +0100, Stefano Stabellini wrote:
> > On Fri, 27 Jun 2014, Ian Campbell wrote:
> > > On Mon, 2014-06-23 at 17:37 +0100, Stefano Stabellini wrote:
> > > 
> > > > +        list_del_init(&p->lr_queue);
> > > > +        list_del_init(&p->inflight);
> > > > +        spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
> > > > +        vgic_vcpu_inject_irq(new, irq);
> > > > +        return;
> > > > +    /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
> > > > +     * and wait for the EOI */
> > > 
> > > Is it safe that MIGRATING serves dual purpose -- it indicates an irq
> > > which is actively being moved but it also indicates an irq where we are
> > > awaiting an EOI before migrating it. (Or have I misunderstood what is
> > > going on here?)
> > 
> > As described in the commit message, MIGRATING is just to flag an irq
> > that has been migrated but for which we are also awaiting an EOI.
> 
> Ah, for some reason I thought it was also signalling an IRQ which was
> currently being migrated.
> 
> How do you handle the race between writing the real itargets register
> and inflight (real) interrupts? After the write you might still see some
> interrupt on the old processor I think. I thought that was what the bit
> was for.

Not a problem: a new interrupt can still come to the old processor and
the gic driver would know how to inject it properly. It could even
arrive on a processor that is nethier the old or the new one and
everything would still work.
The only case that needs care is when an irq is still on an LR register
on the old processor: the MIGRATING bit is for that.


> e.g. on x86 I think they don't consider the interrupt to have migrated
> until they observe the first one on the target processor. Maybe x86 int
> controllers and ARM ones differ wrt this sort of thing though.

I don't think that is an issue for us either way, as long as we take the
right vcpu lock and we have code for that now.


> > Maybe I can rename GIC_IRQ_GUEST_MIGRATING to
> > GIC_IRQ_GUEST_MIGRATING_EOI to make it more obvious that is targeting a
> > specific corner case.
> > 
> > 
> > > I suspect what I'm worried about is something completing the migration
> > > without realising it needs to wait for an EOI?
> > 
> > We carefully check the current status of the irq in vgic_migrate_irq
> > before setting GIC_IRQ_GUEST_MIGRATING. Only if the irq is already in
> > an LR register we set GIC_IRQ_GUEST_MIGRATING.
> 
> Assuming the above doesn't cause a rethink of the implementation then I
> think just clarifying the comment:
> 
> +     * GIC_IRQ_GUEST_MIGRATING: the irq is being migrated to a different
> +     * vcpu.
> 
> would be sufficient.

OK
diff mbox

Patch

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 4d2a92d..c7dda9b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -683,10 +683,32 @@  static void gic_update_one_lr(struct vcpu *v, int i)
         clear_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
         p->lr = GIC_INVALID_LR;
         if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
-             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
+             test_bit(GIC_IRQ_GUEST_QUEUED, &p->status) &&
+             !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
             gic_raise_guest_irq(v, irq, p->priority);
-        else
+        else {
+            int m, q;
             list_del_init(&p->inflight);
+
+            m = test_and_clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status);
+            /* check MIGRATING before QUEUED */
+            smp_rmb();
+            q = test_bit(GIC_IRQ_GUEST_QUEUED, &p->status);
+            if ( m && q )
+            {
+                struct vcpu *v_target;
+
+                /* It is safe to temporarily drop the lock because we
+                 * are finished dealing with this LR. We'll take the
+                 * lock before reading the next. */
+                spin_unlock(&v->arch.vgic.lock);
+                /* vgic_get_target_vcpu takes the rank lock, ensuring
+                 * consistency with other itarget changes. */
+                v_target = vgic_get_target_vcpu(v, irq);
+                vgic_vcpu_inject_irq(v_target, irq);
+                spin_lock(&v->arch.vgic.lock);
+            }
+        }
     }
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 1e1c244..7924629 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -409,6 +409,35 @@  struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq)
     return v_target;
 }
 
+static void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq)
+{
+    unsigned long flags;
+    struct pending_irq *p = irq_to_pending(old, irq);
+
+    /* nothing to do for virtual interrupts */
+    if ( p->desc == NULL )
+        return;
+
+    /* migration already in progress, no need to do anything */
+    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) )
+        return;
+
+    spin_lock_irqsave(&old->arch.vgic.lock, flags);
+    /* If the IRQ is still lr_pending, re-inject it to the new vcpu */
+    if ( !list_empty(&p->lr_queue) )
+    {
+        list_del_init(&p->lr_queue);
+        list_del_init(&p->inflight);
+        spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
+        vgic_vcpu_inject_irq(new, irq);
+        return;
+    /* if the IRQ is in a GICH_LR register, set GIC_IRQ_GUEST_MIGRATING
+     * and wait for the EOI */
+    } else if ( !list_empty(&p->inflight) )
+        set_bit(GIC_IRQ_GUEST_MIGRATING, &p->status);
+    spin_unlock_irqrestore(&old->arch.vgic.lock, flags);
+}
+
 static void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n)
 {
     const unsigned long mask = r;
@@ -546,6 +575,8 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
     int offset = (int)(info->gpa - v->domain->arch.vgic.dbase);
     int gicd_reg = REG(offset);
     uint32_t tr;
+    unsigned long trl;
+    int i;
 
     switch ( gicd_reg )
     {
@@ -630,25 +661,45 @@  static int vgic_distr_mmio_write(struct vcpu *v, mmio_info_t *info)
         if ( rank == NULL) goto write_ignore;
         /* 8-bit vcpu mask for this domain */
         BUG_ON(v->domain->max_vcpus > 8);
-        tr = (1 << v->domain->max_vcpus) - 1;
+        trl = (1 << v->domain->max_vcpus) - 1;
         if ( dabt.size == 2 )
-            tr = tr | (tr << 8) | (tr << 16) | (tr << 24);
+            trl = trl | (trl << 8) | (trl << 16) | (trl << 24);
         else
-            tr = (tr << (8 * (offset & 0x3)));
-        tr &= *r;
+            trl = (trl << (8 * (offset & 0x3)));
+        trl &= *r;
         /* ignore zero writes */
-        if ( !tr )
+        if ( !trl )
             goto write_ignore;
         if ( dabt.size == 2 &&
-            !((tr & 0xff) && (tr & (0xff << 8)) &&
-             (tr & (0xff << 16)) && (tr & (0xff << 24))))
+            !((trl & 0xff) && (trl & (0xff << 8)) &&
+             (trl & (0xff << 16)) && (trl & (0xff << 24))))
             goto write_ignore;
         vgic_lock_rank(v, rank);
+        i = 0;
+        while ( (i = find_next_bit(&trl, 32, i)) < 32 )
+        {
+            unsigned int irq, target, old_target;
+            unsigned long old_target_mask;
+            struct vcpu *v_target, *v_old;
+
+            target = i % 8;
+            old_target_mask = byte_read(rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)], 0, i/8);
+            old_target = find_first_bit(&old_target_mask, 8);
+
+            if ( target != old_target )
+            {
+                irq = offset + (i / 8);
+                v_target = v->domain->vcpu[target];
+                v_old = v->domain->vcpu[old_target];
+                vgic_migrate_irq(v_old, v_target, irq);
+            }
+            i += 8 - target;
+        }
         if ( dabt.size == 2 )
-            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = tr;
+            rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)] = trl;
         else
             byte_write(&rank->itargets[REG_RANK_INDEX(8, gicd_reg - GICD_ITARGETSR)],
-                       tr, offset);
+                       trl, offset);
         vgic_unlock_rank(v, rank);
         return 1;
 
@@ -786,9 +837,14 @@  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
+    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
+    /* update QUEUED before MIGRATING */
+    smp_wmb();
+    if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &n->status) )
+        goto out;
+
     if ( !list_empty(&n->inflight) )
     {
-        set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
         gic_raise_inflight_irq(v, irq);
         goto out;
     }
@@ -796,6 +852,7 @@  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
     /* vcpu offline */
     if ( test_bit(_VPF_down, &v->pause_flags) )
     {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
         spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
         return;
     }
@@ -803,7 +860,6 @@  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
     priority = byte_read(rank->ipriority[REG_RANK_INDEX(8, idx)], 0, byte);
 
     n->irq = irq;
-    set_bit(GIC_IRQ_GUEST_QUEUED, &n->status);
     n->priority = priority;
 
     /* the irq is enabled */
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 2cb6837..12b2a1d 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -55,11 +55,15 @@  struct pending_irq
      * GIC_IRQ_GUEST_ENABLED: the guest IRQ is enabled at the VGICD
      * level (GICD_ICENABLER/GICD_ISENABLER).
      *
+     * GIC_IRQ_GUEST_MIGRATING: the irq is being migrated to a different
+     * vcpu.
+     *
      */
 #define GIC_IRQ_GUEST_QUEUED   0
 #define GIC_IRQ_GUEST_ACTIVE   1
 #define GIC_IRQ_GUEST_VISIBLE  2
 #define GIC_IRQ_GUEST_ENABLED  3
+#define GIC_IRQ_GUEST_MIGRATING   4
     unsigned long status;
     struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
     int irq;