diff mbox

[Xen-devel] xen/arm: Handle platforms with edge-triggered virtual timer

Message ID 1417187826-5491-1-git-send-email-julien.grall@linaro.org
State Accepted, archived
Headers show

Commit Message

Julien Grall Nov. 28, 2014, 3:17 p.m. UTC
Some platforms (such as Xgene and ARMv8 models) use an edge-triggered interrupt
for the virtual timer. Even if the timer output signal is masked in the
context switch, the GIC will keep track that of any interrupts raised
while IRQs are disabled. As soon as IRQs are re-enabled, the virtual
interrupt timer will be injected to Xen.

If an idle vVCPU was scheduled next then the interrupt handler doesn't
expect to the receive the IRQ and will crash:

(XEN)    [<0000000000228388>] _spin_lock_irqsave+0x28/0x94 (PC)
(XEN)    [<0000000000228380>] _spin_lock_irqsave+0x20/0x94 (LR)
(XEN)    [<0000000000250510>] vgic_vcpu_inject_irq+0x40/0x1b0
(XEN)    [<000000000024bcd0>] vtimer_interrupt+0x4c/0x54
(XEN)    [<0000000000247010>] do_IRQ+0x1a4/0x220
(XEN)    [<0000000000244864>] gic_interrupt+0x50/0xec
(XEN)    [<000000000024fbac>] do_trap_irq+0x20/0x2c
(XEN)    [<0000000000255240>] hyp_irq+0x5c/0x60
(XEN)    [<0000000000241084>] context_switch+0xb8/0xc4
(XEN)    [<000000000022482c>] schedule+0x684/0x6d0
(XEN)    [<000000000022785c>] __do_softirq+0xcc/0xe8
(XEN)    [<00000000002278d4>] do_softirq+0x14/0x1c
(XEN)    [<0000000000240fac>] idle_loop+0x134/0x154
(XEN)    [<000000000024c160>] start_secondary+0x14c/0x15c
(XEN)    [<0000000000000001>] 0000000000000001

The proper solution is to context switch the virtual interrupt state at
the GIC level. This would also avoid masking the output signal which
requires specific handling in the guest OS and more complex code in Xen
to deal with EOIs, and so is desirable for that reason too.

Sadly, this solution requires some refactoring which would not be
suitable for a freeze exception for the Xen 4.5 release.

For now implement a temporary solution which ignores the virtual timer
interrupt when the idle VCPU is running.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---

Changes in v2:
    - Reword the commit message and comment in the code to explain the
    real bug. Based on Ian's reword.
    - Use unlikely

This patch is a bug fix candidate for Xen 4.5 and backport for Xen 4.4.
It affects at least Xgene platform and ARMv8 models where Xen may
randomly crash.

This patch don't inject the virtual timer interrupt if the current VCPU
is the idle one. For now, I think this patch is the safest way to resolve
the problem.

I will work on a proper solution for Xen 4.6.
---
 xen/arch/arm/time.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Julien Grall Nov. 28, 2014, 3:23 p.m. UTC | #1
I forgot to add "for 4.5" in the commit title.

On 28/11/14 15:17, Julien Grall wrote:
 > This patch is a bug fix candidate for Xen 4.5 and backport for Xen 4.4.
> It affects at least Xgene platform and ARMv8 models where Xen may
> randomly crash.

Thinking a bit more to this. I believe it's possible for a guest to
crash the host if the next timer deadline is short enough and it execute
a yield. Might be unlikely ...

Regards,
Julien Grall Dec. 2, 2014, 2:08 p.m. UTC | #2
Hi Ian,

On 02/12/14 13:54, Ian Campbell wrote:
> On Fri, 2014-11-28 at 15:17 +0000, Julien Grall wrote:
>> Some platforms (such as Xgene and ARMv8 models) use an edge-triggered interrupt
>> for the virtual timer. Even if the timer output signal is masked in the
>> context switch, the GIC will keep track that of any interrupts raised
>> while IRQs are disabled. As soon as IRQs are re-enabled, the virtual
>> interrupt timer will be injected to Xen.
>>
>> If an idle vVCPU was scheduled next then the interrupt handler doesn't
>> expect to the receive the IRQ and will crash:
>>
>> (XEN)    [<0000000000228388>] _spin_lock_irqsave+0x28/0x94 (PC)
>> (XEN)    [<0000000000228380>] _spin_lock_irqsave+0x20/0x94 (LR)
>> (XEN)    [<0000000000250510>] vgic_vcpu_inject_irq+0x40/0x1b0
>> (XEN)    [<000000000024bcd0>] vtimer_interrupt+0x4c/0x54
>> (XEN)    [<0000000000247010>] do_IRQ+0x1a4/0x220
>> (XEN)    [<0000000000244864>] gic_interrupt+0x50/0xec
>> (XEN)    [<000000000024fbac>] do_trap_irq+0x20/0x2c
>> (XEN)    [<0000000000255240>] hyp_irq+0x5c/0x60
>> (XEN)    [<0000000000241084>] context_switch+0xb8/0xc4
>> (XEN)    [<000000000022482c>] schedule+0x684/0x6d0
>> (XEN)    [<000000000022785c>] __do_softirq+0xcc/0xe8
>> (XEN)    [<00000000002278d4>] do_softirq+0x14/0x1c
>> (XEN)    [<0000000000240fac>] idle_loop+0x134/0x154
>> (XEN)    [<000000000024c160>] start_secondary+0x14c/0x15c
>> (XEN)    [<0000000000000001>] 0000000000000001
>>
>> The proper solution is to context switch the virtual interrupt state at
>> the GIC level. This would also avoid masking the output signal which
>> requires specific handling in the guest OS and more complex code in Xen
>> to deal with EOIs, and so is desirable for that reason too.
>>
>> Sadly, this solution requires some refactoring which would not be
>> suitable for a freeze exception for the Xen 4.5 release.
>>
>> For now implement a temporary solution which ignores the virtual timer
>> interrupt when the idle VCPU is running.
>>
> 
> When we reschedule the vcpu which caused the spurious interrupt, the IRQ
> will definitely trigger again for real, right?

Xen arms a timer when the domain is saved. As we received an unexpected
interrupt that means the timer expires which will make Xen injected the
virtual timer interrupt (see virt_timer_expired).


>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> I'll defer applying until you've said Yes to the above question.
> 
>> diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
>> index a6436f1..471d7a9 100644
>> --- a/xen/arch/arm/time.c
>> +++ b/xen/arch/arm/time.c
>> @@ -169,6 +169,19 @@ static void timer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
>>  
>>  static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
>>  {
>> +    /*
>> +     * Edge-triggered interrupt can be used for the virtual timer. Even
> 
> "interrupts"
> 
>> +     * if the timer output signal is masked in the context switch, the
>> +     * GIC will keep track that of any interrupts raised while IRQS as
> 
> s/as/are/
> 
> I'll fix those on commit.

Thanks!

Regards,
Julien Grall Dec. 2, 2014, 2:32 p.m. UTC | #3
On 02/12/14 14:21, Ian Campbell wrote:
> On Tue, 2014-12-02 at 14:08 +0000, Julien Grall wrote:
>> Hi Ian,
>>
>> On 02/12/14 13:54, Ian Campbell wrote:
>>> On Fri, 2014-11-28 at 15:17 +0000, Julien Grall wrote:
>>>> Some platforms (such as Xgene and ARMv8 models) use an edge-triggered interrupt
>>>> for the virtual timer. Even if the timer output signal is masked in the
>>>> context switch, the GIC will keep track that of any interrupts raised
>>>> while IRQs are disabled. As soon as IRQs are re-enabled, the virtual
>>>> interrupt timer will be injected to Xen.
>>>>
>>>> If an idle vVCPU was scheduled next then the interrupt handler doesn't
>>>> expect to the receive the IRQ and will crash:
>>>>
>>>> (XEN)    [<0000000000228388>] _spin_lock_irqsave+0x28/0x94 (PC)
>>>> (XEN)    [<0000000000228380>] _spin_lock_irqsave+0x20/0x94 (LR)
>>>> (XEN)    [<0000000000250510>] vgic_vcpu_inject_irq+0x40/0x1b0
>>>> (XEN)    [<000000000024bcd0>] vtimer_interrupt+0x4c/0x54
>>>> (XEN)    [<0000000000247010>] do_IRQ+0x1a4/0x220
>>>> (XEN)    [<0000000000244864>] gic_interrupt+0x50/0xec
>>>> (XEN)    [<000000000024fbac>] do_trap_irq+0x20/0x2c
>>>> (XEN)    [<0000000000255240>] hyp_irq+0x5c/0x60
>>>> (XEN)    [<0000000000241084>] context_switch+0xb8/0xc4
>>>> (XEN)    [<000000000022482c>] schedule+0x684/0x6d0
>>>> (XEN)    [<000000000022785c>] __do_softirq+0xcc/0xe8
>>>> (XEN)    [<00000000002278d4>] do_softirq+0x14/0x1c
>>>> (XEN)    [<0000000000240fac>] idle_loop+0x134/0x154
>>>> (XEN)    [<000000000024c160>] start_secondary+0x14c/0x15c
>>>> (XEN)    [<0000000000000001>] 0000000000000001
>>>>
>>>> The proper solution is to context switch the virtual interrupt state at
>>>> the GIC level. This would also avoid masking the output signal which
>>>> requires specific handling in the guest OS and more complex code in Xen
>>>> to deal with EOIs, and so is desirable for that reason too.
>>>>
>>>> Sadly, this solution requires some refactoring which would not be
>>>> suitable for a freeze exception for the Xen 4.5 release.
>>>>
>>>> For now implement a temporary solution which ignores the virtual timer
>>>> interrupt when the idle VCPU is running.
>>>>
>>>
>>> When we reschedule the vcpu which caused the spurious interrupt, the IRQ
>>> will definitely trigger again for real, right?
>>
>> Xen arms a timer when the domain is saved. As we received an unexpected
>> interrupt that means the timer expires which will make Xen injected the
>> virtual timer interrupt (see virt_timer_expired).
> 
> Are we sure there is no race here where the software timer doesn't fire
> because it appears to be in the past or something?
> 
> That would correspond to the sequence:
>    disable interrupts
>    h/w timer expires, interrupt raised but masked
>    calculate timeout for s/w timeout => -ve.

The s/w timers contains the absolute value of the deadline that will be
compared to NOW().

> Perhaps Xen s/w timers in the past fire immediately?

The s/w timer is added in the heap and a SOFTIRQ is raised.

When executed, the softirq will notice that the timer has to be fired
and therefore an interrupt will be injected to the guest.

Regards,
diff mbox

Patch

diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index a6436f1..471d7a9 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -169,6 +169,19 @@  static void timer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
 
 static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
 {
+    /*
+     * Edge-triggered interrupt can be used for the virtual timer. Even
+     * if the timer output signal is masked in the context switch, the
+     * GIC will keep track that of any interrupts raised while IRQS as
+     * disabled. As soon as IRQs are re-enabled, the virtual interrupt
+     * will be injected to Xen.
+     *
+     * If an IDLE vCPU was scheduled next then we should ignore the
+     * interrupt.
+     */
+    if ( unlikely(is_idle_vcpu(current)) )
+        return;
+
     current->arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0);
     WRITE_SYSREG32(current->arch.virt_timer.ctl | CNTx_CTL_MASK, CNTV_CTL_EL0);
     vgic_vcpu_inject_irq(current, current->arch.virt_timer.irq);