diff mbox

[Xen-devel] Xen 4.5 random freeze question

Message ID alpine.DEB.2.02.1411181703140.27247@kaball.uk.xensource.com
State New
Headers show

Commit Message

Stefano Stabellini Nov. 18, 2014, 5:51 p.m. UTC
Hello Andrii,
we are getting closer :-)

It would help if you post the output with GIC_DEBUG defined but without
the other change that "fixes" the issue.

I think the problem is probably due to software irqs.
You are getting too many

gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is still lr_pending

messages. That means you are loosing virtual SGIs (guest VCPU to guest
VCPU). It would be best to investigate why, especially if you get many
more of the same messages without the MAINTENANCE_IRQ change I
suggested.

This patch might also help understading the problem more:






On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> Hi Stefano,
> 
> No hangs with this change.
> Complete log is the following:
> 
> U-Boot SPL 2013.10-00499-g062782f (Oct 14 2014 - 11:36:26)
> DRA752 ES1.0
> <ethaddr> not set. Validating first E-fuse MAC
> cpsw
> - UART enabled -
> - CPU 00000000 booting -
> - Xen starting in Hyp mode -
> - Zero BSS -
> - Setting up control registers -
> - Turning on paging -
> - Ready -
> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 0000000080000000 - 000000009fffffff
> (XEN) RAM: 00000000a0000000 - 00000000bfffffff
> (XEN) RAM: 00000000c0000000 - 00000000dfffffff
> (XEN)
> (XEN) MODULE[1]: 00000000c2000000 - 00000000c20069aa
> (XEN) MODULE[2]: 00000000c0000000 - 00000000c2000000
> (XEN) MODULE[3]: 0000000000000000 - 0000000000000000
> (XEN) MODULE[4]: 00000000c3000000 - 00000000c3010000
> (XEN)  RESVD[0]: 00000000ba300000 - 00000000bfd00000
> (XEN)  RESVD[1]: 0000000095800000 - 0000000095900000
> (XEN)  RESVD[2]: 0000000098a00000 - 0000000098b00000
> (XEN)  RESVD[3]: 0000000095f00000 - 0000000098a00000
> (XEN)  RESVD[4]: 0000000095900000 - 0000000095f00000
> (XEN)
> (XEN) Command line: dom0_mem=128M console=dtuart dtuart=serial0
> dom0_max_vcpus=2 bootscrub=0 flask_enforcing=1
> (XEN) Placing Xen at 0x00000000dfe00000-0x00000000e0000000
> (XEN) Xen heap: 00000000d2000000-00000000de000000 (49152 pages)
> (XEN) Dom heap: 344064 pages
> (XEN) Domain heap initialised
> (XEN) Looking for UART console serial0
>  Xen 4.5-unstable
> (XEN) Xen version 4.5-unstable (atseglytskyi@)
> (arm-linux-gnueabihf-gcc (crosstool-NG
> linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) 4.7.3
> 20130328 (prerelease)) debu4
> (XEN) Latest ChangeSet: Thu Jul 3 12:55:26 2014 +0300 git:3ee354f-dirty
> (XEN) Processor: 412fc0f2: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x2
> (XEN) 32-bit Execution:
> (XEN)   Processor Features: 00001131:00011011
> (XEN)     Instruction Sets: AArch32 Thumb Thumb-2 ThumbEE Jazelle
> (XEN)     Extensions: GenericTimer Security
> (XEN)   Debug Features: 02010555
> (XEN)   Auxiliary Features: 00000000
> (XEN)   Memory Model Features: 10201105 20000000 01240000 02102211
> (XEN)  ISA Features: 02101110 13112111 21232041 11112131 10011142 00000000
> (XEN) Platform: TI DRA7
> (XEN) /psci method must be smc, but is: "hvc"
> (XEN) Set AuxCoreBoot1 to 00000000dfe0004c (0020004c)
> (XEN) Set AuxCoreBoot0 to 0x20
> (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27
> (XEN) Using generic timer at 6144 KHz
> (XEN) GIC initialization:
> (XEN)         gic_dist_addr=0000000048211000
> (XEN)         gic_cpu_addr=0000000048212000
> (XEN)         gic_hyp_addr=0000000048214000
> (XEN)         gic_vcpu_addr=0000000048216000
> (XEN)         gic_maintenance_irq=25
> (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b).
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) I/O virtualisation disabled
> (XEN) Allocated console ring of 16 KiB.
> (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0
> (XEN) Bringing up CPU1
> - CPU 00000001 booting -
> - Xen starting in Hyp mode -
> - Setting up control registers -
> - Turning on paging -
> - Ready -
> (XEN) CPU 1 booted.
> (XEN) Brought up 2 CPUs
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN) Loading kernel from boot module 2
> (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0)
> (XEN) Loading zImage from 00000000c0000040 to 00000000cfc00000-00000000cff50c48
> (XEN) Loading dom0 DTB to 0x00000000cfa00000-0x00000000cfa05ba8
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
> input to Xen)
> (XEN) Freed 272kB init memory.
> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
> already pending in LR0
> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
> already pending in LR0
> [    0.000000] /cpus/cpu@0 missing clock-frequency property
> [    0.000000] /cpus/cpu@1 missing clock-frequency property
> [    0.140625] omap-gpmc omap-gpmc: failed to reserve memory
> [    0.187500] omap_l3_noc ocp.3: couldn't find resource 2
> [    0.273437] i2c i2c-1: of_i2c: invalid reg on
> /ocp/i2c@48072000/camera_ov10635
> [    0.437500] ldo3: operation not allowed
> [    0.437500] omapdss HDMI error: can't set the voltage regulator
> [    0.468750] tfc_s9700 display0: tfc_s9700_probe probe
> [    0.468750] ov1063x 1-0030: No deserializer node found
> [    0.468750] ov1063x 1-0030: No serializer node found
> [    0.468750] ov1063x 1-0030: Failed writing register 0x0103!
> [    0.468750] dra7xx-vip vip1-0: Waiting for I2C subdevice 30
> [    0.578125] ahci ahci.0.auto: can't get clock
> [    0.898437] ldc_module_init
> [    1.304687] Missing dual_emac_res_vlan in DT.
> [    1.304687] Using 1 as Reserved VLAN for 0 slave
> [    1.312500] Missing dual_emac_res_vlan in DT.
> [    1.320312] Using 2 as Reserved VLAN for 1 slave
> [    1.382812] Freeing init memory: 236K
> sh: write error: No such device
> Cannot identify '/dev/camera0': 2, No such file or directory
> Parsing config from /xen/images/DomUAndroid.cfg
> XSM Disabled: seclabel not supported
> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> dom1 access to irq 53: Function not implemented
> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> dom1 access to irq 71: Function not implemented
> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> dom1 access to irq 173: Function not implemented
> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> dom1 access to irq 174: Function not implemented
> Turning on vfb in domain 1
> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> still lr_pending
> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> still lr_pending
> Parsing config from /xen/images/DomUQNX.cfg
> XSM Disabled: seclabel not supported(XEN) gic.c:617:d0v1 trying to
> inject irq=2 into d0v0, when it is still lr_pending
> 
> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> still lr_pending
> [    4.304687] dra7-evm-sound sound.17: cpu dai node is invalid
> [    4.312500] dra7-evm-sound sound.17: failed to add bluetooth dai link -22
> xc: error: panic: xc_dom_core.c:644: xc_dom_find_loader: no loader
> found: Invalid kernel
> libxl: error: libxl_dom.c:436:libxl__build_pv: xc_dom_parse_image
> failed: No such file or directory
> libxl: error: libxl_create.c:1030:domcreate_rebuild_done: cannot
> (re-)build domain: -3
> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> still lr_pending
> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> still lr_pending
> Turning on 'vsnd' in domain '1' (dev_id: '0')
> Turning on vkbd in domain 1
> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> still lr_pending
> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> still lr_pending
> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> still lr_pending
> 
> Please press Enter to activate this console. (XEN) gic.c:617:d0v1
> trying to inject irq=2 into d0v0, when it is still lr_pending
> 
> On Tue, Nov 18, 2014 at 6:18 PM, Andrii Tseglytskyi
> <andrii.tseglytskyi@globallogic.com> wrote:
> > OK got it. Give me a few mins
> >
> > On Tue, Nov 18, 2014 at 6:14 PM, Stefano Stabellini
> > <stefano.stabellini@eu.citrix.com> wrote:
> >> It is not the same: I would like to set GICH_V2_LR_MAINTENANCE_IRQ only
> >> for non-hardware irqs (desc == NULL) and keep avoiding
> >> GICH_V2_LR_MAINTENANCE_IRQ and setting GICH_LR_HW for hardware irqs.
> >>
> >> Also testing on 394b7e587b05d0f4a5fd6f067b38339ab5a77121 would avoid
> >> other potential bugs introduced later.
> >>
> >> On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> >>> What if I try on top of current master branch the following code:
> >>>
> >>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> >>> index 31fb81a..6764ab7 100644
> >>> --- a/xen/arch/arm/gic-v2.c
> >>> +++ b/xen/arch/arm/gic-v2.c
> >>> @@ -36,6 +36,8 @@
> >>>  #include <asm/io.h>
> >>>  #include <asm/gic.h>
> >>>
> >>> +#define GIC_DEBUG 1
> >>> +
> >>>  /*
> >>>   * LR register definitions are GIC v2 specific.
> >>>   * Moved these definitions from header file to here
> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> index bcaded9..c03d6a6 100644
> >>> --- a/xen/arch/arm/gic.c
> >>> +++ b/xen/arch/arm/gic.c
> >>> @@ -41,7 +41,7 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
> >>>
> >>>  #define lr_all_full() (this_cpu(lr_mask) == ((1 <<
> >>> gic_hw_ops->info->nr_lrs) - 1))
> >>>
> >>> -#undef GIC_DEBUG
> >>> +#define GIC_DEBUG 1
> >>>
> >>>  static void gic_update_one_lr(struct vcpu *v, int i);
> >>>
> >>> It is equivalent to what you proposing - my code contains
> >>> PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI, as result the following lone will
> >>> be executed:
> >>>  lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ; inside gicv2_update_lr() function
> >>>
> >>> regards,
> >>> Andrii
> >>>
> >>> On Tue, Nov 18, 2014 at 5:39 PM, Stefano Stabellini
> >>> <stefano.stabellini@eu.citrix.com> wrote:
> >>> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> >>> >> OK, I see that GICH_V2_LR_MAINTENANCE_IRQ must always be set and
> >>> >> everything works fine
> >>> >> The following 2 patches fixes xen/master for my platform.
> >>> >>
> >>> >> Stefano, could you please take a look to these changes?
> >>> >>
> >>> >> commit 3628a0aa35706a8f532af865ed784536ce514eca
> >>> >> Author: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
> >>> >> Date:   Tue Nov 18 14:20:42 2014 +0200
> >>> >>
> >>> >>     xen/arm: dra7: always set GICH_V2_LR_MAINTENANCE_IRQ flag
> >>> >>
> >>> >>     Change-Id: Ia380b3507a182b11592588f65fd23693d4f87434
> >>> >>     Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
> >>> >>
> >>> >> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> >>> >> index 31fb81a..093ecdb 100644
> >>> >> --- a/xen/arch/arm/gic-v2.c
> >>> >> +++ b/xen/arch/arm/gic-v2.c
> >>> >> @@ -396,13 +396,14 @@ static void gicv2_update_lr(int lr, const struct
> >>> >> pending_irq *p,
> >>> >>                                               << GICH_V2_LR_PRIORITY_SHIFT) |
> >>> >>                ((p->irq & GICH_V2_LR_VIRTUAL_MASK) <<
> >>> >> GICH_V2_LR_VIRTUAL_SHIFT));
> >>> >>
> >>> >> -    if ( p->desc != NULL )
> >>> >> +    if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
> >>> >>      {
> >>> >> -        if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
> >>> >> -            lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
> >>> >> -        else
> >>> >> -            lr_reg |= GICH_V2_LR_HW | ((p->desc->irq &
> >>> >> GICH_V2_LR_PHYSICAL_MASK )
> >>> >> -                            << GICH_V2_LR_PHYSICAL_SHIFT);
> >>> >> +        lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
> >>> >> +    }
> >>> >> +    else if ( p->desc != NULL )
> >>> >> +    {
> >>> >> +        lr_reg |= GICH_V2_LR_HW | ((p->desc->irq & GICH_V2_LR_PHYSICAL_MASK )
> >>> >> +                       << GICH_V2_LR_PHYSICAL_SHIFT);
> >>> >>      }
> >>> >>
> >>> >>      writel_gich(lr_reg, GICH_LR + lr * 4);
> >>> >
> >>> > Actually in case p->desc == NULL (the irq is not an hardware irq, it
> >>> > could be the virtual timer irq or the evtchn irq), you shouldn't need
> >>> > the maintenance interrupt, if the bug was really due to GICH_LR_HW not
> >>> > working correctly on OMAP5. This changes might only be better at
> >>> > "hiding" the real issue.
> >>> >
> >>> > Maybe the problem is exactly the opposite: the new scheme for avoiding
> >>> > maintenance interrupts doesn't work for software interrupts.
> >>> > The commit that should make them work correctly after the
> >>> > no-maintenance-irq commit is 394b7e587b05d0f4a5fd6f067b38339ab5a77121
> >>> > If you look at the changes to gic_update_one_lr in that commit, you'll
> >>> > see that is going to set a software irq as PENDING if it is already ACTIVE.
> >>> > Maybe that doesn't work correctly on OMAP5.
> >>> >
> >>> > Could you try this patch on top of
> >>> > 394b7e587b05d0f4a5fd6f067b38339ab5a77121?  It should help us understand
> >>> > if the problem is specifically with software irqs.
> >>> >
> >>> >
> >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >>> > index b7516c0..d8a17c9 100644
> >>> > --- a/xen/arch/arm/gic.c
> >>> > +++ b/xen/arch/arm/gic.c
> >>> > @@ -66,7 +66,7 @@ static DEFINE_PER_CPU(u8, gic_cpu_id);
> >>> >  /* Maximum cpu interface per GIC */
> >>> >  #define NR_GIC_CPU_IF 8
> >>> >
> >>> > -#undef GIC_DEBUG
> >>> > +#define GIC_DEBUG 1
> >>> >
> >>> >  static void gic_update_one_lr(struct vcpu *v, int i);
> >>> >
> >>> > @@ -563,6 +563,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
> >>> >          ((p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT);
> >>> >      if ( p->desc != NULL )
> >>> >          lr_val |= GICH_LR_HW | (p->desc->irq << GICH_LR_PHYSICAL_SHIFT);
> >>> > +    else
> >>> > +        lr_val |= GICH_LR_MAINTENANCE_IRQ;
> >>> >
> >>> >      GICH[GICH_LR + lr] = lr_val;
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Andrii Tseglytskyi | Embedded Dev
> >>> GlobalLogic
> >>> www.globallogic.com
> >>>
> >
> >
> >
> > --
> >
> > Andrii Tseglytskyi | Embedded Dev
> > GlobalLogic
> > www.globallogic.com
> 
> 
> 
> -- 
> 
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
>

Comments

Andrii Tseglytskyi Nov. 19, 2014, 9:38 a.m. UTC | #1
Hi Stefano,

Thank you for your support.

You are right - with latest change you've proposed I got a continuous
prints during platform hang:

(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
(XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0

Looks line issue needs further deeper debugging.

Regards,
Andrii

On Tue, Nov 18, 2014 at 7:51 PM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> Hello Andrii,
> we are getting closer :-)
>
> It would help if you post the output with GIC_DEBUG defined but without
> the other change that "fixes" the issue.
>
> I think the problem is probably due to software irqs.
> You are getting too many
>
> gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is still lr_pending
>
> messages. That means you are loosing virtual SGIs (guest VCPU to guest
> VCPU). It would be best to investigate why, especially if you get many
> more of the same messages without the MAINTENANCE_IRQ change I
> suggested.
>
> This patch might also help understading the problem more:
>
>
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index b7516c0..5eaeca2 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -717,7 +717,12 @@ static void gic_restore_pending_irqs(struct vcpu *v)
>      list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue )
>      {
>          i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
> -        if ( i >= nr_lrs ) return;
> +        if ( i >= nr_lrs )
> +        {
> +            gdprintk(XENLOG_DEBUG, "LRs full, not injecting irq=%u into d%dv%d\n",
> +                    p->irq, v->domain->domain_id, v->vcpu_id);
> +            continue;
> +        }
>
>          spin_lock_irqsave(&gic.lock, flags);
>          gic_set_lr(i, p, GICH_LR_PENDING);
>
>
>
>
> On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> Hi Stefano,
>>
>> No hangs with this change.
>> Complete log is the following:
>>
>> U-Boot SPL 2013.10-00499-g062782f (Oct 14 2014 - 11:36:26)
>> DRA752 ES1.0
>> <ethaddr> not set. Validating first E-fuse MAC
>> cpsw
>> - UART enabled -
>> - CPU 00000000 booting -
>> - Xen starting in Hyp mode -
>> - Zero BSS -
>> - Setting up control registers -
>> - Turning on paging -
>> - Ready -
>> (XEN) Checking for initrd in /chosen
>> (XEN) RAM: 0000000080000000 - 000000009fffffff
>> (XEN) RAM: 00000000a0000000 - 00000000bfffffff
>> (XEN) RAM: 00000000c0000000 - 00000000dfffffff
>> (XEN)
>> (XEN) MODULE[1]: 00000000c2000000 - 00000000c20069aa
>> (XEN) MODULE[2]: 00000000c0000000 - 00000000c2000000
>> (XEN) MODULE[3]: 0000000000000000 - 0000000000000000
>> (XEN) MODULE[4]: 00000000c3000000 - 00000000c3010000
>> (XEN)  RESVD[0]: 00000000ba300000 - 00000000bfd00000
>> (XEN)  RESVD[1]: 0000000095800000 - 0000000095900000
>> (XEN)  RESVD[2]: 0000000098a00000 - 0000000098b00000
>> (XEN)  RESVD[3]: 0000000095f00000 - 0000000098a00000
>> (XEN)  RESVD[4]: 0000000095900000 - 0000000095f00000
>> (XEN)
>> (XEN) Command line: dom0_mem=128M console=dtuart dtuart=serial0
>> dom0_max_vcpus=2 bootscrub=0 flask_enforcing=1
>> (XEN) Placing Xen at 0x00000000dfe00000-0x00000000e0000000
>> (XEN) Xen heap: 00000000d2000000-00000000de000000 (49152 pages)
>> (XEN) Dom heap: 344064 pages
>> (XEN) Domain heap initialised
>> (XEN) Looking for UART console serial0
>>  Xen 4.5-unstable
>> (XEN) Xen version 4.5-unstable (atseglytskyi@)
>> (arm-linux-gnueabihf-gcc (crosstool-NG
>> linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) 4.7.3
>> 20130328 (prerelease)) debu4
>> (XEN) Latest ChangeSet: Thu Jul 3 12:55:26 2014 +0300 git:3ee354f-dirty
>> (XEN) Processor: 412fc0f2: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x2
>> (XEN) 32-bit Execution:
>> (XEN)   Processor Features: 00001131:00011011
>> (XEN)     Instruction Sets: AArch32 Thumb Thumb-2 ThumbEE Jazelle
>> (XEN)     Extensions: GenericTimer Security
>> (XEN)   Debug Features: 02010555
>> (XEN)   Auxiliary Features: 00000000
>> (XEN)   Memory Model Features: 10201105 20000000 01240000 02102211
>> (XEN)  ISA Features: 02101110 13112111 21232041 11112131 10011142 00000000
>> (XEN) Platform: TI DRA7
>> (XEN) /psci method must be smc, but is: "hvc"
>> (XEN) Set AuxCoreBoot1 to 00000000dfe0004c (0020004c)
>> (XEN) Set AuxCoreBoot0 to 0x20
>> (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27
>> (XEN) Using generic timer at 6144 KHz
>> (XEN) GIC initialization:
>> (XEN)         gic_dist_addr=0000000048211000
>> (XEN)         gic_cpu_addr=0000000048212000
>> (XEN)         gic_hyp_addr=0000000048214000
>> (XEN)         gic_vcpu_addr=0000000048216000
>> (XEN)         gic_maintenance_irq=25
>> (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b).
>> (XEN) Using scheduler: SMP Credit Scheduler (credit)
>> (XEN) I/O virtualisation disabled
>> (XEN) Allocated console ring of 16 KiB.
>> (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0
>> (XEN) Bringing up CPU1
>> - CPU 00000001 booting -
>> - Xen starting in Hyp mode -
>> - Setting up control registers -
>> - Turning on paging -
>> - Ready -
>> (XEN) CPU 1 booted.
>> (XEN) Brought up 2 CPUs
>> (XEN) *** LOADING DOMAIN 0 ***
>> (XEN) Loading kernel from boot module 2
>> (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0)
>> (XEN) Loading zImage from 00000000c0000040 to 00000000cfc00000-00000000cff50c48
>> (XEN) Loading dom0 DTB to 0x00000000cfa00000-0x00000000cfa05ba8
>> (XEN) Std. Loglevel: All
>> (XEN) Guest Loglevel: All
>> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
>> input to Xen)
>> (XEN) Freed 272kB init memory.
>> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
>> already pending in LR0
>> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
>> already pending in LR0
>> [    0.000000] /cpus/cpu@0 missing clock-frequency property
>> [    0.000000] /cpus/cpu@1 missing clock-frequency property
>> [    0.140625] omap-gpmc omap-gpmc: failed to reserve memory
>> [    0.187500] omap_l3_noc ocp.3: couldn't find resource 2
>> [    0.273437] i2c i2c-1: of_i2c: invalid reg on
>> /ocp/i2c@48072000/camera_ov10635
>> [    0.437500] ldo3: operation not allowed
>> [    0.437500] omapdss HDMI error: can't set the voltage regulator
>> [    0.468750] tfc_s9700 display0: tfc_s9700_probe probe
>> [    0.468750] ov1063x 1-0030: No deserializer node found
>> [    0.468750] ov1063x 1-0030: No serializer node found
>> [    0.468750] ov1063x 1-0030: Failed writing register 0x0103!
>> [    0.468750] dra7xx-vip vip1-0: Waiting for I2C subdevice 30
>> [    0.578125] ahci ahci.0.auto: can't get clock
>> [    0.898437] ldc_module_init
>> [    1.304687] Missing dual_emac_res_vlan in DT.
>> [    1.304687] Using 1 as Reserved VLAN for 0 slave
>> [    1.312500] Missing dual_emac_res_vlan in DT.
>> [    1.320312] Using 2 as Reserved VLAN for 1 slave
>> [    1.382812] Freeing init memory: 236K
>> sh: write error: No such device
>> Cannot identify '/dev/camera0': 2, No such file or directory
>> Parsing config from /xen/images/DomUAndroid.cfg
>> XSM Disabled: seclabel not supported
>> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> dom1 access to irq 53: Function not implemented
>> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> dom1 access to irq 71: Function not implemented
>> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> dom1 access to irq 173: Function not implemented
>> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> dom1 access to irq 174: Function not implemented
>> Turning on vfb in domain 1
>> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> still lr_pending
>> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> still lr_pending
>> Parsing config from /xen/images/DomUQNX.cfg
>> XSM Disabled: seclabel not supported(XEN) gic.c:617:d0v1 trying to
>> inject irq=2 into d0v0, when it is still lr_pending
>>
>> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> still lr_pending
>> [    4.304687] dra7-evm-sound sound.17: cpu dai node is invalid
>> [    4.312500] dra7-evm-sound sound.17: failed to add bluetooth dai link -22
>> xc: error: panic: xc_dom_core.c:644: xc_dom_find_loader: no loader
>> found: Invalid kernel
>> libxl: error: libxl_dom.c:436:libxl__build_pv: xc_dom_parse_image
>> failed: No such file or directory
>> libxl: error: libxl_create.c:1030:domcreate_rebuild_done: cannot
>> (re-)build domain: -3
>> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> still lr_pending
>> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> still lr_pending
>> Turning on 'vsnd' in domain '1' (dev_id: '0')
>> Turning on vkbd in domain 1
>> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> still lr_pending
>> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> still lr_pending
>> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> still lr_pending
>>
>> Please press Enter to activate this console. (XEN) gic.c:617:d0v1
>> trying to inject irq=2 into d0v0, when it is still lr_pending
>>
>> On Tue, Nov 18, 2014 at 6:18 PM, Andrii Tseglytskyi
>> <andrii.tseglytskyi@globallogic.com> wrote:
>> > OK got it. Give me a few mins
>> >
>> > On Tue, Nov 18, 2014 at 6:14 PM, Stefano Stabellini
>> > <stefano.stabellini@eu.citrix.com> wrote:
>> >> It is not the same: I would like to set GICH_V2_LR_MAINTENANCE_IRQ only
>> >> for non-hardware irqs (desc == NULL) and keep avoiding
>> >> GICH_V2_LR_MAINTENANCE_IRQ and setting GICH_LR_HW for hardware irqs.
>> >>
>> >> Also testing on 394b7e587b05d0f4a5fd6f067b38339ab5a77121 would avoid
>> >> other potential bugs introduced later.
>> >>
>> >> On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> >>> What if I try on top of current master branch the following code:
>> >>>
>> >>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> >>> index 31fb81a..6764ab7 100644
>> >>> --- a/xen/arch/arm/gic-v2.c
>> >>> +++ b/xen/arch/arm/gic-v2.c
>> >>> @@ -36,6 +36,8 @@
>> >>>  #include <asm/io.h>
>> >>>  #include <asm/gic.h>
>> >>>
>> >>> +#define GIC_DEBUG 1
>> >>> +
>> >>>  /*
>> >>>   * LR register definitions are GIC v2 specific.
>> >>>   * Moved these definitions from header file to here
>> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> >>> index bcaded9..c03d6a6 100644
>> >>> --- a/xen/arch/arm/gic.c
>> >>> +++ b/xen/arch/arm/gic.c
>> >>> @@ -41,7 +41,7 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
>> >>>
>> >>>  #define lr_all_full() (this_cpu(lr_mask) == ((1 <<
>> >>> gic_hw_ops->info->nr_lrs) - 1))
>> >>>
>> >>> -#undef GIC_DEBUG
>> >>> +#define GIC_DEBUG 1
>> >>>
>> >>>  static void gic_update_one_lr(struct vcpu *v, int i);
>> >>>
>> >>> It is equivalent to what you proposing - my code contains
>> >>> PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI, as result the following lone will
>> >>> be executed:
>> >>>  lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ; inside gicv2_update_lr() function
>> >>>
>> >>> regards,
>> >>> Andrii
>> >>>
>> >>> On Tue, Nov 18, 2014 at 5:39 PM, Stefano Stabellini
>> >>> <stefano.stabellini@eu.citrix.com> wrote:
>> >>> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> >>> >> OK, I see that GICH_V2_LR_MAINTENANCE_IRQ must always be set and
>> >>> >> everything works fine
>> >>> >> The following 2 patches fixes xen/master for my platform.
>> >>> >>
>> >>> >> Stefano, could you please take a look to these changes?
>> >>> >>
>> >>> >> commit 3628a0aa35706a8f532af865ed784536ce514eca
>> >>> >> Author: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
>> >>> >> Date:   Tue Nov 18 14:20:42 2014 +0200
>> >>> >>
>> >>> >>     xen/arm: dra7: always set GICH_V2_LR_MAINTENANCE_IRQ flag
>> >>> >>
>> >>> >>     Change-Id: Ia380b3507a182b11592588f65fd23693d4f87434
>> >>> >>     Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
>> >>> >>
>> >>> >> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> >>> >> index 31fb81a..093ecdb 100644
>> >>> >> --- a/xen/arch/arm/gic-v2.c
>> >>> >> +++ b/xen/arch/arm/gic-v2.c
>> >>> >> @@ -396,13 +396,14 @@ static void gicv2_update_lr(int lr, const struct
>> >>> >> pending_irq *p,
>> >>> >>                                               << GICH_V2_LR_PRIORITY_SHIFT) |
>> >>> >>                ((p->irq & GICH_V2_LR_VIRTUAL_MASK) <<
>> >>> >> GICH_V2_LR_VIRTUAL_SHIFT));
>> >>> >>
>> >>> >> -    if ( p->desc != NULL )
>> >>> >> +    if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
>> >>> >>      {
>> >>> >> -        if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
>> >>> >> -            lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
>> >>> >> -        else
>> >>> >> -            lr_reg |= GICH_V2_LR_HW | ((p->desc->irq &
>> >>> >> GICH_V2_LR_PHYSICAL_MASK )
>> >>> >> -                            << GICH_V2_LR_PHYSICAL_SHIFT);
>> >>> >> +        lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
>> >>> >> +    }
>> >>> >> +    else if ( p->desc != NULL )
>> >>> >> +    {
>> >>> >> +        lr_reg |= GICH_V2_LR_HW | ((p->desc->irq & GICH_V2_LR_PHYSICAL_MASK )
>> >>> >> +                       << GICH_V2_LR_PHYSICAL_SHIFT);
>> >>> >>      }
>> >>> >>
>> >>> >>      writel_gich(lr_reg, GICH_LR + lr * 4);
>> >>> >
>> >>> > Actually in case p->desc == NULL (the irq is not an hardware irq, it
>> >>> > could be the virtual timer irq or the evtchn irq), you shouldn't need
>> >>> > the maintenance interrupt, if the bug was really due to GICH_LR_HW not
>> >>> > working correctly on OMAP5. This changes might only be better at
>> >>> > "hiding" the real issue.
>> >>> >
>> >>> > Maybe the problem is exactly the opposite: the new scheme for avoiding
>> >>> > maintenance interrupts doesn't work for software interrupts.
>> >>> > The commit that should make them work correctly after the
>> >>> > no-maintenance-irq commit is 394b7e587b05d0f4a5fd6f067b38339ab5a77121
>> >>> > If you look at the changes to gic_update_one_lr in that commit, you'll
>> >>> > see that is going to set a software irq as PENDING if it is already ACTIVE.
>> >>> > Maybe that doesn't work correctly on OMAP5.
>> >>> >
>> >>> > Could you try this patch on top of
>> >>> > 394b7e587b05d0f4a5fd6f067b38339ab5a77121?  It should help us understand
>> >>> > if the problem is specifically with software irqs.
>> >>> >
>> >>> >
>> >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> >>> > index b7516c0..d8a17c9 100644
>> >>> > --- a/xen/arch/arm/gic.c
>> >>> > +++ b/xen/arch/arm/gic.c
>> >>> > @@ -66,7 +66,7 @@ static DEFINE_PER_CPU(u8, gic_cpu_id);
>> >>> >  /* Maximum cpu interface per GIC */
>> >>> >  #define NR_GIC_CPU_IF 8
>> >>> >
>> >>> > -#undef GIC_DEBUG
>> >>> > +#define GIC_DEBUG 1
>> >>> >
>> >>> >  static void gic_update_one_lr(struct vcpu *v, int i);
>> >>> >
>> >>> > @@ -563,6 +563,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
>> >>> >          ((p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT);
>> >>> >      if ( p->desc != NULL )
>> >>> >          lr_val |= GICH_LR_HW | (p->desc->irq << GICH_LR_PHYSICAL_SHIFT);
>> >>> > +    else
>> >>> > +        lr_val |= GICH_LR_MAINTENANCE_IRQ;
>> >>> >
>> >>> >      GICH[GICH_LR + lr] = lr_val;
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>>
>> >>> Andrii Tseglytskyi | Embedded Dev
>> >>> GlobalLogic
>> >>> www.globallogic.com
>> >>>
>> >
>> >
>> >
>> > --
>> >
>> > Andrii Tseglytskyi | Embedded Dev
>> > GlobalLogic
>> > www.globallogic.com
>>
>>
>>
>> --
>>
>> Andrii Tseglytskyi | Embedded Dev
>> GlobalLogic
>> www.globallogic.com
>>
Stefano Stabellini Nov. 19, 2014, 11:12 a.m. UTC | #2
On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
> Hi Stefano,
> 
> Thank you for your support.
> 
> You are right - with latest change you've proposed I got a continuous
> prints during platform hang:
> 
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
> 
> Looks line issue needs further deeper debugging.

Cool! You could simply print what irqs are in all LRs when they are
full, for example you could call gic_dump_info. That would tell us what
is taking all the LRs space we have.

How many LRs are available on omap5 anyway?

I doubt you have so much interrupt traffic to actually fill all the LRs,
so I am thinking that a few LRs might not be cleared properly (that
should happen on hypervisor entry, gic_update_one_lr should take care of
it).


> Regards,
> Andrii
> 
> On Tue, Nov 18, 2014 at 7:51 PM, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> > Hello Andrii,
> > we are getting closer :-)
> >
> > It would help if you post the output with GIC_DEBUG defined but without
> > the other change that "fixes" the issue.
> >
> > I think the problem is probably due to software irqs.
> > You are getting too many
> >
> > gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is still lr_pending
> >
> > messages. That means you are loosing virtual SGIs (guest VCPU to guest
> > VCPU). It would be best to investigate why, especially if you get many
> > more of the same messages without the MAINTENANCE_IRQ change I
> > suggested.
> >
> > This patch might also help understading the problem more:
> >
> >
> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> > index b7516c0..5eaeca2 100644
> > --- a/xen/arch/arm/gic.c
> > +++ b/xen/arch/arm/gic.c
> > @@ -717,7 +717,12 @@ static void gic_restore_pending_irqs(struct vcpu *v)
> >      list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue )
> >      {
> >          i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
> > -        if ( i >= nr_lrs ) return;
> > +        if ( i >= nr_lrs )
> > +        {
> > +            gdprintk(XENLOG_DEBUG, "LRs full, not injecting irq=%u into d%dv%d\n",
> > +                    p->irq, v->domain->domain_id, v->vcpu_id);
> > +            continue;
> > +        }
> >
> >          spin_lock_irqsave(&gic.lock, flags);
> >          gic_set_lr(i, p, GICH_LR_PENDING);
> >
> >
> >
> >
> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> >> Hi Stefano,
> >>
> >> No hangs with this change.
> >> Complete log is the following:
> >>
> >> U-Boot SPL 2013.10-00499-g062782f (Oct 14 2014 - 11:36:26)
> >> DRA752 ES1.0
> >> <ethaddr> not set. Validating first E-fuse MAC
> >> cpsw
> >> - UART enabled -
> >> - CPU 00000000 booting -
> >> - Xen starting in Hyp mode -
> >> - Zero BSS -
> >> - Setting up control registers -
> >> - Turning on paging -
> >> - Ready -
> >> (XEN) Checking for initrd in /chosen
> >> (XEN) RAM: 0000000080000000 - 000000009fffffff
> >> (XEN) RAM: 00000000a0000000 - 00000000bfffffff
> >> (XEN) RAM: 00000000c0000000 - 00000000dfffffff
> >> (XEN)
> >> (XEN) MODULE[1]: 00000000c2000000 - 00000000c20069aa
> >> (XEN) MODULE[2]: 00000000c0000000 - 00000000c2000000
> >> (XEN) MODULE[3]: 0000000000000000 - 0000000000000000
> >> (XEN) MODULE[4]: 00000000c3000000 - 00000000c3010000
> >> (XEN)  RESVD[0]: 00000000ba300000 - 00000000bfd00000
> >> (XEN)  RESVD[1]: 0000000095800000 - 0000000095900000
> >> (XEN)  RESVD[2]: 0000000098a00000 - 0000000098b00000
> >> (XEN)  RESVD[3]: 0000000095f00000 - 0000000098a00000
> >> (XEN)  RESVD[4]: 0000000095900000 - 0000000095f00000
> >> (XEN)
> >> (XEN) Command line: dom0_mem=128M console=dtuart dtuart=serial0
> >> dom0_max_vcpus=2 bootscrub=0 flask_enforcing=1
> >> (XEN) Placing Xen at 0x00000000dfe00000-0x00000000e0000000
> >> (XEN) Xen heap: 00000000d2000000-00000000de000000 (49152 pages)
> >> (XEN) Dom heap: 344064 pages
> >> (XEN) Domain heap initialised
> >> (XEN) Looking for UART console serial0
> >>  Xen 4.5-unstable
> >> (XEN) Xen version 4.5-unstable (atseglytskyi@)
> >> (arm-linux-gnueabihf-gcc (crosstool-NG
> >> linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) 4.7.3
> >> 20130328 (prerelease)) debu4
> >> (XEN) Latest ChangeSet: Thu Jul 3 12:55:26 2014 +0300 git:3ee354f-dirty
> >> (XEN) Processor: 412fc0f2: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x2
> >> (XEN) 32-bit Execution:
> >> (XEN)   Processor Features: 00001131:00011011
> >> (XEN)     Instruction Sets: AArch32 Thumb Thumb-2 ThumbEE Jazelle
> >> (XEN)     Extensions: GenericTimer Security
> >> (XEN)   Debug Features: 02010555
> >> (XEN)   Auxiliary Features: 00000000
> >> (XEN)   Memory Model Features: 10201105 20000000 01240000 02102211
> >> (XEN)  ISA Features: 02101110 13112111 21232041 11112131 10011142 00000000
> >> (XEN) Platform: TI DRA7
> >> (XEN) /psci method must be smc, but is: "hvc"
> >> (XEN) Set AuxCoreBoot1 to 00000000dfe0004c (0020004c)
> >> (XEN) Set AuxCoreBoot0 to 0x20
> >> (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27
> >> (XEN) Using generic timer at 6144 KHz
> >> (XEN) GIC initialization:
> >> (XEN)         gic_dist_addr=0000000048211000
> >> (XEN)         gic_cpu_addr=0000000048212000
> >> (XEN)         gic_hyp_addr=0000000048214000
> >> (XEN)         gic_vcpu_addr=0000000048216000
> >> (XEN)         gic_maintenance_irq=25
> >> (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b).
> >> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> >> (XEN) I/O virtualisation disabled
> >> (XEN) Allocated console ring of 16 KiB.
> >> (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0
> >> (XEN) Bringing up CPU1
> >> - CPU 00000001 booting -
> >> - Xen starting in Hyp mode -
> >> - Setting up control registers -
> >> - Turning on paging -
> >> - Ready -
> >> (XEN) CPU 1 booted.
> >> (XEN) Brought up 2 CPUs
> >> (XEN) *** LOADING DOMAIN 0 ***
> >> (XEN) Loading kernel from boot module 2
> >> (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0)
> >> (XEN) Loading zImage from 00000000c0000040 to 00000000cfc00000-00000000cff50c48
> >> (XEN) Loading dom0 DTB to 0x00000000cfa00000-0x00000000cfa05ba8
> >> (XEN) Std. Loglevel: All
> >> (XEN) Guest Loglevel: All
> >> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
> >> input to Xen)
> >> (XEN) Freed 272kB init memory.
> >> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
> >> already pending in LR0
> >> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
> >> already pending in LR0
> >> [    0.000000] /cpus/cpu@0 missing clock-frequency property
> >> [    0.000000] /cpus/cpu@1 missing clock-frequency property
> >> [    0.140625] omap-gpmc omap-gpmc: failed to reserve memory
> >> [    0.187500] omap_l3_noc ocp.3: couldn't find resource 2
> >> [    0.273437] i2c i2c-1: of_i2c: invalid reg on
> >> /ocp/i2c@48072000/camera_ov10635
> >> [    0.437500] ldo3: operation not allowed
> >> [    0.437500] omapdss HDMI error: can't set the voltage regulator
> >> [    0.468750] tfc_s9700 display0: tfc_s9700_probe probe
> >> [    0.468750] ov1063x 1-0030: No deserializer node found
> >> [    0.468750] ov1063x 1-0030: No serializer node found
> >> [    0.468750] ov1063x 1-0030: Failed writing register 0x0103!
> >> [    0.468750] dra7xx-vip vip1-0: Waiting for I2C subdevice 30
> >> [    0.578125] ahci ahci.0.auto: can't get clock
> >> [    0.898437] ldc_module_init
> >> [    1.304687] Missing dual_emac_res_vlan in DT.
> >> [    1.304687] Using 1 as Reserved VLAN for 0 slave
> >> [    1.312500] Missing dual_emac_res_vlan in DT.
> >> [    1.320312] Using 2 as Reserved VLAN for 1 slave
> >> [    1.382812] Freeing init memory: 236K
> >> sh: write error: No such device
> >> Cannot identify '/dev/camera0': 2, No such file or directory
> >> Parsing config from /xen/images/DomUAndroid.cfg
> >> XSM Disabled: seclabel not supported
> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> >> dom1 access to irq 53: Function not implemented
> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> >> dom1 access to irq 71: Function not implemented
> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> >> dom1 access to irq 173: Function not implemented
> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
> >> dom1 access to irq 174: Function not implemented
> >> Turning on vfb in domain 1
> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> >> still lr_pending
> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> >> still lr_pending
> >> Parsing config from /xen/images/DomUQNX.cfg
> >> XSM Disabled: seclabel not supported(XEN) gic.c:617:d0v1 trying to
> >> inject irq=2 into d0v0, when it is still lr_pending
> >>
> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> >> still lr_pending
> >> [    4.304687] dra7-evm-sound sound.17: cpu dai node is invalid
> >> [    4.312500] dra7-evm-sound sound.17: failed to add bluetooth dai link -22
> >> xc: error: panic: xc_dom_core.c:644: xc_dom_find_loader: no loader
> >> found: Invalid kernel
> >> libxl: error: libxl_dom.c:436:libxl__build_pv: xc_dom_parse_image
> >> failed: No such file or directory
> >> libxl: error: libxl_create.c:1030:domcreate_rebuild_done: cannot
> >> (re-)build domain: -3
> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> >> still lr_pending
> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> >> still lr_pending
> >> Turning on 'vsnd' in domain '1' (dev_id: '0')
> >> Turning on vkbd in domain 1
> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> >> still lr_pending
> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
> >> still lr_pending
> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
> >> still lr_pending
> >>
> >> Please press Enter to activate this console. (XEN) gic.c:617:d0v1
> >> trying to inject irq=2 into d0v0, when it is still lr_pending
> >>
> >> On Tue, Nov 18, 2014 at 6:18 PM, Andrii Tseglytskyi
> >> <andrii.tseglytskyi@globallogic.com> wrote:
> >> > OK got it. Give me a few mins
> >> >
> >> > On Tue, Nov 18, 2014 at 6:14 PM, Stefano Stabellini
> >> > <stefano.stabellini@eu.citrix.com> wrote:
> >> >> It is not the same: I would like to set GICH_V2_LR_MAINTENANCE_IRQ only
> >> >> for non-hardware irqs (desc == NULL) and keep avoiding
> >> >> GICH_V2_LR_MAINTENANCE_IRQ and setting GICH_LR_HW for hardware irqs.
> >> >>
> >> >> Also testing on 394b7e587b05d0f4a5fd6f067b38339ab5a77121 would avoid
> >> >> other potential bugs introduced later.
> >> >>
> >> >> On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> >> >>> What if I try on top of current master branch the following code:
> >> >>>
> >> >>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> >> >>> index 31fb81a..6764ab7 100644
> >> >>> --- a/xen/arch/arm/gic-v2.c
> >> >>> +++ b/xen/arch/arm/gic-v2.c
> >> >>> @@ -36,6 +36,8 @@
> >> >>>  #include <asm/io.h>
> >> >>>  #include <asm/gic.h>
> >> >>>
> >> >>> +#define GIC_DEBUG 1
> >> >>> +
> >> >>>  /*
> >> >>>   * LR register definitions are GIC v2 specific.
> >> >>>   * Moved these definitions from header file to here
> >> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >> >>> index bcaded9..c03d6a6 100644
> >> >>> --- a/xen/arch/arm/gic.c
> >> >>> +++ b/xen/arch/arm/gic.c
> >> >>> @@ -41,7 +41,7 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
> >> >>>
> >> >>>  #define lr_all_full() (this_cpu(lr_mask) == ((1 <<
> >> >>> gic_hw_ops->info->nr_lrs) - 1))
> >> >>>
> >> >>> -#undef GIC_DEBUG
> >> >>> +#define GIC_DEBUG 1
> >> >>>
> >> >>>  static void gic_update_one_lr(struct vcpu *v, int i);
> >> >>>
> >> >>> It is equivalent to what you proposing - my code contains
> >> >>> PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI, as result the following lone will
> >> >>> be executed:
> >> >>>  lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ; inside gicv2_update_lr() function
> >> >>>
> >> >>> regards,
> >> >>> Andrii
> >> >>>
> >> >>> On Tue, Nov 18, 2014 at 5:39 PM, Stefano Stabellini
> >> >>> <stefano.stabellini@eu.citrix.com> wrote:
> >> >>> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
> >> >>> >> OK, I see that GICH_V2_LR_MAINTENANCE_IRQ must always be set and
> >> >>> >> everything works fine
> >> >>> >> The following 2 patches fixes xen/master for my platform.
> >> >>> >>
> >> >>> >> Stefano, could you please take a look to these changes?
> >> >>> >>
> >> >>> >> commit 3628a0aa35706a8f532af865ed784536ce514eca
> >> >>> >> Author: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
> >> >>> >> Date:   Tue Nov 18 14:20:42 2014 +0200
> >> >>> >>
> >> >>> >>     xen/arm: dra7: always set GICH_V2_LR_MAINTENANCE_IRQ flag
> >> >>> >>
> >> >>> >>     Change-Id: Ia380b3507a182b11592588f65fd23693d4f87434
> >> >>> >>     Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
> >> >>> >>
> >> >>> >> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> >> >>> >> index 31fb81a..093ecdb 100644
> >> >>> >> --- a/xen/arch/arm/gic-v2.c
> >> >>> >> +++ b/xen/arch/arm/gic-v2.c
> >> >>> >> @@ -396,13 +396,14 @@ static void gicv2_update_lr(int lr, const struct
> >> >>> >> pending_irq *p,
> >> >>> >>                                               << GICH_V2_LR_PRIORITY_SHIFT) |
> >> >>> >>                ((p->irq & GICH_V2_LR_VIRTUAL_MASK) <<
> >> >>> >> GICH_V2_LR_VIRTUAL_SHIFT));
> >> >>> >>
> >> >>> >> -    if ( p->desc != NULL )
> >> >>> >> +    if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
> >> >>> >>      {
> >> >>> >> -        if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
> >> >>> >> -            lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
> >> >>> >> -        else
> >> >>> >> -            lr_reg |= GICH_V2_LR_HW | ((p->desc->irq &
> >> >>> >> GICH_V2_LR_PHYSICAL_MASK )
> >> >>> >> -                            << GICH_V2_LR_PHYSICAL_SHIFT);
> >> >>> >> +        lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
> >> >>> >> +    }
> >> >>> >> +    else if ( p->desc != NULL )
> >> >>> >> +    {
> >> >>> >> +        lr_reg |= GICH_V2_LR_HW | ((p->desc->irq & GICH_V2_LR_PHYSICAL_MASK )
> >> >>> >> +                       << GICH_V2_LR_PHYSICAL_SHIFT);
> >> >>> >>      }
> >> >>> >>
> >> >>> >>      writel_gich(lr_reg, GICH_LR + lr * 4);
> >> >>> >
> >> >>> > Actually in case p->desc == NULL (the irq is not an hardware irq, it
> >> >>> > could be the virtual timer irq or the evtchn irq), you shouldn't need
> >> >>> > the maintenance interrupt, if the bug was really due to GICH_LR_HW not
> >> >>> > working correctly on OMAP5. This changes might only be better at
> >> >>> > "hiding" the real issue.
> >> >>> >
> >> >>> > Maybe the problem is exactly the opposite: the new scheme for avoiding
> >> >>> > maintenance interrupts doesn't work for software interrupts.
> >> >>> > The commit that should make them work correctly after the
> >> >>> > no-maintenance-irq commit is 394b7e587b05d0f4a5fd6f067b38339ab5a77121
> >> >>> > If you look at the changes to gic_update_one_lr in that commit, you'll
> >> >>> > see that is going to set a software irq as PENDING if it is already ACTIVE.
> >> >>> > Maybe that doesn't work correctly on OMAP5.
> >> >>> >
> >> >>> > Could you try this patch on top of
> >> >>> > 394b7e587b05d0f4a5fd6f067b38339ab5a77121?  It should help us understand
> >> >>> > if the problem is specifically with software irqs.
> >> >>> >
> >> >>> >
> >> >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >> >>> > index b7516c0..d8a17c9 100644
> >> >>> > --- a/xen/arch/arm/gic.c
> >> >>> > +++ b/xen/arch/arm/gic.c
> >> >>> > @@ -66,7 +66,7 @@ static DEFINE_PER_CPU(u8, gic_cpu_id);
> >> >>> >  /* Maximum cpu interface per GIC */
> >> >>> >  #define NR_GIC_CPU_IF 8
> >> >>> >
> >> >>> > -#undef GIC_DEBUG
> >> >>> > +#define GIC_DEBUG 1
> >> >>> >
> >> >>> >  static void gic_update_one_lr(struct vcpu *v, int i);
> >> >>> >
> >> >>> > @@ -563,6 +563,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
> >> >>> >          ((p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT);
> >> >>> >      if ( p->desc != NULL )
> >> >>> >          lr_val |= GICH_LR_HW | (p->desc->irq << GICH_LR_PHYSICAL_SHIFT);
> >> >>> > +    else
> >> >>> > +        lr_val |= GICH_LR_MAINTENANCE_IRQ;
> >> >>> >
> >> >>> >      GICH[GICH_LR + lr] = lr_val;
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>>
> >> >>> Andrii Tseglytskyi | Embedded Dev
> >> >>> GlobalLogic
> >> >>> www.globallogic.com
> >> >>>
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Andrii Tseglytskyi | Embedded Dev
> >> > GlobalLogic
> >> > www.globallogic.com
> >>
> >>
> >>
> >> --
> >>
> >> Andrii Tseglytskyi | Embedded Dev
> >> GlobalLogic
> >> www.globallogic.com
> >>
> 
> 
> 
> -- 
> 
> Andrii Tseglytskyi | Embedded Dev
> GlobalLogic
> www.globallogic.com
>
Andrii Tseglytskyi Nov. 19, 2014, 11:16 a.m. UTC | #3
On Wed, Nov 19, 2014 at 1:12 PM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Wed, 19 Nov 2014, Andrii Tseglytskyi wrote:
>> Hi Stefano,
>>
>> Thank you for your support.
>>
>> You are right - with latest change you've proposed I got a continuous
>> prints during platform hang:
>>
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>> (XEN) gic.c:725:d0v0 LRs full, not injecting irq=2 into d0v0
>>
>> Looks line issue needs further deeper debugging.
>
> Cool! You could simply print what irqs are in all LRs when they are
> full, for example you could call gic_dump_info. That would tell us what
> is taking all the LRs space we have.
>
> How many LRs are available on omap5 anyway?

:) Already done this:


(XEN) gic.c:725:d0v0 LRs full, not injecting irq=27 nr_lrs 4 i 4 into d0v0
(XEN) GICH_LRs (vcpu 0) mask=f
(XEN)    HW_LR[0]=1a00001f
(XEN)    HW_LR[1]=9a00e439
(XEN)    HW_LR[2]=1a000002
(XEN)    HW_LR[3]=9a015856
(XEN) Inflight irq=31 lr=0
(XEN) Inflight irq=57 lr=1
(XEN) Inflight irq=2 lr=2
(XEN) Inflight irq=86 lr=3
(XEN) Inflight irq=27 lr=255
(XEN) Pending irq=27


>
> I doubt you have so much interrupt traffic to actually fill all the LRs,
> so I am thinking that a few LRs might not be cleared properly (that
> should happen on hypervisor entry, gic_update_one_lr should take care of
> it).

This actually explains why this happens during domU start - SGI
traffic might be very heavy this time

>
>
>> Regards,
>> Andrii
>>
>> On Tue, Nov 18, 2014 at 7:51 PM, Stefano Stabellini
>> <stefano.stabellini@eu.citrix.com> wrote:
>> > Hello Andrii,
>> > we are getting closer :-)
>> >
>> > It would help if you post the output with GIC_DEBUG defined but without
>> > the other change that "fixes" the issue.
>> >
>> > I think the problem is probably due to software irqs.
>> > You are getting too many
>> >
>> > gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is still lr_pending
>> >
>> > messages. That means you are loosing virtual SGIs (guest VCPU to guest
>> > VCPU). It would be best to investigate why, especially if you get many
>> > more of the same messages without the MAINTENANCE_IRQ change I
>> > suggested.
>> >
>> > This patch might also help understading the problem more:
>> >
>> >
>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> > index b7516c0..5eaeca2 100644
>> > --- a/xen/arch/arm/gic.c
>> > +++ b/xen/arch/arm/gic.c
>> > @@ -717,7 +717,12 @@ static void gic_restore_pending_irqs(struct vcpu *v)
>> >      list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue )
>> >      {
>> >          i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
>> > -        if ( i >= nr_lrs ) return;
>> > +        if ( i >= nr_lrs )
>> > +        {
>> > +            gdprintk(XENLOG_DEBUG, "LRs full, not injecting irq=%u into d%dv%d\n",
>> > +                    p->irq, v->domain->domain_id, v->vcpu_id);
>> > +            continue;
>> > +        }
>> >
>> >          spin_lock_irqsave(&gic.lock, flags);
>> >          gic_set_lr(i, p, GICH_LR_PENDING);
>> >
>> >
>> >
>> >
>> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> >> Hi Stefano,
>> >>
>> >> No hangs with this change.
>> >> Complete log is the following:
>> >>
>> >> U-Boot SPL 2013.10-00499-g062782f (Oct 14 2014 - 11:36:26)
>> >> DRA752 ES1.0
>> >> <ethaddr> not set. Validating first E-fuse MAC
>> >> cpsw
>> >> - UART enabled -
>> >> - CPU 00000000 booting -
>> >> - Xen starting in Hyp mode -
>> >> - Zero BSS -
>> >> - Setting up control registers -
>> >> - Turning on paging -
>> >> - Ready -
>> >> (XEN) Checking for initrd in /chosen
>> >> (XEN) RAM: 0000000080000000 - 000000009fffffff
>> >> (XEN) RAM: 00000000a0000000 - 00000000bfffffff
>> >> (XEN) RAM: 00000000c0000000 - 00000000dfffffff
>> >> (XEN)
>> >> (XEN) MODULE[1]: 00000000c2000000 - 00000000c20069aa
>> >> (XEN) MODULE[2]: 00000000c0000000 - 00000000c2000000
>> >> (XEN) MODULE[3]: 0000000000000000 - 0000000000000000
>> >> (XEN) MODULE[4]: 00000000c3000000 - 00000000c3010000
>> >> (XEN)  RESVD[0]: 00000000ba300000 - 00000000bfd00000
>> >> (XEN)  RESVD[1]: 0000000095800000 - 0000000095900000
>> >> (XEN)  RESVD[2]: 0000000098a00000 - 0000000098b00000
>> >> (XEN)  RESVD[3]: 0000000095f00000 - 0000000098a00000
>> >> (XEN)  RESVD[4]: 0000000095900000 - 0000000095f00000
>> >> (XEN)
>> >> (XEN) Command line: dom0_mem=128M console=dtuart dtuart=serial0
>> >> dom0_max_vcpus=2 bootscrub=0 flask_enforcing=1
>> >> (XEN) Placing Xen at 0x00000000dfe00000-0x00000000e0000000
>> >> (XEN) Xen heap: 00000000d2000000-00000000de000000 (49152 pages)
>> >> (XEN) Dom heap: 344064 pages
>> >> (XEN) Domain heap initialised
>> >> (XEN) Looking for UART console serial0
>> >>  Xen 4.5-unstable
>> >> (XEN) Xen version 4.5-unstable (atseglytskyi@)
>> >> (arm-linux-gnueabihf-gcc (crosstool-NG
>> >> linaro-1.13.1-4.7-2013.04-20130415 - Linaro GCC 2013.04) 4.7.3
>> >> 20130328 (prerelease)) debu4
>> >> (XEN) Latest ChangeSet: Thu Jul 3 12:55:26 2014 +0300 git:3ee354f-dirty
>> >> (XEN) Processor: 412fc0f2: "ARM Limited", variant: 0x2, part 0xc0f, rev 0x2
>> >> (XEN) 32-bit Execution:
>> >> (XEN)   Processor Features: 00001131:00011011
>> >> (XEN)     Instruction Sets: AArch32 Thumb Thumb-2 ThumbEE Jazelle
>> >> (XEN)     Extensions: GenericTimer Security
>> >> (XEN)   Debug Features: 02010555
>> >> (XEN)   Auxiliary Features: 00000000
>> >> (XEN)   Memory Model Features: 10201105 20000000 01240000 02102211
>> >> (XEN)  ISA Features: 02101110 13112111 21232041 11112131 10011142 00000000
>> >> (XEN) Platform: TI DRA7
>> >> (XEN) /psci method must be smc, but is: "hvc"
>> >> (XEN) Set AuxCoreBoot1 to 00000000dfe0004c (0020004c)
>> >> (XEN) Set AuxCoreBoot0 to 0x20
>> >> (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27
>> >> (XEN) Using generic timer at 6144 KHz
>> >> (XEN) GIC initialization:
>> >> (XEN)         gic_dist_addr=0000000048211000
>> >> (XEN)         gic_cpu_addr=0000000048212000
>> >> (XEN)         gic_hyp_addr=0000000048214000
>> >> (XEN)         gic_vcpu_addr=0000000048216000
>> >> (XEN)         gic_maintenance_irq=25
>> >> (XEN) GIC: 192 lines, 2 cpus, secure (IID 0000043b).
>> >> (XEN) Using scheduler: SMP Credit Scheduler (credit)
>> >> (XEN) I/O virtualisation disabled
>> >> (XEN) Allocated console ring of 16 KiB.
>> >> (XEN) VFP implementer 0x41 architecture 4 part 0x30 variant 0xf rev 0x0
>> >> (XEN) Bringing up CPU1
>> >> - CPU 00000001 booting -
>> >> - Xen starting in Hyp mode -
>> >> - Setting up control registers -
>> >> - Turning on paging -
>> >> - Ready -
>> >> (XEN) CPU 1 booted.
>> >> (XEN) Brought up 2 CPUs
>> >> (XEN) *** LOADING DOMAIN 0 ***
>> >> (XEN) Loading kernel from boot module 2
>> >> (XEN) Populate P2M 0xc8000000->0xd0000000 (1:1 mapping for dom0)
>> >> (XEN) Loading zImage from 00000000c0000040 to 00000000cfc00000-00000000cff50c48
>> >> (XEN) Loading dom0 DTB to 0x00000000cfa00000-0x00000000cfa05ba8
>> >> (XEN) Std. Loglevel: All
>> >> (XEN) Guest Loglevel: All
>> >> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
>> >> input to Xen)
>> >> (XEN) Freed 272kB init memory.
>> >> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
>> >> already pending in LR0
>> >> (XEN) gic.c:673:d0v0 trying to inject irq=2 into d0v0, when it is
>> >> already pending in LR0
>> >> [    0.000000] /cpus/cpu@0 missing clock-frequency property
>> >> [    0.000000] /cpus/cpu@1 missing clock-frequency property
>> >> [    0.140625] omap-gpmc omap-gpmc: failed to reserve memory
>> >> [    0.187500] omap_l3_noc ocp.3: couldn't find resource 2
>> >> [    0.273437] i2c i2c-1: of_i2c: invalid reg on
>> >> /ocp/i2c@48072000/camera_ov10635
>> >> [    0.437500] ldo3: operation not allowed
>> >> [    0.437500] omapdss HDMI error: can't set the voltage regulator
>> >> [    0.468750] tfc_s9700 display0: tfc_s9700_probe probe
>> >> [    0.468750] ov1063x 1-0030: No deserializer node found
>> >> [    0.468750] ov1063x 1-0030: No serializer node found
>> >> [    0.468750] ov1063x 1-0030: Failed writing register 0x0103!
>> >> [    0.468750] dra7xx-vip vip1-0: Waiting for I2C subdevice 30
>> >> [    0.578125] ahci ahci.0.auto: can't get clock
>> >> [    0.898437] ldc_module_init
>> >> [    1.304687] Missing dual_emac_res_vlan in DT.
>> >> [    1.304687] Using 1 as Reserved VLAN for 0 slave
>> >> [    1.312500] Missing dual_emac_res_vlan in DT.
>> >> [    1.320312] Using 2 as Reserved VLAN for 1 slave
>> >> [    1.382812] Freeing init memory: 236K
>> >> sh: write error: No such device
>> >> Cannot identify '/dev/camera0': 2, No such file or directory
>> >> Parsing config from /xen/images/DomUAndroid.cfg
>> >> XSM Disabled: seclabel not supported
>> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> >> dom1 access to irq 53: Function not implemented
>> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> >> dom1 access to irq 71: Function not implemented
>> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> >> dom1 access to irq 173: Function not implemented
>> >> (XEN) do_physdev_op 16 cmd=13: not implemented yet
>> >> libxl: error: libxl_create.c:1092:domcreate_launch_dm: failed give
>> >> dom1 access to irq 174: Function not implemented
>> >> Turning on vfb in domain 1
>> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> >> still lr_pending
>> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> >> still lr_pending
>> >> Parsing config from /xen/images/DomUQNX.cfg
>> >> XSM Disabled: seclabel not supported(XEN) gic.c:617:d0v1 trying to
>> >> inject irq=2 into d0v0, when it is still lr_pending
>> >>
>> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> >> still lr_pending
>> >> [    4.304687] dra7-evm-sound sound.17: cpu dai node is invalid
>> >> [    4.312500] dra7-evm-sound sound.17: failed to add bluetooth dai link -22
>> >> xc: error: panic: xc_dom_core.c:644: xc_dom_find_loader: no loader
>> >> found: Invalid kernel
>> >> libxl: error: libxl_dom.c:436:libxl__build_pv: xc_dom_parse_image
>> >> failed: No such file or directory
>> >> libxl: error: libxl_create.c:1030:domcreate_rebuild_done: cannot
>> >> (re-)build domain: -3
>> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> >> still lr_pending
>> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> >> still lr_pending
>> >> Turning on 'vsnd' in domain '1' (dev_id: '0')
>> >> Turning on vkbd in domain 1
>> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> >> still lr_pending
>> >> (XEN) gic.c:617:d0v1 trying to inject irq=2 into d0v0, when it is
>> >> still lr_pending
>> >> (XEN) gic.c:617:d0v0 trying to inject irq=2 into d0v1, when it is
>> >> still lr_pending
>> >>
>> >> Please press Enter to activate this console. (XEN) gic.c:617:d0v1
>> >> trying to inject irq=2 into d0v0, when it is still lr_pending
>> >>
>> >> On Tue, Nov 18, 2014 at 6:18 PM, Andrii Tseglytskyi
>> >> <andrii.tseglytskyi@globallogic.com> wrote:
>> >> > OK got it. Give me a few mins
>> >> >
>> >> > On Tue, Nov 18, 2014 at 6:14 PM, Stefano Stabellini
>> >> > <stefano.stabellini@eu.citrix.com> wrote:
>> >> >> It is not the same: I would like to set GICH_V2_LR_MAINTENANCE_IRQ only
>> >> >> for non-hardware irqs (desc == NULL) and keep avoiding
>> >> >> GICH_V2_LR_MAINTENANCE_IRQ and setting GICH_LR_HW for hardware irqs.
>> >> >>
>> >> >> Also testing on 394b7e587b05d0f4a5fd6f067b38339ab5a77121 would avoid
>> >> >> other potential bugs introduced later.
>> >> >>
>> >> >> On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> >> >>> What if I try on top of current master branch the following code:
>> >> >>>
>> >> >>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> >> >>> index 31fb81a..6764ab7 100644
>> >> >>> --- a/xen/arch/arm/gic-v2.c
>> >> >>> +++ b/xen/arch/arm/gic-v2.c
>> >> >>> @@ -36,6 +36,8 @@
>> >> >>>  #include <asm/io.h>
>> >> >>>  #include <asm/gic.h>
>> >> >>>
>> >> >>> +#define GIC_DEBUG 1
>> >> >>> +
>> >> >>>  /*
>> >> >>>   * LR register definitions are GIC v2 specific.
>> >> >>>   * Moved these definitions from header file to here
>> >> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> >> >>> index bcaded9..c03d6a6 100644
>> >> >>> --- a/xen/arch/arm/gic.c
>> >> >>> +++ b/xen/arch/arm/gic.c
>> >> >>> @@ -41,7 +41,7 @@ static DEFINE_PER_CPU(uint64_t, lr_mask);
>> >> >>>
>> >> >>>  #define lr_all_full() (this_cpu(lr_mask) == ((1 <<
>> >> >>> gic_hw_ops->info->nr_lrs) - 1))
>> >> >>>
>> >> >>> -#undef GIC_DEBUG
>> >> >>> +#define GIC_DEBUG 1
>> >> >>>
>> >> >>>  static void gic_update_one_lr(struct vcpu *v, int i);
>> >> >>>
>> >> >>> It is equivalent to what you proposing - my code contains
>> >> >>> PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI, as result the following lone will
>> >> >>> be executed:
>> >> >>>  lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ; inside gicv2_update_lr() function
>> >> >>>
>> >> >>> regards,
>> >> >>> Andrii
>> >> >>>
>> >> >>> On Tue, Nov 18, 2014 at 5:39 PM, Stefano Stabellini
>> >> >>> <stefano.stabellini@eu.citrix.com> wrote:
>> >> >>> > On Tue, 18 Nov 2014, Andrii Tseglytskyi wrote:
>> >> >>> >> OK, I see that GICH_V2_LR_MAINTENANCE_IRQ must always be set and
>> >> >>> >> everything works fine
>> >> >>> >> The following 2 patches fixes xen/master for my platform.
>> >> >>> >>
>> >> >>> >> Stefano, could you please take a look to these changes?
>> >> >>> >>
>> >> >>> >> commit 3628a0aa35706a8f532af865ed784536ce514eca
>> >> >>> >> Author: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
>> >> >>> >> Date:   Tue Nov 18 14:20:42 2014 +0200
>> >> >>> >>
>> >> >>> >>     xen/arm: dra7: always set GICH_V2_LR_MAINTENANCE_IRQ flag
>> >> >>> >>
>> >> >>> >>     Change-Id: Ia380b3507a182b11592588f65fd23693d4f87434
>> >> >>> >>     Signed-off-by: Andrii Tseglytskyi <andrii.tseglytskyi@globallogic.com>
>> >> >>> >>
>> >> >>> >> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> >> >>> >> index 31fb81a..093ecdb 100644
>> >> >>> >> --- a/xen/arch/arm/gic-v2.c
>> >> >>> >> +++ b/xen/arch/arm/gic-v2.c
>> >> >>> >> @@ -396,13 +396,14 @@ static void gicv2_update_lr(int lr, const struct
>> >> >>> >> pending_irq *p,
>> >> >>> >>                                               << GICH_V2_LR_PRIORITY_SHIFT) |
>> >> >>> >>                ((p->irq & GICH_V2_LR_VIRTUAL_MASK) <<
>> >> >>> >> GICH_V2_LR_VIRTUAL_SHIFT));
>> >> >>> >>
>> >> >>> >> -    if ( p->desc != NULL )
>> >> >>> >> +    if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
>> >> >>> >>      {
>> >> >>> >> -        if ( platform_has_quirk(PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI) )
>> >> >>> >> -            lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
>> >> >>> >> -        else
>> >> >>> >> -            lr_reg |= GICH_V2_LR_HW | ((p->desc->irq &
>> >> >>> >> GICH_V2_LR_PHYSICAL_MASK )
>> >> >>> >> -                            << GICH_V2_LR_PHYSICAL_SHIFT);
>> >> >>> >> +        lr_reg |= GICH_V2_LR_MAINTENANCE_IRQ;
>> >> >>> >> +    }
>> >> >>> >> +    else if ( p->desc != NULL )
>> >> >>> >> +    {
>> >> >>> >> +        lr_reg |= GICH_V2_LR_HW | ((p->desc->irq & GICH_V2_LR_PHYSICAL_MASK )
>> >> >>> >> +                       << GICH_V2_LR_PHYSICAL_SHIFT);
>> >> >>> >>      }
>> >> >>> >>
>> >> >>> >>      writel_gich(lr_reg, GICH_LR + lr * 4);
>> >> >>> >
>> >> >>> > Actually in case p->desc == NULL (the irq is not an hardware irq, it
>> >> >>> > could be the virtual timer irq or the evtchn irq), you shouldn't need
>> >> >>> > the maintenance interrupt, if the bug was really due to GICH_LR_HW not
>> >> >>> > working correctly on OMAP5. This changes might only be better at
>> >> >>> > "hiding" the real issue.
>> >> >>> >
>> >> >>> > Maybe the problem is exactly the opposite: the new scheme for avoiding
>> >> >>> > maintenance interrupts doesn't work for software interrupts.
>> >> >>> > The commit that should make them work correctly after the
>> >> >>> > no-maintenance-irq commit is 394b7e587b05d0f4a5fd6f067b38339ab5a77121
>> >> >>> > If you look at the changes to gic_update_one_lr in that commit, you'll
>> >> >>> > see that is going to set a software irq as PENDING if it is already ACTIVE.
>> >> >>> > Maybe that doesn't work correctly on OMAP5.
>> >> >>> >
>> >> >>> > Could you try this patch on top of
>> >> >>> > 394b7e587b05d0f4a5fd6f067b38339ab5a77121?  It should help us understand
>> >> >>> > if the problem is specifically with software irqs.
>> >> >>> >
>> >> >>> >
>> >> >>> > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> >> >>> > index b7516c0..d8a17c9 100644
>> >> >>> > --- a/xen/arch/arm/gic.c
>> >> >>> > +++ b/xen/arch/arm/gic.c
>> >> >>> > @@ -66,7 +66,7 @@ static DEFINE_PER_CPU(u8, gic_cpu_id);
>> >> >>> >  /* Maximum cpu interface per GIC */
>> >> >>> >  #define NR_GIC_CPU_IF 8
>> >> >>> >
>> >> >>> > -#undef GIC_DEBUG
>> >> >>> > +#define GIC_DEBUG 1
>> >> >>> >
>> >> >>> >  static void gic_update_one_lr(struct vcpu *v, int i);
>> >> >>> >
>> >> >>> > @@ -563,6 +563,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
>> >> >>> >          ((p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT);
>> >> >>> >      if ( p->desc != NULL )
>> >> >>> >          lr_val |= GICH_LR_HW | (p->desc->irq << GICH_LR_PHYSICAL_SHIFT);
>> >> >>> > +    else
>> >> >>> > +        lr_val |= GICH_LR_MAINTENANCE_IRQ;
>> >> >>> >
>> >> >>> >      GICH[GICH_LR + lr] = lr_val;
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>>
>> >> >>> Andrii Tseglytskyi | Embedded Dev
>> >> >>> GlobalLogic
>> >> >>> www.globallogic.com
>> >> >>>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > Andrii Tseglytskyi | Embedded Dev
>> >> > GlobalLogic
>> >> > www.globallogic.com
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> Andrii Tseglytskyi | Embedded Dev
>> >> GlobalLogic
>> >> www.globallogic.com
>> >>
>>
>>
>>
>> --
>>
>> Andrii Tseglytskyi | Embedded Dev
>> GlobalLogic
>> www.globallogic.com
>>
diff mbox

Patch

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index b7516c0..5eaeca2 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -717,7 +717,12 @@  static void gic_restore_pending_irqs(struct vcpu *v)
     list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue )
     {
         i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs);
-        if ( i >= nr_lrs ) return;
+        if ( i >= nr_lrs )
+        {
+            gdprintk(XENLOG_DEBUG, "LRs full, not injecting irq=%u into d%dv%d\n",
+                    p->irq, v->domain->domain_id, v->vcpu_id);
+            continue;
+        }
 
         spin_lock_irqsave(&gic.lock, flags);
         gic_set_lr(i, p, GICH_LR_PENDING);