Message ID | 5FC3163CFD30C246ABAA99954A238FA83877FE25@FRAEML521-MBX.china.huawei.com |
---|---|
State | New |
Headers | show |
Series | [Xen-devel] Xen Dom0 boot failure on platform that supports ARM GICv4 | expand |
On 03/09/18 15:53, Shameerali Kolothum Thodi wrote: > Hi, Hello, > I am trying to boot xen(stable-4.11) on one of our ARM64 boards which > has support for GICv4. > > But dom0(kernel 4.18) boot fails with the below trap, > > XEN) ............done. > (XEN) Std. Loglevel: All > (XEN) Guest Loglevel: All > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch > input to Xen) > (XEN) Freed 304kB init memory. > (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04 > gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 Which bits of Linux is trying to access the region? > > After a bit of debugging, it looks like, the GICR size used in vgic_v3_domain_init() > is GICv4 GICR size(256K) and this upsets the first_cpu calculations. Can you expand what you mean by upset? What's wrong with the first_cpu calculations. > > Since dom0 gicv3 is also an emulated one, I think the size should be > restricted to use the GICv3 GICR size(128K). I have made the below > changes and is able to boot dom0 now. > > But not sure, this is the right approach to fix the issue. Please let me > know your thoughts. > > Thanks, > Shameer > > ---->8------------- > > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > index b2ed0f8..bf028cc 100644 > --- a/xen/arch/arm/gic-v3.c > +++ b/xen/arch/arm/gic-v3.c > @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) > reg = readl_relaxed(GICD + GICD_TYPER); > intid_bits = GICD_TYPE_ID_BITS(reg); > > - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, intid_bits); > + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, > + intid_bits, gic_dist_supports_dvis()); > gicv3_init_v2(); > > spin_lock_init(&gicv3.lock); > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > index 4b42739..0f53d88 100644 > --- a/xen/arch/arm/vgic-v3.c > +++ b/xen/arch/arm/vgic-v3.c > @@ -59,18 +59,21 @@ static struct { > unsigned int nr_rdist_regions; > const struct rdist_region *regions; > unsigned int intid_bits; /* Number of interrupt ID bits */ > + bool dvis; > } vgic_v3_hw; > > void vgic_v3_setup_hw(paddr_t dbase, > unsigned int nr_rdist_regions, > const struct rdist_region *regions, > - unsigned int intid_bits) > + unsigned int intid_bits, > + bool dvis) > { > vgic_v3_hw.enabled = true; > vgic_v3_hw.dbase = dbase; > vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; > vgic_v3_hw.regions = regions; > vgic_v3_hw.intid_bits = intid_bits; > + vgic_v3_hw.dvis = dvis; > } > > static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t irouter) > @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) > { > paddr_t size = vgic_v3_hw.regions[i].size; > > + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) > + size = GICV3_GICR_SIZE; vgic_v3_hw.regions is describing the regions in the layout that could hold re-distributor. You can have multiple re-distributor per region. The variable size holds the size of the region, not the size of the re-distributor. I am not sure to understand why you want to restrict the size of the region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So you should be able to fit 2 re-distributors per region. It looks like to me the re-distributor regions are not reported correctly or Dom0 thinks it is on GICv4. Can you provide a bit more details on the function that cause the crash and some logs from Linux? Also, which Linux version are you using? Cheers,
Hi Julien, Thanks for taking a look at this. > -----Original Message----- > From: Julien Grall [mailto:julien.grall@arm.com] > Sent: 03 September 2018 17:13 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > xen-devel@lists.xen.org > Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre > Przywara <andre.przywara@arm.com> > Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 > > > > On 03/09/18 15:53, Shameerali Kolothum Thodi wrote: > > Hi, > > Hello, > > > I am trying to boot xen(stable-4.11) on one of our ARM64 boards which > > has support for GICv4. > > > > But dom0(kernel 4.18) boot fails with the below trap, > > > > XEN) ............done. > > (XEN) Std. Loglevel: All > > (XEN) Guest Loglevel: All > > (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch > > input to Xen) > > (XEN) Freed 304kB init memory. > > (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04 > > gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 > > Which bits of Linux is trying to access the region? I think it is the gic_iterate_rdists() as the offset just before this is ffe8, which is GICR_PIDR2 > > > > After a bit of debugging, it looks like, the GICR size used in > vgic_v3_domain_init() > > is GICv4 GICR size(256K) and this upsets the first_cpu calculations. > > Can you expand what you mean by upset? What's wrong with the first_cpu > calculations. What I meant is, since this is a GICv4, the vgic_v3_hw.regions[i]->size is set to 256K and since first_cpu is calculated like, first_cpu += size /GICV3_GICR_SIZE; gets wrong as what I am seeing is, (XEN) frst_cpu 2 (XEN) first_cpu 4 (XEN) first_cpu 6 (XEN) first_cpu 8 (XEN) first_cpu 10 (XEN) first_cpu 12 (XEN) first_cpu 14 ..... (XEN) first_cpu 192 But the original number of CPUS are only 96. Hence I thought this is wrong. > > > > Since dom0 gicv3 is also an emulated one, I think the size should be > > restricted to use the GICv3 GICR size(128K). I have made the below > > changes and is able to boot dom0 now. > > > > But not sure, this is the right approach to fix the issue. Please let me > > know your thoughts. > > > > Thanks, > > Shameer > > > > ---->8------------- > > > > diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > > index b2ed0f8..bf028cc 100644 > > --- a/xen/arch/arm/gic-v3.c > > +++ b/xen/arch/arm/gic-v3.c > > @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) > > reg = readl_relaxed(GICD + GICD_TYPER); > > intid_bits = GICD_TYPE_ID_BITS(reg); > > > > - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, > intid_bits); > > + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, > > + intid_bits, gic_dist_supports_dvis()); > > gicv3_init_v2(); > > > > spin_lock_init(&gicv3.lock); > > diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > > index 4b42739..0f53d88 100644 > > --- a/xen/arch/arm/vgic-v3.c > > +++ b/xen/arch/arm/vgic-v3.c > > @@ -59,18 +59,21 @@ static struct { > > unsigned int nr_rdist_regions; > > const struct rdist_region *regions; > > unsigned int intid_bits; /* Number of interrupt ID bits */ > > + bool dvis; > > } vgic_v3_hw; > > > > void vgic_v3_setup_hw(paddr_t dbase, > > unsigned int nr_rdist_regions, > > const struct rdist_region *regions, > > - unsigned int intid_bits) > > + unsigned int intid_bits, > > + bool dvis) > > { > > vgic_v3_hw.enabled = true; > > vgic_v3_hw.dbase = dbase; > > vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; > > vgic_v3_hw.regions = regions; > > vgic_v3_hw.intid_bits = intid_bits; > > + vgic_v3_hw.dvis = dvis; > > } > > > > static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t > irouter) > > @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) > > { > > paddr_t size = vgic_v3_hw.regions[i].size; > > > > + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) > > + size = GICV3_GICR_SIZE; > > vgic_v3_hw.regions is describing the regions in the layout that could > hold re-distributor. You can have multiple re-distributor per region. > > The variable size holds the size of the region, not the size of the > re-distributor. > > I am not sure to understand why you want to restrict the size of the > region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So > you should be able to fit 2 re-distributors per region. > > It looks like to me the re-distributor regions are not reported > correctly or Dom0 thinks it is on GICv4. Can you provide a bit more > details on the function that cause the crash and some logs from Linux? Ok. I added few prints along the vgic mmio read path and this is what happens before the trap. vgic_v3_rdistr_mmio_read() get_vcpu_from_rdist() -->returns NULL here for 0x004000aa10ffe8 which actually belongs to cpu id 48 as per the log below (XEN) 96 CPUs enabled, 96 CPUs total (XEN) SMP: Allowing 96 CPUs (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 100000 KHz (XEN) GICv3 initialization: (XEN) gic_dist_addr=0x000000aa000000 (XEN) gic_maintenance_irq=25 (XEN) gic_rdist_stride=0 (XEN) gic_rdist_regions=96 (XEN) redistributor regions: (XEN) - region 0: 0x000000aa100000 - 0x000000aa140000 (XEN) - region 1: 0x000000aa140000 - 0x000000aa180000 (XEN) - region 2: 0x000000aa180000 - 0x000000aa1c0000 (XEN) - region 3: 0x000000aa1c0000 - 0x000000aa200000 (XEN) - region 4: 0x000000aa200000 - 0x000000aa240000 (XEN) - region 5: 0x000000aa240000 - 0x000000aa280000 (XEN) - region 6: 0x000000aa280000 - 0x000000aa2c0000 (XEN) - region 7: 0x000000aa2c0000 - 0x000000aa300000 (XEN) - region 8: 0x000000aa300000 - 0x000000aa340000 (XEN) - region 9: 0x000000aa340000 - 0x000000aa380000 (XEN) - region 10: 0x000000aa380000 - 0x000000aa3c0000 (XEN) - region 11: 0x000000aa3c0000 - 0x000000aa400000 (XEN) - region 12: 0x000000aa400000 - 0x000000aa440000 (XEN) - region 13: 0x000000aa440000 - 0x000000aa480000 (XEN) - region 14: 0x000000aa480000 - 0x000000aa4c0000 (XEN) - region 15: 0x000000aa4c0000 - 0x000000aa500000 (XEN) - region 16: 0x000000aa500000 - 0x000000aa540000 (XEN) - region 17: 0x000000aa540000 - 0x000000aa580000 (XEN) - region 18: 0x000000aa580000 - 0x000000aa5c0000 (XEN) - region 19: 0x000000aa5c0000 - 0x000000aa600000 (XEN) - region 20: 0x000000aa600000 - 0x000000aa640000 (XEN) - region 21: 0x000000aa640000 - 0x000000aa680000 (XEN) - region 22: 0x000000aa680000 - 0x000000aa6c0000 (XEN) - region 23: 0x000000aa6c0000 - 0x000000aa700000 (XEN) - region 24: 0x000000ae100000 - 0x000000ae140000 (XEN) - region 25: 0x000000ae140000 - 0x000000ae180000 (XEN) - region 26: 0x000000ae180000 - 0x000000ae1c0000 (XEN) - region 27: 0x000000ae1c0000 - 0x000000ae200000 (XEN) - region 28: 0x000000ae200000 - 0x000000ae240000 (XEN) - region 29: 0x000000ae240000 - 0x000000ae280000 (XEN) - region 30: 0x000000ae280000 - 0x000000ae2c0000 (XEN) - region 31: 0x000000ae2c0000 - 0x000000ae300000 (XEN) - region 32: 0x000000ae300000 - 0x000000ae340000 (XEN) - region 33: 0x000000ae340000 - 0x000000ae380000 (XEN) - region 34: 0x000000ae380000 - 0x000000ae3c0000 (XEN) - region 35: 0x000000ae3c0000 - 0x000000ae400000 (XEN) - region 36: 0x000000ae400000 - 0x000000ae440000 (XEN) - region 37: 0x000000ae440000 - 0x000000ae480000 (XEN) - region 38: 0x000000ae480000 - 0x000000ae4c0000 (XEN) - region 39: 0x000000ae4c0000 - 0x000000ae500000 (XEN) - region 40: 0x000000ae500000 - 0x000000ae540000 (XEN) - region 41: 0x000000ae540000 - 0x000000ae580000 (XEN) - region 42: 0x000000ae580000 - 0x000000ae5c0000 (XEN) - region 43: 0x000000ae5c0000 - 0x000000ae600000 (XEN) - region 44: 0x000000ae600000 - 0x000000ae640000 (XEN) - region 45: 0x000000ae640000 - 0x000000ae680000 (XEN) - region 46: 0x000000ae680000 - 0x000000ae6c0000 (XEN) - region 47: 0x000000ae6c0000 - 0x000000ae700000 (XEN) - region 48: 0x004000aa100000 - 0x004000aa140000 (XEN) - region 49: 0x004000aa140000 - 0x004000aa180000 (XEN) - region 50: 0x004000aa180000 - 0x004000aa1c0000 (XEN) - region 51: 0x004000aa1c0000 - 0x004000aa200000 (XEN) - region 52: 0x004000aa200000 - 0x004000aa240000 (XEN) - region 53: 0x004000aa240000 - 0x004000aa280000 (XEN) - region 54: 0x004000aa280000 - 0x004000aa2c0000 (XEN) - region 55: 0x004000aa2c0000 - 0x004000aa300000 (XEN) - region 56: 0x004000aa300000 - 0x004000aa340000 (XEN) - region 57: 0x004000aa340000 - 0x004000aa380000 (XEN) - region 58: 0x004000aa380000 - 0x004000aa3c0000 (XEN) - region 59: 0x004000aa3c0000 - 0x004000aa400000 (XEN) - region 60: 0x004000aa400000 - 0x004000aa440000 (XEN) - region 61: 0x004000aa440000 - 0x004000aa480000 (XEN) - region 62: 0x004000aa480000 - 0x004000aa4c0000 (XEN) - region 63: 0x004000aa4c0000 - 0x004000aa500000 (XEN) - region 64: 0x004000aa500000 - 0x004000aa540000 (XEN) - region 65: 0x004000aa540000 - 0x004000aa580000 (XEN) - region 66: 0x004000aa580000 - 0x004000aa5c0000 (XEN) - region 67: 0x004000aa5c0000 - 0x004000aa600000 (XEN) - region 68: 0x004000aa600000 - 0x004000aa640000 (XEN) - region 69: 0x004000aa640000 - 0x004000aa680000 (XEN) - region 70: 0x004000aa680000 - 0x004000aa6c0000 (XEN) - region 71: 0x004000aa6c0000 - 0x004000aa700000 (XEN) - region 72: 0x004000ae100000 - 0x004000ae140000 (XEN) - region 73: 0x004000ae140000 - 0x004000ae180000 (XEN) - region 74: 0x004000ae180000 - 0x004000ae1c0000 (XEN) - region 75: 0x004000ae1c0000 - 0x004000ae200000 (XEN) - region 76: 0x004000ae200000 - 0x004000ae240000 (XEN) - region 77: 0x004000ae240000 - 0x004000ae280000 (XEN) - region 78: 0x004000ae280000 - 0x004000ae2c0000 (XEN) - region 79: 0x004000ae2c0000 - 0x004000ae300000 (XEN) - region 80: 0x004000ae300000 - 0x004000ae340000 (XEN) - region 81: 0x004000ae340000 - 0x004000ae380000 (XEN) - region 82: 0x004000ae380000 - 0x004000ae3c0000 (XEN) - region 83: 0x004000ae3c0000 - 0x004000ae400000 (XEN) - region 84: 0x004000ae400000 - 0x004000ae440000 (XEN) - region 85: 0x004000ae440000 - 0x004000ae480000 (XEN) - region 86: 0x004000ae480000 - 0x004000ae4c0000 (XEN) - region 87: 0x004000ae4c0000 - 0x004000ae500000 (XEN) - region 88: 0x004000ae500000 - 0x004000ae540000 (XEN) - region 89: 0x004000ae540000 - 0x004000ae580000 (XEN) - region 90: 0x004000ae580000 - 0x004000ae5c0000 (XEN) - region 91: 0x004000ae5c0000 - 0x004000ae600000 (XEN) - region 92: 0x004000ae600000 - 0x004000ae640000 (XEN) - region 93: 0x004000ae640000 - 0x004000ae680000 (XEN) - region 94: 0x004000ae680000 - 0x004000ae6c0000 (XEN) - region 95: 0x004000ae6c0000 - 0x004000ae700000 (XEN) GICv3: using at most 57344 LPIs on the host. (XEN) GICv3: 672 lines, (IID 00030736). (XEN) GICv3: Found ITS @0x202100000 (XEN) GICv3: CPU0: Found redistributor in region 0 @0000000040037000 (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Defaulting to alternative key handling; send 'A' to switch to normal mode. (XEN) Allocated console ring of 1024 KiB. (XEN) Bringing up CPU1 ....... If I remember correctly there was no logs from Dom0, but I need to double check the Dom0 cmdline option to see earlycon was set. I could also enable/add any prints that you think will help and rerun. Please let me know. > Also, which Linux version are you using? 4.18-rc1. Thanks, Shameer > Cheers, > > -- > Julien Grall
On 03/09/18 17:54, Shameerali Kolothum Thodi wrote: > Hi Julien, > > Thanks for taking a look at this. > >> -----Original Message----- >> From: Julien Grall [mailto:julien.grall@arm.com] >> Sent: 03 September 2018 17:13 >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; >> xen-devel@lists.xen.org >> Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre >> Przywara <andre.przywara@arm.com> >> Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 >> >> >> >> On 03/09/18 15:53, Shameerali Kolothum Thodi wrote: >>> Hi, >> >> Hello, >> >>> I am trying to boot xen(stable-4.11) on one of our ARM64 boards which >>> has support for GICv4. >>> >>> But dom0(kernel 4.18) boot fails with the below trap, >>> >>> XEN) ............done. >>> (XEN) Std. Loglevel: All >>> (XEN) Guest Loglevel: All >>> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch >>> input to Xen) >>> (XEN) Freed 304kB init memory. >>> (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04 >>> gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 >> >> Which bits of Linux is trying to access the region? > > I think it is the gic_iterate_rdists() as the offset just before this is ffe8, which is GICR_PIDR2 > >>> >>> After a bit of debugging, it looks like, the GICR size used in >> vgic_v3_domain_init() >>> is GICv4 GICR size(256K) and this upsets the first_cpu calculations. >> >> Can you expand what you mean by upset? What's wrong with the first_cpu >> calculations. > > What I meant is, since this is a GICv4, the vgic_v3_hw.regions[i]->size is set to 256K and > since first_cpu is calculated like, > > first_cpu += size /GICV3_GICR_SIZE; > > gets wrong as what I am seeing is, > > (XEN) frst_cpu 2 > (XEN) first_cpu 4 > (XEN) first_cpu 6 > (XEN) first_cpu 8 > (XEN) first_cpu 10 > (XEN) first_cpu 12 > (XEN) first_cpu 14 > ..... > (XEN) first_cpu 192 > > But the original number of CPUS are only 96. Hence I thought this is wrong. This is perfectly fine. Until recently it was not possible to know the number of vCPUs at domain creation. So the function is computing the first CPU for all the regions. With the recent change, it would be possible to only compute what is necessary. >>> >>> Since dom0 gicv3 is also an emulated one, I think the size should be >>> restricted to use the GICv3 GICR size(128K). I have made the below >>> changes and is able to boot dom0 now. >>> >>> But not sure, this is the right approach to fix the issue. Please let me >>> know your thoughts. >>> >>> Thanks, >>> Shameer >>> >>> ---->8------------- >>> >>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c >>> index b2ed0f8..bf028cc 100644 >>> --- a/xen/arch/arm/gic-v3.c >>> +++ b/xen/arch/arm/gic-v3.c >>> @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) >>> reg = readl_relaxed(GICD + GICD_TYPER); >>> intid_bits = GICD_TYPE_ID_BITS(reg); >>> >>> - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, >> intid_bits); >>> + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, >>> + intid_bits, gic_dist_supports_dvis()); >>> gicv3_init_v2(); >>> >>> spin_lock_init(&gicv3.lock); >>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c >>> index 4b42739..0f53d88 100644 >>> --- a/xen/arch/arm/vgic-v3.c >>> +++ b/xen/arch/arm/vgic-v3.c >>> @@ -59,18 +59,21 @@ static struct { >>> unsigned int nr_rdist_regions; >>> const struct rdist_region *regions; >>> unsigned int intid_bits; /* Number of interrupt ID bits */ >>> + bool dvis; >>> } vgic_v3_hw; >>> >>> void vgic_v3_setup_hw(paddr_t dbase, >>> unsigned int nr_rdist_regions, >>> const struct rdist_region *regions, >>> - unsigned int intid_bits) >>> + unsigned int intid_bits, >>> + bool dvis) >>> { >>> vgic_v3_hw.enabled = true; >>> vgic_v3_hw.dbase = dbase; >>> vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; >>> vgic_v3_hw.regions = regions; >>> vgic_v3_hw.intid_bits = intid_bits; >>> + vgic_v3_hw.dvis = dvis; >>> } >>> >>> static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t >> irouter) >>> @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) >>> { >>> paddr_t size = vgic_v3_hw.regions[i].size; >>> >>> + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) >>> + size = GICV3_GICR_SIZE; >> >> vgic_v3_hw.regions is describing the regions in the layout that could >> hold re-distributor. You can have multiple re-distributor per region. >> >> The variable size holds the size of the region, not the size of the >> re-distributor. >> >> I am not sure to understand why you want to restrict the size of the >> region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So >> you should be able to fit 2 re-distributors per region. >> >> It looks like to me the re-distributor regions are not reported >> correctly or Dom0 thinks it is on GICv4. Can you provide a bit more >> details on the function that cause the crash and some logs from Linux? > > Ok. I added few prints along the vgic mmio read path and this is what happens > before the trap. > > vgic_v3_rdistr_mmio_read() > get_vcpu_from_rdist() -->returns NULL here for 0x004000aa10ffe8 which > actually belongs to cpu id 48 as per the log below Do you mean region id 48? So if I get it correctly, you are trying to access re-distributor for vCPU ID 96. [...] > If I remember correctly there was no logs from Dom0, but I need to double > check the Dom0 cmdline option to see earlycon was set. > > I could also enable/add any prints that you think will help and rerun. Please > let me know I may have an idea what is happening. As we populate more regions than necessary, it is possible that Linux is trying to access them. Would it be possible to add some debug in the Linux function gic_iterate_rdists to know what the kernel is trying to read? Cheers,
> -----Original Message----- > From: Julien Grall [mailto:julien.grall@arm.com] > Sent: 03 September 2018 18:14 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > xen-devel@lists.xen.org > Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre > Przywara <andre.przywara@arm.com> > Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 > > > > On 03/09/18 17:54, Shameerali Kolothum Thodi wrote: > > Hi Julien, > > > > Thanks for taking a look at this. > > > >> -----Original Message----- > >> From: Julien Grall [mailto:julien.grall@arm.com] > >> Sent: 03 September 2018 17:13 > >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > >> xen-devel@lists.xen.org > >> Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre > >> Przywara <andre.przywara@arm.com> > >> Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 > >> > >> > >> > >> On 03/09/18 15:53, Shameerali Kolothum Thodi wrote: > >>> Hi, > >> > >> Hello, > >> > >>> I am trying to boot xen(stable-4.11) on one of our ARM64 boards which > >>> has support for GICv4. > >>> > >>> But dom0(kernel 4.18) boot fails with the below trap, > >>> > >>> XEN) ............done. > >>> (XEN) Std. Loglevel: All > >>> (XEN) Guest Loglevel: All > >>> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch > >>> input to Xen) > >>> (XEN) Freed 304kB init memory. > >>> (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04 > >>> gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 > >> > >> Which bits of Linux is trying to access the region? > > > > I think it is the gic_iterate_rdists() as the offset just before this is ffe8, which > is GICR_PIDR2 > > > >>> > >>> After a bit of debugging, it looks like, the GICR size used in > >> vgic_v3_domain_init() > >>> is GICv4 GICR size(256K) and this upsets the first_cpu calculations. > >> > >> Can you expand what you mean by upset? What's wrong with the first_cpu > >> calculations. > > > > What I meant is, since this is a GICv4, the vgic_v3_hw.regions[i]->size is set to > 256K and > > since first_cpu is calculated like, > > > > first_cpu += size /GICV3_GICR_SIZE; > > > > gets wrong as what I am seeing is, > > > > (XEN) frst_cpu 2 > > (XEN) first_cpu 4 > > (XEN) first_cpu 6 > > (XEN) first_cpu 8 > > (XEN) first_cpu 10 > > (XEN) first_cpu 12 > > (XEN) first_cpu 14 > > ..... > > (XEN) first_cpu 192 > > > > But the original number of CPUS are only 96. Hence I thought this is wrong. > > This is perfectly fine. Until recently it was not possible to know the > number of vCPUs at domain creation. So the function is computing the > first CPU for all the regions. > With the recent change, it would be possible to only compute what is > necessary. Ah..alright. This was not clear to me. > >>> > >>> Since dom0 gicv3 is also an emulated one, I think the size should be > >>> restricted to use the GICv3 GICR size(128K). I have made the below > >>> changes and is able to boot dom0 now. > >>> > >>> But not sure, this is the right approach to fix the issue. Please let me > >>> know your thoughts. > >>> > >>> Thanks, > >>> Shameer > >>> > >>> ---->8------------- > >>> > >>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c > >>> index b2ed0f8..bf028cc 100644 > >>> --- a/xen/arch/arm/gic-v3.c > >>> +++ b/xen/arch/arm/gic-v3.c > >>> @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) > >>> reg = readl_relaxed(GICD + GICD_TYPER); > >>> intid_bits = GICD_TYPE_ID_BITS(reg); > >>> > >>> - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, > >> intid_bits); > >>> + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, > >>> + intid_bits, gic_dist_supports_dvis()); > >>> gicv3_init_v2(); > >>> > >>> spin_lock_init(&gicv3.lock); > >>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c > >>> index 4b42739..0f53d88 100644 > >>> --- a/xen/arch/arm/vgic-v3.c > >>> +++ b/xen/arch/arm/vgic-v3.c > >>> @@ -59,18 +59,21 @@ static struct { > >>> unsigned int nr_rdist_regions; > >>> const struct rdist_region *regions; > >>> unsigned int intid_bits; /* Number of interrupt ID bits */ > >>> + bool dvis; > >>> } vgic_v3_hw; > >>> > >>> void vgic_v3_setup_hw(paddr_t dbase, > >>> unsigned int nr_rdist_regions, > >>> const struct rdist_region *regions, > >>> - unsigned int intid_bits) > >>> + unsigned int intid_bits, > >>> + bool dvis) > >>> { > >>> vgic_v3_hw.enabled = true; > >>> vgic_v3_hw.dbase = dbase; > >>> vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; > >>> vgic_v3_hw.regions = regions; > >>> vgic_v3_hw.intid_bits = intid_bits; > >>> + vgic_v3_hw.dvis = dvis; > >>> } > >>> > >>> static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t > >> irouter) > >>> @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) > >>> { > >>> paddr_t size = vgic_v3_hw.regions[i].size; > >>> > >>> + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) > >>> + size = GICV3_GICR_SIZE; > >> > >> vgic_v3_hw.regions is describing the regions in the layout that could > >> hold re-distributor. You can have multiple re-distributor per region. > >> > >> The variable size holds the size of the region, not the size of the > >> re-distributor. > >> > >> I am not sure to understand why you want to restrict the size of the > >> region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So > >> you should be able to fit 2 re-distributors per region. > >> > >> It looks like to me the re-distributor regions are not reported > >> correctly or Dom0 thinks it is on GICv4. Can you provide a bit more > >> details on the function that cause the crash and some logs from Linux? > > > > Ok. I added few prints along the vgic mmio read path and this is what > happens > > before the trap. > > > > vgic_v3_rdistr_mmio_read() > > get_vcpu_from_rdist() -->returns NULL here for 0x004000aa10ffe8 > which > > actually belongs to cpu id 48 as per the log > below > > Do you mean region id 48? So if I get it correctly, you are trying to > access re-distributor for vCPU ID 96. Hmm..I was under the impression that there is a one to one map here. And you are right, it is indeed vcpu id 96 which is invalid. > [...] > > > If I remember correctly there was no logs from Dom0, but I need to double > > check the Dom0 cmdline option to see earlycon was set. > > > > I could also enable/add any prints that you think will help and rerun. Please > > let me know > > I may have an idea what is happening. As we populate more regions than > necessary, it is possible that Linux is trying to access them. Would it > be possible to add some debug in the Linux function gic_iterate_rdists > to know what the kernel is trying to read? Ok, enabled earlycon for Dom0. Please find the log below, (XEN) ............done. (XEN) Std. Loglevel: All (XEN) Guest Loglevel: All (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen) (XEN) Freed 304kB init memory. (XEN) DOM0: [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x480fd010] (XEN) DOM0: [ 0.000000] Linux version 4.18.0-rc1-220038-gecb377e-dirty (shameer@shameer-ubuntu) (gcc version 4.9.2 20140904 (prerelease (XEN) DOM0: ) (crosstool-NG linaro-1.13.1-4.9-2014.09 - Linaro GCC 4.9-2014.09)) #255 SMP PREEMPT Mon Sep 3 19:14:14 BST 2018 (XEN) DOM0: [ 0.000000] Xen 4.11 support found (XEN) DOM0: [ 0.000000] efi: Getting EFI parameters from FDT: (XEN) DOM0: [ 0.000000] efi: EFI v2.50 by Xen (XEN) DOM0: [ 0.000000] efi: ACPI 2.0=0x239be02648 (XEN) DOM0: [ 0.000000] ACPI: Early table checksum verification disabled (XEN) DOM0: [ 0.000000] ACPI: RSDP 0x000000239BE02648 000024 (v02 HISI ) (XEN) DOM0: [ 0.000000] ACPI: XSDT 0x000000239BE02598 0000AC (v01 HISI HIP08 00000000 01000013) (XEN) DOM0: [ 0.000000] ACPI: FACP 0x000000239BE00000 000114 (v06 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: DSDT 0x00000000394D0000 006290 (v02 HISI HIP08 00000000 INTL 20180531) (XEN) DOM0: [ 0.000000] ACPI: GTDT 0x0000000039540000 000060 (v02 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: DBG2 0x0000000039530000 00005A (v00 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: MCFG 0x0000000039520000 00003C (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: SLIT 0x0000000039510000 00003C (v01 HISI HIP07 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: SPCR 0x0000000039500000 000050 (v02 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: SRAT 0x00000000394F0000 00074C (v03 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: APIC 0x000000239BE00118 002458 (v04 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: IORT 0x00000000394C0000 000E48 (v00 HISI HIP08 00000000 INTL 20180531) (XEN) DOM0: [ 0.000000] ACPI: BERT 0x00000000394B0000 000030 (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: HEST 0x00000000394A0000 00013C (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: ERST 0x0000000039480000 000230 (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: EINJ 0x0000000039470000 000170 (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: PPTT 0x0000000031080000 002A30 (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: SPMI 0x0000000031070000 000041 (v05 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: iBFT 0x0000000030FD0000 000800 (v01 HISI HIP08 00000000 00000000) (XEN) DOM0: [ 0.000000] ACPI: STAO 0x000000239BE02570 000025 (v01 HISI HIP08 00000000 INTL 20151124) (XEN) DOM0: [ 0.000000] ACPI: SPCR: console: pl011,mmio32,0x94080000,115200 (XEN) DOM0: [ 0.000000] earlycon: pl11 at MMIO32 0x0000000094080000 (options '115200') (XEN) DOM0: [ 0.000000] bootconsole [pl11] enabled (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x0 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x1 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x2 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x3 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x4 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x5 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x6 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x7 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x8 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x9 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xa -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xb -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xc -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xd -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xe -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0xf -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x100 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x101 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x102 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x103 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x104 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x105 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x106 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x107 -> Node 0 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x108 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x109 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10a -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10b -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10c -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10d -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10e -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10f -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x200 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x201 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x202 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x203 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x204 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x205 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x206 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x207 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x208 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x209 -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20a -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20b -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20c -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20d -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20e -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x20f -> Node 1 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x300 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x301 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x302 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x303 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x304 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x305 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x306 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x307 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x308 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x309 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30a -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30b -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30c -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30d -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30e -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x30f -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x400 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x401 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x402 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x403 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x404 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x405 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x406 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x407 -> Node 2 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x408 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x409 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40a -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40b -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40c -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40d -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40e -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x40f -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x500 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x501 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x502 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x503 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x504 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x505 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x506 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x507 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x508 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x509 -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50a -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50b -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50c -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50d -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50e -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: NUMA: SRAT: PXM 3 -> MPIDR 0x50f -> Node 3 (XEN) DOM0: [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x2080000000-0x23ffffffff] (XEN) DOM0: [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA [mem 0x2317feaa80-0x2317febfff] (XEN) DOM0: [ 0.000000] NUMA: Initmem setup node 1 [<memory-less node>] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA [mem 0x2317fe9500-0x2317feaa7f] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA(1) on node 0 (XEN) DOM0: [ 0.000000] NUMA: Initmem setup node 2 [<memory-less node>] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA [mem 0x2317fe7f80-0x2317fe94ff] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA(2) on node 0 (XEN) DOM0: [ 0.000000] NUMA: Initmem setup node 3 [<memory-less node>] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA [mem 0x2317fe6a00-0x2317fe7f7f] (XEN) DOM0: [ 0.000000] NUMA: NODE_DATA(3) on node 0 (XEN) DOM0: [ 0.000000] Zone ranges: (XEN) DOM0: [ 0.000000] DMA32 [mem 0x0000000020000000-0x00000000ffffffff] (XEN) DOM0: [ 0.000000] Normal [mem 0x0000000100000000-0x000000239be02fff] (XEN) DOM0: [ 0.000000] Movable zone start for each node (XEN) DOM0: [ 0.000000] Early memory node ranges (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000000020000000-0x0000000027ffffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000000030fd0000-0x0000000030fdffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000000031070000-0x000000003108ffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000000039470000-0x000000003948ffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x00000000394a0000-0x000000003955ffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000000039620000-0x000000003963ffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x0000002220000000-0x0000002317ffffff] (XEN) DOM0: [ 0.000000] node 0: [mem 0x000000239be00000-0x000000239be02fff] (XEN) DOM0: [ 0.000000] Initmem setup node 0 [mem 0x0000000020000000-0x000000239be02fff] (XEN) DOM0: [ 0.000000] Could not find start_pfn for node 1 (XEN) DOM0: [ 0.000000] Initmem setup node 1 [mem 0x0000000000000000-0x0000000000000000] (XEN) DOM0: [ 0.000000] Could not find start_pfn for node 2 (XEN) DOM0: [ 0.000000] Initmem setup node 2 [mem 0x0000000000000000-0x0000000000000000] (XEN) DOM0: [ 0.000000] Could not find start_pfn for node 3 (XEN) DOM0: [ 0.000000] Initmem setup node 3 [mem 0x0000000000000000-0x0000000000000000] (XEN) DOM0: [ 0.000000] psci: probing for conduit method from ACPI. (XEN) DOM0: [ 0.000000] psci: PSCIv1.1 detected in firmware. (XEN) DOM0: [ 0.000000] psci: Using standard PSCI v0.2 function IDs (XEN) DOM0: [ 0.000000] psci: Trusted OS migration not required (XEN) DOM0: [ 0.000000] psci: SMC Calling Convention v1.1 (XEN) DOM0: [ 0.000000] random: get_random_bytes called from start_kernel+0xb0/0x420 with crng_init=0 (XEN) DOM0: [ 0.000000] percpu: Embedded 23 pages/cpu @(____ptrval____) s56448 r8192 d29568 u94208 (XEN) DOM0: [ 0.000000] Detected VIPT I-cache on CPU0 (XEN) DOM0: [ 0.000000] CPU features: detected: Kernel page table isolation (KPTI) (XEN) DOM0: [ 0.000000] CPU features: detected: Hardware dirty bit management (XEN) DOM0: [ 0.000000] Built 4 zonelists, mobility grouping on. Total pages: 1032493 (XEN) DOM0: [ 0.000000] Policy zone: Normal (XEN) DOM0: [ 0.000000] Kernel command line: rdinit=/init console=hvc0 earlycon acpi=force noinitrd root=/dev/nvme0n1p1 rw (XEN) DOM0: [ 0.000000] log_buf_len individual max cpu contribution: 4096 bytes (XEN) DOM0: [ 0.000000] log_buf_len total cpu_extra contributions: 389120 bytes (XEN) DOM0: [ 0.000000] log_buf_len min size: 131072 bytes (XEN) DOM0: [ 0.000000] log_buf_len: 524288 bytes (XEN) DOM0: [ 0.000000] early log buf free: 117564(89%) (XEN) DOM0: [ 0.000000] software IO TLB [mem 0x23e00000-0x27e00000] (64MB) mapped at [(____ptrval____)-(____ptrval____)] (XEN) DOM0: [ 0.000000] Memory: 3981840K/4195532K available (13244K kernel code, 1510K rwdata, 5836K rodata, 1216K init, 456K bss, 2136 (XEN) DOM0: 92K reserved, 0K cma-reserved) (XEN) DOM0: [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=96, Nodes=4 (XEN) DOM0: [ 0.000000] Preemptible hierarchical RCU implementation. (XEN) DOM0: [ 0.000000] RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=96. (XEN) DOM0: [ 0.000000] Tasks RCU enabled. (XEN) DOM0: [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=96 (XEN) DOM0: [ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0 (XEN) DOM0: [ 0.000000] GICv3: Distributor has no Range Selector support (XEN) DOM0: [ 0.000000] gic_iterate_rdists: nr_redist_regions 96 gic_data.redist_stride 0x0 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [0] gicr read ptr @ffff000009900000, phys_base @0x00000000aa100000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [1] gicr read ptr @ffff000009980000, phys_base @0x00000000aa140000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [2] gicr read ptr @ffff000009a00000, phys_base @0x00000000aa180000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [3] gicr read ptr @ffff000009a80000, phys_base @0x00000000aa1c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [4] gicr read ptr @ffff000009b00000, phys_base @0x00000000aa200000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [5] gicr read ptr @ffff000009b80000, phys_base @0x00000000aa240000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [6] gicr read ptr @ffff000009c00000, phys_base @0x00000000aa280000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [7] gicr read ptr @ffff000009c80000, phys_base @0x00000000aa2c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [8] gicr read ptr @ffff000009d00000, phys_base @0x00000000aa300000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [9] gicr read ptr @ffff000009d80000, phys_base @0x00000000aa340000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [10] gicr read ptr @ffff000009e00000, phys_base @0x00000000aa380000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [11] gicr read ptr @ffff000009e80000, phys_base @0x00000000aa3c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [12] gicr read ptr @ffff000009f00000, phys_base @0x00000000aa400000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [13] gicr read ptr @ffff000009f80000, phys_base @0x00000000aa440000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [14] gicr read ptr @ffff00000a000000, phys_base @0x00000000aa480000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [15] gicr read ptr @ffff00000a080000, phys_base @0x00000000aa4c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [16] gicr read ptr @ffff00000a100000, phys_base @0x00000000aa500000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [17] gicr read ptr @ffff00000a180000, phys_base @0x00000000aa540000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [18] gicr read ptr @ffff00000a200000, phys_base @0x00000000aa580000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [19] gicr read ptr @ffff00000a280000, phys_base @0x00000000aa5c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [20] gicr read ptr @ffff00000a300000, phys_base @0x00000000aa600000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [21] gicr read ptr @ffff00000a380000, phys_base @0x00000000aa640000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [22] gicr read ptr @ffff00000a400000, phys_base @0x00000000aa680000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [23] gicr read ptr @ffff00000a480000, phys_base @0x00000000aa6c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [24] gicr read ptr @ffff00000a500000, phys_base @0x00000000ae100000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [25] gicr read ptr @ffff00000a580000, phys_base @0x00000000ae140000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [26] gicr read ptr @ffff00000a600000, phys_base @0x00000000ae180000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [27] gicr read ptr @ffff00000a680000, phys_base @0x00000000ae1c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [28] gicr read ptr @ffff00000a700000, phys_base @0x00000000ae200000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [29] gicr read ptr @ffff00000a780000, phys_base @0x00000000ae240000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [30] gicr read ptr @ffff00000a800000, phys_base @0x00000000ae280000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [31] gicr read ptr @ffff00000a880000, phys_base @0x00000000ae2c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [32] gicr read ptr @ffff00000a900000, phys_base @0x00000000ae300000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [33] gicr read ptr @ffff00000a980000, phys_base @0x00000000ae340000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [34] gicr read ptr @ffff00000aa00000, phys_base @0x00000000ae380000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [35] gicr read ptr @ffff00000aa80000, phys_base @0x00000000ae3c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [36] gicr read ptr @ffff00000ab00000, phys_base @0x00000000ae400000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [37] gicr read ptr @ffff00000ab80000, phys_base @0x00000000ae440000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [38] gicr read ptr @ffff00000ac00000, phys_base @0x00000000ae480000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [39] gicr read ptr @ffff00000ac80000, phys_base @0x00000000ae4c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [40] gicr read ptr @ffff00000ad00000, phys_base @0x00000000ae500000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [41] gicr read ptr @ffff00000ad80000, phys_base @0x00000000ae540000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [42] gicr read ptr @ffff00000ae00000, phys_base @0x00000000ae580000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [43] gicr read ptr @ffff00000ae80000, phys_base @0x00000000ae5c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [44] gicr read ptr @ffff00000af00000, phys_base @0x00000000ae600000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [45] gicr read ptr @ffff00000af80000, phys_base @0x00000000ae640000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [46] gicr read ptr @ffff00000b000000, phys_base @0x00000000ae680000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [47] gicr read ptr @ffff00000b080000, phys_base @0x00000000ae6c0000 (XEN) DOM0: [ 0.000000] gic_iterate_rdists: [48] gicr read ptr @ffff00000b100000, phys_base @0x00004000aa100000 (XEN) XEN: get_vcpu_from_rdist: vcpu_id 96 d->max_vcpus 96 region->first_cpu 96 region->base 00004000aa100000 (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af70 gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 (XEN) DOM0: [ 0.000000] Unhandled fault at 0xffff00000b10ffe8 (XEN) DOM0: [ 0.000000] Mem abort info: (XEN) DOM0: [ 0.000000] ESR = 0x96000000 (XEN) DOM0: [ 0.000000] Exception class = DABT (current EL), IL = 32 bits (XEN) DOM0: [ 0.000000] SET = 0, FnV = 0 (XEN) DOM0: [ 0.000000] EA = 0, S1PTW = 0 (XEN) DOM0: [ 0.000000] Data abort info: (XEN) DOM0: [ 0.000000] ISV = 0, ISS = 0x00000000 (XEN) DOM0: [ 0.000000] CM = 0, WnR = 0 (XEN) DOM0: [ 0.000000] swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) (XEN) DOM0: [ 0.000000] [ffff00000b10ffe8] pgd=0000002317ffe003, pud=0000002317ffd003, pmd=000000230fdc2003, pte=00e84000aa10ff07 (XEN) DOM0: [ 0.000000] Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP (XEN) DOM0: [ 0.000000] Modules linked in: (XEN) DOM0: [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.18.0-rc1-220038-gecb377e-dirty #255 (XEN) DOM0: [ 0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO) (XEN) DOM0: [ 0.000000] pc : gic_iterate_rdists+0xb4/0x150 (XEN) DOM0: [ 0.000000] lr : gic_iterate_rdists+0xac/0x150 (XEN) DOM0: [ 0.000000] sp : ffff000009463d00 (XEN) DOM0: [ 0.000000] x29: ffff000009463d00 x28: 0000000000000001 (XEN) DOM0: [ 0.000000] x27: ffff000009474af0 x26: ffff00000919d5f0 (XEN) DOM0: [ 0.000000] x25: 0000000000000018 x24: 0000000000000030 (XEN) DOM0: [ 0.000000] x23: 0000000000000480 x22: ffff00000841ae48 (XEN) DOM0: [ 0.000000] x21: ffff000009474000 x20: ffff00000b100000 (XEN) DOM0: [ 0.000000] x19: ffff000008dd33c0 x18: 00000000007fff00 (XEN) DOM0: [ 0.000000] x17: 0000000000000020 x16: 000000000000000a (XEN) DOM0: [ 0.000000] x15: ffffffffffffffff x14: 3078304020657361 (XEN) DOM0: [ 0.000000] x13: 625f73796870202c x12: 3030303030316230 (XEN) DOM0: [ 0.000000] x11: 3030303066666666 x10: 4020727470206461 (XEN) DOM0: [ 0.000000] x9 : c117cabcb346e200 x8 : 0000000000000101 (XEN) DOM0: [ 0.000000] x7 : 736964725f657461 x6 : ffff0000095e8746 (XEN) DOM0: [ 0.000000] x5 : 0000000000000000 x4 : 0000000000000000 (XEN) DOM0: [ 0.000000] x3 : ffffffffffffffff x2 : ffff00000948a468 (XEN) DOM0: [ 0.000000] x1 : 0000000000000000 x0 : ffff00000b10ffe8 (XEN) DOM0: [ 0.000000] Process swapper/0 (pid: 0, stack limit = 0x(____ptrval____)) (XEN) DOM0: [ 0.000000] Call trace: (XEN) DOM0: [ 0.000000] gic_iterate_rdists+0xb4/0x150 (XEN) DOM0: [ 0.000000] gic_init_bases+0x180/0x328 (XEN) DOM0: [ 0.000000] gic_acpi_init+0x13c/0x27c (XEN) DOM0: [ 0.000000] acpi_match_madt+0x44/0x78 (XEN) DOM0: [ 0.000000] acpi_table_parse_entries_array+0x170/0x200 (XEN) DOM0: [ 0.000000] acpi_table_parse_entries+0x3c/0x5c (XEN) DOM0: [ 0.000000] acpi_table_parse_madt+0x24/0x2c (XEN) DOM0: [ 0.000000] __acpi_probe_device_table+0x94/0xec (XEN) DOM0: [ 0.000000] irqchip_init+0x30/0x38 (XEN) DOM0: [ 0.000000] init_IRQ+0x78/0x110 (XEN) DOM0: [ 0.000000] start_kernel+0x288/0x420 (XEN) DOM0: [ 0.000000] Code: aa1403e3 97f43400 91403e80 913fa000 (b9400000) (XEN) DOM0: [ 0.000000] ---[ end trace 118dd2a135e77f55 ]--- (XEN) DOM0: [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! (XEN) DOM0: [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- .... Thanks, Shameer > Cheers, > > -- > Julien Grall
On 09/03/2018 07:37 PM, Shameerali Kolothum Thodi wrote: > > >> -----Original Message----- >> From: Julien Grall [mailto:julien.grall@arm.com] >> Sent: 03 September 2018 18:14 >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; >> xen-devel@lists.xen.org >> Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre >> Przywara <andre.przywara@arm.com> >> Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 >> >> >> >> On 03/09/18 17:54, Shameerali Kolothum Thodi wrote: >>> Hi Julien, >>> >>> Thanks for taking a look at this. >>> >>>> -----Original Message----- >>>> From: Julien Grall [mailto:julien.grall@arm.com] >>>> Sent: 03 September 2018 17:13 >>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; >>>> xen-devel@lists.xen.org >>>> Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre >>>> Przywara <andre.przywara@arm.com> >>>> Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 >>>> >>>> >>>> >>>> On 03/09/18 15:53, Shameerali Kolothum Thodi wrote: >>>>> Hi, >>>> >>>> Hello, >>>> >>>>> I am trying to boot xen(stable-4.11) on one of our ARM64 boards which >>>>> has support for GICv4. >>>>> >>>>> But dom0(kernel 4.18) boot fails with the below trap, >>>>> >>>>> XEN) ............done. >>>>> (XEN) Std. Loglevel: All >>>>> (XEN) Guest Loglevel: All >>>>> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch >>>>> input to Xen) >>>>> (XEN) Freed 304kB init memory. >>>>> (XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04 >>>>> gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8 >>>> >>>> Which bits of Linux is trying to access the region? >>> >>> I think it is the gic_iterate_rdists() as the offset just before this is ffe8, which >> is GICR_PIDR2 >>> >>>>> >>>>> After a bit of debugging, it looks like, the GICR size used in >>>> vgic_v3_domain_init() >>>>> is GICv4 GICR size(256K) and this upsets the first_cpu calculations. >>>> >>>> Can you expand what you mean by upset? What's wrong with the first_cpu >>>> calculations. >>> >>> What I meant is, since this is a GICv4, the vgic_v3_hw.regions[i]->size is set to >> 256K and >>> since first_cpu is calculated like, >>> >>> first_cpu += size /GICV3_GICR_SIZE; >>> >>> gets wrong as what I am seeing is, >>> >>> (XEN) frst_cpu 2 >>> (XEN) first_cpu 4 >>> (XEN) first_cpu 6 >>> (XEN) first_cpu 8 >>> (XEN) first_cpu 10 >>> (XEN) first_cpu 12 >>> (XEN) first_cpu 14 >>> ..... >>> (XEN) first_cpu 192 >>> >>> But the original number of CPUS are only 96. Hence I thought this is wrong. >> >> This is perfectly fine. Until recently it was not possible to know the >> number of vCPUs at domain creation. So the function is computing the >> first CPU for all the regions. > >> With the recent change, it would be possible to only compute what is >> necessary. > > Ah..alright. This was not clear to me. > >>>>> >>>>> Since dom0 gicv3 is also an emulated one, I think the size should be >>>>> restricted to use the GICv3 GICR size(128K). I have made the below >>>>> changes and is able to boot dom0 now. >>>>> >>>>> But not sure, this is the right approach to fix the issue. Please let me >>>>> know your thoughts. >>>>> >>>>> Thanks, >>>>> Shameer >>>>> >>>>> ---->8------------- >>>>> >>>>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c >>>>> index b2ed0f8..bf028cc 100644 >>>>> --- a/xen/arch/arm/gic-v3.c >>>>> +++ b/xen/arch/arm/gic-v3.c >>>>> @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) >>>>> reg = readl_relaxed(GICD + GICD_TYPER); >>>>> intid_bits = GICD_TYPE_ID_BITS(reg); >>>>> >>>>> - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, >>>> intid_bits); >>>>> + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, >>>>> + intid_bits, gic_dist_supports_dvis()); >>>>> gicv3_init_v2(); >>>>> >>>>> spin_lock_init(&gicv3.lock); >>>>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c >>>>> index 4b42739..0f53d88 100644 >>>>> --- a/xen/arch/arm/vgic-v3.c >>>>> +++ b/xen/arch/arm/vgic-v3.c >>>>> @@ -59,18 +59,21 @@ static struct { >>>>> unsigned int nr_rdist_regions; >>>>> const struct rdist_region *regions; >>>>> unsigned int intid_bits; /* Number of interrupt ID bits */ >>>>> + bool dvis; >>>>> } vgic_v3_hw; >>>>> >>>>> void vgic_v3_setup_hw(paddr_t dbase, >>>>> unsigned int nr_rdist_regions, >>>>> const struct rdist_region *regions, >>>>> - unsigned int intid_bits) >>>>> + unsigned int intid_bits, >>>>> + bool dvis) >>>>> { >>>>> vgic_v3_hw.enabled = true; >>>>> vgic_v3_hw.dbase = dbase; >>>>> vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; >>>>> vgic_v3_hw.regions = regions; >>>>> vgic_v3_hw.intid_bits = intid_bits; >>>>> + vgic_v3_hw.dvis = dvis; >>>>> } >>>>> >>>>> static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t >>>> irouter) >>>>> @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) >>>>> { >>>>> paddr_t size = vgic_v3_hw.regions[i].size; >>>>> >>>>> + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) >>>>> + size = GICV3_GICR_SIZE; >>>> >>>> vgic_v3_hw.regions is describing the regions in the layout that could >>>> hold re-distributor. You can have multiple re-distributor per region. >>>> >>>> The variable size holds the size of the region, not the size of the >>>> re-distributor. >>>> >>>> I am not sure to understand why you want to restrict the size of the >>>> region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So >>>> you should be able to fit 2 re-distributors per region. >>>> >>>> It looks like to me the re-distributor regions are not reported >>>> correctly or Dom0 thinks it is on GICv4. Can you provide a bit more >>>> details on the function that cause the crash and some logs from Linux? >>> >>> Ok. I added few prints along the vgic mmio read path and this is what >> happens >>> before the trap. >>> >>> vgic_v3_rdistr_mmio_read() >>> get_vcpu_from_rdist() -->returns NULL here for 0x004000aa10ffe8 >> which >>> actually belongs to cpu id 48 as per the log >> below >> >> Do you mean region id 48? So if I get it correctly, you are trying to >> access re-distributor for vCPU ID 96. > > Hmm..I was under the impression that there is a one to one map here. > And you are right, it is indeed vcpu id 96 which is invalid. >> [...] >> > >>> If I remember correctly there was no logs from Dom0, but I need to double >>> check the Dom0 cmdline option to see earlycon was set. >>> >>> I could also enable/add any prints that you think will help and rerun. Please >>> let me know >> >> I may have an idea what is happening. As we populate more regions than >> necessary, it is possible that Linux is trying to access them. Would it >> be possible to add some debug in the Linux function gic_iterate_rdists >> to know what the kernel is trying to read? > > Ok, enabled earlycon for Dom0. Please find the log below, Thank you for the log. I now have an idea what's is going wrong. The function gic_iterate_rdists can be used to go through all the re-distributor (for instance to check whether vLPIs is available). Because some of the regions are empty (i.e not emulated), you end up to trap. Your patch solves the problem by making regions not empty in the case of GICv4. But I think this can also happen when the number of vCPUs for Dom0 get restricted. Can you have a try at the patch below? I haven't tested on ACPI. If that works for you, I will add the DT case, clean it up and send it. Cheers, >From c1fe63fae976c9d4bf17551d141748c04febab37 Mon Sep 17 00:00:00 2001 From: Julien Grall <julien.grall@arm.com> Date: Tue, 4 Sep 2018 12:10:39 +0100 Subject: [PATCH] xen/arm: gic-v3: Don't create empty re-distributor regions Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Julien Grall <julien.grall@arm.com> --- xen/arch/arm/gic-v3.c | 2 +- xen/arch/arm/vgic-v3.c | 159 ++++++++++++++++++++++++++++--------------------- 2 files changed, 92 insertions(+), 69 deletions(-) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index b2ed0f8b55..eef6776064 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1503,7 +1503,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) /* Add Generic Redistributor */ size = sizeof(struct acpi_madt_generic_redistributor); - for ( i = 0; i < gicv3.rdist_count; i++ ) + for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) { gicr = (struct acpi_madt_generic_redistributor *)(base_ptr + table_len); gicr->header.type = ACPI_MADT_TYPE_GENERIC_REDISTRIBUTOR; diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 4b42739a52..06a9972421 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -1573,9 +1573,91 @@ static const struct mmio_handler_ops vgic_distr_mmio_handler = { .write = vgic_v3_distr_mmio_write, }; +static inline unsigned int vgic_v3_rdist_count(struct domain *d) +{ + /* + * Normally there is only one GICv3 redistributor region. + * The GICv3 DT binding provisions for multiple regions, since there are + * platforms out there which need those (multi-socket systems). + * For Dom0 we have to live with the MMIO layout the hardware provides, + * so we have to copy the multiple regions - as the first region may not + * provide enough space to hold all redistributors we need. + * However DomU get a constructed memory map, so we can go with + * the architected single redistributor region. + */ + return is_hardware_domain(d) ? vgic_v3_hw.nr_rdist_regions : + GUEST_GICV3_RDIST_REGIONS; +} + +static int vgic_v3_initialize_rdists(struct domain *d) +{ + struct vgic_rdist_region *rdist_regions; + int rdist_count, i; + + /* Allocate memory for Re-distributor regions */ + rdist_count = vgic_v3_rdist_count(d); + + rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); + if ( !rdist_regions ) + return -ENOMEM; + + d->arch.vgic.nr_regions = rdist_count; + d->arch.vgic.rdist_regions = rdist_regions; + + if ( is_hardware_domain(d) ) + { + unsigned int first_cpu = 0; + + for ( i = 0; i < vgic_v3_hw.nr_rdist_regions; i++ ) + { + paddr_t size = vgic_v3_hw.regions[i].size; + + d->arch.vgic.rdist_regions[i].base = vgic_v3_hw.regions[i].base; + d->arch.vgic.rdist_regions[i].size = size; + + /* Set the first CPU handled by this region */ + d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; + + first_cpu += size / GICV3_GICR_SIZE; + + if ( first_cpu >= d->max_vcpus ) + break; + } + + /* Update with the actual number of regions used */ + d->arch.vgic.nr_regions = i + 1; + } + else + { + /* A single Re-distributor region is mapped for the guest. */ + BUILD_BUG_ON(GUEST_GICV3_RDIST_REGIONS != 1); + + /* The first redistributor should contain enough space for all CPUs */ + BUILD_BUG_ON((GUEST_GICV3_GICR0_SIZE / GICV3_GICR_SIZE) < MAX_VIRT_CPUS); + d->arch.vgic.rdist_regions[0].base = GUEST_GICV3_GICR0_BASE; + d->arch.vgic.rdist_regions[0].size = GUEST_GICV3_GICR0_SIZE; + d->arch.vgic.rdist_regions[0].first_cpu = 0; + } + + /* + * Register mmio handler per contiguous region occupied by the + * redistributors. The handler will take care to choose which + * redistributor is targeted. + */ + for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) + { + struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i]; + + register_mmio_handler(d, &vgic_rdistr_mmio_handler, + region->base, region->size, region); + } + + return 0; +} + static int vgic_v3_vcpu_init(struct vcpu *v) { - int i; + int i, rc; paddr_t rdist_base; struct vgic_rdist_region *region; unsigned int last_cpu; @@ -1583,6 +1665,13 @@ static int vgic_v3_vcpu_init(struct vcpu *v) /* Convenient alias */ struct domain *d = v->domain; + if ( v->vcpu_id == 0 ) + { + rc = vgic_v3_initialize_rdists(v->domain); + if ( rc ) + return rc; + } + /* * Find the region where the re-distributor lives. For this purpose, * we look one region ahead as we have only the first CPU in hand. @@ -1625,36 +1714,9 @@ static int vgic_v3_vcpu_init(struct vcpu *v) return 0; } -static inline unsigned int vgic_v3_rdist_count(struct domain *d) -{ - /* - * Normally there is only one GICv3 redistributor region. - * The GICv3 DT binding provisions for multiple regions, since there are - * platforms out there which need those (multi-socket systems). - * For Dom0 we have to live with the MMIO layout the hardware provides, - * so we have to copy the multiple regions - as the first region may not - * provide enough space to hold all redistributors we need. - * However DomU get a constructed memory map, so we can go with - * the architected single redistributor region. - */ - return is_hardware_domain(d) ? vgic_v3_hw.nr_rdist_regions : - GUEST_GICV3_RDIST_REGIONS; -} - static int vgic_v3_domain_init(struct domain *d) { - struct vgic_rdist_region *rdist_regions; - int rdist_count, i, ret; - - /* Allocate memory for Re-distributor regions */ - rdist_count = vgic_v3_rdist_count(d); - - rdist_regions = xzalloc_array(struct vgic_rdist_region, rdist_count); - if ( !rdist_regions ) - return -ENOMEM; - - d->arch.vgic.nr_regions = rdist_count; - d->arch.vgic.rdist_regions = rdist_regions; + int ret; rwlock_init(&d->arch.vgic.pend_lpi_tree_lock); radix_tree_init(&d->arch.vgic.pend_lpi_tree); @@ -1665,38 +1727,12 @@ static int vgic_v3_domain_init(struct domain *d) */ if ( is_hardware_domain(d) ) { - unsigned int first_cpu = 0; - d->arch.vgic.dbase = vgic_v3_hw.dbase; - - for ( i = 0; i < vgic_v3_hw.nr_rdist_regions; i++ ) - { - paddr_t size = vgic_v3_hw.regions[i].size; - - d->arch.vgic.rdist_regions[i].base = vgic_v3_hw.regions[i].base; - d->arch.vgic.rdist_regions[i].size = size; - - /* Set the first CPU handled by this region */ - d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; - - first_cpu += size / GICV3_GICR_SIZE; - } - d->arch.vgic.intid_bits = vgic_v3_hw.intid_bits; } else { d->arch.vgic.dbase = GUEST_GICV3_GICD_BASE; - - /* A single Re-distributor region is mapped for the guest. */ - BUILD_BUG_ON(GUEST_GICV3_RDIST_REGIONS != 1); - - /* The first redistributor should contain enough space for all CPUs */ - BUILD_BUG_ON((GUEST_GICV3_GICR0_SIZE / GICV3_GICR_SIZE) < MAX_VIRT_CPUS); - d->arch.vgic.rdist_regions[0].base = GUEST_GICV3_GICR0_BASE; - d->arch.vgic.rdist_regions[0].size = GUEST_GICV3_GICR0_SIZE; - d->arch.vgic.rdist_regions[0].first_cpu = 0; - /* * TODO: only SPIs for now, adjust this when guests need LPIs. * Please note that this value just describes the bits required @@ -1715,19 +1751,6 @@ static int vgic_v3_domain_init(struct domain *d) register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase, SZ_64K, NULL); - /* - * Register mmio handler per contiguous region occupied by the - * redistributors. The handler will take care to choose which - * redistributor is targeted. - */ - for ( i = 0; i < d->arch.vgic.nr_regions; i++ ) - { - struct vgic_rdist_region *region = &d->arch.vgic.rdist_regions[i]; - - register_mmio_handler(d, &vgic_rdistr_mmio_handler, - region->base, region->size, region); - } - d->arch.vgic.ctlr = VGICD_CTLR_DEFAULT; return 0;
> -----Original Message----- > From: Julien Grall [mailto:julien.grall@arm.com] > Sent: 04 September 2018 12:22 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>; > xen-devel@lists.xen.org > Cc: sstabellini@kernel.org; Linuxarm <linuxarm@huawei.com>; Andre > Przywara <andre.przywara@arm.com> > Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4 [...] > >> I may have an idea what is happening. As we populate more regions than > >> necessary, it is possible that Linux is trying to access them. Would it > >> be possible to add some debug in the Linux function gic_iterate_rdists > >> to know what the kernel is trying to read? > > > > Ok, enabled earlycon for Dom0. Please find the log below, > > Thank you for the log. I now have an idea what's is going wrong. The function > gic_iterate_rdists can be used to go through all the re-distributor (for instance > to check whether vLPIs is available). > > Because some of the regions are empty (i.e not emulated), you end up to trap. > Your > patch solves the problem by making regions not empty in the case of GICv4. But > I > think this can also happen when the number of vCPUs for Dom0 get restricted. Yes, that’s right. I didn’t consider that. > Can you have a try at the patch below? I haven't tested on ACPI. > > If that works for you, I will add the DT case, clean it up and send it. Thanks for the patch. It works. Please CC me when you send the revised one, I can retest and provide T-by. Cheers, Shameer
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index b2ed0f8..bf028cc 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1783,7 +1783,8 @@ static int __init gicv3_init(void) reg = readl_relaxed(GICD + GICD_TYPER); intid_bits = GICD_TYPE_ID_BITS(reg); - vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, intid_bits); + vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions, + intid_bits, gic_dist_supports_dvis()); gicv3_init_v2(); spin_lock_init(&gicv3.lock); diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index 4b42739..0f53d88 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -59,18 +59,21 @@ static struct { unsigned int nr_rdist_regions; const struct rdist_region *regions; unsigned int intid_bits; /* Number of interrupt ID bits */ + bool dvis; } vgic_v3_hw; void vgic_v3_setup_hw(paddr_t dbase, unsigned int nr_rdist_regions, const struct rdist_region *regions, - unsigned int intid_bits) + unsigned int intid_bits, + bool dvis) { vgic_v3_hw.enabled = true; vgic_v3_hw.dbase = dbase; vgic_v3_hw.nr_rdist_regions = nr_rdist_regions; vgic_v3_hw.regions = regions; vgic_v3_hw.intid_bits = intid_bits; + vgic_v3_hw.dvis = dvis; } static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t irouter) @@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d) { paddr_t size = vgic_v3_hw.regions[i].size; + if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE)) + size = GICV3_GICR_SIZE; + d->arch.vgic.rdist_regions[i].base = vgic_v3_hw.regions[i].base; d->arch.vgic.rdist_regions[i].size = size; @@ -1680,6 +1686,7 @@ static int vgic_v3_domain_init(struct domain *d) d->arch.vgic.rdist_regions[i].first_cpu = first_cpu; first_cpu += size / GICV3_GICR_SIZE; + } d->arch.vgic.intid_bits = vgic_v3_hw.intid_bits; diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c index a35449b..dabd5f6 100644 --- a/xen/arch/arm/vgic/vgic.c +++ b/xen/arch/arm/vgic/vgic.c @@ -979,7 +979,8 @@ unsigned int vgic_max_vcpus(const struct domain *d) void vgic_v3_setup_hw(paddr_t dbase, unsigned int nr_rdist_regions, const struct rdist_region *regions, - unsigned int intid_bits) + unsigned int intid_bits, + bool dvis) { panic("New VGIC implementation does not yet support GICv3."); } diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h index 10a2aee..de1facf 100644 --- a/xen/include/asm-arm/gic_v3_defs.h +++ b/xen/include/asm-arm/gic_v3_defs.h @@ -73,6 +73,8 @@ /* Two pages for the RD_base and SGI_base register frame. */ #define GICV3_GICR_SIZE (2 * SZ_64K) +#define GICV4_GICR_SIZE (4 * SZ_64K) + #define GICR_CTLR (0x0000) #define GICR_IIDR (0x0004) #define GICR_TYPER (0x0008) diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h index 2a58ea3..3890ad8 100644 --- a/xen/include/asm-arm/vgic.h +++ b/xen/include/asm-arm/vgic.h @@ -364,7 +364,8 @@ struct rdist_region; void vgic_v3_setup_hw(paddr_t dbase, unsigned int nr_rdist_regions, const struct rdist_region *regions, - unsigned int intid_bits); + unsigned int intid_bits, + bool dvis); #endif #endif /* __ASM_ARM_VGIC_H__ */