KVM: arm/arm64: vgic: add init entry to VGIC KVM device

Message ID 1417541251-20761-1-git-send-email-eric.auger@linaro.org
State New
Headers show

Commit Message

Auger Eric Dec. 2, 2014, 5:27 p.m.
Since the advent of dynamic initialization of VGIC, this latter is
initialized very late, on the first vcpu run. This initialization
could be initiated much earlier by the user, as soon as it has
provided the requested dimensioning parameters:
- number of IRQs and number of vCPUs,
- DIST and CPU interface base address.

One motivation behind being able to initialize the VGIC sooner is
related to the setup of IRQ injection in VFIO use case. The VFIO
signaling, especially when used along with irqfd must be set *after*
vgic initialization to prevent any virtual IRQ injection before
VGIC initialization. If virtual IRQ injection occurs before the VGIC
init, the IRQ cannot be injected and subsequent injection is blocked
due to VFIO completion mechanism (unmask/mask or forward/unforward).

This patch adds a new entry to the VGIC KVM device that allows
the user to manually request the VGIC init:
- a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
- Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT

The rationale behind introducing a group is to be able to add other
controls later on, if needed.

Obviously, as soon as the init is done, the dimensioning parameters
cannot be changed.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
---
 Documentation/virtual/kvm/devices/arm-vgic.txt | 11 +++++++++++
 arch/arm/include/uapi/asm/kvm.h                |  2 ++
 arch/arm64/include/uapi/asm/kvm.h              |  2 ++
 virt/kvm/arm/vgic.c                            | 14 +++++++++++++-
 4 files changed, 28 insertions(+), 1 deletion(-)

Comments

Peter Maydell Dec. 2, 2014, 5:50 p.m. | #1
On 2 December 2014 at 17:27, Eric Auger <eric.auger@linaro.org> wrote:
> Since the advent of dynamic initialization of VGIC, this latter is
> initialized very late, on the first vcpu run. This initialization
> could be initiated much earlier by the user, as soon as it has
> provided the requested dimensioning parameters:
> - number of IRQs and number of vCPUs,
> - DIST and CPU interface base address.
>
> One motivation behind being able to initialize the VGIC sooner is
> related to the setup of IRQ injection in VFIO use case. The VFIO
> signaling, especially when used along with irqfd must be set *after*
> vgic initialization to prevent any virtual IRQ injection before
> VGIC initialization. If virtual IRQ injection occurs before the VGIC
> init, the IRQ cannot be injected and subsequent injection is blocked
> due to VFIO completion mechanism (unmask/mask or forward/unforward).

This implies that you're potentially injecting virtual IRQs
(and changing the state of the VGIC) before we actually
start running the VM (ie before userspace calls KVM_RUN).
Is that right? It seems odd, but maybe vfio works that way?

-- PMM
Auger Eric Dec. 2, 2014, 5:54 p.m. | #2
On 12/02/2014 06:50 PM, Peter Maydell wrote:
> On 2 December 2014 at 17:27, Eric Auger <eric.auger@linaro.org> wrote:
>> Since the advent of dynamic initialization of VGIC, this latter is
>> initialized very late, on the first vcpu run. This initialization
>> could be initiated much earlier by the user, as soon as it has
>> provided the requested dimensioning parameters:
>> - number of IRQs and number of vCPUs,
>> - DIST and CPU interface base address.
>>
>> One motivation behind being able to initialize the VGIC sooner is
>> related to the setup of IRQ injection in VFIO use case. The VFIO
>> signaling, especially when used along with irqfd must be set *after*
>> vgic initialization to prevent any virtual IRQ injection before
>> VGIC initialization. If virtual IRQ injection occurs before the VGIC
>> init, the IRQ cannot be injected and subsequent injection is blocked
>> due to VFIO completion mechanism (unmask/mask or forward/unforward).
> 
> This implies that you're potentially injecting virtual IRQs
> (and changing the state of the VGIC) before we actually
> start running the VM (ie before userspace calls KVM_RUN).
> Is that right? It seems odd, but maybe vfio works that way?

Hi Peter,

as soon as VFIO signaling is set up (the device IRQ index is linked to
an eventfd, the physical IRQ VFIO handler is installed and the physical
IRQ is enabled at interrupt controller level), virtual IRQs are likely
to be injected. With current QEMU code, we setup this VFIO signaling
*before* the vgic readiness (either on machine init done or reset
notifier) and we face that issue of early injection. QEMU related
patches to follow ...

Best Regards

Eric
> 
> -- PMM
>
Christoffer Dall Dec. 3, 2014, 10:33 a.m. | #3
On Tue, Dec 02, 2014 at 05:50:00PM +0000, Peter Maydell wrote:
> On 2 December 2014 at 17:27, Eric Auger <eric.auger@linaro.org> wrote:
> > Since the advent of dynamic initialization of VGIC, this latter is
> > initialized very late, on the first vcpu run. This initialization
> > could be initiated much earlier by the user, as soon as it has
> > provided the requested dimensioning parameters:
> > - number of IRQs and number of vCPUs,
> > - DIST and CPU interface base address.
> >
> > One motivation behind being able to initialize the VGIC sooner is
> > related to the setup of IRQ injection in VFIO use case. The VFIO
> > signaling, especially when used along with irqfd must be set *after*
> > vgic initialization to prevent any virtual IRQ injection before
> > VGIC initialization. If virtual IRQ injection occurs before the VGIC
> > init, the IRQ cannot be injected and subsequent injection is blocked
> > due to VFIO completion mechanism (unmask/mask or forward/unforward).
> 
> This implies that you're potentially injecting virtual IRQs
> (and changing the state of the VGIC) before we actually
> start running the VM (ie before userspace calls KVM_RUN).
> Is that right? It seems odd, but maybe vfio works that way?
> 
Yeah, I can't think of a cleaner way to do this.  VFIO doesn't know
anything about KVM or whether a machine is running or not.  QEMU has to
configure all this before starting a VM (wiring up IRQs after the VM is
running is even more weird imho, when would you even do that?) so
interrupts from the real hardware are bound to hit VFIO just
before/during/after VCPUs are started, and VFIO doesn't have any caching
mechanism for this state, it really has to go to the consumer of the
interrupt, which is KVM in the case of forwarded interrupts.

Did I miss something obvious here?

-Christoffer
Christoffer Dall Dec. 3, 2014, 10:45 a.m. | #4
On Tue, Dec 02, 2014 at 06:27:31PM +0100, Eric Auger wrote:
> Since the advent of dynamic initialization of VGIC, this latter is
> initialized very late, on the first vcpu run. This initialization
> could be initiated much earlier by the user, as soon as it has
> provided the requested dimensioning parameters:
> - number of IRQs and number of vCPUs,
> - DIST and CPU interface base address.
> 
> One motivation behind being able to initialize the VGIC sooner is
> related to the setup of IRQ injection in VFIO use case. The VFIO
> signaling, especially when used along with irqfd must be set *after*
> vgic initialization to prevent any virtual IRQ injection before
> VGIC initialization. If virtual IRQ injection occurs before the VGIC
> init, the IRQ cannot be injected and subsequent injection is blocked
> due to VFIO completion mechanism (unmask/mask or forward/unforward).
> 
> This patch adds a new entry to the VGIC KVM device that allows
> the user to manually request the VGIC init:
> - a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
> - Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT
> 
> The rationale behind introducing a group is to be able to add other
> controls later on, if needed.
> 
> Obviously, as soon as the init is done, the dimensioning parameters
> cannot be changed.

you would need to add a check in the vcpu_create path, which I don't
believe we currently have.  That may conflict with Andre's series so we
need to coordinate.

We're also seeing this potentially being useful for migration, so my
feeling is that the GICv3 patches should be rebased on this patch and
this patch should include a check in the vcpu create path.

> 
> Signed-off-by: Eric Auger <eric.auger@linaro.org>
> ---
>  Documentation/virtual/kvm/devices/arm-vgic.txt | 11 +++++++++++
>  arch/arm/include/uapi/asm/kvm.h                |  2 ++
>  arch/arm64/include/uapi/asm/kvm.h              |  2 ++
>  virt/kvm/arm/vgic.c                            | 14 +++++++++++++-
>  4 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt
> index df8b0c7..80db43f 100644
> --- a/Documentation/virtual/kvm/devices/arm-vgic.txt
> +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
> @@ -81,3 +81,14 @@ Groups:
>      -EINVAL: Value set is out of the expected range
>      -EBUSY: Value has already be set, or GIC has already been initialized
>              with default values.
> +
> +  KVM_DEV_ARM_VGIC_GRP_CTRL
> +  Attributes:
> +    KVM_DEV_ARM_VGIC_CTRL_INIT
> +      request the initialization of the VGIC, no additional parameter in
> +      kvm_device_attr.addr.
> +  Errors:
> +    -ENXIO: distributor or CPU interface base address were not set prior
> +            to that call

this should be more generic to also apply to GICv3, I would suggest:

"VGIC not properly configured as required prior to calling this
attribute."

alternatively, the attribute should be KVM_DEV_ARM_VGIC_V2_CTRL_INIT.

> +    -EINVAL: number of vcpus is not known

can we have a different error code for this case?  ENODEV for example?

> +    -ENOMEM: memory shortage when allocating vgic internal data

> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 77547bb..2499867 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -175,6 +175,8 @@ struct kvm_arch_memory_slot {
>  #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
>  #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
>  #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
> +#define KVM_DEV_ARM_VGIC_GRP_CTRL       4
> +#define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
>  
>  /* KVM_IRQ_LINE irq field index values */
>  #define KVM_ARM_IRQ_TYPE_SHIFT		24
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 1ed4417..b35c95a 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -161,6 +161,8 @@ struct kvm_arch_memory_slot {
>  #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
>  #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
>  #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
> +#define KVM_DEV_ARM_VGIC_GRP_CTRL	4
> +#define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
>  
>  /* KVM_IRQ_LINE irq field index values */
>  #define KVM_ARM_IRQ_TYPE_SHIFT		24
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index b76c38c..2fe5bdb 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -2474,7 +2474,14 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
>  
>  		return ret;
>  	}
> -
> +	case KVM_DEV_ARM_VGIC_GRP_CTRL: {
> +		switch (attr->attr) {
> +		case KVM_DEV_ARM_VGIC_CTRL_INIT:
> +			r = kvm_vgic_init(dev->kvm);
> +			return r;
> +		}
> +		break;
> +	}
>  	}
>  
>  	return -ENXIO;
> @@ -2553,6 +2560,11 @@ static int vgic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
>  		return vgic_has_attr_regs(vgic_cpu_ranges, offset);
>  	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
>  		return 0;
> +	case KVM_DEV_ARM_VGIC_GRP_CTRL:
> +		switch (attr->attr) {
> +		case KVM_DEV_ARM_VGIC_CTRL_INIT:
> +			return 0;
> +		}
>  	}
>  	return -ENXIO;
>  }
> -- 
> 1.9.1
>
Peter Maydell Dec. 4, 2014, 10:02 a.m. | #5
On 2 December 2014 at 17:54, Eric Auger <eric.auger@linaro.org> wrote:
> as soon as VFIO signaling is set up (the device IRQ index is linked to
> an eventfd, the physical IRQ VFIO handler is installed and the physical
> IRQ is enabled at interrupt controller level), virtual IRQs are likely
> to be injected. With current QEMU code, we setup this VFIO signaling
> *before* the vgic readiness (either on machine init done or reset
> notifier) and we face that issue of early injection. QEMU related
> patches to follow ...

So can you describe in QEMU terms how the lifecycle of these
things works? How do we ensure that we don't start trying
to inject VFIO IRQs before we've even created the vgic, for
instance?

thanks
-- PMM
Auger Eric Dec. 4, 2014, 10:04 a.m. | #6
Hi Christoffer
On 12/03/2014 11:45 AM, Christoffer Dall wrote:
> On Tue, Dec 02, 2014 at 06:27:31PM +0100, Eric Auger wrote:
>> Since the advent of dynamic initialization of VGIC, this latter is
>> initialized very late, on the first vcpu run. This initialization
>> could be initiated much earlier by the user, as soon as it has
>> provided the requested dimensioning parameters:
>> - number of IRQs and number of vCPUs,
>> - DIST and CPU interface base address.
>>
>> One motivation behind being able to initialize the VGIC sooner is
>> related to the setup of IRQ injection in VFIO use case. The VFIO
>> signaling, especially when used along with irqfd must be set *after*
>> vgic initialization to prevent any virtual IRQ injection before
>> VGIC initialization. If virtual IRQ injection occurs before the VGIC
>> init, the IRQ cannot be injected and subsequent injection is blocked
>> due to VFIO completion mechanism (unmask/mask or forward/unforward).
>>
>> This patch adds a new entry to the VGIC KVM device that allows
>> the user to manually request the VGIC init:
>> - a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
>> - Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT
>>
>> The rationale behind introducing a group is to be able to add other
>> controls later on, if needed.
>>
>> Obviously, as soon as the init is done, the dimensioning parameters
>> cannot be changed.
> 
> you would need to add a check in the vcpu_create path,
check added in v2
 which I don't
> believe we currently have.  That may conflict with Andre's series so we
> need to coordinate.
> 
> We're also seeing this potentially being useful for migration, so my
> feeling is that the GICv3 patches should be rebased on this patch and
> this patch should include a check in the vcpu create path.
> 
>>
>> Signed-off-by: Eric Auger <eric.auger@linaro.org>
>> ---
>>  Documentation/virtual/kvm/devices/arm-vgic.txt | 11 +++++++++++
>>  arch/arm/include/uapi/asm/kvm.h                |  2 ++
>>  arch/arm64/include/uapi/asm/kvm.h              |  2 ++
>>  virt/kvm/arm/vgic.c                            | 14 +++++++++++++-
>>  4 files changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt
>> index df8b0c7..80db43f 100644
>> --- a/Documentation/virtual/kvm/devices/arm-vgic.txt
>> +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
>> @@ -81,3 +81,14 @@ Groups:
>>      -EINVAL: Value set is out of the expected range
>>      -EBUSY: Value has already be set, or GIC has already been initialized
>>              with default values.
>> +
>> +  KVM_DEV_ARM_VGIC_GRP_CTRL
>> +  Attributes:
>> +    KVM_DEV_ARM_VGIC_CTRL_INIT
>> +      request the initialization of the VGIC, no additional parameter in
>> +      kvm_device_attr.addr.
>> +  Errors:
>> +    -ENXIO: distributor or CPU interface base address were not set prior
>> +            to that call
> 
> this should be more generic to also apply to GICv3, I would suggest:
> 
> "VGIC not properly configured as required prior to calling this
> attribute."
corrected
> 
> alternatively, the attribute should be KVM_DEV_ARM_VGIC_V2_CTRL_INIT.
> 
>> +    -EINVAL: number of vcpus is not known
> 
> can we have a different error code for this case?  ENODEV for example?
corrected

Thanks
Eric
> 
>> +    -ENOMEM: memory shortage when allocating vgic internal data
> 
>> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
>> index 77547bb..2499867 100644
>> --- a/arch/arm/include/uapi/asm/kvm.h
>> +++ b/arch/arm/include/uapi/asm/kvm.h
>> @@ -175,6 +175,8 @@ struct kvm_arch_memory_slot {
>>  #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
>>  #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
>>  #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
>> +#define KVM_DEV_ARM_VGIC_GRP_CTRL       4
>> +#define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
>>  
>>  /* KVM_IRQ_LINE irq field index values */
>>  #define KVM_ARM_IRQ_TYPE_SHIFT		24
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 1ed4417..b35c95a 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -161,6 +161,8 @@ struct kvm_arch_memory_slot {
>>  #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
>>  #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
>>  #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
>> +#define KVM_DEV_ARM_VGIC_GRP_CTRL	4
>> +#define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
>>  
>>  /* KVM_IRQ_LINE irq field index values */
>>  #define KVM_ARM_IRQ_TYPE_SHIFT		24
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index b76c38c..2fe5bdb 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -2474,7 +2474,14 @@ static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
>>  
>>  		return ret;
>>  	}
>> -
>> +	case KVM_DEV_ARM_VGIC_GRP_CTRL: {
>> +		switch (attr->attr) {
>> +		case KVM_DEV_ARM_VGIC_CTRL_INIT:
>> +			r = kvm_vgic_init(dev->kvm);
>> +			return r;
>> +		}
>> +		break;
>> +	}
>>  	}
>>  
>>  	return -ENXIO;
>> @@ -2553,6 +2560,11 @@ static int vgic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
>>  		return vgic_has_attr_regs(vgic_cpu_ranges, offset);
>>  	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
>>  		return 0;
>> +	case KVM_DEV_ARM_VGIC_GRP_CTRL:
>> +		switch (attr->attr) {
>> +		case KVM_DEV_ARM_VGIC_CTRL_INIT:
>> +			return 0;
>> +		}
>>  	}
>>  	return -ENXIO;
>>  }
>> -- 
>> 1.9.1
>>
Auger Eric Dec. 4, 2014, 12:01 p.m. | #7
On 12/04/2014 11:02 AM, Peter Maydell wrote:
> On 2 December 2014 at 17:54, Eric Auger <eric.auger@linaro.org> wrote:
>> as soon as VFIO signaling is set up (the device IRQ index is linked to
>> an eventfd, the physical IRQ VFIO handler is installed and the physical
>> IRQ is enabled at interrupt controller level), virtual IRQs are likely
>> to be injected. With current QEMU code, we setup this VFIO signaling
>> *before* the vgic readiness (either on machine init done or reset
>> notifier) and we face that issue of early injection. QEMU related
>> patches to follow ...
> 
> So can you describe in QEMU terms how the lifecycle of these
> things works? How do we ensure that we don't start trying
> to inject VFIO IRQs before we've even created the vgic, for
> instance?

Hi Peter,

Here is the sequence:
1) The VGIC early initialization is initiated in a machine init done
notifier. This notifier is registered in kvm_arm_gic_realize
(http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00220.html). It
executes after vcpu instantiations + dist/cpu interface base address
setting + nb irq setting.
2) the VFIO signaling and irqfd setup is done in a reset notifier
http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg04365.html

Besides https://lkml.org/lkml/2014/12/3/601 now prevents the irqfd setup
if the vgic is not initialized.

QEMU tear down:
in kvm_vm_release, kvm_irqfd_release is called before kvm_vgic_destroy.
This means the irqfd injection is stopped before vgic initialization.
VFIO driver will also will be released by QEMU process, independently on
KVM life cycle. If it still exist while KVM has been released, VFIO
signaling may still be up, meaning eventfd can be signaled but there is
no registered handler anymore, hence no risk of virtual IRQ injection.

Best Regards

Eric


> 
> thanks
> -- PMM
>
Peter Maydell Dec. 4, 2014, 12:07 p.m. | #8
On 4 December 2014 at 12:01, Eric Auger <eric.auger@linaro.org> wrote:
> Here is the sequence:
> 1) The VGIC early initialization is initiated in a machine init done
> notifier. This notifier is registered in kvm_arm_gic_realize
> (http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00220.html). It
> executes after vcpu instantiations + dist/cpu interface base address
> setting + nb irq setting.
> 2) the VFIO signaling and irqfd setup is done in a reset notifier
> http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg04365.html

OK. And on x86 VFIO how does this work? Obviously x86's GIC just
initializes as soon as it's created, but do we do the irqfd setup
in a reset notifier there too?

thanks
-- PMM
Auger Eric Dec. 4, 2014, 12:26 p.m. | #9
On 12/04/2014 01:07 PM, Peter Maydell wrote:
> On 4 December 2014 at 12:01, Eric Auger <eric.auger@linaro.org> wrote:
>> Here is the sequence:
>> 1) The VGIC early initialization is initiated in a machine init done
>> notifier. This notifier is registered in kvm_arm_gic_realize
>> (http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg00220.html). It
>> executes after vcpu instantiations + dist/cpu interface base address
>> setting + nb irq setting.
>> 2) the VFIO signaling and irqfd setup is done in a reset notifier
>> http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg04365.html
> 
> OK. And on x86 VFIO how does this work? Obviously x86's GIC just
> initializes as soon as it's created, but do we do the irqfd setup
> in a reset notifier there too?

This is what I understand from PCI intx init sequence: the vfio
signaling and irqfd setup happens in the vfio_initfn function instead.
first vfio signaling with user-side eventfd handlers is setup
(vfio_enable_intx). if KVM is enabled, vfio_enable_intx then tears the
user-side handlers down and replace them by irqfd kernel-side handlers
(IRQFD setup). the virtual IRQ to be injected to the guest is set to
vdev->intx.route.irq. This obviously requires this routing to be set
before the QEMU PCI VFIO device instantiation.

On ARM case, VFIO device are bound to be dynamically instantiated from
command line using dynamic sysbus device instantiation. Binding of IRQ
is done by a platform bus machine init done notifier. Only at that stage
we know the gsi. hence doing the irqfd init in a reset notifier.

Eric


> 
> thanks
> -- PMM
>

Patch

diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt b/Documentation/virtual/kvm/devices/arm-vgic.txt
index df8b0c7..80db43f 100644
--- a/Documentation/virtual/kvm/devices/arm-vgic.txt
+++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
@@ -81,3 +81,14 @@  Groups:
     -EINVAL: Value set is out of the expected range
     -EBUSY: Value has already be set, or GIC has already been initialized
             with default values.
+
+  KVM_DEV_ARM_VGIC_GRP_CTRL
+  Attributes:
+    KVM_DEV_ARM_VGIC_CTRL_INIT
+      request the initialization of the VGIC, no additional parameter in
+      kvm_device_attr.addr.
+  Errors:
+    -ENXIO: distributor or CPU interface base address were not set prior
+            to that call
+    -EINVAL: number of vcpus is not known
+    -ENOMEM: memory shortage when allocating vgic internal data
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 77547bb..2499867 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -175,6 +175,8 @@  struct kvm_arch_memory_slot {
 #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
 #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
+#define KVM_DEV_ARM_VGIC_GRP_CTRL       4
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 1ed4417..b35c95a 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -161,6 +161,8 @@  struct kvm_arch_memory_slot {
 #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
 #define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS	3
+#define KVM_DEV_ARM_VGIC_GRP_CTRL	4
+#define   KVM_DEV_ARM_VGIC_CTRL_INIT	0
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index b76c38c..2fe5bdb 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2474,7 +2474,14 @@  static int vgic_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 
 		return ret;
 	}
-
+	case KVM_DEV_ARM_VGIC_GRP_CTRL: {
+		switch (attr->attr) {
+		case KVM_DEV_ARM_VGIC_CTRL_INIT:
+			r = kvm_vgic_init(dev->kvm);
+			return r;
+		}
+		break;
+	}
 	}
 
 	return -ENXIO;
@@ -2553,6 +2560,11 @@  static int vgic_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr)
 		return vgic_has_attr_regs(vgic_cpu_ranges, offset);
 	case KVM_DEV_ARM_VGIC_GRP_NR_IRQS:
 		return 0;
+	case KVM_DEV_ARM_VGIC_GRP_CTRL:
+		switch (attr->attr) {
+		case KVM_DEV_ARM_VGIC_CTRL_INIT:
+			return 0;
+		}
 	}
 	return -ENXIO;
 }