Message ID | 1344272705-17825-1-git-send-email-peter.maydell@linaro.org |
---|---|
State | Superseded |
Headers | show |
Ping? I don't think this one quite made it into Avi's pullreq... thanks -- PMM On 6 August 2012 18:05, Peter Maydell <peter.maydell@linaro.org> wrote: > Move the init of the irqchip_inject_ioctl field of KVMState out of > kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq() > can be used even when no irqchip is created (for architectures > that support async interrupt notification even without an in > kernel irqchip). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > We can't just set irqchip_inject_ioctl in target-*/kvm.c because > the KVMState struct layout is private to kvm-all.c. Moving the > default initialisation of this field seemed the simplest approach. > It's safe because the use in kvm_set_irq() is guarded by a check of > kvm_async_interrupts_enabled(). > > The other approach would be to have a helper function for setting > the field, but that seems overkill when KVM_IRQ_LINE is the standard > default value. (KVM_IRQ_LINE_STATUS seems to be undocumented, > incidentally. I am going to assume it's another x86ism until somebody > does document it :-)) > > kvm-all.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kvm-all.c b/kvm-all.c > index 6def6c9..9a1f001 100644 > --- a/kvm-all.c > +++ b/kvm-all.c > @@ -1200,7 +1200,6 @@ static int kvm_irqchip_create(KVMState *s) > return ret; > } > > - s->irqchip_inject_ioctl = KVM_IRQ_LINE; > if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) { > s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS; > } > @@ -1321,6 +1320,8 @@ int kvm_init(void) > s->direct_msi = (kvm_check_extension(s, KVM_CAP_SIGNAL_MSI) > 0); > #endif > > + s->irqchip_inject_ioctl = KVM_IRQ_LINE; > + > ret = kvm_arch_init(s); > if (ret < 0) { > goto err; > -- > 1.7.9.5
On Sun, Aug 12, 2012 at 02:13:52PM +0100, Peter Maydell wrote: > Ping? I don't think this one quite made it into Avi's pullreq... Please post this with the rest of the code to support the new s->irqchip_inject_ioctl value. > thanks > -- PMM > > On 6 August 2012 18:05, Peter Maydell <peter.maydell@linaro.org> wrote: > > Move the init of the irqchip_inject_ioctl field of KVMState out of > > kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq() > > can be used even when no irqchip is created (for architectures > > that support async interrupt notification even without an in > > kernel irqchip). > > > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > --- > > We can't just set irqchip_inject_ioctl in target-*/kvm.c because > > the KVMState struct layout is private to kvm-all.c. Moving the > > default initialisation of this field seemed the simplest approach. > > It's safe because the use in kvm_set_irq() is guarded by a check of > > kvm_async_interrupts_enabled(). > > > > The other approach would be to have a helper function for setting > > the field, but that seems overkill when KVM_IRQ_LINE is the standard > > default value. (KVM_IRQ_LINE_STATUS seems to be undocumented, > > incidentally. I am going to assume it's another x86ism until somebody > > does document it :-)) > > > > kvm-all.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/kvm-all.c b/kvm-all.c > > index 6def6c9..9a1f001 100644 > > --- a/kvm-all.c > > +++ b/kvm-all.c > > @@ -1200,7 +1200,6 @@ static int kvm_irqchip_create(KVMState *s) > > return ret; > > } > > > > - s->irqchip_inject_ioctl = KVM_IRQ_LINE; > > if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) { > > s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS; > > } > > @@ -1321,6 +1320,8 @@ int kvm_init(void) > > s->direct_msi = (kvm_check_extension(s, KVM_CAP_SIGNAL_MSI) > 0); > > #endif > > > > + s->irqchip_inject_ioctl = KVM_IRQ_LINE; > > + > > ret = kvm_arch_init(s); > > if (ret < 0) { > > goto err; > > -- > > 1.7.9.5 > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13 August 2012 21:45, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Sun, Aug 12, 2012 at 02:13:52PM +0100, Peter Maydell wrote: >> Ping? I don't think this one quite made it into Avi's pullreq... > > Please post this with the rest of the code to support the new > s->irqchip_inject_ioctl value. Er, which new irqchip_inject_ioctl value? There are still only two possibilities, KVM_IRQ_LINE and KVM_IRQ_LINE_STATUS. The patch just allows you to use kvm_set_irq() without having called kvm_irqchip_create() first [which makes sense to me given the previous patches which remove the assumption that you need an in-kernel irqchip to be able to do KVM_IRQ_LINE style interrupt delivery]. -- PMM
On 2012-08-06 19:05, Peter Maydell wrote: > Move the init of the irqchip_inject_ioctl field of KVMState out of > kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq() > can be used even when no irqchip is created (for architectures > that support async interrupt notification even without an in > kernel irqchip). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > We can't just set irqchip_inject_ioctl in target-*/kvm.c because > the KVMState struct layout is private to kvm-all.c. Moving the > default initialisation of this field seemed the simplest approach. > It's safe because the use in kvm_set_irq() is guarded by a check of > kvm_async_interrupts_enabled(). > > The other approach would be to have a helper function for setting > the field, but that seems overkill when KVM_IRQ_LINE is the standard > default value. (KVM_IRQ_LINE_STATUS seems to be undocumented, > incidentally. I am going to assume it's another x86ism until somebody > does document it :-)) > > kvm-all.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kvm-all.c b/kvm-all.c > index 6def6c9..9a1f001 100644 > --- a/kvm-all.c > +++ b/kvm-all.c > @@ -1200,7 +1200,6 @@ static int kvm_irqchip_create(KVMState *s) > return ret; > } > > - s->irqchip_inject_ioctl = KVM_IRQ_LINE; > if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) { > s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS; > } Either you move both or none. KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e injection with feedback to allow lost-tick compensation) is the current standard that other archs should pick up. Jan
On 2012-08-06 19:05, Peter Maydell wrote: > Move the init of the irqchip_inject_ioctl field of KVMState out of > kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq() > can be used even when no irqchip is created (for architectures > that support async interrupt notification even without an in > kernel irqchip). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > We can't just set irqchip_inject_ioctl in target-*/kvm.c because > the KVMState struct layout is private to kvm-all.c. Moving the > default initialisation of this field seemed the simplest approach. > It's safe because the use in kvm_set_irq() is guarded by a check of > kvm_async_interrupts_enabled(). > > The other approach would be to have a helper function for setting > the field, but that seems overkill when KVM_IRQ_LINE is the standard > default value. (KVM_IRQ_LINE_STATUS seems to be undocumented, > incidentally. I am going to assume it's another x86ism until somebody > does document it :-)) Oh, and when implementing KVM_IRQ_LINE_STATUS for ARM, please also feel free to document it in a way that it applies both to its x86 history and new architectures. Jan
On 14 August 2012 08:33, Jan Kiszka <jan.kiszka@web.de> wrote: > Either you move both or none. OK. > KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e > injection with feedback to allow lost-tick compensation) is the current > standard that other archs should pick up. Can it be documented in the kernel api.txt then, please? Nobody is going to use it otherwise... (If I'd been paying attention at the time I'd have nak'd the qemu patches that added it on the grounds they were using an undocumented kernel API :-)) -- PMM
On 2012-08-14 09:40, Peter Maydell wrote: > On 14 August 2012 08:33, Jan Kiszka <jan.kiszka@web.de> wrote: >> Either you move both or none. > > OK. > >> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >> injection with feedback to allow lost-tick compensation) is the current >> standard that other archs should pick up. > > Can it be documented in the kernel api.txt then, please? Nobody > is going to use it otherwise... (If I'd been paying attention at the > time I'd have nak'd the qemu patches that added it on the grounds > they were using an undocumented kernel API :-)) The kernel API's documentation has in fact a much younger history than KVM support in QEMU. I think we still need to add quite a few standard IOCTLs to make it complete. Patches always welcome. Jan
On 14 August 2012 08:42, Jan Kiszka <jan.kiszka@web.de> wrote: > On 2012-08-14 09:40, Peter Maydell wrote: >> On 14 August 2012 08:33, Jan Kiszka <jan.kiszka@web.de> wrote: >>> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >>> injection with feedback to allow lost-tick compensation) is the current >>> standard that other archs should pick up. >> >> Can it be documented in the kernel api.txt then, please? Nobody >> is going to use it otherwise... (If I'd been paying attention at the >> time I'd have nak'd the qemu patches that added it on the grounds >> they were using an undocumented kernel API :-)) > > The kernel API's documentation has in fact a much younger history than > KVM support in QEMU. I think we still need to add quite a few standard > IOCTLs to make it complete. Patches always welcome. Well, you appear to know what this variant ioctl does and why it's better than KVM_IRQ_LINE, whereas I don't. I just want to deliver an interrupt, KVM_IRQ_LINE lets me deliver an interrupt, why do I need anything more? (What would I do with the status return, for instance? I have to assert the incoming irq line, there's nothing for me to do if the kernel says "sorry, can't do that" except abort qemu.) -- PMM
On 2012-08-14 09:52, Peter Maydell wrote: > On 14 August 2012 08:42, Jan Kiszka <jan.kiszka@web.de> wrote: >> On 2012-08-14 09:40, Peter Maydell wrote: >>> On 14 August 2012 08:33, Jan Kiszka <jan.kiszka@web.de> wrote: >>>> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >>>> injection with feedback to allow lost-tick compensation) is the current >>>> standard that other archs should pick up. >>> >>> Can it be documented in the kernel api.txt then, please? Nobody >>> is going to use it otherwise... (If I'd been paying attention at the >>> time I'd have nak'd the qemu patches that added it on the grounds >>> they were using an undocumented kernel API :-)) >> >> The kernel API's documentation has in fact a much younger history than >> KVM support in QEMU. I think we still need to add quite a few standard >> IOCTLs to make it complete. Patches always welcome. > > Well, you appear to know what this variant ioctl does and why it's > better than KVM_IRQ_LINE, whereas I don't. I just want to deliver > an interrupt, KVM_IRQ_LINE lets me deliver an interrupt, why > do I need anything more? (What would I do with the status return, for > instance? I have to assert the incoming irq line, there's nothing for > me to do if the kernel says "sorry, can't do that" except abort qemu.) Not sure how timekeeping of all your guests will work, but a classic scenario on x86 is that some timer is programmed to deliver periodic ticks (or one-shot ticks that also generates a virtual periodic timer) and that those ticks will then be used to derive the system time of the guest. Now, if the guest was unable to process the past tick completely (due to host load) and we inject already another tick event, that one will get lost. Some guests (older Linuxes but also many proprietary OSes) are not prepared for such tick loss and will suffer from drifting wall clocks. For that reason, we allow userspace to find out if a (potentially) tick driving IRQ was actually received by the guest or if it coalesced with an ongoing event. In the latter case, userspace can reinject those events once the guest is able to receive them again. All we need from the kernel API is that feedback KVM_IRQ_LINE_STATUS provides. The return values are "nicely" hidden in kvm_set_irq: /* * Return value: * < 0 Interrupt was ignored (masked or not delivered for other reasons) * = 0 Interrupt was coalesced (previous irq is still pending) * > 0 Number of CPUs interrupt was delivered to */ QEMU doesn't make use of that number of receiving CPUs and I'm mot sure why we even report it. Maybe the kernel API should just state that >0 means delivered. Jan
On 08/14/2012 10:33 AM, Jan Kiszka wrote: > > KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e > injection with feedback to allow lost-tick compensation) is the current > standard that other archs should pick up. KVM_IRQ_LINE_STATUS may not make sense on all architectures. I don't think we're really deprecating KVM_IRQ_LINE or discouraging its use. It's not like the kernel-allocated memory slot ioctls.
On 2012-08-14 13:01, Avi Kivity wrote: > On 08/14/2012 10:33 AM, Jan Kiszka wrote: >> >> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >> injection with feedback to allow lost-tick compensation) is the current >> standard that other archs should pick up. > > KVM_IRQ_LINE_STATUS may not make sense on all architectures. > > I don't think we're really deprecating KVM_IRQ_LINE or discouraging its > use. It's not like the kernel-allocated memory slot ioctls. I do not think it makes sense to provide both interfaces long term (provided we ever do a cut). Also, it's almost trivial to provide the add-on feature of KVM_IRQ_LINE_STATUS, and it keeps the door open for IRQ decoalescing. If there is no way for an arch to detect coalescing, it can still return >0 unconditionally. Jan
On 14 August 2012 09:09, Jan Kiszka <jan.kiszka@web.de> wrote: > On 2012-08-14 09:52, Peter Maydell wrote: >> Well, you appear to know what this variant ioctl does and why it's >> better than KVM_IRQ_LINE, whereas I don't. I just want to deliver >> an interrupt, KVM_IRQ_LINE lets me deliver an interrupt, why >> do I need anything more? (What would I do with the status return, for >> instance? I have to assert the incoming irq line, there's nothing for >> me to do if the kernel says "sorry, can't do that" except abort qemu.) > > Not sure how timekeeping of all your guests will work, but a classic > scenario on x86 is that some timer is programmed to deliver periodic > ticks (or one-shot ticks that also generates a virtual periodic timer) > and that those ticks will then be used to derive the system time of the > guest. Now, if the guest was unable to process the past tick completely > (due to host load) and we inject already another tick event, that one > will get lost. Some guests (older Linuxes but also many proprietary > OSes) are not prepared for such tick loss and will suffer from drifting > wall clocks. So, for ARM the overwhelmingly common case will be that we use the in kernel architectural timer for doing periodic / one-shot ticks. That's all in the kernel anyway (both timer and VGIC interrupt controller). The less plausible case uses the in-kernel interrupt controller but an out of kernel timer device. The only link between the two is a qemu_irq line, so we have (a) no way to tell if this interrupt line into the GIC is a timer or not and (b) no back-channel to the timer device to report things anyway. The really unlikely case (in that there is theoretically code to handle it but in practice QEMU will refuse to configure itself this way) has an out of kernel GIC. In this case the interrupts we deliver to the kernel are pre-coalesced into a single IRQ line, and there's definitely not a lot we could do with the status report. Marc Z tells me we could make the kernel return the 'did this coalesce?' status without too much difficulty, but unless it's actually possible to do something useful with it in userspace I'm not sure I see the point. -- PMM
On 08/14/2012 02:05 PM, Jan Kiszka wrote: > On 2012-08-14 13:01, Avi Kivity wrote: >> On 08/14/2012 10:33 AM, Jan Kiszka wrote: >>> >>> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >>> injection with feedback to allow lost-tick compensation) is the current >>> standard that other archs should pick up. >> >> KVM_IRQ_LINE_STATUS may not make sense on all architectures. >> >> I don't think we're really deprecating KVM_IRQ_LINE or discouraging its >> use. It's not like the kernel-allocated memory slot ioctls. > > I do not think it makes sense to provide both interfaces long term > (provided we ever do a cut). Also, it's almost trivial to provide the > add-on feature of KVM_IRQ_LINE_STATUS, and it keeps the door open for > IRQ decoalescing. If there is no way for an arch to detect coalescing, > it can still return >0 unconditionally. That's lying. I don't see how anything bad can come out of it, but we can always be surprised. If we can't support something, let's not claim we do.
On 2012-08-14 15:10, Peter Maydell wrote: > On 14 August 2012 09:09, Jan Kiszka <jan.kiszka@web.de> wrote: >> On 2012-08-14 09:52, Peter Maydell wrote: >>> Well, you appear to know what this variant ioctl does and why it's >>> better than KVM_IRQ_LINE, whereas I don't. I just want to deliver >>> an interrupt, KVM_IRQ_LINE lets me deliver an interrupt, why >>> do I need anything more? (What would I do with the status return, for >>> instance? I have to assert the incoming irq line, there's nothing for >>> me to do if the kernel says "sorry, can't do that" except abort qemu.) >> >> Not sure how timekeeping of all your guests will work, but a classic >> scenario on x86 is that some timer is programmed to deliver periodic >> ticks (or one-shot ticks that also generates a virtual periodic timer) >> and that those ticks will then be used to derive the system time of the >> guest. Now, if the guest was unable to process the past tick completely >> (due to host load) and we inject already another tick event, that one >> will get lost. Some guests (older Linuxes but also many proprietary >> OSes) are not prepared for such tick loss and will suffer from drifting >> wall clocks. > > So, for ARM the overwhelmingly common case will be that we use the in > kernel architectural timer for doing periodic / one-shot ticks. That's > all in the kernel anyway (both timer and VGIC interrupt controller). > > The less plausible case uses the in-kernel interrupt controller but an > out of kernel timer device. The only link between the two is a qemu_irq line, > so we have (a) no way to tell if this interrupt line into the GIC is a timer > or not and (b) no back-channel to the timer device to report things anyway. (a) is userspace knowledge (if it implements some timer that guests may use to drive a clock), so nothing the kernel needs to know. (b) will likely come for x86 to clean up the current mess we have for decoalescing RTC periodic timers and possibly adding HPET support. > > The really unlikely case (in that there is theoretically code to handle it > but in practice QEMU will refuse to configure itself this way) has an out > of kernel GIC. In this case the interrupts we deliver to the kernel are > pre-coalesced into a single IRQ line, and there's definitely not a lot > we could do with the status report. Agreed. In this case, the userspace GIC would have to provide the coalescing information. > > Marc Z tells me we could make the kernel return the 'did this coalesce?' > status without too much difficulty, but unless it's actually possible > to do something useful with it in userspace I'm not sure I see the point. From my point of view (but Avi seems to disagree), it would even make sense to always return 1 if you cannot actually tell, just to consolidate over a single IOCTL on the long run. But given that you can, I would avoid over-simplifying, specifically as the embedded space is known to add all sorts of special devices for special use cases, and you cannot know in advance if the architectural (in-kernel) timer will always be sufficient. Jan
On 2012-08-14 15:14, Avi Kivity wrote: > On 08/14/2012 02:05 PM, Jan Kiszka wrote: >> On 2012-08-14 13:01, Avi Kivity wrote: >>> On 08/14/2012 10:33 AM, Jan Kiszka wrote: >>>> >>>> KVM_IRQ_LINE is old-style, deprecated, KVM_IRQ_LINE_STATUS (i.e >>>> injection with feedback to allow lost-tick compensation) is the current >>>> standard that other archs should pick up. >>> >>> KVM_IRQ_LINE_STATUS may not make sense on all architectures. >>> >>> I don't think we're really deprecating KVM_IRQ_LINE or discouraging its >>> use. It's not like the kernel-allocated memory slot ioctls. >> >> I do not think it makes sense to provide both interfaces long term >> (provided we ever do a cut). Also, it's almost trivial to provide the >> add-on feature of KVM_IRQ_LINE_STATUS, and it keeps the door open for >> IRQ decoalescing. If there is no way for an arch to detect coalescing, >> it can still return >0 unconditionally. > > That's lying. I don't see how anything bad can come out of it, but we > can always be surprised. If we can't support something, let's not claim > we do. Fortunately, we are not in this position on ARM. It has in in-kernel irqchip and can tell if some line is still being processed. Jan
diff --git a/kvm-all.c b/kvm-all.c index 6def6c9..9a1f001 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1200,7 +1200,6 @@ static int kvm_irqchip_create(KVMState *s) return ret; } - s->irqchip_inject_ioctl = KVM_IRQ_LINE; if (kvm_check_extension(s, KVM_CAP_IRQ_INJECT_STATUS)) { s->irqchip_inject_ioctl = KVM_IRQ_LINE_STATUS; } @@ -1321,6 +1320,8 @@ int kvm_init(void) s->direct_msi = (kvm_check_extension(s, KVM_CAP_SIGNAL_MSI) > 0); #endif + s->irqchip_inject_ioctl = KVM_IRQ_LINE; + ret = kvm_arch_init(s); if (ret < 0) { goto err;
Move the init of the irqchip_inject_ioctl field of KVMState out of kvm_irqchip_create() and into kvm_init(), so that kvm_set_irq() can be used even when no irqchip is created (for architectures that support async interrupt notification even without an in kernel irqchip). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- We can't just set irqchip_inject_ioctl in target-*/kvm.c because the KVMState struct layout is private to kvm-all.c. Moving the default initialisation of this field seemed the simplest approach. It's safe because the use in kvm_set_irq() is guarded by a check of kvm_async_interrupts_enabled(). The other approach would be to have a helper function for setting the field, but that seems overkill when KVM_IRQ_LINE is the standard default value. (KVM_IRQ_LINE_STATUS seems to be undocumented, incidentally. I am going to assume it's another x86ism until somebody does document it :-)) kvm-all.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)