Message ID | 1543440731-21947-2-git-send-email-andrii.anisov@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [Xen-devel,RFC,01/16] xen/arm: Re-enable interrupt later in the trap path | expand |
On 28/11/2018 21:31, Andrii Anisov wrote: > From: Julien Grall <julien.grall@arm.com> > > This makes function enter_hypervisor_head() being executed with > irqs locked. This also give a fine side effect - it assures that > LRs are cleared prior to any IRQs processing, which leads to a > better (faster) IRQs processing. Again, this need some rationale why is it faster. And how this is going to impact the other vGIC... Cheers, -- Julien Grall
Hello Julien, It looks I missed answering this email. So do this now: On 29.11.18 00:00, Julien Grall wrote: > On 28/11/2018 21:31, Andrii Anisov wrote: >> From: Julien Grall <julien.grall@arm.com> >> >> This makes function enter_hypervisor_head() being executed with >> irqs locked. This also give a fine side effect - it assures that >> LRs are cleared prior to any IRQs processing, which leads to a >> better (faster) IRQs processing. > > Again, this need some rationale why is it faster. And how this is going > to impact the other vGIC... With the current code, there is a chance that hypervisor performs do_IRQ() before fetching LRs with the information about IRQs being processed by a guest. This could happen in guest trap paths where interrupts are enabled before `enter_hypervisor_head()` call. Performing `do_IRQ()` prior to `vgic_sync_from_lrs()` is assumed as less efficient than doing it vice versa for high IRQ rate use-cases with the number of different interrupts (specific for multimedia scenarios). - For the old vgic implementation, `vgic_sync_from_lrs()` release LRs from processed interrupts also shortens `inflight_irqs` sorted list. So `do_IRQ()` has better chances to write IRQ directly to LR instead of saving it into `lr_pending` list and possibly faster insert the new IRQ into `inflight_irqs` list. - For the new vgic implementation, `vgic_sync_from_lrs()` fetches about processed interrupts from LRs and shortens `ap_list`. So `do_IRQ()` could potentially faster insert the new interrupt into `ap_list`.
(+ Andre) On 7/30/19 6:34 PM, Andrii Anisov wrote: > Hello Julien, > > It looks I missed answering this email. So do this now: > > On 29.11.18 00:00, Julien Grall wrote: >> On 28/11/2018 21:31, Andrii Anisov wrote: >>> From: Julien Grall <julien.grall@arm.com> >>> >>> This makes function enter_hypervisor_head() being executed with >>> irqs locked. This also give a fine side effect - it assures that >>> LRs are cleared prior to any IRQs processing, which leads to a >>> better (faster) IRQs processing. >> >> Again, this need some rationale why is it faster. And how this is going >> to impact the other vGIC... > > With the current code, there is a chance that hypervisor performs > do_IRQ() before fetching LRs with the information about IRQs being > processed by a guest. This could happen in guest trap paths where > interrupts are enabled before `enter_hypervisor_head()` call. > > Performing `do_IRQ()` prior to `vgic_sync_from_lrs()` is assumed as less > efficient than doing it vice versa for high IRQ rate use-cases with the > number of different interrupts (specific for multimedia scenarios). What you suggest here is really specific to your use case. However, think about the Real-Time case where you may want to receive your interrupt as soon as possible. In general, keeping the interrupt disabled for a long time is a pretty bad idea because you increase latency. Interrupts should only be disabled when it is strictly necessary. At the moment I don't view this change as necessary. But I am happy to be proven otherwise. > - For the old vgic implementation, `vgic_sync_from_lrs()` release LRs > from processed interrupts also shortens `inflight_irqs` sorted list. So > `do_IRQ()` has better chances to write IRQ directly to LR instead of > saving it into `lr_pending` list and possibly faster insert the new IRQ > into `inflight_irqs` list. Saving to lr_pending means that you have all the LRs full. I vaguely recall we spoke about it before but I can't find anything in my inbox. So can you explain again your use-case, the number of LRs...? Note this is were writing down everything in the commit message (or after ---) helps people to understand where you are coming from. Cheers,
diff --git a/xen/arch/arm/arm64/entry.S b/xen/arch/arm/arm64/entry.S index 97b05f5..8f28789 100644 --- a/xen/arch/arm/arm64/entry.S +++ b/xen/arch/arm/arm64/entry.S @@ -195,7 +195,6 @@ hyp_error_invalid: hyp_error: entry hyp=1 - msr daifclr, #2 mov x0, sp bl do_trap_hyp_serror exit hyp=1 @@ -203,7 +202,7 @@ hyp_error: /* Traps taken in Current EL with SP_ELx */ hyp_sync: entry hyp=1 - msr daifclr, #6 + msr daifclr, #4 mov x0, sp bl do_trap_hyp_sync exit hyp=1 @@ -304,7 +303,7 @@ guest_sync_slowpath: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT) - msr daifclr, #6 + msr daifclr, #4 mov x0, sp bl do_trap_guest_sync 1: @@ -332,7 +331,7 @@ guest_fiq_invalid: guest_error: entry hyp=0, compat=0 - msr daifclr, #6 + msr daifclr, #4 mov x0, sp bl do_trap_guest_serror exit hyp=0, compat=0 @@ -347,7 +346,7 @@ guest_sync_compat: ALTERNATIVE("bl check_pending_vserror; cbnz x0, 1f", "nop; nop", SKIP_SYNCHRONIZE_SERROR_ENTRY_EXIT) - msr daifclr, #6 + msr daifclr, #4 mov x0, sp bl do_trap_guest_sync 1: @@ -375,7 +374,7 @@ guest_fiq_invalid_compat: guest_error_compat: entry hyp=0, compat=1 - msr daifclr, #6 + msr daifclr, #4 mov x0, sp bl do_trap_guest_serror exit hyp=0, compat=1 diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index 88ffeeb..18355e9 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -2038,6 +2038,8 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) { struct vcpu *v = current; + ASSERT(!local_irq_is_enabled()); + /* If the guest has disabled the workaround, bring it back on. */ if ( needs_ssbd_flip(v) ) arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL); @@ -2072,6 +2074,7 @@ void do_trap_guest_sync(struct cpu_user_regs *regs) const union hsr hsr = { .bits = regs->hsr }; enter_hypervisor_head(regs); + local_irq_enable(); switch ( hsr.ec ) { @@ -2207,6 +2210,7 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs) const union hsr hsr = { .bits = regs->hsr }; enter_hypervisor_head(regs); + local_irq_enable(); switch ( hsr.ec ) { @@ -2245,6 +2249,7 @@ void do_trap_hyp_sync(struct cpu_user_regs *regs) void do_trap_hyp_serror(struct cpu_user_regs *regs) { enter_hypervisor_head(regs); + local_irq_enable(); __do_trap_serror(regs, VABORT_GEN_BY_GUEST(regs)); } @@ -2252,6 +2257,7 @@ void do_trap_hyp_serror(struct cpu_user_regs *regs) void do_trap_guest_serror(struct cpu_user_regs *regs) { enter_hypervisor_head(regs); + local_irq_enable(); __do_trap_serror(regs, true); }