diff mbox series

intel_idle: add CPUIDLE_FLAG_IRQ_ENABLE to SPR C1 and C1E

Message ID 20220630194309.40465-1-jon@nutanix.com
State New
Headers show
Series intel_idle: add CPUIDLE_FLAG_IRQ_ENABLE to SPR C1 and C1E | expand

Commit Message

Jon Kohler June 30, 2022, 7:43 p.m. UTC
Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will
allow local IRQs to be enabled during fast idle transitions on SPR.

Note: Enabling this for both C1 and C1E is slightly different than
the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only
enabled on C1; however, given that SPR target/exit latency is 1/1
for c1 and 2/4 for C1E, respectively, which is slower than C1
for SKX, it seems prudent to now enable it on both states.

This is also important as on SPR it is possible for only C1 or
only C1E to be enabled (i.e. one of them would be disabled), so
only enabling C1 would short change C1E-only configurations.

Fixes: 9edf3c0ffef0 ("intel_idle: add SPR support")
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
 drivers/idle/intel_idle.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Artem Bityutskiy July 1, 2022, 1:30 p.m. UTC | #1
Hi Jon,

On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote:
> Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will
> allow local IRQs to be enabled during fast idle transitions on SPR.

Did you have a chance to measure this? When I was doing this for ICX and CLX, I
was using cyclictest and wult for measuring IRQ latency.

I was planning to do this for SPR as well.

> Note: Enabling this for both C1 and C1E is slightly different than
> the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only
> enabled on C1; however, given that SPR target/exit latency is 1/1
> for c1 and 2/4 for C1E, respectively, which is slower than C1
> for SKX, it seems prudent to now enable it on both states.

I was also going to measure this for C1E.

Could we please hold on this a bit - I'd like to measure this before we merge
it.

Artem.
Jon Kohler July 1, 2022, 2:06 p.m. UTC | #2
> On Jul 1, 2022, at 9:30 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote:
> 
> Hi Jon,
> 
> On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote:
>> Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will
>> allow local IRQs to be enabled during fast idle transitions on SPR.
> 
> Did you have a chance to measure this? When I was doing this for ICX and CLX, I
> was using cyclictest and wult for measuring IRQ latency.
> 
> I was planning to do this for SPR as well.

We have the ‘before’ baseline from wult, and realized after doing it that
IRQ_ENABLE config wasn’t set. I’ve provided this patch to our internal
team working on SPR enablement to get another wult run in next week.

That said, if you’ve got access to an SPR system setup as well, we’d
certainly appreciate a second set of eyes. This is the first generation
of enablement for a new platform that we’ve done where wult has been
on the ‘checklist’ so to speak, so we don’t have as much ’stick time’
on it as someone like yourself would :) 

> 
>> Note: Enabling this for both C1 and C1E is slightly different than
>> the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only
>> enabled on C1; however, given that SPR target/exit latency is 1/1
>> for c1 and 2/4 for C1E, respectively, which is slower than C1
>> for SKX, it seems prudent to now enable it on both states.
> 
> I was also going to measure this for C1E.
> 
> Could we please hold on this a bit - I'd like to measure this before we merge
> it.

Yea no problem, happy to get help and a second set of eyes on this.

Thanks - Jon

> 
> Artem.
>
Jon Kohler Nov. 10, 2023, 8 p.m. UTC | #3
> On Jul 1, 2022, at 10:06 AM, Jon Kohler <jon@nutanix.com> wrote:
> 
> 
> 
>> On Jul 1, 2022, at 9:30 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote:
>> 
>> Hi Jon,
>> 
>> On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote:
>>> Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will
>>> allow local IRQs to be enabled during fast idle transitions on SPR.
>> 
>> Did you have a chance to measure this? When I was doing this for ICX and CLX, I
>> was using cyclictest and wult for measuring IRQ latency.
>> 
>> I was planning to do this for SPR as well.
> 
> We have the ‘before’ baseline from wult, and realized after doing it that
> IRQ_ENABLE config wasn’t set. I’ve provided this patch to our internal
> team working on SPR enablement to get another wult run in next week.
> 
> That said, if you’ve got access to an SPR system setup as well, we’d
> certainly appreciate a second set of eyes. This is the first generation
> of enablement for a new platform that we’ve done where wult has been
> on the ‘checklist’ so to speak, so we don’t have as much ’stick time’
> on it as someone like yourself would :) 
> 
>> 
>>> Note: Enabling this for both C1 and C1E is slightly different than
>>> the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only
>>> enabled on C1; however, given that SPR target/exit latency is 1/1
>>> for c1 and 2/4 for C1E, respectively, which is slower than C1
>>> for SKX, it seems prudent to now enable it on both states.
>> 
>> I was also going to measure this for C1E.
>> 
>> Could we please hold on this a bit - I'd like to measure this before we merge
>> it.
> 
> Yea no problem, happy to get help and a second set of eyes on this.
> 
> Thanks - Jon

Hey Artem,
Coming back around on this, I realized this fell through the cracks. I was wondering
if you happened to run through this testing on your side already as part of other
efforts? 

If not, I’ll see if we can get it spun back up on our side. 

Thanks,
Jon

> 
>> 
>> Artem.
Artem Bityutskiy Nov. 12, 2023, 11:52 a.m. UTC | #4
On Fri, 2023-11-10 at 20:00 +0000, Jon Kohler wrote:
> CPUIDLE_FLAG_IRQ_ENABLE

Hi, yes, I did run several experiments, and found that this change would make
some micro benchmarks give worse score. I did a lot of repetitions assuming I
was mixing noise with signal, but every time confirmed that enabling interrupts
in C1 made the score worse (like 1%).

I do not have explanation for this phenomena, but decided to not pursue this
idea.

Artem.
Jon Kohler Nov. 13, 2023, 3:20 p.m. UTC | #5
> On Nov 12, 2023, at 6:52 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote:
> 
> On Fri, 2023-11-10 at 20:00 +0000, Jon Kohler wrote:
>> CPUIDLE_FLAG_IRQ_ENABLE
> 
> Hi, yes, I did run several experiments, and found that this change would make
> some micro benchmarks give worse score. I did a lot of repetitions assuming I
> was mixing noise with signal, but every time confirmed that enabling interrupts
> in C1 made the score worse (like 1%).
> 
> I do not have explanation for this phenomena, but decided to not pursue this
> idea.
> 
> Artem. 

Thanks for the follow up. Indeed that is strange on why it would make it worse, but it
is good to know that the shipping configuration is going to work out best.

Thanks,
Jon
diff mbox series

Patch

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 424ef470223d..f51857cddf2b 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -893,7 +893,7 @@  static struct cpuidle_state spr_cstates[] __initdata = {
 	{
 		.name = "C1",
 		.desc = "MWAIT 0x00",
-		.flags = MWAIT2flg(0x00),
+		.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
 		.exit_latency = 1,
 		.target_residency = 1,
 		.enter = &intel_idle,
@@ -902,7 +902,8 @@  static struct cpuidle_state spr_cstates[] __initdata = {
 		.name = "C1E",
 		.desc = "MWAIT 0x01",
 		.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE |
-					   CPUIDLE_FLAG_UNUSABLE,
+					   CPUIDLE_FLAG_UNUSABLE  |
+					   CPUIDLE_FLAG_IRQ_ENABLE,
 		.exit_latency = 2,
 		.target_residency = 4,
 		.enter = &intel_idle,