@@ -296,17 +296,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
idx = i; /* first enabled state */
if (s->target_residency_ns > predicted_ns) {
- /*
- * Use a physical idle state, not busy polling, unless
- * a timer is going to trigger soon enough.
- */
- if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
- s->exit_latency_ns <= latency_req &&
- s->target_residency_ns <= data->next_timer_ns) {
- predicted_ns = s->target_residency_ns;
- idx = i;
- break;
- }
if (predicted_ns < TICK_NSEC)
break;
Avoid selecting deep idle state when the predicted idle duration is shorter than its target residency, as this leads to unnecessary state transitions without energy savings. On virtualized PowerPC (pseries) systems, where only one polling state (Snooze) and one deep state (CEDE) are available, selecting CEDE when its target residency exceeds the predicted idle duration hurts performance. For example, if the predicted idle duration is 15 us and the first non-polling state has a target residency of 120 us, selecting it would be suboptimal. Remove the condition introduced in commit 69d25870f20c ("cpuidle: fix the menu governor to boost IO performance") that prioritized non-polling states even when their target residency exceeded the predicted idle duration and allow polling states to be selected when appropriate. Performance improvement observed with pgbench on PowerPC (pseries) system: +---------------------------+------------+------------+------------+ | Metric | Baseline | Patched | Change (%) | +---------------------------+------------+------------+------------+ | Transactions/sec (TPS) | 494,834 | 538,707 | +8.85% | | Avg latency (ms) | 0.162 | 0.149 | -8.02% | +---------------------------+------------+------------+------------+ CPUIdle state usage: +--------------+--------------+-------------+ | Metric | Baseline | Patched | +--------------+--------------+-------------+ | Total usage | 12,703,630 | 13,941,966 | | Above usage | 11,388,990 | 1,620,474 | | Below usage | 19,973 | 684,708 | +--------------+--------------+-------------+ Above/Total and Below/Total usage percentages: +------------------------+-----------+---------+ | Metric | Baseline | Patched | +------------------------+-----------+---------+ | Above % (Above/Total) | 89.67% | 11.63% | | Below % (Below/Total) | 0.16% | 4.91% | | Total cpuidle miss (%) | 89.83% | 16.54% | +------------------------+-----------+---------+ Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> --- v1: https://lore.kernel.org/all/20240809073120.250974-1-aboorvad@linux.ibm.com/ v1 -> v2: - Drop cover letter and improve commit message. --- drivers/cpuidle/governors/menu.c | 11 ----------- 1 file changed, 11 deletions(-)