[2/2] sched/fair: Improve the for loop in select_idle_core()

Message ID 6b165676325a47d67e667582a7b78da85c5c118a.1549536337.git.viresh.kumar@linaro.org
State New
Headers show
Series
  • [1/2] sched/fair: Don't pass sd to select_idle_smt()
Related show

Commit Message

Viresh Kumar Feb. 7, 2019, 10:46 a.m.
Once a non-idle thread is found for a core, there is no point in
traversing rest of the threads of that core. We continue traversal
currently to clear those threads from "cpus" mask. Clear all the
threads with a single call to cpumask_andnot(), which will also let us
exit the loop earlier.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 kernel/sched/fair.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

-- 
2.20.1.321.g9e740568ce00

Comments

Peter Zijlstra Feb. 11, 2019, 9:30 a.m. | #1
On Thu, Feb 07, 2019 at 04:16:06PM +0530, Viresh Kumar wrote:
> @@ -6081,10 +6082,14 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int

>  	for_each_cpu_wrap(core, cpus, target) {

>  		bool idle = true;

>  

> -		for_each_cpu(cpu, cpu_smt_mask(core)) {

> -			cpumask_clear_cpu(cpu, cpus);

> -			if (!available_idle_cpu(cpu))

> +		smt = cpu_smt_mask(core);

> +		cpumask_andnot(cpus, cpus, smt);


So where the previous code was like 1-2 stores, you just added 16.

(assuming 64bit and NR_CPUS=1024)

And we still do the iteration anyway:

> +		for_each_cpu(cpu, smt) {

> +			if (!available_idle_cpu(cpu)) {

>  				idle = false;

> +				break;

> +			}

>  		}


An actual improvement would've been:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 38d4669aa2ef..2d352d6d15c7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6082,7 +6082,7 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
 		bool idle = true;
 
 		for_each_cpu(cpu, cpu_smt_mask(core)) {
-			cpumask_clear_cpu(cpu, cpus);
+			__cpumask_clear_cpu(cpu, cpus);
 			if (!available_idle_cpu(cpu))
 				idle = false;
 		}
Viresh Kumar Feb. 11, 2019, 10:26 a.m. | #2
On 11-02-19, 10:30, Peter Zijlstra wrote:
> On Thu, Feb 07, 2019 at 04:16:06PM +0530, Viresh Kumar wrote:

> > @@ -6081,10 +6082,14 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int

> >  	for_each_cpu_wrap(core, cpus, target) {

> >  		bool idle = true;

> >  

> > -		for_each_cpu(cpu, cpu_smt_mask(core)) {

> > -			cpumask_clear_cpu(cpu, cpus);

> > -			if (!available_idle_cpu(cpu))

> > +		smt = cpu_smt_mask(core);

> > +		cpumask_andnot(cpus, cpus, smt);

> 

> So where the previous code was like 1-2 stores, you just added 16.


Is the max number of possible threads per core just 2? That's what I
read just now and I wasn't aware of that earlier. This commit doesn't
improve anything then. Sorry for the noise.

-- 
viresh
Peter Zijlstra Feb. 11, 2019, 10:44 a.m. | #3
On Mon, Feb 11, 2019 at 03:56:59PM +0530, Viresh Kumar wrote:
> On 11-02-19, 10:30, Peter Zijlstra wrote:

> > On Thu, Feb 07, 2019 at 04:16:06PM +0530, Viresh Kumar wrote:

> > > @@ -6081,10 +6082,14 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int

> > >  	for_each_cpu_wrap(core, cpus, target) {

> > >  		bool idle = true;

> > >  

> > > -		for_each_cpu(cpu, cpu_smt_mask(core)) {

> > > -			cpumask_clear_cpu(cpu, cpus);

> > > -			if (!available_idle_cpu(cpu))

> > > +		smt = cpu_smt_mask(core);

> > > +		cpumask_andnot(cpus, cpus, smt);

> > 

> > So where the previous code was like 1-2 stores, you just added 16.

> 

> Is the max number of possible threads per core just 2? That's what I

> read just now and I wasn't aware of that earlier. This commit doesn't

> improve anything then. Sorry for the noise.


We've got up to SMT8 in the tree (Sparc64, Power8 and some MIPS IIRC),
but that's still less than having to touch the entire bitmap.

Also, Power9 went back to SMT4 and I think the majory of SMT deployments
is that or less.

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8d5c82342a36..ccd0ae9878a2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6068,6 +6068,7 @@  void __update_idle_core(struct rq *rq)
 static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int target)
 {
 	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
+	const struct cpumask *smt;
 	int core, cpu;
 
 	if (!static_branch_likely(&sched_smt_present))
@@ -6081,10 +6082,14 @@  static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
 	for_each_cpu_wrap(core, cpus, target) {
 		bool idle = true;
 
-		for_each_cpu(cpu, cpu_smt_mask(core)) {
-			cpumask_clear_cpu(cpu, cpus);
-			if (!available_idle_cpu(cpu))
+		smt = cpu_smt_mask(core);
+		cpumask_andnot(cpus, cpus, smt);
+
+		for_each_cpu(cpu, smt) {
+			if (!available_idle_cpu(cpu)) {
 				idle = false;
+				break;
+			}
 		}
 
 		if (idle)