Message ID | 1409051215-16788-4-git-send-email-vincent.guittot@linaro.org |
---|---|
State | Accepted |
Commit | 65fdac08c264506ff95ee1e34ae066e308c9e6e3 |
Headers | show |
Hi Vincent, On 08/26/2014 04:36 PM, Vincent Guittot wrote: > The computation of avg_load and avg_load_per_task should only takes into > account the number of cfs tasks. The non cfs task are already taken into > account by decreasing the cpu's capacity and they will be tracked in the > CPU's utilization (group_utilization) of the next patches > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > kernel/sched/fair.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 87b9dc7..b85e9f7 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) > static unsigned long cpu_avg_load_per_task(int cpu) > { > struct rq *rq = cpu_rq(cpu); > - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); > + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); > unsigned long load_avg = rq->cfs.runnable_load_avg; > > if (nr_running) > @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, > load = source_load(i, load_idx); > > sgs->group_load += load; > - sgs->sum_nr_running += rq->nr_running; > + sgs->sum_nr_running += rq->cfs.h_nr_running; > > if (rq->nr_running > 1) > *overload = true; > Why do we probe rq->nr_running while we do load balancing? Should not we be probing cfs_rq->nr_running instead? We are interested after all in load balancing fair tasks right? The reason I ask this is, I was wondering if we need to make the above similar change in more places in load balancing. To cite examples: The above check says a cpu is overloaded when rq->nr_running > 1. However if these tasks happen to be rt tasks, we would anyway not be able to load balance. So while I was looking through this patch, I noticed this and wanted to cross verify if we are checking rq->nr_running on purpose in some places in load balancing; another example being in nohz_kick_needed(). Regards Preeti U Murthy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 30 August 2014 14:00, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: > Hi Vincent, > > On 08/26/2014 04:36 PM, Vincent Guittot wrote: >> The computation of avg_load and avg_load_per_task should only takes into >> account the number of cfs tasks. The non cfs task are already taken into >> account by decreasing the cpu's capacity and they will be tracked in the >> CPU's utilization (group_utilization) of the next patches >> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> >> --- >> kernel/sched/fair.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 87b9dc7..b85e9f7 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) >> static unsigned long cpu_avg_load_per_task(int cpu) >> { >> struct rq *rq = cpu_rq(cpu); >> - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); >> + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); >> unsigned long load_avg = rq->cfs.runnable_load_avg; >> >> if (nr_running) >> @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, >> load = source_load(i, load_idx); >> >> sgs->group_load += load; >> - sgs->sum_nr_running += rq->nr_running; >> + sgs->sum_nr_running += rq->cfs.h_nr_running; >> >> if (rq->nr_running > 1) >> *overload = true; >> > > Why do we probe rq->nr_running while we do load balancing? Should not we > be probing cfs_rq->nr_running instead? We are interested after all in > load balancing fair tasks right? The reason I ask this is, I was > wondering if we need to make the above similar change in more places in > load balancing. Hi Preeti, Yes, we should probably the test rq->cfs.h_nr_running > 0 before setting overload. Sorry for this late answer, the email was lost in my messy inbox Vincent > > To cite examples: The above check says a cpu is overloaded when > rq->nr_running > 1. However if these tasks happen to be rt tasks, we > would anyway not be able to load balance. So while I was looking through > this patch, I noticed this and wanted to cross verify if we are checking > rq->nr_running on purpose in some places in load balancing; another > example being in nohz_kick_needed(). > > > Regards > Preeti U Murthy > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, 2014-09-03 at 13:09 +0200, Vincent Guittot wrote: > On 30 August 2014 14:00, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: > > Hi Vincent, > > > > On 08/26/2014 04:36 PM, Vincent Guittot wrote: > >> The computation of avg_load and avg_load_per_task should only takes into > >> account the number of cfs tasks. The non cfs task are already taken into > >> account by decreasing the cpu's capacity and they will be tracked in the > >> CPU's utilization (group_utilization) of the next patches > >> > >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > >> --- > >> kernel/sched/fair.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index 87b9dc7..b85e9f7 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) > >> static unsigned long cpu_avg_load_per_task(int cpu) > >> { > >> struct rq *rq = cpu_rq(cpu); > >> - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); > >> + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); > >> unsigned long load_avg = rq->cfs.runnable_load_avg; > >> > >> if (nr_running) > >> @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, > >> load = source_load(i, load_idx); > >> > >> sgs->group_load += load; > >> - sgs->sum_nr_running += rq->nr_running; > >> + sgs->sum_nr_running += rq->cfs.h_nr_running; > >> > >> if (rq->nr_running > 1) > >> *overload = true; > >> > > > > Why do we probe rq->nr_running while we do load balancing? Should not we > > be probing cfs_rq->nr_running instead? We are interested after all in > > load balancing fair tasks right? The reason I ask this is, I was > > wondering if we need to make the above similar change in more places in > > load balancing. > > Hi Preeti, > > Yes, we should probably the test rq->cfs.h_nr_running > 0 before > setting overload. > The overload indicator is used for knowing when we can totally avoid load balancing to a cpu that is about to go idle. We can avoid load balancing when no cpu has more than 1 task. So if you have say just one fair task and multiple deadline tasks on a cpu, and another cpu about to go idle, you should turn on normal load balancing in the idle path by setting overload to true. So setting overload should be set based on rq->nr_running and not on rq->cfs.h_nr_running. Thanks. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 4 September 2014 01:43, Tim Chen <tim.c.chen@linux.intel.com> wrote: > On Wed, 2014-09-03 at 13:09 +0200, Vincent Guittot wrote: >> On 30 August 2014 14:00, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: >> > Hi Vincent, >> > >> > On 08/26/2014 04:36 PM, Vincent Guittot wrote: >> >> The computation of avg_load and avg_load_per_task should only takes into >> >> account the number of cfs tasks. The non cfs task are already taken into >> >> account by decreasing the cpu's capacity and they will be tracked in the >> >> CPU's utilization (group_utilization) of the next patches >> >> >> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> >> >> --- >> >> kernel/sched/fair.c | 4 ++-- >> >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> >> index 87b9dc7..b85e9f7 100644 >> >> --- a/kernel/sched/fair.c >> >> +++ b/kernel/sched/fair.c >> >> @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) >> >> static unsigned long cpu_avg_load_per_task(int cpu) >> >> { >> >> struct rq *rq = cpu_rq(cpu); >> >> - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); >> >> + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); >> >> unsigned long load_avg = rq->cfs.runnable_load_avg; >> >> >> >> if (nr_running) >> >> @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, >> >> load = source_load(i, load_idx); >> >> >> >> sgs->group_load += load; >> >> - sgs->sum_nr_running += rq->nr_running; >> >> + sgs->sum_nr_running += rq->cfs.h_nr_running; >> >> >> >> if (rq->nr_running > 1) >> >> *overload = true; >> >> >> > >> > Why do we probe rq->nr_running while we do load balancing? Should not we >> > be probing cfs_rq->nr_running instead? We are interested after all in >> > load balancing fair tasks right? The reason I ask this is, I was >> > wondering if we need to make the above similar change in more places in >> > load balancing. >> >> Hi Preeti, >> >> Yes, we should probably the test rq->cfs.h_nr_running > 0 before >> setting overload. >> > > The overload indicator is used for knowing when we can totally avoid > load balancing to a cpu that is about to go idle. > We can avoid load balancing when no cpu has more than 1 task. So if you > have say just one fair task and multiple deadline tasks on a cpu, > and another cpu about to go idle, you should turn on normal load > balancing in the idle path by setting overload to true. The newly idle load balancing can only affect CFS tasks so triggering a load_balance because a cpu is overloaded by rt tasks only, will not change anything. > > So setting overload should be set based on rq->nr_running and not on > rq->cfs.h_nr_running. We should probably use both values like below if ((rq->nr_running > 1) && ( rq->cfs.h_nr_running > 0)) Regards, Vincent > > Thanks. > > Tim > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Thu, 2014-09-04 at 09:17 +0200, Vincent Guittot wrote: > On 4 September 2014 01:43, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > On Wed, 2014-09-03 at 13:09 +0200, Vincent Guittot wrote: > >> On 30 August 2014 14:00, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: > >> > Hi Vincent, > >> > > >> > On 08/26/2014 04:36 PM, Vincent Guittot wrote: > >> >> The computation of avg_load and avg_load_per_task should only takes into > >> >> account the number of cfs tasks. The non cfs task are already taken into > >> >> account by decreasing the cpu's capacity and they will be tracked in the > >> >> CPU's utilization (group_utilization) of the next patches > >> >> > >> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > >> >> --- > >> >> kernel/sched/fair.c | 4 ++-- > >> >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> >> > >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> >> index 87b9dc7..b85e9f7 100644 > >> >> --- a/kernel/sched/fair.c > >> >> +++ b/kernel/sched/fair.c > >> >> @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) > >> >> static unsigned long cpu_avg_load_per_task(int cpu) > >> >> { > >> >> struct rq *rq = cpu_rq(cpu); > >> >> - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); > >> >> + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); > >> >> unsigned long load_avg = rq->cfs.runnable_load_avg; > >> >> > >> >> if (nr_running) > >> >> @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, > >> >> load = source_load(i, load_idx); > >> >> > >> >> sgs->group_load += load; > >> >> - sgs->sum_nr_running += rq->nr_running; > >> >> + sgs->sum_nr_running += rq->cfs.h_nr_running; > >> >> > >> >> if (rq->nr_running > 1) > >> >> *overload = true; > >> >> > >> > > >> > Why do we probe rq->nr_running while we do load balancing? Should not we > >> > be probing cfs_rq->nr_running instead? We are interested after all in > >> > load balancing fair tasks right? The reason I ask this is, I was > >> > wondering if we need to make the above similar change in more places in > >> > load balancing. > >> > >> Hi Preeti, > >> > >> Yes, we should probably the test rq->cfs.h_nr_running > 0 before > >> setting overload. > >> > > > > The overload indicator is used for knowing when we can totally avoid > > load balancing to a cpu that is about to go idle. > > We can avoid load balancing when no cpu has more than 1 task. So if you > > have say just one fair task and multiple deadline tasks on a cpu, > > and another cpu about to go idle, you should turn on normal load > > balancing in the idle path by setting overload to true. > > The newly idle load balancing can only affect CFS tasks so triggering > a load_balance because a cpu is overloaded by rt tasks only, will not > change anything. > > > > > So setting overload should be set based on rq->nr_running and not on > > rq->cfs.h_nr_running. > > We should probably use both values like below > > if ((rq->nr_running > 1) && ( rq->cfs.h_nr_running > 0)) Yes, this modification is the correct one that takes care of the condition I objected to previously. Thanks. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 08/26/2014 04:36 PM, Vincent Guittot wrote: > The computation of avg_load and avg_load_per_task should only takes into > account the number of cfs tasks. The non cfs task are already taken into > account by decreasing the cpu's capacity and they will be tracked in the > CPU's utilization (group_utilization) of the next patches > > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > --- > kernel/sched/fair.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 87b9dc7..b85e9f7 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) > static unsigned long cpu_avg_load_per_task(int cpu) > { > struct rq *rq = cpu_rq(cpu); > - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); > + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); > unsigned long load_avg = rq->cfs.runnable_load_avg; > > if (nr_running) > @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, > load = source_load(i, load_idx); > > sgs->group_load += load; > - sgs->sum_nr_running += rq->nr_running; > + sgs->sum_nr_running += rq->cfs.h_nr_running; Yes this was one of the concerns I had around the usage of rq->nr_running. Looks good to me. > > if (rq->nr_running > 1) > *overload = true; > Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 87b9dc7..b85e9f7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu) static unsigned long cpu_avg_load_per_task(int cpu) { struct rq *rq = cpu_rq(cpu); - unsigned long nr_running = ACCESS_ONCE(rq->nr_running); + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running); unsigned long load_avg = rq->cfs.runnable_load_avg; if (nr_running) @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env, load = source_load(i, load_idx); sgs->group_load += load; - sgs->sum_nr_running += rq->nr_running; + sgs->sum_nr_running += rq->cfs.h_nr_running; if (rq->nr_running > 1) *overload = true;
The computation of avg_load and avg_load_per_task should only takes into account the number of cfs tasks. The non cfs task are already taken into account by decreasing the cpu's capacity and they will be tracked in the CPU's utilization (group_utilization) of the next patches Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)