[LKP,lkp-developer,sched/fair] 4e5160766f: +149% ftq.noise.50% regression

Message ID 20161222151215.GA23448@linaro.org
State New
Headers show

Commit Message

Vincent Guittot Dec. 22, 2016, 3:12 p.m.
Le Tuesday 13 Dec 2016 à 09:47:30 (+0800), Huang, Ying a écrit :
> Hi, Vincent,

> 

> Vincent Guittot <vincent.guittot@linaro.org> writes:

> 

> > Hi Ying,

> >

> > On 12 December 2016 at 06:43, kernel test robot

> > <ying.huang@linux.intel.com> wrote:

> >> Greeting,

> >>

> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:

> >>

> >>

> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")

> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

> >>

> >> in testcase: ftq

> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory

> >> with following parameters:

> >>

> >>         nr_task: 100%

> >>         samples: 6000ss

> >>         test: cache

> >>         freq: 20

> >>         cpufreq_governor: powersave

> >

> > Why using powersave ? Are you testing  every governors ?

> 

> We will test performance and powersave governor for FTQ.


Ok thanks

> 

> >>

> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.

> >> test-url: https://github.com/rminnich/ftq

> >

> > It's a bit difficult to understand exactly what is measured and what

> > is ftq.noise.50% because this result is not part of the bench which

> > seems to only record a log of data in a file and ftq.noise.50% seems

> > to be lkp specific

> 

> Yes. FTQ itself has no noise statistics builtin, although it is an OS

> noise benchmark.  ftq.noise.50% is calculated as below:

> 

> There is a score for every sample of ftq.  The lower the score, the

> higher the noises.  ftq.noise.50% is the number (per 1000000 samples) of

> samples whose score is less than 50% of the mean score.

> 


ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%

I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.

Could it be possible to test if the patch below fix the regression ?


---
 kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

-- 
2.7.4

Thanks


> Best Regards,

> Huang, Ying

> 

> > I have tried to reproduce the lkp test on a debian jessie then a

> > ubuntu server 16.10 but lkp doesn't seems to install cleanly as there

> > are some errors:

> >

> > sudo bin/lkp run     job.yaml

> > IPMI BMC is not supported on this machine, skip bmc-watchdog setup!

> > 2016-12-12 13:58:39 ./ftq_cache -f 20 -n 6000 -t 8 -a 524288

> > Start 5088418680237 end 5438443372098 elapsed 350024691861

> > cyclestart 14236344834332 cycleend 15214154208877 elapsed 977809374545

> > Avg Cycles(ticks) per ns. is 2.793544; nspercycle is 0.357968

> > Pre-computed ticks per ns: 2.793541

> > Sample frequency is 20.000000

> > ticks per ns 2.79354

> > chown: utilisateur incorrect: «lkp.lkp»

> > chown: utilisateur incorrect: «lkp.lkp»

> > wait for background monitors: 9405 9407 oom-killer nfs-hang

> > curl: (6) Could not resolve host: ftq.time

> >

> >

> >>

> >> In addition to that, the commit also has significant impact on the following tests:

> >>

> >> +------------------+--------------------------------------------------------------------------------+

> >> | testcase: change | unixbench: unixbench.score 2.7% improvement                                    |

> >> | test machine     | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory               |

> >> | test parameters  | cpufreq_governor=performance                                                   |

> >> |                  | nr_task=100%                                                                   |

> >> |                  | runtime=300s                                                                   |

> >> |                  | test=execl                                                                     |

> >> +------------------+--------------------------------------------------------------------------------+

> >>

> >>

> >> Details are as below:

> >> -------------------------------------------------------------------------------------------------->

> >>

> >>

> >> To reproduce:

> >>

> >>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git

> >>         cd lkp-tests

> >>         bin/lkp install job.yaml  # job file is attached in this email

> >>         bin/lkp run     job.yaml

> >>

> >> testcase/path_params/tbox_group/run: ftq/100%-6000ss-cache-20-powersave/lkp-hsw-d01

> >>

> >> 09a43ace1f986b00  4e5160766fcc9f41bbd38bac11

> >> ----------------  --------------------------

> >>          %stddev      change         %stddev

> >>              \          |                \

> >>        305 ± 30%       260%       1100 ± 14%  ftq.noise.75%

> >>       1386 ± 19%       149%       3457 ±  7%  ftq.noise.50%

> >>       2148 ± 11%        98%       4257 ±  4%  ftq.noise.25%

> >>    3963589                     3898578        ftq.time.involuntary_context_switches

> >>

> >>

> >>

> >>                                    ftq.noise.50_

> >>

> >>   4000 ++------------O------------------------------------------------------+

> >>        |                                                           O      O |

> >>   3500 ++     O             O                        O    O O O             O

> >>        | O  O      O   O      O O  O O O    O O    O   O         O          |

> >>        O        O                         O                          O O    |

> >>   3000 ++                                       O                           |

> >>        |                 O                                                  |

> >>   2500 ++                                                                   |

> >>        |                                                                    |

> >>   2000 ++                                                                   |

> >>        |    *                  .*                                           |

> >>        |   + :     *   *      *  +                                          |

> >>   1500 ++ +  :    + + + +    :    + .*                                      |

> >>        |.*    *. +   *   *.. :     *  +                                     |

> >>   1000 *+-------*-----------*----------*------------------------------------+

> >>

> >>         [*] bisect-good sample

> >>         [O] bisect-bad  sample

> >>

> >>

> >> Disclaimer:

> >> Results have been estimated based on internal Intel analysis and are provided

> >> for informational purposes only. Any difference in system hardware or software

> >> design or configuration may affect actual performance.

> >>

> >>

> >> Thanks,

> >> Ying Huang

> > _______________________________________________

> > LKP mailing list

> > LKP@lists.01.org

> > https://lists.01.org/mailman/listinfo/lkp

Comments

Huang\, Ying Dec. 28, 2016, 8:17 a.m. | #1
Vincent Guittot <vincent.guittot@linaro.org> writes:

> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit :

>> Hi, Vincent,

>> 

>> Vincent Guittot <vincent.guittot@linaro.org> writes:

>> 

>> > Hi Ying,

>> >

>> > On 12 December 2016 at 06:43, kernel test robot

>> > <ying.huang@linux.intel.com> wrote:

>> >> Greeting,

>> >>

>> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:

>> >>

>> >>

>> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")

>> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

>> >>

>> >> in testcase: ftq

>> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory

>> >> with following parameters:

>> >>

>> >>         nr_task: 100%

>> >>         samples: 6000ss

>> >>         test: cache

>> >>         freq: 20

>> >>         cpufreq_governor: powersave

>> >

>> > Why using powersave ? Are you testing  every governors ?

>> 

>> We will test performance and powersave governor for FTQ.

>

> Ok thanks

>

>> 

>> >>

>> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.

>> >> test-url: https://github.com/rminnich/ftq

>> >

>> > It's a bit difficult to understand exactly what is measured and what

>> > is ftq.noise.50% because this result is not part of the bench which

>> > seems to only record a log of data in a file and ftq.noise.50% seems

>> > to be lkp specific

>> 

>> Yes. FTQ itself has no noise statistics builtin, although it is an OS

>> noise benchmark.  ftq.noise.50% is calculated as below:

>> 

>> There is a score for every sample of ftq.  The lower the score, the

>> higher the noises.  ftq.noise.50% is the number (per 1000000 samples) of

>> samples whose score is less than 50% of the mean score.

>> 

>

> ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%

>

> I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.

>

> Could it be possible to test if the patch below fix the regression ?

>

>

> ---

>  kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-

>  1 file changed, 28 insertions(+), 1 deletion(-)

>

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

> index 090a9bb..8efa113 100644

> --- a/kernel/sched/fair.c

> +++ b/kernel/sched/fair.c

> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)

>  	return 1;

>  }

>  

> +/* Check if we need to update the load and the utilization of a group_entity */

> +static inline bool skip_blocked_update(struct sched_entity *se)

> +{

> +	struct cfs_rq *gcfs_rq = group_cfs_rq(se);

> +

> +	/*

> +	 * If sched_entity still have not null load or utilization, we have to

> +	 * decay it.

> +	 */

> +	if (se->avg.load_avg || se->avg.util_avg)

> +		return false;

> +

> +	/*

> +	 * If there is a pending propagation, we have to update the load and

> +	 * the utilizaion of the sched_entity

> +	 */

> +	if (gcfs_rq->propagate_avg)

> +		return false;

> +

> +	/*

> +	 * Other wise, the load and the utilizaiton of the sched_entity is

> +	 * already null so it will be a waste of time to try to decay it

> +	 */

> +	return true;

> +}

>  #else /* CONFIG_FAIR_GROUP_SCHED */

>  

>  static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}

> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)

>  {

>  	struct rq *rq = cpu_rq(cpu);

>  	struct cfs_rq *cfs_rq;

> +	struct sched_entity *se;

>  	unsigned long flags;

>  

>  	raw_spin_lock_irqsave(&rq->lock, flags);

> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)

>  			update_tg_load_avg(cfs_rq, 0);

>  

>  		/* Propagate pending load changes to the parent */

> -		if (cfs_rq->tg->se[cpu])

> +		se = cfs_rq->tg->se[cpu];

> +		if (se && !skip_blocked_update(se))

>  			update_load_avg(cfs_rq->tg->se[cpu], 0);

>  	}

>  	raw_spin_unlock_irqrestore(&rq->lock, flags);


The test result is as follow,

=========================================================================================
compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase:
  gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq

commit: 
  4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit
  09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit
  0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch

4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3 
---------------- -------------------------- -------------------------- 
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
     61670 ±228%     -96.5%       2148 ± 11%     -94.7%       3281 ± 58%  ftq.noise.25%
      3463 ± 10%     -60.0%       1386 ± 19%     -26.3%       2552 ± 58%  ftq.noise.50%
      1116 ± 23%     -72.6%     305.99 ± 30%     -35.8%     716.15 ± 64%  ftq.noise.75%
   3843815 ±  3%      +3.1%    3963589 ±  1%     -49.6%    1938221 ±100%  ftq.time.involuntary_context_switches
      5.33 ± 30%     +21.4%       6.46 ± 14%     -71.7%       1.50 ±108%  time.system_time


It appears that the system_time and involuntary_context_switches reduced
much after applied the debug patch, which is good from noise point of
view.  ftq.noise.50% reduced compared with the first bad commit, but
have not restored to that of the parent of the first bad commit.

Best Regards,
Huang, Ying
Vincent Guittot Jan. 2, 2017, 3:42 p.m. | #2
Hi Ying,

On 28 December 2016 at 09:17, Huang, Ying <ying.huang@intel.com> wrote:
> Vincent Guittot <vincent.guittot@linaro.org> writes:

>

>> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit :

>>> Hi, Vincent,

>>>

>>> Vincent Guittot <vincent.guittot@linaro.org> writes:

>>>

>>> > Hi Ying,

>>> >

>>> > On 12 December 2016 at 06:43, kernel test robot

>>> > <ying.huang@linux.intel.com> wrote:

>>> >> Greeting,

>>> >>

>>> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:

>>> >>

>>> >>

>>> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")

>>> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

>>> >>

>>> >> in testcase: ftq

>>> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory

>>> >> with following parameters:

>>> >>

>>> >>         nr_task: 100%

>>> >>         samples: 6000ss

>>> >>         test: cache

>>> >>         freq: 20

>>> >>         cpufreq_governor: powersave

>>> >

>>> > Why using powersave ? Are you testing  every governors ?

>>>

>>> We will test performance and powersave governor for FTQ.

>>

>> Ok thanks

>>

>>>

>>> >>

>>> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.

>>> >> test-url: https://github.com/rminnich/ftq

>>> >

>>> > It's a bit difficult to understand exactly what is measured and what

>>> > is ftq.noise.50% because this result is not part of the bench which

>>> > seems to only record a log of data in a file and ftq.noise.50% seems

>>> > to be lkp specific

>>>

>>> Yes. FTQ itself has no noise statistics builtin, although it is an OS

>>> noise benchmark.  ftq.noise.50% is calculated as below:

>>>

>>> There is a score for every sample of ftq.  The lower the score, the

>>> higher the noises.  ftq.noise.50% is the number (per 1000000 samples) of

>>> samples whose score is less than 50% of the mean score.

>>>

>>

>> ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%

>>

>> I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.

>>

>> Could it be possible to test if the patch below fix the regression ?

>>

>>

>> ---

>>  kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-

>>  1 file changed, 28 insertions(+), 1 deletion(-)

>>

>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

>> index 090a9bb..8efa113 100644

>> --- a/kernel/sched/fair.c

>> +++ b/kernel/sched/fair.c

>> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)

>>       return 1;

>>  }

>>

>> +/* Check if we need to update the load and the utilization of a group_entity */

>> +static inline bool skip_blocked_update(struct sched_entity *se)

>> +{

>> +     struct cfs_rq *gcfs_rq = group_cfs_rq(se);

>> +

>> +     /*

>> +      * If sched_entity still have not null load or utilization, we have to

>> +      * decay it.

>> +      */

>> +     if (se->avg.load_avg || se->avg.util_avg)

>> +             return false;

>> +

>> +     /*

>> +      * If there is a pending propagation, we have to update the load and

>> +      * the utilizaion of the sched_entity

>> +      */

>> +     if (gcfs_rq->propagate_avg)

>> +             return false;

>> +

>> +     /*

>> +      * Other wise, the load and the utilizaiton of the sched_entity is

>> +      * already null so it will be a waste of time to try to decay it

>> +      */

>> +     return true;

>> +}

>>  #else /* CONFIG_FAIR_GROUP_SCHED */

>>

>>  static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}

>> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)

>>  {

>>       struct rq *rq = cpu_rq(cpu);

>>       struct cfs_rq *cfs_rq;

>> +     struct sched_entity *se;

>>       unsigned long flags;

>>

>>       raw_spin_lock_irqsave(&rq->lock, flags);

>> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)

>>                       update_tg_load_avg(cfs_rq, 0);

>>

>>               /* Propagate pending load changes to the parent */

>> -             if (cfs_rq->tg->se[cpu])

>> +             se = cfs_rq->tg->se[cpu];

>> +             if (se && !skip_blocked_update(se))

>>                       update_load_avg(cfs_rq->tg->se[cpu], 0);

>>       }

>>       raw_spin_unlock_irqrestore(&rq->lock, flags);

>

> The test result is as follow,

>

> =========================================================================================

> compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase:

>   gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq

>

> commit:

>   4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit

>   09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit

>   0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch

>

> 4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3

> ---------------- -------------------------- --------------------------

>          %stddev     %change         %stddev     %change         %stddev

>              \          |                \          |                \

>      61670 ±228%     -96.5%       2148 ± 11%     -94.7%       3281 ± 58%  ftq.noise.25%

>       3463 ± 10%     -60.0%       1386 ± 19%     -26.3%       2552 ± 58%  ftq.noise.50%

>       1116 ± 23%     -72.6%     305.99 ± 30%     -35.8%     716.15 ± 64%  ftq.noise.75%

>    3843815 ±  3%      +3.1%    3963589 ±  1%     -49.6%    1938221 ±100%  ftq.time.involuntary_context_switches

>       5.33 ± 30%     +21.4%       6.46 ± 14%     -71.7%       1.50 ±108%  time.system_time

>

>

> It appears that the system_time and involuntary_context_switches reduced

> much after applied the debug patch, which is good from noise point of

> view.  ftq.noise.50% reduced compared with the first bad commit, but

> have not restored to that of the parent of the first bad commit.


Thanks for testing. I will try to improve it a bit but not sure that I
can reduce more.

Regards,
Vincent

>

> Best Regards,

> Huang, Ying
Dietmar Eggemann Jan. 3, 2017, 10:38 a.m. | #3
Hi Vincent and Ying,

On 01/02/2017 04:42 PM, Vincent Guittot wrote:
> Hi Ying,

>

> On 28 December 2016 at 09:17, Huang, Ying <ying.huang@intel.com> wrote:

>> Vincent Guittot <vincent.guittot@linaro.org> writes:

>>

>>> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit :

>>>> Hi, Vincent,

>>>>

>>>> Vincent Guittot <vincent.guittot@linaro.org> writes:


[...]

>>> ---

>>>  kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-

>>>  1 file changed, 28 insertions(+), 1 deletion(-)

>>>

>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

>>> index 090a9bb..8efa113 100644

>>> --- a/kernel/sched/fair.c

>>> +++ b/kernel/sched/fair.c

>>> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)

>>>       return 1;

>>>  }

>>>

>>> +/* Check if we need to update the load and the utilization of a group_entity */

>>> +static inline bool skip_blocked_update(struct sched_entity *se)

>>> +{

>>> +     struct cfs_rq *gcfs_rq = group_cfs_rq(se);

>>> +

>>> +     /*

>>> +      * If sched_entity still have not null load or utilization, we have to

>>> +      * decay it.

>>> +      */

>>> +     if (se->avg.load_avg || se->avg.util_avg)

>>> +             return false;

>>> +

>>> +     /*

>>> +      * If there is a pending propagation, we have to update the load and

>>> +      * the utilizaion of the sched_entity

>>> +      */

>>> +     if (gcfs_rq->propagate_avg)

>>> +             return false;

>>> +

>>> +     /*

>>> +      * Other wise, the load and the utilizaiton of the sched_entity is

>>> +      * already null so it will be a waste of time to try to decay it

>>> +      */

>>> +     return true;

>>> +}

>>>  #else /* CONFIG_FAIR_GROUP_SCHED */

>>>

>>>  static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}

>>> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)

>>>  {

>>>       struct rq *rq = cpu_rq(cpu);

>>>       struct cfs_rq *cfs_rq;

>>> +     struct sched_entity *se;

>>>       unsigned long flags;

>>>

>>>       raw_spin_lock_irqsave(&rq->lock, flags);

>>> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)

>>>                       update_tg_load_avg(cfs_rq, 0);

>>>

>>>               /* Propagate pending load changes to the parent */

>>> -             if (cfs_rq->tg->se[cpu])

>>> +             se = cfs_rq->tg->se[cpu];

>>> +             if (se && !skip_blocked_update(se))

>>>                       update_load_avg(cfs_rq->tg->se[cpu], 0);

>>>       }

>>>       raw_spin_unlock_irqrestore(&rq->lock, flags);

>>

>> The test result is as follow,

>>

>> =========================================================================================

>> compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase:

>>   gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq

>>

>> commit:

>>   4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit

>>   09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit

>>   0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch

>>

>> 4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3

>> ---------------- -------------------------- --------------------------

>>          %stddev     %change         %stddev     %change         %stddev

>>              \          |                \          |                \

>>      61670 ±228%     -96.5%       2148 ± 11%     -94.7%       3281 ± 58%  ftq.noise.25%

>>       3463 ± 10%     -60.0%       1386 ± 19%     -26.3%       2552 ± 58%  ftq.noise.50%

>>       1116 ± 23%     -72.6%     305.99 ± 30%     -35.8%     716.15 ± 64%  ftq.noise.75%

>>    3843815 ±  3%      +3.1%    3963589 ±  1%     -49.6%    1938221 ±100%  ftq.time.involuntary_context_switches

>>       5.33 ± 30%     +21.4%       6.46 ± 14%     -71.7%       1.50 ±108%  time.system_time

>>

>>

>> It appears that the system_time and involuntary_context_switches reduced

>> much after applied the debug patch, which is good from noise point of

>> view.  ftq.noise.50% reduced compared with the first bad commit, but

>> have not restored to that of the parent of the first bad commit.

>

> Thanks for testing. I will try to improve it a bit but not sure that I

> can reduce more.


Is this a desktop system where this regression comes from autogroups (1 
level taskgroups) or a server system with systemd (2 level taskgroups)?

Since the PELT rewrite (v4.2) I have ~60 autogroups per cpu 
(&rq->leaf_cfs_rq_list) on my Ubuntu desktop system permanently (Intel 
i7-4750HQ) whereas in v4.1 there were 0 - 10.

$ for i in `seq 0 7`; do cat /proc/sched_debug | grep
"cfs_rq\[$i\]:/autogroup-" | wc -l; done
58
61
63
65
60
59
62
56

Couldn't we still remove these autogroups by if (!cfs_rq->nr_running && 
!se->avg.load_avg && !se->avg.util_avg) in update_blocked_averages()?

Vincent, like we discussed in September last year, the proper fix would 
probably be a cfs-rq->nr_attached which IMHO is not doable w/o being an 
atomic because of migrate_task_rq_fair()->remove_entity_load_avg() not 
holding the rq lock.

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 090a9bb..8efa113 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3138,6 +3138,31 @@  static inline int propagate_entity_load_avg(struct sched_entity *se)
 	return 1;
 }
 
+/* Check if we need to update the load and the utilization of a group_entity */
+static inline bool skip_blocked_update(struct sched_entity *se)
+{
+	struct cfs_rq *gcfs_rq = group_cfs_rq(se);
+
+	/*
+	 * If sched_entity still have not null load or utilization, we have to
+	 * decay it.
+	 */
+	if (se->avg.load_avg || se->avg.util_avg)
+		return false;
+
+	/*
+	 * If there is a pending propagation, we have to update the load and
+	 * the utilizaion of the sched_entity
+	 */
+	if (gcfs_rq->propagate_avg)
+		return false;
+
+	/*
+	 * Other wise, the load and the utilizaiton of the sched_entity is
+	 * already null so it will be a waste of time to try to decay it
+	 */
+	return true;
+}
 #else /* CONFIG_FAIR_GROUP_SCHED */
 
 static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
@@ -6858,6 +6883,7 @@  static void update_blocked_averages(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 	struct cfs_rq *cfs_rq;
+	struct sched_entity *se;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&rq->lock, flags);
@@ -6876,7 +6902,8 @@  static void update_blocked_averages(int cpu)
 			update_tg_load_avg(cfs_rq, 0);
 
 		/* Propagate pending load changes to the parent */
-		if (cfs_rq->tg->se[cpu])
+		se = cfs_rq->tg->se[cpu];
+		if (se && !skip_blocked_update(se))
 			update_load_avg(cfs_rq->tg->se[cpu], 0);
 	}
 	raw_spin_unlock_irqrestore(&rq->lock, flags);