Message ID | 1510711256-1931-1-git-send-email-luis.machado@linaro.org |
---|---|
State | New |
Headers | show |
Series | [AArch64] Adjust tuning parameters for Falkor | expand |
On Tue, Nov 14, 2017 at 6:00 PM, Luis Machado <luis.machado@linaro.org> wrote: > Disabling software prefetching and switching the autoprefetcher to weak improves > CPU2017 rate and speed benchmarks for both int and fp sets on Falkor. > > SPECrate 2017 fp is up 0.38% > SPECspeed 2017 fp is up 0.54% > SPECrate 2017 int is up 3.02% > SPECspeed 2017 int is up 3.16% > > There are only a couple individual regressions. The biggest one being about 4% > in parest. > > For SPEC2006, we've noticed the following: > > SPECint is up 0.91% > SPECfp is stable > > In the case of SPEC2006 we noticed both a big regression in mcf (about 20%) > and a big improvement for hmmer (about 40%). > > Since the overall result is positive, we would like to make these new tuning > settings the default for Falkor. > > We may revisit the software prefetcher setting in the future, in case we > can adjust it enough so it provides us a good balance between improvements and > regressions (mcf). But for now it is best if it stays off. > > I understand the freeze is happening soon, so it would be great to have this > in before then. > > OK? > > Thanks, > Luis > > 2017-11-14 Luis Machado <luis.machado@linaro.org> > > * config/aarch64/aarch64.c (qdf24xx_prefetch_tune): Remove. > (qdf24xx_tunings): Replace qdf24xx_prefetch_tune with > generic_prefetch_tune and tune_params::AUTOPREFETCHER_STRONG with > tune_params::AUTOPREFETCHER_WEAK. > --- > gcc/ChangeLog | 7 +++++++ > gcc/config/aarch64/aarch64.c | 13 ++----------- > 2 files changed, 9 insertions(+), 11 deletions(-) > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index b80a421..4dbfda0 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2017-11-14 Luis Machado <luis.machado@linaro.org> > + > + * config/aarch64/aarch64.c (qdf24xx_prefetch_tune): Remove. > + (qdf24xx_tunings): Replace qdf24xx_prefetch_tune with > + generic_prefetch_tune and tune_params::AUTOPREFETCHER_STRONG with > + tune_params::AUTOPREFETCHER_WEAK. > + > 2017-11-14 Carl Love <cel@us.ibm.com> > > * config/rs6000/rs6000.c (swap_endian_selector_for_mode): Remove > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 0c67e2b..171a230 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -502,15 +502,6 @@ static const cpu_prefetch_tune exynosm1_prefetch_tune = > -1 /* default_opt_level */ > }; > > -static const cpu_prefetch_tune qdf24xx_prefetch_tune = > -{ > - 4, /* num_slots */ > - 32, /* l1_cache_size */ > - 64, /* l1_cache_line_size */ > - 1024, /* l2_cache_size */ > - 3 /* default_opt_level */ I think the best thing is to leave this tuning structure in place and just change default_opt_level to -1 to disable it at -O3. Thanks, Andrew > -}; > - > static const cpu_prefetch_tune thunderxt88_prefetch_tune = > { > 8, /* num_slots */ > @@ -817,9 +808,9 @@ static const struct tune_params qdf24xx_tunings = > 2, /* min_div_recip_mul_sf. */ > 2, /* min_div_recip_mul_df. */ > 0, /* max_case_values. */ > - tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ > + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ > (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ > - &qdf24xx_prefetch_tune > + &generic_prefetch_tune > }; > > /* Tuning structure for the Qualcomm Saphira core. Default to falkor values > -- > 2.7.4 ><div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png" alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href="http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank" style="color: #4453ea;">www.avg.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>
> I think the best thing is to leave this tuning structure in place and > just change default_opt_level to -1 to disable it at -O3. > > Thanks, > Andrew > Indeed that seems to be more appropriate if re-enabling prefetches in the future is a possibility. How about the following? Thanks, Luis 2017-11-15 Luis Machado <luis.machado@linaro.org> gcc/ * config/aarch64/aarch64.c (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. (qdf24xx_tunings) <autoprefetcher_model>: Set to tune_params::AUTOPREFETCHER_WEAK. --- gcc/ChangeLog | 7 +++++++ gcc/config/aarch64/aarch64.c | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index b80a421..0e05f9e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2017-11-15 Luis Machado <luis.machado@linaro.org> + + * config/aarch64/aarch64.c + (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. + (qdf24xx_tunings) <autoprefetcher_model>: Set to + tune_params::AUTOPREFETCHER_WEAK. + 2017-11-14 Carl Love <cel@us.ibm.com> * config/rs6000/rs6000.c (swap_endian_selector_for_mode): Remove diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 0c67e2b..8779cad 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -508,7 +508,7 @@ static const cpu_prefetch_tune qdf24xx_prefetch_tune = 32, /* l1_cache_size */ 64, /* l1_cache_line_size */ 1024, /* l2_cache_size */ - 3 /* default_opt_level */ + -1 /* default_opt_level */ }; static const cpu_prefetch_tune thunderxt88_prefetch_tune = @@ -817,7 +817,7 @@ static const struct tune_params qdf24xx_tunings = 2, /* min_div_recip_mul_sf. */ 2, /* min_div_recip_mul_df. */ 0, /* max_case_values. */ - tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ &qdf24xx_prefetch_tune }; -- 2.7.4
On 15 November 2017 at 01:00, Luis Machado <luis.machado@linaro.org> wrote: > > I think the best thing is to leave this tuning structure in place and > > just change default_opt_level to -1 to disable it at -O3. > > > > Thanks, > > Andrew > > > > Indeed that seems to be more appropriate if re-enabling prefetches in the > future is a possibility. > > How about the following? > > Thanks, > Luis > Ping?
Hi Luis, [cc'ing aarch64 maintainers, it's quicker to get review that way] On 15/11/17 03:00, Luis Machado wrote: > > I think the best thing is to leave this tuning structure in place and > > just change default_opt_level to -1 to disable it at -O3. > > > > Thanks, > > Andrew > > > > Indeed that seems to be more appropriate if re-enabling prefetches in the > future is a possibility. > > How about the following? > This looks correct to me to achieve what you want to achieve, but I can't approve it myself so you'll need an ok from an aarch64 maintainer. Thanks, Kyrill > Thanks, > Luis > > 2017-11-15 Luis Machado <luis.machado@linaro.org> > > gcc/ > * config/aarch64/aarch64.c > (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. > (qdf24xx_tunings) <autoprefetcher_model>: Set to > tune_params::AUTOPREFETCHER_WEAK. > --- > gcc/ChangeLog | 7 +++++++ > gcc/config/aarch64/aarch64.c | 4 ++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index b80a421..0e05f9e 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2017-11-15 Luis Machado <luis.machado@linaro.org> > + > + * config/aarch64/aarch64.c > + (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. > + (qdf24xx_tunings) <autoprefetcher_model>: Set to > + tune_params::AUTOPREFETCHER_WEAK. > + > 2017-11-14 Carl Love <cel@us.ibm.com> > > * config/rs6000/rs6000.c (swap_endian_selector_for_mode): Remove > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 0c67e2b..8779cad 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -508,7 +508,7 @@ static const cpu_prefetch_tune qdf24xx_prefetch_tune = > 32, /* l1_cache_size */ > 64, /* l1_cache_line_size */ > 1024, /* l2_cache_size */ > - 3 /* default_opt_level */ > + -1 /* default_opt_level */ > }; > > static const cpu_prefetch_tune thunderxt88_prefetch_tune = > @@ -817,7 +817,7 @@ static const struct tune_params qdf24xx_tunings = > 2, /* min_div_recip_mul_sf. */ > 2, /* min_div_recip_mul_df. */ > 0, /* max_case_values. */ > - tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ > + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ > (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ > &qdf24xx_prefetch_tune > }; > -- > 2.7.4 >
On 11/17/2017 07:25 AM, Kyrill Tkachov wrote: > Hi Luis, > > [cc'ing aarch64 maintainers, it's quicker to get review that way] > > On 15/11/17 03:00, Luis Machado wrote: >> > I think the best thing is to leave this tuning structure in place and >> > just change default_opt_level to -1 to disable it at -O3. >> > >> > Thanks, >> > Andrew >> > >> >> Indeed that seems to be more appropriate if re-enabling prefetches in the >> future is a possibility. >> >> How about the following? >> > > This looks correct to me to achieve what you want to achieve, > but I can't approve it myself so you'll need an ok from an aarch64 > maintainer. > > Thanks, > Kyrill > Thanks for the feedback. I'll wait for one of the maintainers to OK it. Luis
On Wed, Nov 15, 2017 at 03:00:53AM +0000, Luis Machado wrote: > > I think the best thing is to leave this tuning structure in place and > > just change default_opt_level to -1 to disable it at -O3. > > > > Thanks, > > Andrew > > > > Indeed that seems to be more appropriate if re-enabling prefetches in the > future is a possibility. > > How about the following? This is OK. Thanks, James > 2017-11-15 Luis Machado <luis.machado@linaro.org> > > gcc/ > * config/aarch64/aarch64.c > (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. > (qdf24xx_tunings) <autoprefetcher_model>: Set to > tune_params::AUTOPREFETCHER_WEAK. > --- > gcc/ChangeLog | 7 +++++++ > gcc/config/aarch64/aarch64.c | 4 ++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index b80a421..0e05f9e 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,10 @@ > +2017-11-15 Luis Machado <luis.machado@linaro.org> > + > + * config/aarch64/aarch64.c > + (qdf24xx_prefetch_tune) <default_opt_level>: Set to -1. > + (qdf24xx_tunings) <autoprefetcher_model>: Set to > + tune_params::AUTOPREFETCHER_WEAK. > + > 2017-11-14 Carl Love <cel@us.ibm.com> > > * config/rs6000/rs6000.c (swap_endian_selector_for_mode): Remove > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 0c67e2b..8779cad 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -508,7 +508,7 @@ static const cpu_prefetch_tune qdf24xx_prefetch_tune = > 32, /* l1_cache_size */ > 64, /* l1_cache_line_size */ > 1024, /* l2_cache_size */ > - 3 /* default_opt_level */ > + -1 /* default_opt_level */ > }; > > static const cpu_prefetch_tune thunderxt88_prefetch_tune = > @@ -817,7 +817,7 @@ static const struct tune_params qdf24xx_tunings = > 2, /* min_div_recip_mul_sf. */ > 2, /* min_div_recip_mul_df. */ > 0, /* max_case_values. */ > - tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ > + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ > (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ > &qdf24xx_prefetch_tune > }; > -- > 2.7.4 >
On 11/17/2017 01:48 PM, James Greenhalgh wrote: > On Wed, Nov 15, 2017 at 03:00:53AM +0000, Luis Machado wrote: >>> I think the best thing is to leave this tuning structure in place and >>> just change default_opt_level to -1 to disable it at -O3. >>> >>> Thanks, >>> Andrew >>> >> >> Indeed that seems to be more appropriate if re-enabling prefetches in the >> future is a possibility. >> >> How about the following? > > This is OK. > > Thanks, > James Thanks James. I've pushed this now. Luis
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index b80a421..4dbfda0 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2017-11-14 Luis Machado <luis.machado@linaro.org> + + * config/aarch64/aarch64.c (qdf24xx_prefetch_tune): Remove. + (qdf24xx_tunings): Replace qdf24xx_prefetch_tune with + generic_prefetch_tune and tune_params::AUTOPREFETCHER_STRONG with + tune_params::AUTOPREFETCHER_WEAK. + 2017-11-14 Carl Love <cel@us.ibm.com> * config/rs6000/rs6000.c (swap_endian_selector_for_mode): Remove diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 0c67e2b..171a230 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -502,15 +502,6 @@ static const cpu_prefetch_tune exynosm1_prefetch_tune = -1 /* default_opt_level */ }; -static const cpu_prefetch_tune qdf24xx_prefetch_tune = -{ - 4, /* num_slots */ - 32, /* l1_cache_size */ - 64, /* l1_cache_line_size */ - 1024, /* l2_cache_size */ - 3 /* default_opt_level */ -}; - static const cpu_prefetch_tune thunderxt88_prefetch_tune = { 8, /* num_slots */ @@ -817,9 +808,9 @@ static const struct tune_params qdf24xx_tunings = 2, /* min_div_recip_mul_sf. */ 2, /* min_div_recip_mul_df. */ 0, /* max_case_values. */ - tune_params::AUTOPREFETCHER_STRONG, /* autoprefetcher_model. */ + tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model. */ (AARCH64_EXTRA_TUNE_NONE), /* tune_flags. */ - &qdf24xx_prefetch_tune + &generic_prefetch_tune }; /* Tuning structure for the Qualcomm Saphira core. Default to falkor values