mbox series

[0/5] tcg: Dynamically allocate temporaries

Message ID 20210119183428.556706-1-richard.henderson@linaro.org
Headers show
Series tcg: Dynamically allocate temporaries | expand

Message

Richard Henderson Jan. 19, 2021, 6:34 p.m. UTC
My recent change for caching tcg constants has, in a number of cases,
overflowed the statically allocated array of temporaries.  Change to
dynamic allocation.

I'll note that nothing in check-acceptance triggers this overflow.
Anyone care to add some more test cases there?

Also, there's some outstanding weirdness in gitlab testing that I
cannot reproduce locally.


r~


Richard Henderson (5):
  tcg: Add an index to TCGTemp
  tcg: Introduce and use tcg_temp
  tcg: Make TCGTempSet expandable
  tcg: Adjust tcgv_*_temp/temp_tcgv_*
  tcg: Dynamically allocate temporaries

 include/tcg/tcg.h |  79 ++++++++++++++-----
 tcg/optimize.c    |  23 +++---
 tcg/tcg.c         | 196 +++++++++++++++++++++++++++++++---------------
 3 files changed, 205 insertions(+), 93 deletions(-)

-- 
2.25.1

Comments

BALATON Zoltan Jan. 19, 2021, 11:06 p.m. UTC | #1
On Tue, 19 Jan 2021, Richard Henderson wrote:
> My recent change for caching tcg constants has, in a number of cases,

> overflowed the statically allocated array of temporaries.  Change to

> dynamic allocation.


This seems to work for me so

Tested-by: BALATON Zoltan <balaton@eik.bme.hu>


but have you done any performance tests to check that this actually 
improves emulation speed? To mee it seems slower. Booting AmigaOS on 
sam460ex with c0dd6654f207 (just before your TCG series) takes:

real	0m33.829s
user	0m34.432s
sys	0m0.296s

but on HEAD with this series:

real	0m44.381s
user	0m46.058s
sys	0m0.532s

This is noticable decrease in speed also without measuring it. With just 
increasing the TCG_MAX_TEMPS to 2048 on 7c79721606be without this series I 
get:

real	0m42.681s
user	0m44.208s
sys	0m0.435s

So the performance regression is somewhere in the original series not in 
this fix up series.

> I'll note that nothing in check-acceptance triggers this overflow.

> Anyone care to add some more test cases there?


The proposed test for the upcoming pegasos2 machine may also catch this 
(when that will be merged, its dependencies are still under review) or the 
sam460ex test that currently only checks the firmware could be enhanced to 
try to boot AROS if somebody wants to do that. The drawback is that it 
needs an external iso whereas the current test doesn't need any additional 
images but it did not catch problems with IRQ and neither this problem 
with TCG temps. This problem was also found with riscv and mips I think 
but don't know if those would be easier to test.

Regards,
BALATON Zoltan
Philippe Mathieu-Daudé Jan. 19, 2021, 11:33 p.m. UTC | #2
On 1/20/21 12:06 AM, BALATON Zoltan wrote:
> On Tue, 19 Jan 2021, Richard Henderson wrote:

>> My recent change for caching tcg constants has, in a number of cases,

>> overflowed the statically allocated array of temporaries.  Change to

>> dynamic allocation.

> 

> This seems to work for me so

> 

> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>

> 

> but have you done any performance tests to check that this actually

> improves emulation speed? To mee it seems slower. Booting AmigaOS on

> sam460ex with c0dd6654f207 (just before your TCG series) takes:

> 

> real    0m33.829s

> user    0m34.432s

> sys    0m0.296s

> 

> but on HEAD with this series:

> 

> real    0m44.381s

> user    0m46.058s

> sys    0m0.532s

> 

> This is noticable decrease in speed also without measuring it. With just

> increasing the TCG_MAX_TEMPS to 2048 on 7c79721606be without this series

> I get:

> 

> real    0m42.681s

> user    0m44.208s

> sys    0m0.435s

> 

> So the performance regression is somewhere in the original series not in

> this fix up series.


Cc'ing Lukas for the performance part, as he is investigating
how to catch such regressions.

>> I'll note that nothing in check-acceptance triggers this overflow.

>> Anyone care to add some more test cases there?

> 

> The proposed test for the upcoming pegasos2 machine may also catch this

> (when that will be merged, its dependencies are still under review)


What are your running on pegasos2?

> or

> the sam460ex test that currently only checks the firmware could be

> enhanced to try to boot AROS if somebody wants to do that. The drawback

> is that it needs an external iso whereas the current test doesn't need

> any additional images but it did not catch problems with IRQ and neither

> this problem with TCG temps.


So this other option is not very useful, right?
BALATON Zoltan Jan. 20, 2021, 9:03 a.m. UTC | #3
On Wed, 20 Jan 2021, Philippe Mathieu-Daudé wrote:
> On 1/20/21 12:06 AM, BALATON Zoltan wrote:

>> On Tue, 19 Jan 2021, Richard Henderson wrote:

>>> My recent change for caching tcg constants has, in a number of cases,

>>> overflowed the statically allocated array of temporaries.  Change to

>>> dynamic allocation.

>>

>> This seems to work for me so

>>

>> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>

>>

>> but have you done any performance tests to check that this actually

>> improves emulation speed? To mee it seems slower. Booting AmigaOS on

>> sam460ex with c0dd6654f207 (just before your TCG series) takes:

>>

>> real    0m33.829s

>> user    0m34.432s

>> sys    0m0.296s

>>

>> but on HEAD with this series:

>>

>> real    0m44.381s

>> user    0m46.058s

>> sys    0m0.532s

>>

>> This is noticable decrease in speed also without measuring it. With just

>> increasing the TCG_MAX_TEMPS to 2048 on 7c79721606be without this series

>> I get:

>>

>> real    0m42.681s

>> user    0m44.208s

>> sys    0m0.435s

>>

>> So the performance regression is somewhere in the original series not in

>> this fix up series.

>

> Cc'ing Lukas for the performance part, as he is investigating

> how to catch such regressions.


I think there was a GSoC last year that resulted in some scripts to do 
performance testing and even bisecting regressions. I've seen a few 
reports posted about those but maybe the project should also use the 
results and run these on some dedicated test machines regularly to be 
useful. The GSoC has ended, so the student has left and I think the mentor 
was Aleksandar M. who may also be doing something else now so these 
benchmarking scripts seem to be abandoned. But maybe could be picked up as 
a starting point or inspiration for any similar activity to build on those 
results.

>>> I'll note that nothing in check-acceptance triggers this overflow.

>>> Anyone care to add some more test cases there?

>>

>> The proposed test for the upcoming pegasos2 machine may also catch this

>> (when that will be merged, its dependencies are still under review)

>

> What are your running on pegasos2?


I've sent you before what test I think we could do for pegasos2, see:

https://lists.nongnu.org/archive/html/qemu-ppc/2021-01/msg00112.html

but I could not write the script for that and have no way to test it so 
some help would be needed with that. By the way, before that there are 
also the vt82c686 patches still waiting for review. I hope you haven't 
forgot and will eventually come back to them.

>> or

>> the sam460ex test that currently only checks the firmware could be

>> enhanced to try to boot AROS if somebody wants to do that. The drawback

>> is that it needs an external iso whereas the current test doesn't need

>> any additional images but it did not catch problems with IRQ and neither

>> this problem with TCG temps.

>

> So this other option is not very useful, right?


It's still useful to test if the machine is working at all but the 
firmware does not seem to use interrupts and if you don't boot anything it 
won't access disks so some parts will not be tested by only firmware level 
testing. Basically only CPU, RAM, ROM, serial would be tested and that the 
machine could be created which would still catch some bugs but IRQs and 
IDE probably would only be tested by trying to boot an OS. I think Guenter 
runs Linux kernel boot tests but with -kernel option so disks would still 
not be tested by that therefore it may be more useful to run some other OS 
booted from CD just to increase coverage. Since I plan to use MorphOS for 
pegasos2 as test, sam460ex could use AROS (AmigaOS is not freely available 
so that cannot be used but I also test with that). The AROS isos are free 
but may be somewhat unstable so instead of using the latest, one should 
use a known working version and only update after manual testing, 
otherwise the test may break due to change in AROS.

Regards,
BALATON Zoltan
BALATON Zoltan Jan. 21, 2021, 8:09 p.m. UTC | #4
On Wed, 20 Jan 2021, BALATON Zoltan wrote:
> On Tue, 19 Jan 2021, Richard Henderson wrote:

>> My recent change for caching tcg constants has, in a number of cases,

>> overflowed the statically allocated array of temporaries.  Change to

>> dynamic allocation.

>

> This seems to work for me so

>

> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>

>

> but have you done any performance tests to check that this actually improves 

> emulation speed? To mee it seems slower. Booting AmigaOS on sam460ex with 

> c0dd6654f207 (just before your TCG series) takes:

>

> real	0m33.829s

> user	0m34.432s

> sys	0m0.296s

>

> but on HEAD with this series:

>

> real	0m44.381s

> user	0m46.058s

> sys	0m0.532s

>

> This is noticable decrease in speed also without measuring it. With just 

> increasing the TCG_MAX_TEMPS to 2048 on 7c79721606be without this series I 

> get:

>

> real	0m42.681s

> user	0m44.208s

> sys	0m0.435s

>

> So the performance regression is somewhere in the original series not in this 

> fix up series.


I've tried to do more measurements to identify where it got slower but I 
could not reproduce it today. I'm now getting around 42 seconds both 
before and after the series so not sure what made it faster before but 
it's probably not because of a code change then.

Regards,
BALATON Zoltan
Richard Henderson Jan. 21, 2021, 8:17 p.m. UTC | #5
On 1/21/21 10:09 AM, BALATON Zoltan wrote:
> On Wed, 20 Jan 2021, BALATON Zoltan wrote:

>> On Tue, 19 Jan 2021, Richard Henderson wrote:

>>> My recent change for caching tcg constants has, in a number of cases,

>>> overflowed the statically allocated array of temporaries.  Change to

>>> dynamic allocation.

>>

>> This seems to work for me so

>>

>> Tested-by: BALATON Zoltan <balaton@eik.bme.hu>

>>

>> but have you done any performance tests to check that this actually improves

>> emulation speed? To mee it seems slower. Booting AmigaOS on sam460ex with

>> c0dd6654f207 (just before your TCG series) takes:

>>

>> real    0m33.829s

>> user    0m34.432s

>> sys    0m0.296s

>>

>> but on HEAD with this series:

>>

>> real    0m44.381s

>> user    0m46.058s

>> sys    0m0.532s

>>

>> This is noticable decrease in speed also without measuring it. With just

>> increasing the TCG_MAX_TEMPS to 2048 on 7c79721606be without this series I get:

>>

>> real    0m42.681s

>> user    0m44.208s

>> sys    0m0.435s

>>

>> So the performance regression is somewhere in the original series not in this

>> fix up series.

> 

> I've tried to do more measurements to identify where it got slower but I could

> not reproduce it today. I'm now getting around 42 seconds both before and after

> the series so not sure what made it faster before but it's probably not because

> of a code change then.


That's reassuring.  I hadn't been able to measure a performance regression myself.

(The kind of TB that caused this SEGV thread and creates oodles of temps seems
to be an outlier.  Otherwise there should be very little change to non-vector
code.)


r~