diff mbox series

selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

Message ID 20210630145740.54614-1-paolo.pisati@canonical.com
State Accepted
Commit 0c0f6299ba71faf610e311605e09e96331c45f28
Headers show
Series selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test | expand

Commit Message

Paolo Pisati June 30, 2021, 2:57 p.m. UTC
While the offline memory test obey ratio limit, the same test with error
injection does not and tries to offline all the hotpluggable memory, spamming
system logs with hundreds of thousands of dump_page() entries, slowing system
down (to the point the test itself timeout and gets terminated) and excessive fs
occupation:

...
[ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
[ 9784.393355] def_blk_aops
[ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
[ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
[ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
[ 9784.393359] page dumped because: migration failure
[ 9784.393360] page->mem_cgroup:c00000000490d000
[ 9784.393416] migrating pfn 1f46d failed ret:1
...

$ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
2405558

$ ls -la /var/log/kern.log
-rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
---
 tools/testing/selftests/memory-hotplug/mem-on-off-test.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Krzysztof Kozlowski July 2, 2021, 2:48 p.m. UTC | #1
On 30/06/2021 16:57, Paolo Pisati wrote:
> While the offline memory test obey ratio limit, the same test with error

> injection does not and tries to offline all the hotpluggable memory, spamming

> system logs with hundreds of thousands of dump_page() entries, slowing system

> down (to the point the test itself timeout and gets terminated) and excessive fs

> occupation:

> 

> ...

> [ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b

> [ 9784.393355] def_blk_aops

> [ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)

> [ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950

> [ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000

> [ 9784.393359] page dumped because: migration failure

> [ 9784.393360] page->mem_cgroup:c00000000490d000

> [ 9784.393416] migrating pfn 1f46d failed ret:1

> ...

> 

> $ grep "page dumped because: migration failure" /var/log/kern.log | wc -l

> 2405558

> 

> $ ls -la /var/log/kern.log

> -rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log


Makes sense to me and looks better choice than to disable the test
completely (as other choice...).

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>


> 

> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

> ---

>  tools/testing/selftests/memory-hotplug/mem-on-off-test.sh | 4 +++-

>  1 file changed, 3 insertions(+), 1 deletion(-)

> 

Best regards,
Krzysztof
Paolo Pisati July 9, 2021, 1 p.m. UTC | #2
On Wed, Jun 30, 2021 at 04:57:40PM +0200, Paolo Pisati wrote:
> While the offline memory test obey ratio limit, the same test with error

> injection does not and tries to offline all the hotpluggable memory, spamming

> system logs with hundreds of thousands of dump_page() entries, slowing system

> down (to the point the test itself timeout and gets terminated) and excessive fs

> occupation:

> 

> ...


Anyone with spare cycles could review this? It got one ack already.
-- 
bye,
p.
Shuah Khan July 9, 2021, 5:03 p.m. UTC | #3
On 7/9/21 7:00 AM, Paolo Pisati wrote:
> On Wed, Jun 30, 2021 at 04:57:40PM +0200, Paolo Pisati wrote:

>> While the offline memory test obey ratio limit, the same test with error

>> injection does not and tries to offline all the hotpluggable memory, spamming

>> system logs with hundreds of thousands of dump_page() entries, slowing system

>> down (to the point the test itself timeout and gets terminated) and excessive fs

>> occupation:

>>

>> ...

> 

> Anyone with spare cycles could review this? It got one ack already.

> 


Thanks for finding and fixing this.

Looks good to me. I will pull this in as soon as the merge window
ends and 5.14-rc1 comes out.

thanks,
-- Shuah
diff mbox series

Patch

diff --git a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
index b37585e6aa38..46a97f318f58 100755
--- a/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
+++ b/tools/testing/selftests/memory-hotplug/mem-on-off-test.sh
@@ -282,7 +282,9 @@  done
 #
 echo $error > $NOTIFIER_ERR_INJECT_DIR/actions/MEM_GOING_OFFLINE/error
 for memory in `hotpluggable_online_memory`; do
-	offline_memory_expect_fail $memory
+	if [ $((RANDOM % 100)) -lt $ratio ]; then
+		offline_memory_expect_fail $memory
+	fi
 done
 
 echo 0 > $NOTIFIER_ERR_INJECT_DIR/actions/MEM_GOING_OFFLINE/error