[v2] locktorture: Fix NULL pointer when torture_type is invalid

Message ID 1453955159-23216-1-git-send-email-wangkefeng.wang@huawei.com
State New
Headers show

Commit Message

Kefeng Wang Jan. 28, 2016, 4:25 a.m.
Insmod locktorture with torture_type=mutex will lead to crash,

Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = ffffffc0f6c10000
[00000008] *pgd=000000013b221003, *pud=000000013b221003, *pmd=0000000000000000a
Internal error: Oops: 94000006 [#1] PREEMPT SMP
Modules linked in: locktorture(+) torture
CPU: 2 PID: 1462 Comm: insmod Not tainted 4.4.0+ #19
Hardware name: linux,dummy-virt (DT)
task: ffffffc0fb2b3700 ti: ffffffc0fa938000 task.ti: ffffffc0fa938000
PC is at __torture_print_stats+0x18/0x180 [locktorture]
LR is at lock_torture_stats_print+0x68/0x110 [locktorture]
pc : [<ffffffbffc017028>] lr : [<ffffffbffc017500>] pstate: 60000145
sp : ffffffc0fa93bb20
[snip...]
Call trace:
[<ffffffbffc017028>] __torture_print_stats+0x18/0x180 [locktorture]
[<ffffffbffc017500>] lock_torture_stats_print+0x68/0x110 [locktorture]
[<ffffffbffc0180fc>] lock_torture_cleanup+0xc4/0x278 [locktorture]
[<ffffffbffc01d144>] lock_torture_init+0x144/0x5b0 [locktorture]
[<ffffffc000082940>] do_one_initcall+0x94/0x1a0
[<ffffffc000141888>] do_init_module+0x60/0x1c8
[<ffffffc00011c628>] load_module+0x1880/0x1c9c
[<ffffffc00011cc00>] SyS_finit_module+0x7c/0x88
[<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

Fix it by check statp in __torture_print_stats(), if it is NULL, just
create a lock-torture-statistics message with zero statistic argument.

Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

---
 kernel/locking/locktorture.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

-- 
2.6.0.GIT

Comments

Kefeng Wang Jan. 30, 2016, 2:46 a.m. | #1
Hi Paul,

On 2016/1/28 12:25, Kefeng Wang wrote:
> Insmod locktorture with torture_type=mutex will lead to crash,

> 

> Unable to handle kernel NULL pointer dereference at virtual address 00000008

> pgd = ffffffc0f6c10000

> [00000008] *pgd=000000013b221003, *pud=000000013b221003, *pmd=0000000000000000a

> Internal error: Oops: 94000006 [#1] PREEMPT SMP

> Modules linked in: locktorture(+) torture

> CPU: 2 PID: 1462 Comm: insmod Not tainted 4.4.0+ #19

> Hardware name: linux,dummy-virt (DT)

> task: ffffffc0fb2b3700 ti: ffffffc0fa938000 task.ti: ffffffc0fa938000

> PC is at __torture_print_stats+0x18/0x180 [locktorture]

> LR is at lock_torture_stats_print+0x68/0x110 [locktorture]

> pc : [<ffffffbffc017028>] lr : [<ffffffbffc017500>] pstate: 60000145

> sp : ffffffc0fa93bb20

> [snip...]

> Call trace:

> [<ffffffbffc017028>] __torture_print_stats+0x18/0x180 [locktorture]

> [<ffffffbffc017500>] lock_torture_stats_print+0x68/0x110 [locktorture]

> [<ffffffbffc0180fc>] lock_torture_cleanup+0xc4/0x278 [locktorture]

> [<ffffffbffc01d144>] lock_torture_init+0x144/0x5b0 [locktorture]

> [<ffffffc000082940>] do_one_initcall+0x94/0x1a0

> [<ffffffc000141888>] do_init_module+0x60/0x1c8

> [<ffffffc00011c628>] load_module+0x1880/0x1c9c

> [<ffffffc00011cc00>] SyS_finit_module+0x7c/0x88

> [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

> 

> Fix it by check statp in __torture_print_stats(), if it is NULL, just

> create a lock-torture-statistics message with zero statistic argument.


It is keep to get the statistics printout at the end if with bad argument,
what's your opinion about this version?

Thanks,
Kefeng

> 

> Cc: Josh Triplett <josh@joshtriplett.org>

> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

> ---

>  kernel/locking/locktorture.c | 24 ++++++++++++++----------

>  1 file changed, 14 insertions(+), 10 deletions(-)

> 

> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c

> index 8ef1919..7f0cf9c 100644

> --- a/kernel/locking/locktorture.c

> +++ b/kernel/locking/locktorture.c

> @@ -647,19 +647,23 @@ static void __torture_print_stats(char *page,

>  	bool fail = 0;

>  	int i, n_stress;

>  	long max = 0;

> -	long min = statp[0].n_lock_acquired;

> +	long min = 0;

>  	long long sum = 0;

>  

> -	n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;

> -	for (i = 0; i < n_stress; i++) {

> -		if (statp[i].n_lock_fail)

> -			fail = true;

> -		sum += statp[i].n_lock_acquired;

> -		if (max < statp[i].n_lock_fail)

> -			max = statp[i].n_lock_fail;

> -		if (min > statp[i].n_lock_fail)

> -			min = statp[i].n_lock_fail;

> +	if (statp) {

> +		min = statp[0].n_lock_acquired;

> +		n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;

> +		for (i = 0; i < n_stress; i++) {

> +			if (statp[i].n_lock_fail)

> +				fail = true;

> +			sum += statp[i].n_lock_acquired;

> +			if (max < statp[i].n_lock_fail)

> +				max = statp[i].n_lock_fail;

> +			if (min > statp[i].n_lock_fail)

> +				min = statp[i].n_lock_fail;

> +		}

>  	}

> +

>  	page += sprintf(page,

>  			"%s:  Total: %lld  Max/Min: %ld/%ld %s  Fail: %d %s\n",

>  			write ? "Writes" : "Reads ",

>
Kefeng Wang Feb. 1, 2016, 2:25 a.m. | #2
Hi Davidlohr,

On 2016/2/1 6:17, Davidlohr Bueso wrote:
> On Sat, 30 Jan 2016, Paul E. McKenney wrote:

> 

>>> On 2016/1/28 12:25, Kefeng Wang wrote:

>>> > Insmod locktorture with torture_type=mutex will lead to crash,

> 

> You actually want mutex_lock here, we always use the _lock suffix, mainly

> because it all started out with spin_lock. And you just showed how fragile

> this is -- I'd say most of use use this module in a scripted setup, which

> is why it was not seen before.

> 

[snip...]
> 

> So we shouldn't be doing anything with statistics here in the first place, as

> it was a bogus argument. Instead, we should just exit with EINVAL. Something

> like the below does the trick for me.

> 


Yes, it works, but what you are doing is to revert commit a36a99618b1adb2d6ca0b7e08e3a656a04e477fe
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Sun Aug 30 20:01:48 2015 -0700

    locktorture: Fix module unwind when bad torture_type specified

    The locktorture module has a list of torture types, and specifying
    a type not on this list is supposed to cleanly fail the module load.
    Unfortunately, the "fail" happens without the "cleanly".  This commit
    therefore adds the needed clean-up after an incorrect torture_type.


If we revert it, the cleanup is not completely after insmod lockture with bad torture_type,
we can't insmod lockture with correct argument anymore, we will get following print,

-bash-4.3# insmod locktorture.ko torture_type=mutex
[  340.872178] lock-torture: invalid torture type: "mutex"
[  340.873185] lock-torture types:
[  340.873665]  lock_busted spin_lock
[  340.874182]  spin_lock_irq rw_lock
[  340.883853]  rw_lock_irq mutex_lock
[  340.884238]  rtmutex_lock rwsem_lock
[  340.884640]  percpu_rwsem_lock[  340.884941]
insmod: ERROR: could not insert module locktorture.ko: Invalid parameters
-bash-4.3# lsmod
Module                  Size  Used by
torture                17672  0
-bash-4.3# insmod locktorture.ko torture_type=mutex_lock
[  356.796114] torture_init_begin: refusing mutex_lock init: mutex running
insmod: ERROR: could not insert module locktorture.ko: Device or resource busy

And what Paul wanted is something that would print the full statistics
at the end regardless of the periodic statistics.

I prefer my version 2, here is some log with my patch v2, it is keep consistent
with rcutorture.
-------------------------------------------------------
-bash-4.3# insmod locktorture.ko torture_type=mutex
[  190.845067] lock-torture: invalid torture type: "mutex"
[  190.845748] lock-torture types:
[  190.846099]  lock_busted spin_lock
[  190.863211]  spin_lock_irq rw_lock
[  190.863668]  rw_lock_irq mutex_lock
[  190.864049]  rtmutex_lock rwsem_lock
[  190.864390]  percpu_rwsem_lock[  190.864686]
[  190.865662] Writes:  Total: 0  Max/Min: 0/0   Fail: 0
[  190.866218] Reads :  Total: 0  Max/Min: 0/0   Fail: 0
[  190.875071] mutex-torture:--- End of test: SUCCESS: nwriters_stress=0 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
insmod: ERROR: could not insert module locktorture.ko: Invalid parameters

-bash-4.3# insmod locktorture.ko torture_type=mutex_lock
[  201.440136] mutex_lock-torture:--- Start of test: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
[  201.441666] mutex_lock-torture: Creating torture_shuffle task
[  201.447173] mutex_lock-torture: Creating torture_stutter task
[  201.448124] mutex_lock-torture: torture_shuffle task started
[  201.452483] mutex_lock-torture: Creating lock_torture_writer task
[  201.453670] mutex_lock-torture: torture_stutter task started
[  201.459217] mutex_lock-torture: Creating lock_torture_writer task
[  201.460204] mutex_lock-torture: lock_torture_writer task started
[  201.460976] mutex_lock-torture: Creating lock_torture_stats task
[  201.461668] mutex_lock-torture: lock_torture_writer task started
[  201.471916] mutex_lock-torture: lock_torture_stats task started
-bash-4.3# rmmod locktorture.ko
[  209.294964] mutex_lock-torture: Stopping torture_shuffle task
[  209.295874] mutex_lock-torture: Stopping torture_shuffle
[  209.296847] mutex_lock-torture: Stopping torture_stutter task
[  209.297458] mutex_lock-torture: Stopping torture_stutter
[  209.301694] mutex_lock-torture: Stopping lock_torture_writer task
[  209.318665] mutex_lock-torture: Stopping lock_torture_writer
[  209.319522] mutex_lock-torture: Stopping lock_torture_writer task
[  209.341051] mutex_lock-torture: Stopping lock_torture_writer
[  209.341864] mutex_lock-torture: Stopping lock_torture_stats task
[  209.344949] Writes:  Total: 378  Max/Min: 0/0   Fail: 0
[  209.345617] mutex_lock-torture: Stopping lock_torture_stats
[  209.346817] Writes:  Total: 378  Max/Min: 0/0   Fail: 0
[  209.347389] mutex_lock-torture:--- End of test: SUCCESS: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
-------------------------------------------------------

Thanks,
Kefeng

> Thanks,

> Davidlohr

> 

> ----8<--------------------------------------------------------------------------

> From: Davidlohr Bueso <dave@stgolabs.net>

> Subject: [PATCH] locktorture: Do not deal with statistics upon bogus input

> 

> Kefeng reported the following nil pointer dereference when

> passing an invalid lock type to torture.

> 

> [  114.887250] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008

> [  114.887254] IP: [<ffffffffa077808d>] __torture_print_stats+0x1d/0x100 [locktorture]

> ...

> [  114.887315] CPU: 6 PID: 7080 Comm: modprobe Tainted: G          I E   4.5.0-rc1-52.39-default+ #2

> [  114.887316] Hardware name: Dell Inc. PowerEdge R510/0W844P, BIOS 1.3.6 05/25/2010

> [  114.887318] task: ffff880196610040 ti: ffff8801a8ca4000 task.ti: ffff8801a8ca4000

> [  114.887322] RIP: 0010:[<ffffffffa077808d>]  [<ffffffffa077808d>] __torture_print_stats+0x1d/0x100 [locktorture]

> [  114.887324] RSP: 0018:ffff8801a8ca7c38  EFLAGS: 00010202

> [  114.887326] RAX: 0000000000000000 RBX: 0000000000004000 RCX: ffffea0006a12b20

> [  114.887327] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801aa7bc000

> [  114.887328] RBP: ffff8801a8ca7c58 R08: 0000000000000004 R09: ffff8801aa7bc000

> [  114.887329] R10: 000000000001c1a8 R11: ffff8801afc77d10 R12: 0000000000004000

> [  114.887330] R13: ffff8801aa7bc000 R14: ffff8801aaa312b0 R15: ffff8801a8ca7ec8

> [  114.887332] FS:  00007f13182bc700(0000) GS:ffff8801afc60000(0000) knlGS:0000000000000000

> [  114.887333] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b

> [  114.887334] CR2: 0000000000000008 CR3: 0000000196eb1000 CR4: 00000000000006e0

> [  114.887335] Stack:

> [  114.887337]  0000000000004000 ffffffffa077a000 ffff8801aaa312b0 0000000000004000

> [  114.887339]  ffff8801a8ca7c80 ffffffffa0778ba1 0000000000000048 00000000ffffffff

> [  114.887341]  ffffffffa077a000 ffff8801a8ca7cd0 ffffffffa0778d6d ffff8801a8ca7cb0

> [  114.887341] Call Trace:

> [  114.887347]  [<ffffffffa0778ba1>] lock_torture_stats_print+0x71/0x100 [locktorture]

> [  114.887351]  [<ffffffffa0778d6d>] lock_torture_cleanup+0xcd/0x28b [locktorture]

> [  114.887358]  [<ffffffff81084f6c>] ? blocking_notifier_chain_register+0x5c/0xa0

> [  114.887361]  [<ffffffff81085f88>] ? register_reboot_notifier+0x18/0x20

> [  114.887365]  [<ffffffffa077e0fa>] lock_torture_init+0xfa/0x1000 [locktorture]

> [  114.887367]  [<ffffffffa077e000>] ? 0xffffffffa077e000

> [  114.887372]  [<ffffffff810003dd>] do_one_initcall+0xad/0x1e0

> 

> There is no reason for us to be calling into statistics logic

> as we should just return to the user:

> 

> lock-torture: invalid torture type: "mutex"

> 

> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>

> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

> ---

>  kernel/locking/locktorture.c | 5 +++--

>  1 file changed, 3 insertions(+), 2 deletions(-)

> 

> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c

> index 8ef1919..f613fff 100644

> --- a/kernel/locking/locktorture.c

> +++ b/kernel/locking/locktorture.c

> @@ -811,8 +811,9 @@ static int __init lock_torture_init(void)

>          for (i = 0; i < ARRAY_SIZE(torture_ops); i++)

>              pr_alert(" %s", torture_ops[i]->name);

>          pr_alert("\n");

> -        firsterr = -EINVAL;

> -        goto unwind;

> +

> +        torture_init_end();

> +        return -EINVAL;

>      }

>      if (cxt.cur_ops->init)

>          cxt.cur_ops->init();
Kefeng Wang Feb. 1, 2016, 3:28 a.m. | #3
On 2016/2/1 11:02, Davidlohr Bueso wrote:
> On Mon, 01 Feb 2016, Kefeng Wang wrote:

> 

>> Hi Davidlohr,

>>

[...]
>>

>> Yes, it works, but what you are doing is to revert commit a36a99618b1adb2d6ca0b7e08e3a656a04e477fe

> 

> Oh, I see. I was definitely not aware of that one.

> 

> [...]

> 

>> And what Paul wanted is something that would print the full statistics

>> at the end regardless of the periodic statistics.

>>

>> I prefer my version 2, here is some log with my patch v2, it is keep consistent

>> with rcutorture.

>> -------------------------------------------------------

>> -bash-4.3# insmod locktorture.ko torture_type=mutex

>> [  190.845067] lock-torture: invalid torture type: "mutex"

>> [  190.845748] lock-torture types:

>> [  190.846099]  lock_busted spin_lock

>> [  190.863211]  spin_lock_irq rw_lock

>> [  190.863668]  rw_lock_irq mutex_lock

>> [  190.864049]  rtmutex_lock rwsem_lock

>> [  190.864390]  percpu_rwsem_lock[  190.864686]

>> [  190.865662] Writes:  Total: 0  Max/Min: 0/0   Fail: 0

>> [  190.866218] Reads :  Total: 0  Max/Min: 0/0   Fail: 0

>> [  190.875071] mutex-torture:--- End of test: SUCCESS: nwriters_stress=0 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0

> 

> How can the above be a successful run (SUCCESS string) if we didn't pass a

> valid torture_type? iow, there is no test without it. Just think of passing

> the wrong param in a userland application, 99.999% of the tools simply error

> out complaining about the bogus input.

> 

> I think the right approach would be to decouple the statistics from the cleanup,

> that way we can still do the required cleanups and avoid any stats.


Just like I mentioned before, keep consistent with rcutorture,I am happy to any
way to solve this issue,

To Paul, what's your advice?

Thanks,
Kefeng

Attached rcutorture log,
----------------------------------------------------------
-bash-4.3# insmod rcutorture.ko torture_type=xxx
[ 3195.103753] rcu-torture: invalid torture type: "xxx"
[ 3195.104328] rcu-torture types:
[ 3195.104610]  rcu rcu_bh
[ 3195.104889]  rcu_busted srcu
[ 3195.105188]  srcud sched
[ 3195.105464]  tasks[ 3195.105680]
[ 3196.121026] xxx-torture: rtc:           (null) ver: 0 tfle: 1 rta: 0 rtaf: 0 rtf: 0 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 0 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 0
[ 3196.125066] xxx-torture: Reader Pipe:  0 0 0 0 0 0 0 0 0 0 0
[ 3196.126176] xxx-torture: Reader Batch:  0 0 0 0 0 0 0 0 0 0 0
[ 3196.128884] xxx-torture: Free-Block Circulation:  0 0 0 0 0 0 0 0 0 0 0
[ 3196.129945] xxx-torture:--- End of test: SUCCESS: nreaders=0 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0
insmod: ERROR: could not insert module rcutorture.ko: Invalid parameters


-bash-4.3# insmod rcutorture.ko torture_type=rcu
[ 3205.456128] rcu-torture:--- Start of test: nreaders=1 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0
[ 3205.474983] rcu-torture: Creating rcu_torture_writer task
[ 3205.478951] rcu-torture: Creating rcu_torture_fakewriter task
[ 3205.479966] rcu-torture: rcu_torture_writer task started
[ 3205.480410] rcu-torture: Grace periods expedited from boot/sysfs for rcu,
[ 3205.480942] rcu-torture: Testing of dynamic grace-period expediting diabled.
[ 3205.482951] rcu-torture: Creating rcu_torture_fakewriter task
[ 3205.483854] rcu-torture: rcu_torture_fakewriter task started
[ 3205.491931] rcu-torture: Creating rcu_torture_fakewriter task
[ 3205.492858] rcu-torture: rcu_torture_fakewriter task started
[ 3205.493613] rcu-torture: Creating rcu_torture_fakewriter task
[ 3205.495176] rcu-torture: rcu_torture_fakewriter task started
[ 3205.499013] rcu-torture: Creating rcu_torture_reader task
[ 3205.499919] rcu-torture: rcu_torture_fakewriter task started
[ 3205.500578] rcu-torture: Creating rcu_torture_stats task
[ 3205.501297] rcu-torture: rcu_torture_reader task started
[ 3205.502903] rcu-torture: Creating torture_shuffle task
[ 3205.503797] rcu-torture: rcu_torture_stats task started
[ 3205.514980] rcu-torture: Creating torture_stutter task
[ 3205.515871] rcu-torture: torture_shuffle task started
[ 3205.520461] rcu-torture: Creating rcu_torture_cbflood task
[ 3205.521379] rcu-torture: torture_stutter task started
[ 3205.529569] rcu-torture: rcu_torture_cbflood task started
-bash-4.3# rmmod rcutorture.ko
[ 3209.438170] rcu-torture: Stopping torture_shuffle task
[ 3209.439312] rcu-torture: Stopping torture_shuffle
[ 3209.440150] rcu-torture: Stopping torture_stutter task
[ 3209.446624] rcu-torture: Stopping torture_stutter
[ 3209.447376] rcu-torture: Stopping rcu_torture_fakewriter
[ 3209.448028] rcu-torture: Stopping rcu_torture_writer task
[ 3209.448594] rcu-torture: Stopping rcu_torture_fakewriter
[ 3209.449088] rcu-torture: Stopping rcu_torture_reader
[ 3209.453865] rcu-torture: Stopping rcu_torture_fakewriter
[ 3209.454995] rcu-torture: Stopping rcu_torture_fakewriter
[ 3209.455551] rcu-torture: Stopping rcu_torture_writer
[ 3209.456288] rcu-torture: Stopping rcu_torture_reader task
[ 3209.457060] rcu-torture: Stopping rcu_torture_fakewriter task
[ 3209.457836] rcu-torture: Stopping rcu_torture_fakewriter task
[ 3209.462206] rcu-torture: Stopping rcu_torture_fakewriter task
[ 3209.463673] rcu-torture: Stopping rcu_torture_fakewriter task
[ 3209.464398] rcu-torture: Stopping rcu_torture_stats task
[ 3209.465139] rcu-torture: rtc:           (null) ver: 217 tfle: 0 rta: 217 rtaf: 0 rtf: 206 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 740 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 3
[ 3209.469436] rcu-torture: Reader Pipe:  776988 96 0 0 0 0 0 0 0 0 0
[ 3209.471702] rcu-torture: Reader Batch:  620212 156872 0 0 0 0 0 0 0 0 0
[ 3209.472626] rcu-torture: Free-Block Circulation:  216 215 214 213 211 210 209 208 207 206 0
[ 3209.473730] rcu-torture: Stopping rcu_torture_stats
[ 3209.474885] rcu-torture: Stopping rcu_torture_cbflood task
[ 3209.539215] rcu-torture: Stopping rcu_torture_cbflood
[ 3209.540049] rcu-torture: rtc:           (null) ver: 217 tfle: 0 rta: 217 rtaf: 0 rtf: 206 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 740 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 4
[ 3209.541674] rcu-torture: Reader Pipe:  776988 96 0 0 0 0 0 0 0 0 0
[ 3209.544527] rcu-torture: Reader Batch:  620212 156872 0 0 0 0 0 0 0 0 0
[ 3209.545288] rcu-torture: Free-Block Circulation:  216 215 214 213 211 210 209 208 207 206 0
[ 3209.548506] rcu-torture:--- End of test: SUCCESS: nreaders=1 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0

> 

> Thanks,

> Davidlohr

> 

> .

>
Kefeng Wang March 3, 2016, 1:37 a.m. | #4
On 2016/3/3 5:12, Paul E. McKenney wrote:
> On Wed, Mar 02, 2016 at 11:55:43AM -0800, Davidlohr Bueso wrote:

>> On Tue, 02 Feb 2016, Davidlohr Bueso wrote:

>>

>> I've just hit this issue myself and remembered this thread :)

>>

>> Paul, folks, does the below patch look reasonable to you? If so

>> I can properly resend. thanks.

> 

> If it works for Kefeng Wang, I would be happy to take it.


Yes, it works for me, tested on my board.


> 

> 							Thanx, Paul

>
Kefeng Wang March 3, 2016, 4:31 a.m. | #5
Hi Davidlohr and Paul,

On 2016/3/3 9:37, Kefeng Wang wrote:
> 

> 

> On 2016/3/3 5:12, Paul E. McKenney wrote:

>> On Wed, Mar 02, 2016 at 11:55:43AM -0800, Davidlohr Bueso wrote:

>>> On Tue, 02 Feb 2016, Davidlohr Bueso wrote:

>>>

>>> I've just hit this issue myself and remembered this thread :)

>>>

>>> Paul, folks, does the below patch look reasonable to you? If so

>>> I can properly resend. thanks.

>>

>> If it works for Kefeng Wang, I would be happy to take it.

> 

> Yes, it works for me, tested on my board.

> 


Even if we merge Davidlohr's patch, I think we still need my v2 patch,
here is a scene,
----------
cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, GFP_KERNEL);
if (cxt.lwsa == NULL) {
	goto unwind;
}

or

cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, GFP_KERNEL);
if (cxt.lrsa == NULL) {
        VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
        firsterr = -ENOMEM;
        kfree(cxt.lwsa);
        goto unwind;
}
----------
we will get cxt.lwsa = NULL, and go to cleanup, then in

static void __torture_print_stats(char *page,
                                  struct lock_stress_stats *statp, bool write)
{
        bool fail = 0;
        int i, n_stress;
        long max = 0;
        long min = statp[0].n_lock_acquired;   // here, *we will meet NULL pointer dereference*

}

and my patch v2 solve this issue too, so it is still needed.

Thanks,
Kefeng


> 

>>

>> 							Thanx, Paul

>>

>
Kefeng Wang March 7, 2016, 2 a.m. | #6
On 2016/3/3 16:36, Davidlohr Bueso wrote:
> On Thu, 03 Mar 2016, Kefeng Wang wrote:

> 


> 

> The below should take care of both issues, what do you think?

> 

> Thanks,

> Davidlohr

> 

> <8-------------------------------------------------------------------------

> Subject: [PATCH] locktorture: Fix nil pointer dereferencing for cleanup paths

> 

> It has been found that paths that invoke cleanups through

> lock_torture_cleanup() can incur in nil pointer dereferencing

> bugs during the statistics printing phase. This is mainly

> because we should not be calling into statistics before we are

> sure things have been setup correctly.

> 

> Specifically, early checks (and the need for handling this in

> the cleanup call) only include parameter checks and basic

> statistics allocation. Once we start write/read kthreads

> we then consider the test as started. As such, update the func

> in question to check for cxt.lwsa writer stats, if not set,

> we either have a bogus parameter or ENOMEM situation and

> therefore only need to deal with general torture calls.

> 

> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

> ---

> XXX: while looking at the code, do we need at least a stat_interval > 0

> check before stopping the lock_torture_stats kthread?

> 

>  kernel/locking/locktorture.c | 11 +++++++++++

>  1 file changed, 11 insertions(+)

> 

> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c

> index 8ef1919..1942848 100644

> --- a/kernel/locking/locktorture.c

> +++ b/kernel/locking/locktorture.c

> @@ -748,6 +748,15 @@ static void lock_torture_cleanup(void)

>      if (torture_cleanup_begin())

>          return;

>  

> +    /*

> +     * Indicates early cleanup, meaning that the test has not run,

> +     * such as when passing bogus args when loading the module. As

> +     * such, only perform the underlying torture-specific cleanups,

> +     * and avoid anything related to locktorture.

> +     */

> +    if (!cxt.lwsa)

> +        goto end;


Sorry for the late response, the cxt.lrsa should be taken into account too.

> +

>      if (writer_tasks) {

>          for (i = 0; i < cxt.nrealwriters_stress; i++)

>              torture_stop_kthread(lock_torture_writer,

> @@ -776,6 +785,7 @@ static void lock_torture_cleanup(void)

>      else

>          lock_torture_print_module_parms(cxt.cur_ops,

>                          "End of test: SUCCESS");

> +end:

>      torture_cleanup_end();

>  }

>  

> @@ -878,6 +888,7 @@ static int __init lock_torture_init(void)

>              cxt.lrsa[i].n_lock_acquired = 0;

>          }

>      }

> +

>      lock_torture_print_module_parms(cxt.cur_ops, "Start of test");

>  

>      /* Prepare torture context. */
Kefeng Wang March 7, 2016, 7:02 a.m. | #7
On 2016/3/7 13:40, Davidlohr Bueso wrote:
> On Mon, 07 Mar 2016, Kefeng Wang wrote:

>> On 2016/3/3 16:36, Davidlohr Bueso wrote:

> 

>>> +    /*

>>> +     * Indicates early cleanup, meaning that the test has not run,

>>> +     * such as when passing bogus args when loading the module. As

>>> +     * such, only perform the underlying torture-specific cleanups,

>>> +     * and avoid anything related to locktorture.

>>> +     */

>>> +    if (!cxt.lwsa)

>>> +        goto end;

>>

>> Sorry for the late response, the cxt.lrsa should be taken into account too.

> 

> I am taking it into account, note that we kfree lwsa if lrsa fails memory

> allocation. Of course we should be defensive, so go ahead and explicitly set

> it to nil. v2 below, otherwise same patch.


This one looks good, and tested on my board.


> 

> -----8<--------------------------

> Subject: [PATCH v2] locktorture: Fix nil pointer dereferencing for cleanup paths

> 

> It has been found that paths that invoke cleanups through

> lock_torture_cleanup() can incur in nil pointer dereferencing

> bugs during the statistics printing phase. This is mainly

> because we should not be calling into statistics before we are

> sure things have been setup correctly.

> 

> Specifically, early checks (and the need for handling this in

> the cleanup call) only include parameter checks and basic

> statistics allocation. Once we start write/read kthreads

> we then consider the test as started. As such, update the func

> in question to check for cxt.lwsa writer stats, if not set,

> we either have a bogus parameter or ENOMEM situation and

> therefore only need to deal with general torture calls

> 

> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>

> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

> ---

>  kernel/locking/locktorture.c | 12 ++++++++++++

>  1 file changed, 12 insertions(+)

> 

> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c

> index 8ef1919..b5bc243 100644

> --- a/kernel/locking/locktorture.c

> +++ b/kernel/locking/locktorture.c

> @@ -748,6 +748,15 @@ static void lock_torture_cleanup(void)

>      if (torture_cleanup_begin())

>          return;

>  

> +    /*

> +     * Indicates early cleanup, meaning that the test has not run,

> +     * such as when passing bogus args when loading the module. As

> +     * such, only perform the underlying torture-specific cleanups,

> +     * and avoid anything related to locktorture.

> +     */

> +    if (!cxt.lwsa)

> +        goto end;

> +

>      if (writer_tasks) {

>          for (i = 0; i < cxt.nrealwriters_stress; i++)

>              torture_stop_kthread(lock_torture_writer,

> @@ -776,6 +785,7 @@ static void lock_torture_cleanup(void)

>      else

>          lock_torture_print_module_parms(cxt.cur_ops,

>                          "End of test: SUCCESS");

> +end:

>      torture_cleanup_end();

>  }

>  

> @@ -870,6 +880,7 @@ static int __init lock_torture_init(void)

>              VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");

>              firsterr = -ENOMEM;

>              kfree(cxt.lwsa);

> +            cxt.lwsa = NULL;

>              goto unwind;

>          }

>  

> @@ -878,6 +889,7 @@ static int __init lock_torture_init(void)

>              cxt.lrsa[i].n_lock_acquired = 0;

>          }

>      }

> +

>      lock_torture_print_module_parms(cxt.cur_ops, "Start of test");

>  

>      /* Prepare torture context. */
Kefeng Wang March 8, 2016, 2:10 a.m. | #8
On 2016/3/7 21:37, Paul E. McKenney wrote:
> On Mon, Mar 07, 2016 at 03:02:05PM +0800, Kefeng Wang wrote:

>>

>>

>> On 2016/3/7 13:40, Davidlohr Bueso wrote:

>>> On Mon, 07 Mar 2016, Kefeng Wang wrote:

>>>> On 2016/3/3 16:36, Davidlohr Bueso wrote:

>>>

>>>>> +    /*

>>>>> +     * Indicates early cleanup, meaning that the test has not run,

>>>>> +     * such as when passing bogus args when loading the module. As

>>>>> +     * such, only perform the underlying torture-specific cleanups,

>>>>> +     * and avoid anything related to locktorture.

>>>>> +     */

>>>>> +    if (!cxt.lwsa)

>>>>> +        goto end;

>>>>

>>>> Sorry for the late response, the cxt.lrsa should be taken into account too.

>>>

>>> I am taking it into account, note that we kfree lwsa if lrsa fails memory

>>> allocation. Of course we should be defensive, so go ahead and explicitly set

>>> it to nil. v2 below, otherwise same patch.

>>

>> This one looks good, and tested on my board.

> 

> Very good!  May we add your Tested-by?


Sure, please.

> 

> 							Thanx, Paul

Patch

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 8ef1919..7f0cf9c 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -647,19 +647,23 @@  static void __torture_print_stats(char *page,
 	bool fail = 0;
 	int i, n_stress;
 	long max = 0;
-	long min = statp[0].n_lock_acquired;
+	long min = 0;
 	long long sum = 0;
 
-	n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
-	for (i = 0; i < n_stress; i++) {
-		if (statp[i].n_lock_fail)
-			fail = true;
-		sum += statp[i].n_lock_acquired;
-		if (max < statp[i].n_lock_fail)
-			max = statp[i].n_lock_fail;
-		if (min > statp[i].n_lock_fail)
-			min = statp[i].n_lock_fail;
+	if (statp) {
+		min = statp[0].n_lock_acquired;
+		n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
+		for (i = 0; i < n_stress; i++) {
+			if (statp[i].n_lock_fail)
+				fail = true;
+			sum += statp[i].n_lock_acquired;
+			if (max < statp[i].n_lock_fail)
+				max = statp[i].n_lock_fail;
+			if (min > statp[i].n_lock_fail)
+				min = statp[i].n_lock_fail;
+		}
 	}
+
 	page += sprintf(page,
 			"%s:  Total: %lld  Max/Min: %ld/%ld %s  Fail: %d %s\n",
 			write ? "Writes" : "Reads ",