Message ID | 1453955159-23216-1-git-send-email-wangkefeng.wang@huawei.com |
---|---|
State | New |
Headers | show |
Hi Paul, On 2016/1/28 12:25, Kefeng Wang wrote: > Insmod locktorture with torture_type=mutex will lead to crash, > > Unable to handle kernel NULL pointer dereference at virtual address 00000008 > pgd = ffffffc0f6c10000 > [00000008] *pgd=000000013b221003, *pud=000000013b221003, *pmd=0000000000000000a > Internal error: Oops: 94000006 [#1] PREEMPT SMP > Modules linked in: locktorture(+) torture > CPU: 2 PID: 1462 Comm: insmod Not tainted 4.4.0+ #19 > Hardware name: linux,dummy-virt (DT) > task: ffffffc0fb2b3700 ti: ffffffc0fa938000 task.ti: ffffffc0fa938000 > PC is at __torture_print_stats+0x18/0x180 [locktorture] > LR is at lock_torture_stats_print+0x68/0x110 [locktorture] > pc : [<ffffffbffc017028>] lr : [<ffffffbffc017500>] pstate: 60000145 > sp : ffffffc0fa93bb20 > [snip...] > Call trace: > [<ffffffbffc017028>] __torture_print_stats+0x18/0x180 [locktorture] > [<ffffffbffc017500>] lock_torture_stats_print+0x68/0x110 [locktorture] > [<ffffffbffc0180fc>] lock_torture_cleanup+0xc4/0x278 [locktorture] > [<ffffffbffc01d144>] lock_torture_init+0x144/0x5b0 [locktorture] > [<ffffffc000082940>] do_one_initcall+0x94/0x1a0 > [<ffffffc000141888>] do_init_module+0x60/0x1c8 > [<ffffffc00011c628>] load_module+0x1880/0x1c9c > [<ffffffc00011cc00>] SyS_finit_module+0x7c/0x88 > [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28 > > Fix it by check statp in __torture_print_stats(), if it is NULL, just > create a lock-torture-statistics message with zero statistic argument. It is keep to get the statistics printout at the end if with bad argument, what's your opinion about this version? Thanks, Kefeng > > Cc: Josh Triplett <josh@joshtriplett.org> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > kernel/locking/locktorture.c | 24 ++++++++++++++---------- > 1 file changed, 14 insertions(+), 10 deletions(-) > > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c > index 8ef1919..7f0cf9c 100644 > --- a/kernel/locking/locktorture.c > +++ b/kernel/locking/locktorture.c > @@ -647,19 +647,23 @@ static void __torture_print_stats(char *page, > bool fail = 0; > int i, n_stress; > long max = 0; > - long min = statp[0].n_lock_acquired; > + long min = 0; > long long sum = 0; > > - n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress; > - for (i = 0; i < n_stress; i++) { > - if (statp[i].n_lock_fail) > - fail = true; > - sum += statp[i].n_lock_acquired; > - if (max < statp[i].n_lock_fail) > - max = statp[i].n_lock_fail; > - if (min > statp[i].n_lock_fail) > - min = statp[i].n_lock_fail; > + if (statp) { > + min = statp[0].n_lock_acquired; > + n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress; > + for (i = 0; i < n_stress; i++) { > + if (statp[i].n_lock_fail) > + fail = true; > + sum += statp[i].n_lock_acquired; > + if (max < statp[i].n_lock_fail) > + max = statp[i].n_lock_fail; > + if (min > statp[i].n_lock_fail) > + min = statp[i].n_lock_fail; > + } > } > + > page += sprintf(page, > "%s: Total: %lld Max/Min: %ld/%ld %s Fail: %d %s\n", > write ? "Writes" : "Reads ", >
Hi Davidlohr, On 2016/2/1 6:17, Davidlohr Bueso wrote: > On Sat, 30 Jan 2016, Paul E. McKenney wrote: > >>> On 2016/1/28 12:25, Kefeng Wang wrote: >>> > Insmod locktorture with torture_type=mutex will lead to crash, > > You actually want mutex_lock here, we always use the _lock suffix, mainly > because it all started out with spin_lock. And you just showed how fragile > this is -- I'd say most of use use this module in a scripted setup, which > is why it was not seen before. > [snip...] > > So we shouldn't be doing anything with statistics here in the first place, as > it was a bogus argument. Instead, we should just exit with EINVAL. Something > like the below does the trick for me. > Yes, it works, but what you are doing is to revert commit a36a99618b1adb2d6ca0b7e08e3a656a04e477fe Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Date: Sun Aug 30 20:01:48 2015 -0700 locktorture: Fix module unwind when bad torture_type specified The locktorture module has a list of torture types, and specifying a type not on this list is supposed to cleanly fail the module load. Unfortunately, the "fail" happens without the "cleanly". This commit therefore adds the needed clean-up after an incorrect torture_type. If we revert it, the cleanup is not completely after insmod lockture with bad torture_type, we can't insmod lockture with correct argument anymore, we will get following print, -bash-4.3# insmod locktorture.ko torture_type=mutex [ 340.872178] lock-torture: invalid torture type: "mutex" [ 340.873185] lock-torture types: [ 340.873665] lock_busted spin_lock [ 340.874182] spin_lock_irq rw_lock [ 340.883853] rw_lock_irq mutex_lock [ 340.884238] rtmutex_lock rwsem_lock [ 340.884640] percpu_rwsem_lock[ 340.884941] insmod: ERROR: could not insert module locktorture.ko: Invalid parameters -bash-4.3# lsmod Module Size Used by torture 17672 0 -bash-4.3# insmod locktorture.ko torture_type=mutex_lock [ 356.796114] torture_init_begin: refusing mutex_lock init: mutex running insmod: ERROR: could not insert module locktorture.ko: Device or resource busy And what Paul wanted is something that would print the full statistics at the end regardless of the periodic statistics. I prefer my version 2, here is some log with my patch v2, it is keep consistent with rcutorture. ------------------------------------------------------- -bash-4.3# insmod locktorture.ko torture_type=mutex [ 190.845067] lock-torture: invalid torture type: "mutex" [ 190.845748] lock-torture types: [ 190.846099] lock_busted spin_lock [ 190.863211] spin_lock_irq rw_lock [ 190.863668] rw_lock_irq mutex_lock [ 190.864049] rtmutex_lock rwsem_lock [ 190.864390] percpu_rwsem_lock[ 190.864686] [ 190.865662] Writes: Total: 0 Max/Min: 0/0 Fail: 0 [ 190.866218] Reads : Total: 0 Max/Min: 0/0 Fail: 0 [ 190.875071] mutex-torture:--- End of test: SUCCESS: nwriters_stress=0 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0 insmod: ERROR: could not insert module locktorture.ko: Invalid parameters -bash-4.3# insmod locktorture.ko torture_type=mutex_lock [ 201.440136] mutex_lock-torture:--- Start of test: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0 [ 201.441666] mutex_lock-torture: Creating torture_shuffle task [ 201.447173] mutex_lock-torture: Creating torture_stutter task [ 201.448124] mutex_lock-torture: torture_shuffle task started [ 201.452483] mutex_lock-torture: Creating lock_torture_writer task [ 201.453670] mutex_lock-torture: torture_stutter task started [ 201.459217] mutex_lock-torture: Creating lock_torture_writer task [ 201.460204] mutex_lock-torture: lock_torture_writer task started [ 201.460976] mutex_lock-torture: Creating lock_torture_stats task [ 201.461668] mutex_lock-torture: lock_torture_writer task started [ 201.471916] mutex_lock-torture: lock_torture_stats task started -bash-4.3# rmmod locktorture.ko [ 209.294964] mutex_lock-torture: Stopping torture_shuffle task [ 209.295874] mutex_lock-torture: Stopping torture_shuffle [ 209.296847] mutex_lock-torture: Stopping torture_stutter task [ 209.297458] mutex_lock-torture: Stopping torture_stutter [ 209.301694] mutex_lock-torture: Stopping lock_torture_writer task [ 209.318665] mutex_lock-torture: Stopping lock_torture_writer [ 209.319522] mutex_lock-torture: Stopping lock_torture_writer task [ 209.341051] mutex_lock-torture: Stopping lock_torture_writer [ 209.341864] mutex_lock-torture: Stopping lock_torture_stats task [ 209.344949] Writes: Total: 378 Max/Min: 0/0 Fail: 0 [ 209.345617] mutex_lock-torture: Stopping lock_torture_stats [ 209.346817] Writes: Total: 378 Max/Min: 0/0 Fail: 0 [ 209.347389] mutex_lock-torture:--- End of test: SUCCESS: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0 ------------------------------------------------------- Thanks, Kefeng > Thanks, > Davidlohr > > ----8<-------------------------------------------------------------------------- > From: Davidlohr Bueso <dave@stgolabs.net> > Subject: [PATCH] locktorture: Do not deal with statistics upon bogus input > > Kefeng reported the following nil pointer dereference when > passing an invalid lock type to torture. > > [ 114.887250] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > [ 114.887254] IP: [<ffffffffa077808d>] __torture_print_stats+0x1d/0x100 [locktorture] > ... > [ 114.887315] CPU: 6 PID: 7080 Comm: modprobe Tainted: G I E 4.5.0-rc1-52.39-default+ #2 > [ 114.887316] Hardware name: Dell Inc. PowerEdge R510/0W844P, BIOS 1.3.6 05/25/2010 > [ 114.887318] task: ffff880196610040 ti: ffff8801a8ca4000 task.ti: ffff8801a8ca4000 > [ 114.887322] RIP: 0010:[<ffffffffa077808d>] [<ffffffffa077808d>] __torture_print_stats+0x1d/0x100 [locktorture] > [ 114.887324] RSP: 0018:ffff8801a8ca7c38 EFLAGS: 00010202 > [ 114.887326] RAX: 0000000000000000 RBX: 0000000000004000 RCX: ffffea0006a12b20 > [ 114.887327] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801aa7bc000 > [ 114.887328] RBP: ffff8801a8ca7c58 R08: 0000000000000004 R09: ffff8801aa7bc000 > [ 114.887329] R10: 000000000001c1a8 R11: ffff8801afc77d10 R12: 0000000000004000 > [ 114.887330] R13: ffff8801aa7bc000 R14: ffff8801aaa312b0 R15: ffff8801a8ca7ec8 > [ 114.887332] FS: 00007f13182bc700(0000) GS:ffff8801afc60000(0000) knlGS:0000000000000000 > [ 114.887333] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 114.887334] CR2: 0000000000000008 CR3: 0000000196eb1000 CR4: 00000000000006e0 > [ 114.887335] Stack: > [ 114.887337] 0000000000004000 ffffffffa077a000 ffff8801aaa312b0 0000000000004000 > [ 114.887339] ffff8801a8ca7c80 ffffffffa0778ba1 0000000000000048 00000000ffffffff > [ 114.887341] ffffffffa077a000 ffff8801a8ca7cd0 ffffffffa0778d6d ffff8801a8ca7cb0 > [ 114.887341] Call Trace: > [ 114.887347] [<ffffffffa0778ba1>] lock_torture_stats_print+0x71/0x100 [locktorture] > [ 114.887351] [<ffffffffa0778d6d>] lock_torture_cleanup+0xcd/0x28b [locktorture] > [ 114.887358] [<ffffffff81084f6c>] ? blocking_notifier_chain_register+0x5c/0xa0 > [ 114.887361] [<ffffffff81085f88>] ? register_reboot_notifier+0x18/0x20 > [ 114.887365] [<ffffffffa077e0fa>] lock_torture_init+0xfa/0x1000 [locktorture] > [ 114.887367] [<ffffffffa077e000>] ? 0xffffffffa077e000 > [ 114.887372] [<ffffffff810003dd>] do_one_initcall+0xad/0x1e0 > > There is no reason for us to be calling into statistics logic > as we should just return to the user: > > lock-torture: invalid torture type: "mutex" > > Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> > Signed-off-by: Davidlohr Bueso <dbueso@suse.de> > --- > kernel/locking/locktorture.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c > index 8ef1919..f613fff 100644 > --- a/kernel/locking/locktorture.c > +++ b/kernel/locking/locktorture.c > @@ -811,8 +811,9 @@ static int __init lock_torture_init(void) > for (i = 0; i < ARRAY_SIZE(torture_ops); i++) > pr_alert(" %s", torture_ops[i]->name); > pr_alert("\n"); > - firsterr = -EINVAL; > - goto unwind; > + > + torture_init_end(); > + return -EINVAL; > } > if (cxt.cur_ops->init) > cxt.cur_ops->init();
On 2016/2/1 11:02, Davidlohr Bueso wrote: > On Mon, 01 Feb 2016, Kefeng Wang wrote: > >> Hi Davidlohr, >> [...] >> >> Yes, it works, but what you are doing is to revert commit a36a99618b1adb2d6ca0b7e08e3a656a04e477fe > > Oh, I see. I was definitely not aware of that one. > > [...] > >> And what Paul wanted is something that would print the full statistics >> at the end regardless of the periodic statistics. >> >> I prefer my version 2, here is some log with my patch v2, it is keep consistent >> with rcutorture. >> ------------------------------------------------------- >> -bash-4.3# insmod locktorture.ko torture_type=mutex >> [ 190.845067] lock-torture: invalid torture type: "mutex" >> [ 190.845748] lock-torture types: >> [ 190.846099] lock_busted spin_lock >> [ 190.863211] spin_lock_irq rw_lock >> [ 190.863668] rw_lock_irq mutex_lock >> [ 190.864049] rtmutex_lock rwsem_lock >> [ 190.864390] percpu_rwsem_lock[ 190.864686] >> [ 190.865662] Writes: Total: 0 Max/Min: 0/0 Fail: 0 >> [ 190.866218] Reads : Total: 0 Max/Min: 0/0 Fail: 0 >> [ 190.875071] mutex-torture:--- End of test: SUCCESS: nwriters_stress=0 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0 > > How can the above be a successful run (SUCCESS string) if we didn't pass a > valid torture_type? iow, there is no test without it. Just think of passing > the wrong param in a userland application, 99.999% of the tools simply error > out complaining about the bogus input. > > I think the right approach would be to decouple the statistics from the cleanup, > that way we can still do the required cleanups and avoid any stats. Just like I mentioned before, keep consistent with rcutorture,I am happy to any way to solve this issue, To Paul, what's your advice? Thanks, Kefeng Attached rcutorture log, ---------------------------------------------------------- -bash-4.3# insmod rcutorture.ko torture_type=xxx [ 3195.103753] rcu-torture: invalid torture type: "xxx" [ 3195.104328] rcu-torture types: [ 3195.104610] rcu rcu_bh [ 3195.104889] rcu_busted srcu [ 3195.105188] srcud sched [ 3195.105464] tasks[ 3195.105680] [ 3196.121026] xxx-torture: rtc: (null) ver: 0 tfle: 1 rta: 0 rtaf: 0 rtf: 0 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 0 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 0 [ 3196.125066] xxx-torture: Reader Pipe: 0 0 0 0 0 0 0 0 0 0 0 [ 3196.126176] xxx-torture: Reader Batch: 0 0 0 0 0 0 0 0 0 0 0 [ 3196.128884] xxx-torture: Free-Block Circulation: 0 0 0 0 0 0 0 0 0 0 0 [ 3196.129945] xxx-torture:--- End of test: SUCCESS: nreaders=0 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0 insmod: ERROR: could not insert module rcutorture.ko: Invalid parameters -bash-4.3# insmod rcutorture.ko torture_type=rcu [ 3205.456128] rcu-torture:--- Start of test: nreaders=1 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0 [ 3205.474983] rcu-torture: Creating rcu_torture_writer task [ 3205.478951] rcu-torture: Creating rcu_torture_fakewriter task [ 3205.479966] rcu-torture: rcu_torture_writer task started [ 3205.480410] rcu-torture: Grace periods expedited from boot/sysfs for rcu, [ 3205.480942] rcu-torture: Testing of dynamic grace-period expediting diabled. [ 3205.482951] rcu-torture: Creating rcu_torture_fakewriter task [ 3205.483854] rcu-torture: rcu_torture_fakewriter task started [ 3205.491931] rcu-torture: Creating rcu_torture_fakewriter task [ 3205.492858] rcu-torture: rcu_torture_fakewriter task started [ 3205.493613] rcu-torture: Creating rcu_torture_fakewriter task [ 3205.495176] rcu-torture: rcu_torture_fakewriter task started [ 3205.499013] rcu-torture: Creating rcu_torture_reader task [ 3205.499919] rcu-torture: rcu_torture_fakewriter task started [ 3205.500578] rcu-torture: Creating rcu_torture_stats task [ 3205.501297] rcu-torture: rcu_torture_reader task started [ 3205.502903] rcu-torture: Creating torture_shuffle task [ 3205.503797] rcu-torture: rcu_torture_stats task started [ 3205.514980] rcu-torture: Creating torture_stutter task [ 3205.515871] rcu-torture: torture_shuffle task started [ 3205.520461] rcu-torture: Creating rcu_torture_cbflood task [ 3205.521379] rcu-torture: torture_stutter task started [ 3205.529569] rcu-torture: rcu_torture_cbflood task started -bash-4.3# rmmod rcutorture.ko [ 3209.438170] rcu-torture: Stopping torture_shuffle task [ 3209.439312] rcu-torture: Stopping torture_shuffle [ 3209.440150] rcu-torture: Stopping torture_stutter task [ 3209.446624] rcu-torture: Stopping torture_stutter [ 3209.447376] rcu-torture: Stopping rcu_torture_fakewriter [ 3209.448028] rcu-torture: Stopping rcu_torture_writer task [ 3209.448594] rcu-torture: Stopping rcu_torture_fakewriter [ 3209.449088] rcu-torture: Stopping rcu_torture_reader [ 3209.453865] rcu-torture: Stopping rcu_torture_fakewriter [ 3209.454995] rcu-torture: Stopping rcu_torture_fakewriter [ 3209.455551] rcu-torture: Stopping rcu_torture_writer [ 3209.456288] rcu-torture: Stopping rcu_torture_reader task [ 3209.457060] rcu-torture: Stopping rcu_torture_fakewriter task [ 3209.457836] rcu-torture: Stopping rcu_torture_fakewriter task [ 3209.462206] rcu-torture: Stopping rcu_torture_fakewriter task [ 3209.463673] rcu-torture: Stopping rcu_torture_fakewriter task [ 3209.464398] rcu-torture: Stopping rcu_torture_stats task [ 3209.465139] rcu-torture: rtc: (null) ver: 217 tfle: 0 rta: 217 rtaf: 0 rtf: 206 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 740 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 3 [ 3209.469436] rcu-torture: Reader Pipe: 776988 96 0 0 0 0 0 0 0 0 0 [ 3209.471702] rcu-torture: Reader Batch: 620212 156872 0 0 0 0 0 0 0 0 0 [ 3209.472626] rcu-torture: Free-Block Circulation: 216 215 214 213 211 210 209 208 207 206 0 [ 3209.473730] rcu-torture: Stopping rcu_torture_stats [ 3209.474885] rcu-torture: Stopping rcu_torture_cbflood task [ 3209.539215] rcu-torture: Stopping rcu_torture_cbflood [ 3209.540049] rcu-torture: rtc: (null) ver: 217 tfle: 0 rta: 217 rtaf: 0 rtf: 206 rtmbe: 0 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 740 onoff: 0/0:0/0 -1,0:-1,0 0:0 (HZ=250) barrier: 0/0:0 cbflood: 4 [ 3209.541674] rcu-torture: Reader Pipe: 776988 96 0 0 0 0 0 0 0 0 0 [ 3209.544527] rcu-torture: Reader Batch: 620212 156872 0 0 0 0 0 0 0 0 0 [ 3209.545288] rcu-torture: Free-Block Circulation: 216 215 214 213 211 210 209 208 207 206 0 [ 3209.548506] rcu-torture:--- End of test: SUCCESS: nreaders=1 nfakewriters=4 stat_interval=60 verbose=1 test_no_idle_hz=1 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 stall_cpu=0 stall_cpu_holdoff=10 n_barrier_cbs=0 onoff_interval=0 onoff_holdoff=0 > > Thanks, > Davidlohr > > . >
On 2016/3/3 5:12, Paul E. McKenney wrote: > On Wed, Mar 02, 2016 at 11:55:43AM -0800, Davidlohr Bueso wrote: >> On Tue, 02 Feb 2016, Davidlohr Bueso wrote: >> >> I've just hit this issue myself and remembered this thread :) >> >> Paul, folks, does the below patch look reasonable to you? If so >> I can properly resend. thanks. > > If it works for Kefeng Wang, I would be happy to take it. Yes, it works for me, tested on my board. > > Thanx, Paul >
Hi Davidlohr and Paul, On 2016/3/3 9:37, Kefeng Wang wrote: > > > On 2016/3/3 5:12, Paul E. McKenney wrote: >> On Wed, Mar 02, 2016 at 11:55:43AM -0800, Davidlohr Bueso wrote: >>> On Tue, 02 Feb 2016, Davidlohr Bueso wrote: >>> >>> I've just hit this issue myself and remembered this thread :) >>> >>> Paul, folks, does the below patch look reasonable to you? If so >>> I can properly resend. thanks. >> >> If it works for Kefeng Wang, I would be happy to take it. > > Yes, it works for me, tested on my board. > Even if we merge Davidlohr's patch, I think we still need my v2 patch, here is a scene, ---------- cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, GFP_KERNEL); if (cxt.lwsa == NULL) { goto unwind; } or cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, GFP_KERNEL); if (cxt.lrsa == NULL) { VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory"); firsterr = -ENOMEM; kfree(cxt.lwsa); goto unwind; } ---------- we will get cxt.lwsa = NULL, and go to cleanup, then in static void __torture_print_stats(char *page, struct lock_stress_stats *statp, bool write) { bool fail = 0; int i, n_stress; long max = 0; long min = statp[0].n_lock_acquired; // here, *we will meet NULL pointer dereference* } and my patch v2 solve this issue too, so it is still needed. Thanks, Kefeng > >> >> Thanx, Paul >> >
On 2016/3/3 16:36, Davidlohr Bueso wrote: > On Thu, 03 Mar 2016, Kefeng Wang wrote: > > > The below should take care of both issues, what do you think? > > Thanks, > Davidlohr > > <8------------------------------------------------------------------------- > Subject: [PATCH] locktorture: Fix nil pointer dereferencing for cleanup paths > > It has been found that paths that invoke cleanups through > lock_torture_cleanup() can incur in nil pointer dereferencing > bugs during the statistics printing phase. This is mainly > because we should not be calling into statistics before we are > sure things have been setup correctly. > > Specifically, early checks (and the need for handling this in > the cleanup call) only include parameter checks and basic > statistics allocation. Once we start write/read kthreads > we then consider the test as started. As such, update the func > in question to check for cxt.lwsa writer stats, if not set, > we either have a bogus parameter or ENOMEM situation and > therefore only need to deal with general torture calls. > > Signed-off-by: Davidlohr Bueso <dbueso@suse.de> > --- > XXX: while looking at the code, do we need at least a stat_interval > 0 > check before stopping the lock_torture_stats kthread? > > kernel/locking/locktorture.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c > index 8ef1919..1942848 100644 > --- a/kernel/locking/locktorture.c > +++ b/kernel/locking/locktorture.c > @@ -748,6 +748,15 @@ static void lock_torture_cleanup(void) > if (torture_cleanup_begin()) > return; > > + /* > + * Indicates early cleanup, meaning that the test has not run, > + * such as when passing bogus args when loading the module. As > + * such, only perform the underlying torture-specific cleanups, > + * and avoid anything related to locktorture. > + */ > + if (!cxt.lwsa) > + goto end; Sorry for the late response, the cxt.lrsa should be taken into account too. > + > if (writer_tasks) { > for (i = 0; i < cxt.nrealwriters_stress; i++) > torture_stop_kthread(lock_torture_writer, > @@ -776,6 +785,7 @@ static void lock_torture_cleanup(void) > else > lock_torture_print_module_parms(cxt.cur_ops, > "End of test: SUCCESS"); > +end: > torture_cleanup_end(); > } > > @@ -878,6 +888,7 @@ static int __init lock_torture_init(void) > cxt.lrsa[i].n_lock_acquired = 0; > } > } > + > lock_torture_print_module_parms(cxt.cur_ops, "Start of test"); > > /* Prepare torture context. */
On 2016/3/7 13:40, Davidlohr Bueso wrote: > On Mon, 07 Mar 2016, Kefeng Wang wrote: >> On 2016/3/3 16:36, Davidlohr Bueso wrote: > >>> + /* >>> + * Indicates early cleanup, meaning that the test has not run, >>> + * such as when passing bogus args when loading the module. As >>> + * such, only perform the underlying torture-specific cleanups, >>> + * and avoid anything related to locktorture. >>> + */ >>> + if (!cxt.lwsa) >>> + goto end; >> >> Sorry for the late response, the cxt.lrsa should be taken into account too. > > I am taking it into account, note that we kfree lwsa if lrsa fails memory > allocation. Of course we should be defensive, so go ahead and explicitly set > it to nil. v2 below, otherwise same patch. This one looks good, and tested on my board. > > -----8<-------------------------- > Subject: [PATCH v2] locktorture: Fix nil pointer dereferencing for cleanup paths > > It has been found that paths that invoke cleanups through > lock_torture_cleanup() can incur in nil pointer dereferencing > bugs during the statistics printing phase. This is mainly > because we should not be calling into statistics before we are > sure things have been setup correctly. > > Specifically, early checks (and the need for handling this in > the cleanup call) only include parameter checks and basic > statistics allocation. Once we start write/read kthreads > we then consider the test as started. As such, update the func > in question to check for cxt.lwsa writer stats, if not set, > we either have a bogus parameter or ENOMEM situation and > therefore only need to deal with general torture calls > > Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> > Signed-off-by: Davidlohr Bueso <dbueso@suse.de> > --- > kernel/locking/locktorture.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c > index 8ef1919..b5bc243 100644 > --- a/kernel/locking/locktorture.c > +++ b/kernel/locking/locktorture.c > @@ -748,6 +748,15 @@ static void lock_torture_cleanup(void) > if (torture_cleanup_begin()) > return; > > + /* > + * Indicates early cleanup, meaning that the test has not run, > + * such as when passing bogus args when loading the module. As > + * such, only perform the underlying torture-specific cleanups, > + * and avoid anything related to locktorture. > + */ > + if (!cxt.lwsa) > + goto end; > + > if (writer_tasks) { > for (i = 0; i < cxt.nrealwriters_stress; i++) > torture_stop_kthread(lock_torture_writer, > @@ -776,6 +785,7 @@ static void lock_torture_cleanup(void) > else > lock_torture_print_module_parms(cxt.cur_ops, > "End of test: SUCCESS"); > +end: > torture_cleanup_end(); > } > > @@ -870,6 +880,7 @@ static int __init lock_torture_init(void) > VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory"); > firsterr = -ENOMEM; > kfree(cxt.lwsa); > + cxt.lwsa = NULL; > goto unwind; > } > > @@ -878,6 +889,7 @@ static int __init lock_torture_init(void) > cxt.lrsa[i].n_lock_acquired = 0; > } > } > + > lock_torture_print_module_parms(cxt.cur_ops, "Start of test"); > > /* Prepare torture context. */
On 2016/3/7 21:37, Paul E. McKenney wrote: > On Mon, Mar 07, 2016 at 03:02:05PM +0800, Kefeng Wang wrote: >> >> >> On 2016/3/7 13:40, Davidlohr Bueso wrote: >>> On Mon, 07 Mar 2016, Kefeng Wang wrote: >>>> On 2016/3/3 16:36, Davidlohr Bueso wrote: >>> >>>>> + /* >>>>> + * Indicates early cleanup, meaning that the test has not run, >>>>> + * such as when passing bogus args when loading the module. As >>>>> + * such, only perform the underlying torture-specific cleanups, >>>>> + * and avoid anything related to locktorture. >>>>> + */ >>>>> + if (!cxt.lwsa) >>>>> + goto end; >>>> >>>> Sorry for the late response, the cxt.lrsa should be taken into account too. >>> >>> I am taking it into account, note that we kfree lwsa if lrsa fails memory >>> allocation. Of course we should be defensive, so go ahead and explicitly set >>> it to nil. v2 below, otherwise same patch. >> >> This one looks good, and tested on my board. > > Very good! May we add your Tested-by? Sure, please. > > Thanx, Paul
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 8ef1919..7f0cf9c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -647,19 +647,23 @@ static void __torture_print_stats(char *page, bool fail = 0; int i, n_stress; long max = 0; - long min = statp[0].n_lock_acquired; + long min = 0; long long sum = 0; - n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress; - for (i = 0; i < n_stress; i++) { - if (statp[i].n_lock_fail) - fail = true; - sum += statp[i].n_lock_acquired; - if (max < statp[i].n_lock_fail) - max = statp[i].n_lock_fail; - if (min > statp[i].n_lock_fail) - min = statp[i].n_lock_fail; + if (statp) { + min = statp[0].n_lock_acquired; + n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress; + for (i = 0; i < n_stress; i++) { + if (statp[i].n_lock_fail) + fail = true; + sum += statp[i].n_lock_acquired; + if (max < statp[i].n_lock_fail) + max = statp[i].n_lock_fail; + if (min > statp[i].n_lock_fail) + min = statp[i].n_lock_fail; + } } + page += sprintf(page, "%s: Total: %lld Max/Min: %ld/%ld %s Fail: %d %s\n", write ? "Writes" : "Reads ",
Insmod locktorture with torture_type=mutex will lead to crash, Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = ffffffc0f6c10000 [00000008] *pgd=000000013b221003, *pud=000000013b221003, *pmd=0000000000000000a Internal error: Oops: 94000006 [#1] PREEMPT SMP Modules linked in: locktorture(+) torture CPU: 2 PID: 1462 Comm: insmod Not tainted 4.4.0+ #19 Hardware name: linux,dummy-virt (DT) task: ffffffc0fb2b3700 ti: ffffffc0fa938000 task.ti: ffffffc0fa938000 PC is at __torture_print_stats+0x18/0x180 [locktorture] LR is at lock_torture_stats_print+0x68/0x110 [locktorture] pc : [<ffffffbffc017028>] lr : [<ffffffbffc017500>] pstate: 60000145 sp : ffffffc0fa93bb20 [snip...] Call trace: [<ffffffbffc017028>] __torture_print_stats+0x18/0x180 [locktorture] [<ffffffbffc017500>] lock_torture_stats_print+0x68/0x110 [locktorture] [<ffffffbffc0180fc>] lock_torture_cleanup+0xc4/0x278 [locktorture] [<ffffffbffc01d144>] lock_torture_init+0x144/0x5b0 [locktorture] [<ffffffc000082940>] do_one_initcall+0x94/0x1a0 [<ffffffc000141888>] do_init_module+0x60/0x1c8 [<ffffffc00011c628>] load_module+0x1880/0x1c9c [<ffffffc00011cc00>] SyS_finit_module+0x7c/0x88 [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28 Fix it by check statp in __torture_print_stats(), if it is NULL, just create a lock-torture-statistics message with zero statistic argument. Cc: Josh Triplett <josh@joshtriplett.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- kernel/locking/locktorture.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) -- 2.6.0.GIT