diff mbox series

scsi: target: Avoid smp_processor_id() in preemptible code

Message ID 20210513044944.1572404-1-shinichiro.kawasaki@wdc.com
State New
Headers show
Series scsi: target: Avoid smp_processor_id() in preemptible code | expand

Commit Message

Shin'ichiro Kawasaki May 13, 2021, 4:49 a.m. UTC
The BUG message "BUG: using smp_processor_id() in preemptible [00000000]
code" was observed for TCMU devices with kernel config DEBUG_PREEMPT.

The message was observed when blktests block/005 was run on TCMU devices
with fileio backend or user:zbc backend [1]. The commit 1130b499b4a7
("scsi: target: tcm_loop: Use LIO wq cmd submission helper") triggered
the symptom. The commit modified work queue to handle commands and
changed 'current->nr_cpu_allowed' at smp_processor_id() call.

The message was also observed at system shutdown when TCMU devices were
not cleaned up [2]. The function smp_processor_id() was called in SCSI
host work queue for abort handling, and triggered the BUG message. This
symptom was observed regardless of the commit 1130b499b4a7 ("scsi:
target: tcm_loop: Use LIO wq cmd submission helper").

To avoid the BUG message at smp_processor_id(), use get_cpu() and
put_cpu() instead which disable preemption.

[1]

[   56.468103] run blktests block/005 at 2021-05-12 14:16:38
[   57.369473] check_preemption_disabled: 85 callbacks suppressed
[   57.369480] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1511
[   57.369506] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1510
[   57.369512] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1506
[   57.369552] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[   57.369606] CPU: 4 PID: 1506 Comm: fio Not tainted 5.13.0-rc1+ #34
[   57.369613] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
[   57.369617] Call Trace:
[   57.369621] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1507
[   57.369628]  dump_stack+0x6d/0x89
[   57.369642]  check_preemption_disabled+0xc8/0xd0
[   57.369628] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[   57.369655]  __target_init_cmd+0x157/0x170 [target_core_mod]
[   57.369695]  target_init_cmd+0x76/0x90 [target_core_mod]
[   57.369732]  tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
[   57.369744]  scsi_queue_rq+0x38e/0xc40
[   57.369761]  __blk_mq_try_issue_directly+0x109/0x1c0
[   57.369779]  blk_mq_try_issue_directly+0x43/0x90
[   57.369790]  blk_mq_submit_bio+0x4e5/0x5d0
[   57.369812]  submit_bio_noacct+0x46e/0x4e0
[   57.369830]  __blkdev_direct_IO_simple+0x1a3/0x2d0
[   57.369859]  ? set_init_blocksize.isra.0+0x60/0x60
[   57.369880]  generic_file_read_iter+0x89/0x160
[   57.369898]  blkdev_read_iter+0x44/0x60
[   57.369906]  new_sync_read+0x102/0x170
[   57.369929]  vfs_read+0xd4/0x160
[   57.369941]  __x64_sys_pread64+0x6e/0xa0
[   57.369946]  ? lockdep_hardirqs_on+0x79/0x100
[   57.369958]  do_syscall_64+0x3a/0x70
[   57.369965]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   57.369973] RIP: 0033:0x7f7ed4c1399f
[   57.369979] Code: 08 89 3c 24 48 89 4c 24 18 e8 7d f3 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 cd f3 ff ff 48 8b
[   57.369983] RSP: 002b:00007ffd7918c580 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[   57.369990] RAX: ffffffffffffffda RBX: 00000000015b4540 RCX: 00007f7ed4c1399f
[   57.369993] RDX: 0000000000001000 RSI: 00000000015de000 RDI: 0000000000000009
[   57.369996] RBP: 00000000015b4540 R08: 0000000000000000 R09: 0000000000000001
[   57.369999] R10: 0000000000e5c000 R11: 0000000000000293 R12: 00007f7eb5269a70
[   57.370002] R13: 0000000000000000 R14: 0000000000001000 R15: 00000000015b4568
[   57.370031] CPU: 7 PID: 1507 Comm: fio Not tainted 5.13.0-rc1+ #34
[   57.370036] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
[   57.370039] Call Trace:
[   57.370045]  dump_stack+0x6d/0x89
[   57.370056]  check_preemption_disabled+0xc8/0xd0
[   57.370068]  __target_init_cmd+0x157/0x170 [target_core_mod]
[   57.370121]  target_init_cmd+0x76/0x90 [target_core_mod]
[   57.370178]  tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
[   57.370197]  scsi_queue_rq+0x38e/0xc40
[   57.370224]  __blk_mq_try_issue_directly+0x109/0x1c0
...

[2]

[  117.458597] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u16:8
[  117.467279] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
[  117.473893] CPU: 1 PID: 418 Comm: kworker/u16:6 Not tainted 5.13.0-rc1+ #34
[  117.481150] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 8
[  117.481153] Workqueue: scsi_tmf_7 scmd_eh_abort_handler
[  117.481156] Call Trace:
[  117.481158]  dump_stack+0x6d/0x89
[  117.481162]  check_preemption_disabled+0xc8/0xd0
[  117.512575]  target_submit_tmr+0x41/0x150 [target_core_mod]
[  117.519705]  tcm_loop_issue_tmr+0xa7/0x100 [tcm_loop]
[  117.524913]  tcm_loop_abort_task+0x43/0x60 [tcm_loop]
[  117.530137]  scmd_eh_abort_handler+0x7b/0x230
[  117.534681]  process_one_work+0x268/0x580
[  117.538862]  worker_thread+0x55/0x3b0
[  117.542652]  ? process_one_work+0x580/0x580
[  117.548351]  kthread+0x143/0x160
[  117.551675]  ? kthread_create_worker_on_cpu+0x40/0x40
[  117.556873]  ret_from_fork+0x1f/0x30

Fixes: 1526d9f10c61 ("scsi: target: Make state_list per CPU")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.11+
---
 drivers/target/target_core_transport.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Mike Christie May 13, 2021, 3:36 p.m. UTC | #1
On 5/12/21 11:49 PM, Shin'ichiro Kawasaki wrote:
> The BUG message "BUG: using smp_processor_id() in preemptible [00000000]
> code" was observed for TCMU devices with kernel config DEBUG_PREEMPT.
> 
> The message was observed when blktests block/005 was run on TCMU devices
> with fileio backend or user:zbc backend [1]. The commit 1130b499b4a7
> ("scsi: target: tcm_loop: Use LIO wq cmd submission helper") triggered
> the symptom. The commit modified work queue to handle commands and
> changed 'current->nr_cpu_allowed' at smp_processor_id() call.
> 
> The message was also observed at system shutdown when TCMU devices were
> not cleaned up [2]. The function smp_processor_id() was called in SCSI
> host work queue for abort handling, and triggered the BUG message. This
> symptom was observed regardless of the commit 1130b499b4a7 ("scsi:
> target: tcm_loop: Use LIO wq cmd submission helper").
> 
> To avoid the BUG message at smp_processor_id(), use get_cpu() and
> put_cpu() instead which disable preemption.
> 
> [1]
> 
> [   56.468103] run blktests block/005 at 2021-05-12 14:16:38
> [   57.369473] check_preemption_disabled: 85 callbacks suppressed
> [   57.369480] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1511
> [   57.369506] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1510
> [   57.369512] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1506
> [   57.369552] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
> [   57.369606] CPU: 4 PID: 1506 Comm: fio Not tainted 5.13.0-rc1+ #34
> [   57.369613] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
> [   57.369617] Call Trace:
> [   57.369621] BUG: using smp_processor_id() in preemptible [00000000] code: fio/1507
> [   57.369628]  dump_stack+0x6d/0x89
> [   57.369642]  check_preemption_disabled+0xc8/0xd0
> [   57.369628] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
> [   57.369655]  __target_init_cmd+0x157/0x170 [target_core_mod]
> [   57.369695]  target_init_cmd+0x76/0x90 [target_core_mod]
> [   57.369732]  tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
> [   57.369744]  scsi_queue_rq+0x38e/0xc40
> [   57.369761]  __blk_mq_try_issue_directly+0x109/0x1c0
> [   57.369779]  blk_mq_try_issue_directly+0x43/0x90
> [   57.369790]  blk_mq_submit_bio+0x4e5/0x5d0
> [   57.369812]  submit_bio_noacct+0x46e/0x4e0
> [   57.369830]  __blkdev_direct_IO_simple+0x1a3/0x2d0
> [   57.369859]  ? set_init_blocksize.isra.0+0x60/0x60
> [   57.369880]  generic_file_read_iter+0x89/0x160
> [   57.369898]  blkdev_read_iter+0x44/0x60
> [   57.369906]  new_sync_read+0x102/0x170
> [   57.369929]  vfs_read+0xd4/0x160
> [   57.369941]  __x64_sys_pread64+0x6e/0xa0
> [   57.369946]  ? lockdep_hardirqs_on+0x79/0x100
> [   57.369958]  do_syscall_64+0x3a/0x70
> [   57.369965]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   57.369973] RIP: 0033:0x7f7ed4c1399f
> [   57.369979] Code: 08 89 3c 24 48 89 4c 24 18 e8 7d f3 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 04 24 e8 cd f3 ff ff 48 8b
> [   57.369983] RSP: 002b:00007ffd7918c580 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
> [   57.369990] RAX: ffffffffffffffda RBX: 00000000015b4540 RCX: 00007f7ed4c1399f
> [   57.369993] RDX: 0000000000001000 RSI: 00000000015de000 RDI: 0000000000000009
> [   57.369996] RBP: 00000000015b4540 R08: 0000000000000000 R09: 0000000000000001
> [   57.369999] R10: 0000000000e5c000 R11: 0000000000000293 R12: 00007f7eb5269a70
> [   57.370002] R13: 0000000000000000 R14: 0000000000001000 R15: 00000000015b4568
> [   57.370031] CPU: 7 PID: 1507 Comm: fio Not tainted 5.13.0-rc1+ #34
> [   57.370036] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018
> [   57.370039] Call Trace:
> [   57.370045]  dump_stack+0x6d/0x89
> [   57.370056]  check_preemption_disabled+0xc8/0xd0
> [   57.370068]  __target_init_cmd+0x157/0x170 [target_core_mod]
> [   57.370121]  target_init_cmd+0x76/0x90 [target_core_mod]
> [   57.370178]  tcm_loop_queuecommand+0x109/0x210 [tcm_loop]
> [   57.370197]  scsi_queue_rq+0x38e/0xc40
> [   57.370224]  __blk_mq_try_issue_directly+0x109/0x1c0
> ...
> 
> [2]
> 
> [  117.458597] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u16:8
> [  117.467279] caller is __target_init_cmd+0x157/0x170 [target_core_mod]
> [  117.473893] CPU: 1 PID: 418 Comm: kworker/u16:6 Not tainted 5.13.0-rc1+ #34
> [  117.481150] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 8
> [  117.481153] Workqueue: scsi_tmf_7 scmd_eh_abort_handler
> [  117.481156] Call Trace:
> [  117.481158]  dump_stack+0x6d/0x89
> [  117.481162]  check_preemption_disabled+0xc8/0xd0
> [  117.512575]  target_submit_tmr+0x41/0x150 [target_core_mod]
> [  117.519705]  tcm_loop_issue_tmr+0xa7/0x100 [tcm_loop]
> [  117.524913]  tcm_loop_abort_task+0x43/0x60 [tcm_loop]
> [  117.530137]  scmd_eh_abort_handler+0x7b/0x230
> [  117.534681]  process_one_work+0x268/0x580
> [  117.538862]  worker_thread+0x55/0x3b0
> [  117.542652]  ? process_one_work+0x580/0x580
> [  117.548351]  kthread+0x143/0x160
> [  117.551675]  ? kthread_create_worker_on_cpu+0x40/0x40
> [  117.556873]  ret_from_fork+0x1f/0x30
> 
> Fixes: 1526d9f10c61 ("scsi: target: Make state_list per CPU")
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Cc: stable@vger.kernel.org # v5.11+
> ---
>  drivers/target/target_core_transport.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 8fbfe75c5744..9ed5d234acd8 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1415,8 +1415,10 @@ void __target_init_cmd(
>  	cmd->sense_buffer = sense_buffer;
>  	cmd->orig_fe_lun = unpacked_lun;
>  
> -	if (!(cmd->se_cmd_flags & SCF_USE_CPUID))
> -		cmd->cpuid = smp_processor_id();
> +	if (!(cmd->se_cmd_flags & SCF_USE_CPUID)) {
> +		cmd->cpuid = get_cpu();
> +		put_cpu();
> +	}

get/put_cpu is ok. I just had a question if something else is normally
used for this use case.

Does the get/put_cpu calls handle the case where we are running on CPU
N, then because of a preempt the thread we are calling this from gets
rescheduled to a different CPU? So it prevents the issue where after
the preempt we think we are running on CPU1, but we ended up on CPU0?

For this case/code above, either the caller of this function has pinned
the calling context to a specific CPU or the caller does not care. So
for:

1. the pinned case if we got preempted the thread would get rescheduled
on the original CPU and we be ok right?

2. the don't-care case we could end up on any old CPU. For this I think
we will be ok without the get/put_cpu preempt disablement because the caller
does not care and we just need any old CPU value. Basically after we set
cmd->cpuid above, later accesses will use that cmd->cpuid value for per CPU
structs. We just need the cpuid value for this "don't care" case because
cases 1 and 2 use common set/lookup code.

Is there something else that's used normally? raw_smp_processor_id?
Shin'ichiro Kawasaki May 14, 2021, 11:55 a.m. UTC | #2
On May 13, 2021 / 12:37, Mike Christie wrote:
> On 5/13/21 11:03 AM, Bart Van Assche wrote:

> > On 5/13/21 8:36 AM, Mike Christie wrote:

> >> Is there something else that's used normally? raw_smp_processor_id?

> > 

> > +1 for raw_smp_processor_id() since my understanding is that

> > SCF_USE_CPUID is a performance optimization only and correctness of the

> > target code is not affected if the running thread is moved to another

> > CPU core by the scheduler.

> > 

> 

> Ok.

> 

> Shin'ichiro if we all are in agreement then I made the same mistake in

> iblock_plug_device. I can send a patch for that, or if you want to get

> your patch commit count up, feel free to when you resend this patch.


Thank you for the discussion. I also agree to use raw_smp_processor_id()
instead of get_cpu()/put_cpu(). Will post v2.

Regarding iblock_plug_device, may I ask Mike to send a patch? I'm a bit
caught up in other tasks.

-- 
Best Regards,
Shin'ichiro Kawasaki
diff mbox series

Patch

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 8fbfe75c5744..9ed5d234acd8 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -1415,8 +1415,10 @@  void __target_init_cmd(
 	cmd->sense_buffer = sense_buffer;
 	cmd->orig_fe_lun = unpacked_lun;
 
-	if (!(cmd->se_cmd_flags & SCF_USE_CPUID))
-		cmd->cpuid = smp_processor_id();
+	if (!(cmd->se_cmd_flags & SCF_USE_CPUID)) {
+		cmd->cpuid = get_cpu();
+		put_cpu();
+	}
 
 	cmd->state_active = false;
 }