diff mbox

[V2,00/25] tty: serial: drop uart_port->lock before calling

Message ID CAKohpomE+6fxwF4JsDUDCzf+LsWZ_deBsRzF_XYjQHZTPvp_jw@mail.gmail.com
State New
Headers show

Commit Message

Viresh Kumar Aug. 20, 2013, 10:52 a.m. UTC
Hi Peter,

On 20 August 2013 08:05, Peter Hurley <peter@hurleysoftware.com> wrote:
> The key to why the flush_to_disc() worker thread cannot acquire the
> uart_port spinlock is in the first stack trace:
>
> s3c24xx_serial_rx_chars
>   raw_spin_unlock_irqrestore
>     do_raw_spin_unlock
>       dump_stack
>
> So the real question here is why do_raw_spin_unlock() is doing
> a dump_stack()?  Is one of the SPIN_BUG_ON() asserts in debug_spin_unlock()
> triggering?

Ahh, you are right and SPIN_BUG_ON() is indeed getting asserted.

I am embarrassed to say this but it happened due to my stupid mistake :(

And yes, non-rt kernels doesn't see this crash as we simply schedule
work from our interrupt handlers instead of calling flush_to_ldisc()

This is how I got the wrong crash dump:
- Applied my patch over latest linux-next
- Some more debug messages

Tested: No crashes found..

Thought maybe I should try without my patch first, if it really crashes
or not.

- Reverted my patch and got some conflicts due the "debug patch"..
and ended up resolving them in the wrong way:

-       spin_unlock_irqrestore(&port->lock, flags);
        pr_info("%s\n", __func__);
        tty_flip_buffer_push(&port->state->port);
+       spin_unlock_irqrestore(&port->lock, flags);

  out:
+       spin_unlock_irqrestore(&port->lock, flags);

Got two unlocks and so the dump..
Sorry for the incorrect information :(

But yes there are some issues with the RT kernel, which are going to
stay without this patchset.. Here are the logs with 3.10.6 + rt kernel
(attached as well, in case lines are wrapped by gmail :( ):

root@localhost:~# [  128.771800] ------------[ cut here ]------------
[  128.771815] kernel BUG at
/home/arm/work/kernel/linaro/lng/lng.git/kernel/rtmutex.c:738!
[  128.771829] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
[  128.771844] Modules linked in:
[  128.771868] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Not tainted 3.10.6-rt3+ #7
[  128.771885] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
[  128.771915] PC is at rt_spin_lock_slowlock+0x18c/0x1a0
[  128.771933] LR is at rt_spin_lock_slowlock+0x3f/0x1a0
[  128.771956] pc : [<80406b9c>]    lr : [<80406a4f>]    psr: 60070033
[  128.771956] sp : ee2b1d68  ip : 00000000  fp : 00000000
[  128.771968] r10: ee6e641e  r9 : ef2d9cc0  r8 : 00000000
[  128.771982] r7 : ee6e7800  r6 : 00000000  r5 : ee2b0000  r4 : 806a9ccc
[  128.771995] r3 : ef2d9cc0  r2 : 00000000  r1 : ef2d9cc0  r0 : 00000000
[  128.772011] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb
Segment kernel
[  128.772026] Control: 50c5387d  Table: aeac406a  DAC: 00000015
[  128.772041] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
[  128.772054] Stack: (0xee2b1d68 to 0xee2b2000)
[  128.772080] 1d60:                   00000000 00000000 00000000
00000000 00000400 80654380
[  128.772105] 1d80: ef07d800 00000002 ef07d800 81767800 00000002
00000000 00000001 80043101
[  128.772128] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800
00000000 ee6e7800 ee6e641e
[  128.772150] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d
80278af5 81767380 800451c9
[  128.772173] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0
80406a4f ef2d9e18 80037a0d
[  128.772195] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240
80406a4f 00000000 ffffffff
[  128.772216] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100
00000001 eea11250 ee2b0000
[  128.772238] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250
ee2b0000 00000000 806582c0
[  128.772260] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000
8027bcdd 00000001 00000000
[  128.772282] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400
eea11250 80278a59 ee75a500
[  128.772304] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000
806a9cac 806a9ccc 00000000
[  128.772325] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000
8028d4db 00000000 8003ca2b
[  128.772347] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055
00000000 ee75a320 80067da1
[  128.772368] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000
80067db9 00000001 8067b750
[  128.772390] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300
00000000 00000000 80067cfd
[  128.772411] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9
00000000 00000000 00000000
[  128.772432] 1f60: 00000000 80033561 00000000 00000000 000000f8
ee75a300 00000000 00000000
[  128.772453] 1f80: ee2b1f80 ee2b1f80 00000000 00000000 ee2b1f90
ee2b1f90 00000000 ee6bdcd0
[  128.772473] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000
00000000 00000000 00000000
[  128.772492] 1fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  128.772512] 1fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[  128.772564] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from
[<80288639>] (uart_start+0x1d/0x30)
[  128.772608] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>]
(n_tty_receive_buf+0x9d/0xbf0)
[  128.772650] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from
[<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
[  128.772697] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from
[<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
[  128.772742] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from
[<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
[  128.772793] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from
[<80067db9>] (irq_forced_thread_fn+0x19/0x38)
[  128.772837] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from
[<80067e91>] (irq_thread+0xb9/0x118)
[  128.772884] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>]
(kthread+0x6d/0x74)
[  128.772932] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>]
(ret_from_fork+0x11/0x20)
[  128.772962] Code: f7ff fae4 e7ac de02 (de02) 1d23
[  129.122974] ---[ end trace 0000000000000002 ]---
[  129.122994] note: irq/85-12c20000[1895] exited with preempt_count 1
[  129.123041] Unable to handle kernel paging request at virtual
address ffffffec
[  129.123049] pgd = 80004000
[  129.123077] [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000
[  129.123095] Internal error: Oops: 17 [#2] PREEMPT SMP THUMB2
[  129.123108] Modules linked in:
[  129.123131] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Tainted: G
D      3.10.6-rt3+ #7
[  129.123144] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
[  129.123173] PC is at kthread_data+0xa/0x10
[  129.123196] LR is at irq_thread_dtor+0x21/0x80
[  129.123218] pc : [<80033886>]    lr : [<80067d1d>]    psr: 800700b3
[  129.123218] sp : ee2b1bd8  ip : 00000031  fp : 20070093
[  129.123231] r10: ef2d9cc0  r9 : 80680008  r8 : ee2b0000
[  129.123243] r7 : ef2d9cc0  r6 : 806d8c88  r5 : 00000000  r4 : ef2d9cc0
[  129.123254] r3 : 00000000  r2 : 80000000  r1 : ee2b1f38  r0 : ef2d9cc0
[  129.123269] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA Thumb
Segment user
[  129.123283] Control: 50c5387d  Table: aeac406a  DAC: 00000015
[  129.123297] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
[  129.123307] Stack: (0xee2b1bd8 to 0xee2b2000)
[  129.123324] 1bc0:
    80067cfd ef2da124
[  129.123348] 1be0: 00000000 8003136f ee2b0000 002f002f 00000000
0000000b 00000000 00000000
[  129.123370] 1c00: ee2b1d20 8001f619 20070093 00000001 8053407c
8001dccf 002f002f 806855ec
[  129.123392] 1c20: ee2b0000 806c5f84 0000000b ee2b0000 ee2b1d20
8052fd08 80680008 00000000
[  129.123413] 1c40: 20070093 8000fcf1 ee2b0238 0000000b 8176f34c
00000000 00030001 00000004
[  129.123434] 1c60: 0000de02 ee2b1d20 80406b9c ee6e641e 00000000
80008235 00000006 00000002
[  129.123456] 1c80: 00000004 00000000 00030001 80406b9c 00000193
80651d84 0111c000 80021859
[  129.123477] 1ca0: 00000001 80021c2b 8176dd84 8006dd4f 00000000
81770380 00000001 00000001
[  129.123498] 1cc0: 00000001 00000000 00000400 80042a73 0000041d
000003e3 00000400 00000000
[  129.123519] 1ce0: 80654380 ef07e950 80654380 ef07e8c0 00000000
80654380 ffffffff 00000000
[  129.123540] 1d00: ee2b1d0c 80406b9e 60070033 ffffffff ee2b1d54
00000000 ef2d9cc0 8000cac1
[  129.123561] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc
ee2b0000 00000000 ee6e7800
[  129.123581] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000
ee2b1d68 80406a4f 80406b9c
[  129.123601] 1d60: 60070033 ffffffff 00000000 00000000 00000000
00000000 00000400 80654380
[  129.123622] 1d80: ef07d800 00000002 ef07d800 81767800 00000002
00000000 00000001 80043101
[  129.123643] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800
00000000 ee6e7800 ee6e641e
[  129.123664] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d
80278af5 81767380 800451c9
[  129.123686] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0
80406a4f ef2d9e18 80037a0d
[  129.123708] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240
80406a4f 00000000 ffffffff
[  129.123729] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100
00000001 eea11250 ee2b0000
[  129.123750] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250
ee2b0000 00000000 806582c0
[  129.123770] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000
8027bcdd 00000001 00000000
[  129.123792] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400
eea11250 80278a59 ee75a500
[  129.123813] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000
806a9cac 806a9ccc 00000000
[  129.123834] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000
8028d4db 00000000 8003ca2b
[  129.123855] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055
00000000 ee75a320 80067da1
[  129.123876] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000
80067db9 00000001 8067b750
[  129.123897] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300
00000000 00000000 80067cfd
[  129.123918] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9
00000000 00000000 00000000
[  129.123938] 1f60: 00000000 80033561 00000000 00000000 000000f8
ee75a300 00000000 00000000
[  129.123959] 1f80: ee2b1f80 ee2b1f80 00000001 00010001 ee2b1f90
ee2b1f90 00000000 ee6bdcd0
[  129.123979] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000
00000000 00000000 00000000
[  129.123998] 1fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[  129.124017] 1fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[  129.124070] [<80033886>] (kthread_data+0xa/0x10) from [<80067d1d>]
(irq_thread_dtor+0x21/0x80)
[  129.124116] [<80067d1d>] (irq_thread_dtor+0x21/0x80) from
[<8003136f>] (task_work_run+0x73/0xac)
[  129.124156] [<8003136f>] (task_work_run+0x73/0xac) from
[<8001f619>] (do_exit+0x215/0x6fc)
[  129.124189] [<8001f619>] (do_exit+0x215/0x6fc) from [<8000fcf1>]
(die+0x155/0x1e8)
[  129.124221] [<8000fcf1>] (die+0x155/0x1e8) from [<80008235>]
(do_undefinstr+0x69/0x14c)
[  129.124261] [<80008235>] (do_undefinstr+0x69/0x14c) from
[<8000cac1>] (__und_svc_finish+0x1/0x40)
[  129.124272] Exception stack(0xee2b1d20 to 0xee2b1d68)
[  129.124296] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc
ee2b0000 00000000 ee6e7800
[  129.124318] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000
ee2b1d68 80406a4f 80406b9c
[  129.124329] 1d60: 60070033 ffffffff
[  129.124372] [<8000cac1>] (__und_svc_finish+0x1/0x40) from
[<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0)
[  129.124408] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from
[<80288639>] (uart_start+0x1d/0x30)
[  129.124445] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>]
(n_tty_receive_buf+0x9d/0xbf0)
[  129.124484] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from
[<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
[  129.124527] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from
[<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
[  129.124569] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from
[<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
[  129.124615] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from
[<80067db9>] (irq_forced_thread_fn+0x19/0x38)
[  129.124657] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from
[<80067e91>] (irq_thread+0xb9/0x118)
[  129.124700] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>]
(kthread+0x6d/0x74)
[  129.124745] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>]
(ret_from_fork+0x11/0x20)
[  129.124775] Code: f85d eb04 f8d0 3264 (f853) 0c14

---------x-----------------x------------

Here we haven't scheduled work from tty_flip_buffer_push() because of
following change in RT kernel:

commit 714cc6dc551136b8f88b53361b504a5f624308b9
Author: Ingo Molnar <mingo@elte.hu>
Date:   Fri Jul 3 08:30:01 2009 -0500

    serial: 8250: Call flush_to_ldisc when the irq is threaded

    Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/tty/tty_buffer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

---------x------------------x-------------------

How do you guys suggest me to fix this problem now? The fix I suggested?
Or something specific to the RT kernel only? Or maybe send these patches
for RT kernel only.

One more thing: We are calling tty_flip_buffer_push() with (from interrupt
context) and without (from process context) port->lock held.. Shouldn't we
make this flat, like don't acquire port->lock while calling
tty_flip_buffer_push()
as that may cause recursive locking?

Thanks for your help and patience.

--
viresh
root@localhost:~# [  128.771800] ------------[ cut here ]------------
[  128.771815] kernel BUG at /home/arm/work/kernel/linaro/lng/lng.git/kernel/rtmutex.c:738!
[  128.771829] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
[  128.771844] Modules linked in:
[  128.771868] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Not tainted 3.10.6-rt3+ #7
[  128.771885] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
[  128.771915] PC is at rt_spin_lock_slowlock+0x18c/0x1a0
[  128.771933] LR is at rt_spin_lock_slowlock+0x3f/0x1a0
[  128.771956] pc : [<80406b9c>]    lr : [<80406a4f>]    psr: 60070033
[  128.771956] sp : ee2b1d68  ip : 00000000  fp : 00000000
[  128.771968] r10: ee6e641e  r9 : ef2d9cc0  r8 : 00000000
[  128.771982] r7 : ee6e7800  r6 : 00000000  r5 : ee2b0000  r4 : 806a9ccc
[  128.771995] r3 : ef2d9cc0  r2 : 00000000  r1 : ef2d9cc0  r0 : 00000000
[  128.772011] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment kernel
[  128.772026] Control: 50c5387d  Table: aeac406a  DAC: 00000015
[  128.772041] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
[  128.772054] Stack: (0xee2b1d68 to 0xee2b2000)
[  128.772080] 1d60:                   00000000 00000000 00000000 00000000 00000400 80654380
[  128.772105] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101
[  128.772128] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e
[  128.772150] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9
[  128.772173] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d
[  128.772195] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff
[  128.772216] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000
[  128.772238] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0
[  128.772260] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000
[  128.772282] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500
[  128.772304] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000
[  128.772325] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b
[  128.772347] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1
[  128.772368] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750
[  128.772390] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd
[  128.772411] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000
[  128.772432] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000
[  128.772453] 1f80: ee2b1f80 ee2b1f80 00000000 00000000 ee2b1f90 ee2b1f90 00000000 ee6bdcd0
[  128.772473] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000
[  128.772492] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  128.772512] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[  128.772564] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30)
[  128.772608] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0)
[  128.772650] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
[  128.772697] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
[  128.772742] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
[  128.772793] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from [<80067db9>] (irq_forced_thread_fn+0x19/0x38)
[  128.772837] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118)
[  128.772884] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74)
[  128.772932] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20)
[  128.772962] Code: f7ff fae4 e7ac de02 (de02) 1d23 
[  129.122974] ---[ end trace 0000000000000002 ]---
[  129.122994] note: irq/85-12c20000[1895] exited with preempt_count 1
[  129.123041] Unable to handle kernel paging request at virtual address ffffffec
[  129.123049] pgd = 80004000
[  129.123077] [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000
[  129.123095] Internal error: Oops: 17 [#2] PREEMPT SMP THUMB2
[  129.123108] Modules linked in:
[  129.123131] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Tainted: G      D      3.10.6-rt3+ #7
[  129.123144] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
[  129.123173] PC is at kthread_data+0xa/0x10
[  129.123196] LR is at irq_thread_dtor+0x21/0x80
[  129.123218] pc : [<80033886>]    lr : [<80067d1d>]    psr: 800700b3
[  129.123218] sp : ee2b1bd8  ip : 00000031  fp : 20070093
[  129.123231] r10: ef2d9cc0  r9 : 80680008  r8 : ee2b0000
[  129.123243] r7 : ef2d9cc0  r6 : 806d8c88  r5 : 00000000  r4 : ef2d9cc0
[  129.123254] r3 : 00000000  r2 : 80000000  r1 : ee2b1f38  r0 : ef2d9cc0
[  129.123269] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA Thumb  Segment user
[  129.123283] Control: 50c5387d  Table: aeac406a  DAC: 00000015
[  129.123297] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
[  129.123307] Stack: (0xee2b1bd8 to 0xee2b2000)
[  129.123324] 1bc0:                                                       80067cfd ef2da124
[  129.123348] 1be0: 00000000 8003136f ee2b0000 002f002f 00000000 0000000b 00000000 00000000
[  129.123370] 1c00: ee2b1d20 8001f619 20070093 00000001 8053407c 8001dccf 002f002f 806855ec
[  129.123392] 1c20: ee2b0000 806c5f84 0000000b ee2b0000 ee2b1d20 8052fd08 80680008 00000000
[  129.123413] 1c40: 20070093 8000fcf1 ee2b0238 0000000b 8176f34c 00000000 00030001 00000004
[  129.123434] 1c60: 0000de02 ee2b1d20 80406b9c ee6e641e 00000000 80008235 00000006 00000002
[  129.123456] 1c80: 00000004 00000000 00030001 80406b9c 00000193 80651d84 0111c000 80021859
[  129.123477] 1ca0: 00000001 80021c2b 8176dd84 8006dd4f 00000000 81770380 00000001 00000001
[  129.123498] 1cc0: 00000001 00000000 00000400 80042a73 0000041d 000003e3 00000400 00000000
[  129.123519] 1ce0: 80654380 ef07e950 80654380 ef07e8c0 00000000 80654380 ffffffff 00000000
[  129.123540] 1d00: ee2b1d0c 80406b9e 60070033 ffffffff ee2b1d54 00000000 ef2d9cc0 8000cac1
[  129.123561] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800
[  129.123581] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c
[  129.123601] 1d60: 60070033 ffffffff 00000000 00000000 00000000 00000000 00000400 80654380
[  129.123622] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101
[  129.123643] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e
[  129.123664] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9
[  129.123686] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d
[  129.123708] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff
[  129.123729] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000
[  129.123750] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0
[  129.123770] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000
[  129.123792] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500
[  129.123813] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000
[  129.123834] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b
[  129.123855] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1
[  129.123876] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750
[  129.123897] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd
[  129.123918] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000
[  129.123938] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000
[  129.123959] 1f80: ee2b1f80 ee2b1f80 00000001 00010001 ee2b1f90 ee2b1f90 00000000 ee6bdcd0
[  129.123979] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000
[  129.123998] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  129.124017] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[  129.124070] [<80033886>] (kthread_data+0xa/0x10) from [<80067d1d>] (irq_thread_dtor+0x21/0x80)
[  129.124116] [<80067d1d>] (irq_thread_dtor+0x21/0x80) from [<8003136f>] (task_work_run+0x73/0xac)
[  129.124156] [<8003136f>] (task_work_run+0x73/0xac) from [<8001f619>] (do_exit+0x215/0x6fc)
[  129.124189] [<8001f619>] (do_exit+0x215/0x6fc) from [<8000fcf1>] (die+0x155/0x1e8)
[  129.124221] [<8000fcf1>] (die+0x155/0x1e8) from [<80008235>] (do_undefinstr+0x69/0x14c)
[  129.124261] [<80008235>] (do_undefinstr+0x69/0x14c) from [<8000cac1>] (__und_svc_finish+0x1/0x40)
[  129.124272] Exception stack(0xee2b1d20 to 0xee2b1d68)
[  129.124296] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800
[  129.124318] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c
[  129.124329] 1d60: 60070033 ffffffff
[  129.124372] [<8000cac1>] (__und_svc_finish+0x1/0x40) from [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0)
[  129.124408] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30)
[  129.124445] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0)
[  129.124484] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
[  129.124527] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
[  129.124569] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
[  129.124615] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from [<80067db9>] (irq_forced_thread_fn+0x19/0x38)
[  129.124657] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118)
[  129.124700] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74)
[  129.124745] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20)
[  129.124775] Code: f85d eb04 f8d0 3264 (f853) 0c14

Comments

Peter Hurley Aug. 20, 2013, 7:38 p.m. UTC | #1
On 08/20/2013 06:52 AM, Viresh Kumar wrote:
> Hi Peter,
>
> On 20 August 2013 08:05, Peter Hurley <peter@hurleysoftware.com> wrote:
>> The key to why the flush_to_disc() worker thread cannot acquire the
>> uart_port spinlock is in the first stack trace:
>>
>> s3c24xx_serial_rx_chars
>>    raw_spin_unlock_irqrestore
>>      do_raw_spin_unlock
>>        dump_stack
>>
>> So the real question here is why do_raw_spin_unlock() is doing
>> a dump_stack()?  Is one of the SPIN_BUG_ON() asserts in debug_spin_unlock()
>> triggering?
>
> Ahh, you are right and SPIN_BUG_ON() is indeed getting asserted.
>
> I am embarrassed to say this but it happened due to my stupid mistake :(
>
> And yes, non-rt kernels doesn't see this crash as we simply schedule
> work from our interrupt handlers instead of calling flush_to_ldisc()
>
> This is how I got the wrong crash dump:
> - Applied my patch over latest linux-next
> - Some more debug messages
>
> Tested: No crashes found..
>
> Thought maybe I should try without my patch first, if it really crashes
> or not.
>
> - Reverted my patch and got some conflicts due the "debug patch"..
> and ended up resolving them in the wrong way:
>
> -       spin_unlock_irqrestore(&port->lock, flags);
>          pr_info("%s\n", __func__);
>          tty_flip_buffer_push(&port->state->port);
> +       spin_unlock_irqrestore(&port->lock, flags);
>
>    out:
> +       spin_unlock_irqrestore(&port->lock, flags);
>
> Got two unlocks and so the dump..
> Sorry for the incorrect information :(

No need to apologize. At least this part of the mystery is solved.

> But yes there are some issues with the RT kernel, which are going to
> stay without this patchset.. Here are the logs with 3.10.6 + rt kernel
> (attached as well, in case lines are wrapped by gmail :( ):
>
> root@localhost:~# [  128.771800] ------------[ cut here ]------------
> [  128.771815] kernel BUG at /home/arm/work/kernel/linaro/lng/lng.git/kernel/rtmutex.c:738!
> [  128.771829] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
> [  128.771844] Modules linked in:
> [  128.771868] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Not tainted 3.10.6-rt3+ #7
> [  128.771885] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
> [  128.771915] PC is at rt_spin_lock_slowlock+0x18c/0x1a0
> [  128.771933] LR is at rt_spin_lock_slowlock+0x3f/0x1a0
> [  128.771956] pc : [<80406b9c>]    lr : [<80406a4f>]    psr: 60070033
> [  128.771956] sp : ee2b1d68  ip : 00000000  fp : 00000000
> [  128.771968] r10: ee6e641e  r9 : ef2d9cc0  r8 : 00000000
> [  128.771982] r7 : ee6e7800  r6 : 00000000  r5 : ee2b0000  r4 : 806a9ccc
> [  128.771995] r3 : ef2d9cc0  r2 : 00000000  r1 : ef2d9cc0  r0 : 00000000
> [  128.772011] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb Segment kernel
> [  128.772026] Control: 50c5387d  Table: aeac406a  DAC: 00000015
> [  128.772041] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
> [  128.772054] Stack: (0xee2b1d68 to 0xee2b2000)
> [  128.772080] 1d60:                   00000000 00000000 00000000 00000000 00000400 80654380
> [  128.772105] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101
> [  128.772128] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e
> [  128.772150] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9
> [  128.772173] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d
> [  128.772195] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff
> [  128.772216] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000
> [  128.772238] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0
> [  128.772260] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000
> [  128.772282] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500
> [  128.772304] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000
> [  128.772325] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b
> [  128.772347] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1
> [  128.772368] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750
> [  128.772390] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd
> [  128.772411] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000
> [  128.772432] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000
> [  128.772453] 1f80: ee2b1f80 ee2b1f80 00000000 00000000 ee2b1f90 ee2b1f90 00000000 ee6bdcd0
> [  128.772473] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000
> [  128.772492] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  128.772512] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [  128.772564] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30)
> [  128.772608] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0)
> [  128.772650] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
> [  128.772697] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
> [  128.772742] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
> [  128.772793] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from  [<80067db9>] (irq_forced_thread_fn+0x19/0x38)
> [  128.772837] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118)
> [  128.772884] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74)
> [  128.772932] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20)
> [  128.772962] Code: f7ff fae4 e7ac de02 (de02) 1d23
> [  129.122974] ---[ end trace 0000000000000002 ]---
> [  129.122994] note: irq/85-12c20000[1895] exited with preempt_count 1
> [  129.123041] Unable to handle kernel paging request at virtual address ffffffec
> [  129.123049] pgd = 80004000
> [  129.123077] [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000
> [  129.123095] Internal error: Oops: 17 [#2] PREEMPT SMP THUMB2
> [  129.123108] Modules linked in:
> [  129.123131] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Tainted: G D      3.10.6-rt3+ #7
> [  129.123144] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
> [  129.123173] PC is at kthread_data+0xa/0x10
> [  129.123196] LR is at irq_thread_dtor+0x21/0x80
> [  129.123218] pc : [<80033886>]    lr : [<80067d1d>]    psr: 800700b3
> [  129.123218] sp : ee2b1bd8  ip : 00000031  fp : 20070093
> [  129.123231] r10: ef2d9cc0  r9 : 80680008  r8 : ee2b0000
> [  129.123243] r7 : ef2d9cc0  r6 : 806d8c88  r5 : 00000000  r4 : ef2d9cc0
> [  129.123254] r3 : 00000000  r2 : 80000000  r1 : ee2b1f38  r0 : ef2d9cc0
> [  129.123269] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA Thumb Segment user
> [  129.123283] Control: 50c5387d  Table: aeac406a  DAC: 00000015
> [  129.123297] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
> [  129.123307] Stack: (0xee2b1bd8 to 0xee2b2000)
> [  129.123324] 1bc0:     80067cfd ef2da124
> [  129.123348] 1be0: 00000000 8003136f ee2b0000 002f002f 00000000 0000000b 00000000 00000000
> [  129.123370] 1c00: ee2b1d20 8001f619 20070093 00000001 8053407c 8001dccf 002f002f 806855ec
> [  129.123392] 1c20: ee2b0000 806c5f84 0000000b ee2b0000 ee2b1d20 8052fd08 80680008 00000000
> [  129.123413] 1c40: 20070093 8000fcf1 ee2b0238 0000000b 8176f34c 00000000 00030001 00000004
> [  129.123434] 1c60: 0000de02 ee2b1d20 80406b9c ee6e641e 00000000 80008235 00000006 00000002
> [  129.123456] 1c80: 00000004 00000000 00030001 80406b9c 00000193 80651d84 0111c000 80021859
> [  129.123477] 1ca0: 00000001 80021c2b 8176dd84 8006dd4f 00000000 81770380 00000001 00000001
> [  129.123498] 1cc0: 00000001 00000000 00000400 80042a73 0000041d 000003e3 00000400 00000000
> [  129.123519] 1ce0: 80654380 ef07e950 80654380 ef07e8c0 00000000 80654380 ffffffff 00000000
> [  129.123540] 1d00: ee2b1d0c 80406b9e 60070033 ffffffff ee2b1d54 00000000 ef2d9cc0 8000cac1
> [  129.123561] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800
> [  129.123581] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c
> [  129.123601] 1d60: 60070033 ffffffff 00000000 00000000 00000000 00000000 00000400 80654380
> [  129.123622] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101
> [  129.123643] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e
> [  129.123664] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9
> [  129.123686] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d
> [  129.123708] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff
> [  129.123729] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000
> [  129.123750] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0
> [  129.123770] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000
> [  129.123792] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500
> [  129.123813] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000
> [  129.123834] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b
> [  129.123855] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1
> [  129.123876] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750
> [  129.123897] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd
> [  129.123918] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000
> [  129.123938] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000
> [  129.123959] 1f80: ee2b1f80 ee2b1f80 00000001 00010001 ee2b1f90 ee2b1f90 00000000 ee6bdcd0
> [  129.123979] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000
> [  129.123998] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [  129.124017] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [  129.124070] [<80033886>] (kthread_data+0xa/0x10) from [<80067d1d>] (irq_thread_dtor+0x21/0x80)
> [  129.124116] [<80067d1d>] (irq_thread_dtor+0x21/0x80) from [<8003136f>] (task_work_run+0x73/0xac)
> [  129.124156] [<8003136f>] (task_work_run+0x73/0xac) from [<8001f619>] (do_exit+0x215/0x6fc)
> [  129.124189] [<8001f619>] (do_exit+0x215/0x6fc) from [<8000fcf1>] (die+0x155/0x1e8)
> [  129.124221] [<8000fcf1>] (die+0x155/0x1e8) from [<80008235>] (do_undefinstr+0x69/0x14c)
> [  129.124261] [<80008235>] (do_undefinstr+0x69/0x14c) from [<8000cac1>] (__und_svc_finish+0x1/0x40)
> [  129.124272] Exception stack(0xee2b1d20 to 0xee2b1d68)
> [  129.124296] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800
> [  129.124318] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c
> [  129.124329] 1d60: 60070033 ffffffff
> [  129.124372] [<8000cac1>] (__und_svc_finish+0x1/0x40) from [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0)
> [  129.124408] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30)
> [  129.124445] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0)
> [  129.124484] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8)
> [  129.124527] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c)
> [  129.124569] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
> [  129.124615] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from [<80067db9>] (irq_forced_thread_fn+0x19/0x38)
> [  129.124657] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118)
> [  129.124700] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74)
> [  129.124745] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20)
> [  129.124775] Code: f85d eb04 f8d0 3264 (f853) 0c14

In light of the RT patch below, completely understandable why it crashes.

PS - Not automagic formatting, unless you consider my hand-editing automagic.

> ---------x-----------------x------------
>
> Here we haven't scheduled work from tty_flip_buffer_push() because of
> following change in RT kernel:
>
> commit 714cc6dc551136b8f88b53361b504a5f624308b9
> Author: Ingo Molnar <mingo@elte.hu>
> Date:   Fri Jul 3 08:30:01 2009 -0500
>
>      serial: 8250: Call flush_to_ldisc when the irq is threaded
>
>      Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>   drivers/tty/tty_buffer.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
> index 9121c1f..94d3215 100644
> --- a/drivers/tty/tty_buffer.c
> +++ b/drivers/tty/tty_buffer.c
> @@ -517,10 +517,15 @@ void tty_flip_buffer_push(struct tty_port *port)
>                  buf->tail->commit = buf->tail->used;
>          spin_unlock_irqrestore(&buf->lock, flags);
>
> +#ifndef CONFIG_PREEMPT_RT_FULL
>          if (port->low_latency)
>                  flush_to_ldisc(&buf->work);
>          else
>                  schedule_work(&buf->work);
> +#else
> +       flush_to_ldisc(&buf->work);
> +#endif

As you've discovered, not correct for 25+ drivers, including the samsung driver.


> +
>   }
>   EXPORT_SYMBOL(tty_flip_buffer_push);
>
> @@ -545,4 +550,3 @@ void tty_buffer_init(struct tty_port *port)
>          buf->memory_used = 0;
>          INIT_WORK(&buf->work, flush_to_ldisc);
>   }
>
> ---------x------------------x-------------------
>
> How do you guys suggest me to fix this problem now? The fix I suggested?
> Or something specific to the RT kernel only? Or maybe send these patches
> for RT kernel only.

That's really Greg's call whether he wants to take up RT-specific patches
into the regular tree.


> One more thing: We are calling tty_flip_buffer_push() with (from interrupt
> context) and without (from process context) port->lock held.  Shouldn't we
> make this flat, like don't acquire port->lock while calling
> tty_flip_buffer_push() as that may cause recursive locking?

The tty buffer interface is designed for a single producer so
provides no locking of its own. Provided a uart port mini-driver only uses
the tty buffer interface serially, then no locking is required (to protect
the internals of the tty buffer).

When you note that tty_flip_buffer_push() is called from 'process context',
I assume you mean kworker threads handling rx DMA. Besides the pty driver
and the fwserial staging driver that has loopback mode, I'm not aware of
any other drivers that do rx from process context.

Regards,
Peter Hurley
Viresh Kumar Aug. 21, 2013, 4:35 a.m. UTC | #2
On 21 August 2013 01:08, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 08/20/2013 06:52 AM, Viresh Kumar wrote:

> That's really Greg's call whether he wants to take up RT-specific patches
> into the regular tree.

Yeah.. Lets wait for Greg..

>> One more thing: We are calling tty_flip_buffer_push() with (from interrupt
>> context) and without (from process context) port->lock held.  Shouldn't we
>>
>> make this flat, like don't acquire port->lock while calling
>> tty_flip_buffer_push() as that may cause recursive locking?
>
> When you note that tty_flip_buffer_push() is called from 'process context',
> I assume you mean kworker threads handling rx DMA. Besides the pty driver
> and the fwserial staging driver that has loopback mode, I'm not aware of
> any other drivers that do rx from process context.

Okay.. And so most of the people call it from interrupt context (25+ of them,
I am trying to fix for RT and others are already doing the right thing) and all
of these must have set port->low_latency to false.

Now my questions is, what exactly we want to protect in this function while
calling it from interrupt or process context? For which we need a port->lock
to be held by caller?

There are drivers which are sustaining without a lock being held by caller and
others shouldn't have any problem doing that... That might be a clean solution
for Mainline and that will fix the RT issues we are facing :)

--
viresh
Viresh Kumar Aug. 23, 2013, 8:26 a.m. UTC | #3
On 21 August 2013 10:05, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 21 August 2013 01:08, Peter Hurley <peter@hurleysoftware.com> wrote:
>> On 08/20/2013 06:52 AM, Viresh Kumar wrote:
>
>> That's really Greg's call whether he wants to take up RT-specific patches
>> into the regular tree.
>
> Yeah.. Lets wait for Greg..

Hi Greg,

Should I resend my patches with better commit logs if you are going to
take these patches?

--
viresh
Greg Kroah-Hartman Aug. 27, 2013, 11:19 p.m. UTC | #4
On Fri, Aug 23, 2013 at 01:56:08PM +0530, Viresh Kumar wrote:
> On 21 August 2013 10:05, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 21 August 2013 01:08, Peter Hurley <peter@hurleysoftware.com> wrote:
> >> On 08/20/2013 06:52 AM, Viresh Kumar wrote:
> >
> >> That's really Greg's call whether he wants to take up RT-specific patches
> >> into the regular tree.
> >
> > Yeah.. Lets wait for Greg..
> 
> Hi Greg,
> 
> Should I resend my patches with better commit logs if you are going to
> take these patches?

No need, I took them as-is, thanks for doing this work.

greg k-h
Viresh Kumar Aug. 28, 2013, 3:31 a.m. UTC | #5
On 28 August 2013 04:49, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> No need, I took them as-is, thanks for doing this work.

Thanks Greg..

But I think the current commit log is poor and doesn't communicate the problem
well.. I have prepared another patchset with the same diff and updated
log (below),
I can resend them if you are willing to drop current version and take next one..

----------x--------------x---------------------

    The current driver triggers a lockdep warning for RT kernel if
    tty_flip_buffer_push() is called with uart_port->lock locked. This
never shows
    up on UP kernels and comes up only on SMP kernels on RT kernels.

    This happened due to following commit in rt kernels:

        commit 714cc6dc551136b8f88b53361b504a5f624308b9
        Author: Ingo Molnar <mingo@elte.hu>
        Date:   Fri Jul 3 08:30:01 2009 -0500

            serial: 8250: Call flush_to_ldisc when the irq is threaded

            Signed-off-by: Ingo Molnar <mingo@elte.hu>
        ---
         drivers/tty/tty_buffer.c | 6 +++++-
         1 file changed, 5 insertions(+), 1 deletion(-)

        diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
        index 9121c1f..94d3215 100644
        --- a/drivers/tty/tty_buffer.c
        +++ b/drivers/tty/tty_buffer.c
        @@ -517,10 +517,15 @@ void tty_flip_buffer_push(struct tty_port *port)
                        buf->tail->commit = buf->tail->used;
                spin_unlock_irqrestore(&buf->lock, flags);

        +#ifndef CONFIG_PREEMPT_RT_FULL
                if (port->low_latency)
                        flush_to_ldisc(&buf->work);
                else
                        schedule_work(&buf->work);
        +#else
        +       flush_to_ldisc(&buf->work);
        +#endif
        +
         }
         EXPORT_SYMBOL(tty_flip_buffer_push);

    And the crash looks like this (produced with samsung.c driver):

    -----
        kernel BUG at
/home/arm/work/kernel/linaro/lng/lng.git/kernel/rtmutex.c:738!
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
        Modules linked in:
        CPU: 0 PID: 1895 Comm: irq/85-12c20000 Not tainted 3.10.6-rt3+ #7
        task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
        PC is at rt_spin_lock_slowlock+0x18c/0x1a0
        LR is at rt_spin_lock_slowlock+0x3f/0x1a0
        pc : [<80406b9c>]    lr : [<80406a4f>]    psr: 60070033
        sp : ee2b1d68  ip : 00000000  fp : 00000000
        r10: ee6e641e  r9 : ef2d9cc0  r8 : 00000000
        r7 : ee6e7800  r6 : 00000000  r5 : ee2b0000  r4 : 806a9ccc
        r3 : ef2d9cc0  r2 : 00000000  r1 : ef2d9cc0  r0 : 00000000
        Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb Segment kernel
        Control: 50c5387d  Table: aeac406a  DAC: 00000015
        Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238)
        Stack: (0xee2b1d68 to 0xee2b2000)

        [snip]

        [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from
[<80288639>] (uart_start+0x1d/0x30)
        [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>]
(n_tty_receive_buf+0x9d/0xbf0)
        [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>]
(flush_to_ldisc+0xb5/0xe8)
        [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>]
(s3c24xx_serial_rx_chars+0xbf/0x19c)
        [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from
[<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
        [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from
[<80067db9>] (irq_forced_thread_fn+0x19/0x38)
        [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from
[<80067e91>] (irq_thread+0xb9/0x118)
        [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>]
(kthread+0x6d/0x74)
        [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>]
(ret_from_fork+0x11/0x20)
        Code: f7ff fae4 e7ac de02 (de02) 1d23
        ---[ end trace 0000000000000002 ]---
        note: irq/85-12c20000[1895] exited with preempt_count 1
        Unable to handle kernel paging request at virtual address ffffffec
        pgd = 80004000
        [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000
        Internal error: Oops: 17 [#2] PREEMPT SMP THUMB2
        Modules linked in:
        CPU: 0 PID: 1895 Comm: irq/85-12c20000 Tainted: G D      3.10.6-rt3+ #7
        task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000
        PC is at kthread_data+0xa/0x10
        LR is at irq_thread_dtor+0x21/0x80

        [snip]

        [<80033886>] (kthread_data+0xa/0x10) from [<80067d1d>]
(irq_thread_dtor+0x21/0x80)
        [<80067d1d>] (irq_thread_dtor+0x21/0x80) from [<8003136f>]
(task_work_run+0x73/0xac)
        [<8003136f>] (task_work_run+0x73/0xac) from [<8001f619>]
(do_exit+0x215/0x6fc)
        [<8001f619>] (do_exit+0x215/0x6fc) from [<8000fcf1>] (die+0x155/0x1e8)
        [<8000fcf1>] (die+0x155/0x1e8) from [<80008235>]
(do_undefinstr+0x69/0x14c)
        [<80008235>] (do_undefinstr+0x69/0x14c) from [<8000cac1>]
(__und_svc_finish+0x1/0x40)
        Exception stack(0xee2b1d20 to 0xee2b1d68)
        1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000
00000000 ee6e7800
        1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68
80406a4f 80406b9c
        1d60: 60070033 ffffffff
        [<8000cac1>] (__und_svc_finish+0x1/0x40) from [<80406b9c>]
(rt_spin_lock_slowlock+0x18c/0x1a0)
        [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from
[<80288639>] (uart_start+0x1d/0x30)
        [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>]
(n_tty_receive_buf+0x9d/0xbf0)
        [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>]
(flush_to_ldisc+0xb5/0xe8)
        [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>]
(s3c24xx_serial_rx_chars+0xbf/0x19c)
        [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from
[<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38)
        [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from
[<80067db9>] (irq_forced_thread_fn+0x19/0x38)
        [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from
[<80067e91>] (irq_thread+0xb9/0x118)
        [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>]
(kthread+0x6d/0x74)
        [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>]
(ret_from_fork+0x11/0x20)

    Moreoever port->lock isn't really required before calling
tty_flip_buffer_push()
    and hence its better to make changes in mainline kernel instead of
RT kernel.

    Release the port lock before calling tty_flip_buffer_push() and reacquire it
    after the call.

    Similar stuff was already done for few other drivers in the past, like:

        commit 2389b272168ceec056ca1d8a870a97fa9c26e11a
        Author: Thomas Gleixner <tglx@linutronix.de>
        Date:   Tue May 29 21:53:50 2007 +0100

            [ARM] 4417/1: Serial: Fix AMBA drivers locking

        Cc: Tobias Klauser <tklauser@distanz.ch>
Greg Kroah-Hartman Aug. 28, 2013, 3:43 a.m. UTC | #6
On Wed, Aug 28, 2013 at 09:01:25AM +0530, Viresh Kumar wrote:
> On 28 August 2013 04:49, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > No need, I took them as-is, thanks for doing this work.
> 
> Thanks Greg..
> 
> But I think the current commit log is poor and doesn't communicate the problem
> well.. I have prepared another patchset with the same diff and updated
> log (below),
> I can resend them if you are willing to drop current version and take next one..

I can't drop them, I could revert them, and then apply the new ones, but
that's just doing 50 patches of "churn" for no good reason.  I assumed
that if you really wanted the changelog changed, you would have sent me
updated versions by now, so I went with what you had.  :)

It's fine, don't worry.

greg k-h
Viresh Kumar Aug. 28, 2013, 3:45 a.m. UTC | #7
On 28 August 2013 09:13, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> I can't drop them, I could revert them, and then apply the new ones, but
> that's just doing 50 patches of "churn" for no good reason.

Agreed..

> I assumed
> that if you really wanted the changelog changed, you would have sent me
> updated versions by now, so I went with what you had.  :)

I didn't wanted to spam LKML with another 25 patches in case you weren't
going to apply them.. And that's why asked you specifically this:

"Should I resend my patches with better commit logs if you are going to
take these patches?"

:)

> It's fine, don't worry.

Okay.. Thanks.
Viresh Kumar Aug. 28, 2013, 5:58 a.m. UTC | #8
On 28 August 2013 04:49, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> No need, I took them as-is, thanks for doing this work.

I believe this must go to few stable releases as well.. ??
Greg Kroah-Hartman Aug. 28, 2013, 6:05 a.m. UTC | #9
On Wed, Aug 28, 2013 at 11:28:48AM +0530, Viresh Kumar wrote:
> On 28 August 2013 04:49, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > No need, I took them as-is, thanks for doing this work.
> 
> I believe this must go to few stable releases as well.. ??

Why, as it doesn't cause problems unless you are running rt kernels at
the moment, right?  Is there some other way to trigger the problem that
you know of?

Remember, stable is for "real" problems :)

thanks,

greg k-h
Viresh Kumar Aug. 28, 2013, 6:07 a.m. UTC | #10
On 28 August 2013 11:35, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> Why, as it doesn't cause problems unless you are running rt kernels at
> the moment, right?  Is there some other way to trigger the problem that
> you know of?

No, its only coming for RT kernels...

> Remember, stable is for "real" problems :)

I faced this issue while working on 3.10 rt kernel and so asked :)
Greg Kroah-Hartman Aug. 28, 2013, 6:15 a.m. UTC | #11
On Wed, Aug 28, 2013 at 11:37:21AM +0530, Viresh Kumar wrote:
> On 28 August 2013 11:35, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > Why, as it doesn't cause problems unless you are running rt kernels at
> > the moment, right?  Is there some other way to trigger the problem that
> > you know of?
> 
> No, its only coming for RT kernels...
> 
> > Remember, stable is for "real" problems :)
> 
> I faced this issue while working on 3.10 rt kernel and so asked :)

Ask the -rt developers to add it to their patchset if it affects those
kernels.
diff mbox

Patch

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 9121c1f..94d3215 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -517,10 +517,15 @@  void tty_flip_buffer_push(struct tty_port *port)
                buf->tail->commit = buf->tail->used;
        spin_unlock_irqrestore(&buf->lock, flags);

+#ifndef CONFIG_PREEMPT_RT_FULL
        if (port->low_latency)
                flush_to_ldisc(&buf->work);
        else
                schedule_work(&buf->work);
+#else
+       flush_to_ldisc(&buf->work);
+#endif
+
 }
 EXPORT_SYMBOL(tty_flip_buffer_push);

@@ -545,4 +550,3 @@  void tty_buffer_init(struct tty_port *port)
        buf->memory_used = 0;
        INIT_WORK(&buf->work, flush_to_ldisc);
 }