diff mbox

xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()

Message ID 20170817161453.19318-1-julien.grall@arm.com
State Superseded
Headers show

Commit Message

Julien Grall Aug. 17, 2017, 4:14 p.m. UTC
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
splat appears:

[    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.019717] ASID allocator initialised with 65536 entries
[    0.020019] xen:grant_table: Grant tables using version 1 layout
[    0.020051] Grant table initialized
[    0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046
[    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[    0.020123] no locks held by swapper/0/1.
[    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
[    0.020166] Hardware name: FVP Base (DT)
[    0.020182] Call trace:
[    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270
[    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30
[    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0
[    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8
[    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90
[    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8
[    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88
[    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0
[    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50
[    0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0
[    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4
[    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8
[    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300
[    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130
[    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288
[    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110
[    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40
[    0.020606] xen:events: Using FIFO-based ABI
[    0.020658] Xen: initializing cpu0
[    0.027727] Hierarchical SRCU implementation.
[    0.036235] EFI services will not be available.
[    0.043810] smp: Bringing up secondary CPUs ...

This is because get_cpu() in xen_evtchn_fifo_init() will disable
preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

xen_evtchn_fifo_init() will always be called before SMP is initialized,
so {get,put}_cpu() could be replaced by a simple smp_processor_id().

This also avoid to modify evtchn_fifo_alloc_control_block that will be
called in other context.

Signed-off-by: Julien Grall <julien.grall@arm.com>

Reported-by: Andre Przywara <andre.przywara@arm.com>
Fixes: 1fe565517b57 ("xen/events: use the FIFO-based ABI if available")
---
 drivers/xen/events/events_fifo.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

-- 
2.11.0

Comments

Boris Ostrovsky Aug. 17, 2017, 5:36 p.m. UTC | #1
On 08/17/2017 12:14 PM, Julien Grall wrote:
> When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following

> splat appears:

>

> [    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)

> [    0.019717] ASID allocator initialised with 65536 entries

> [    0.020019] xen:grant_table: Grant tables using version 1 layout

> [    0.020051] Grant table initialized

> [    0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046

> [    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0

> [    0.020123] no locks held by swapper/0/1.

> [    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598

> [    0.020166] Hardware name: FVP Base (DT)

> [    0.020182] Call trace:

> [    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270

> [    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30

> [    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0

> [    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8

> [    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90

> [    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8

> [    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88

> [    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0

> [    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50

> [    0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0

> [    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4

> [    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8

> [    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300

> [    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130

> [    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288

> [    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110

> [    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40

> [    0.020606] xen:events: Using FIFO-based ABI

> [    0.020658] Xen: initializing cpu0

> [    0.027727] Hierarchical SRCU implementation.

> [    0.036235] EFI services will not be available.

> [    0.043810] smp: Bringing up secondary CPUs ...

>

> This is because get_cpu() in xen_evtchn_fifo_init() will disable

> preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

>

> xen_evtchn_fifo_init() will always be called before SMP is initialized,

> so {get,put}_cpu() could be replaced by a simple smp_processor_id().


On x86 this will be called out of init_IRQ(), which is already preceded
by preempt_disable().

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Julien Grall Aug. 18, 2017, 10:15 a.m. UTC | #2
Hi Boris,

On 17/08/17 18:36, Boris Ostrovsky wrote:
> On 08/17/2017 12:14 PM, Julien Grall wrote:

>> When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following

>> splat appears:

>>

>> [    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)

>> [    0.019717] ASID allocator initialised with 65536 entries

>> [    0.020019] xen:grant_table: Grant tables using version 1 layout

>> [    0.020051] Grant table initialized

>> [    0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046

>> [    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0

>> [    0.020123] no locks held by swapper/0/1.

>> [    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598

>> [    0.020166] Hardware name: FVP Base (DT)

>> [    0.020182] Call trace:

>> [    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270

>> [    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30

>> [    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0

>> [    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8

>> [    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90

>> [    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8

>> [    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88

>> [    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0

>> [    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50

>> [    0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0

>> [    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4

>> [    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8

>> [    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300

>> [    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130

>> [    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288

>> [    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110

>> [    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40

>> [    0.020606] xen:events: Using FIFO-based ABI

>> [    0.020658] Xen: initializing cpu0

>> [    0.027727] Hierarchical SRCU implementation.

>> [    0.036235] EFI services will not be available.

>> [    0.043810] smp: Bringing up secondary CPUs ...

>>

>> This is because get_cpu() in xen_evtchn_fifo_init() will disable

>> preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

>>

>> xen_evtchn_fifo_init() will always be called before SMP is initialized,

>> so {get,put}_cpu() could be replaced by a simple smp_processor_id().

>

> On x86 this will be called out of init_IRQ(), which is already preceded

> by preempt_disable().


Well the main problem is preempt_disable() itself. in_atomic() will 
check preempt_count and return 1 if it is non-zero.

__get_free_page might sleep if GFP_ATOMIC is not set and therefore you 
will see the splat when CONFIG_DEBUG_ATOMIC is enabled. However, those 
checks don't happen before the scheduler is setup. Hence why you don't 
see the error on x86.

Cheers,

>

> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

>


-- 
Julien Grall
Julien Grall Aug. 22, 2017, 4:15 p.m. UTC | #3
Hi,

Gentle ping. This patch was reviewed but not queued. Are we waiting for 
other reviewed?

Cheers,

On 18/08/17 11:15, Julien Grall wrote:
> Hi Boris,

>

> On 17/08/17 18:36, Boris Ostrovsky wrote:

>> On 08/17/2017 12:14 PM, Julien Grall wrote:

>>> When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following

>>> splat appears:

>>>

>>> [    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1,

>>> 8192 bytes)

>>> [    0.019717] ASID allocator initialised with 65536 entries

>>> [    0.020019] xen:grant_table: Grant tables using version 1 layout

>>> [    0.020051] Grant table initialized

>>> [    0.020069] BUG: sleeping function called from invalid context at

>>> /data/src/linux/mm/page_alloc.c:4046

>>> [    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name:

>>> swapper/0

>>> [    0.020123] no locks held by swapper/0/1.

>>> [    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598

>>> [    0.020166] Hardware name: FVP Base (DT)

>>> [    0.020182] Call trace:

>>> [    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270

>>> [    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30

>>> [    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0

>>> [    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8

>>> [    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90

>>> [    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8

>>> [    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88

>>> [    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0

>>> [    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50

>>> [    0.020411] [<ffff0000086566a4>]

>>> evtchn_fifo_alloc_control_block+0x2c/0xa0

>>> [    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4

>>> [    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8

>>> [    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300

>>> [    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130

>>> [    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288

>>> [    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110

>>> [    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40

>>> [    0.020606] xen:events: Using FIFO-based ABI

>>> [    0.020658] Xen: initializing cpu0

>>> [    0.027727] Hierarchical SRCU implementation.

>>> [    0.036235] EFI services will not be available.

>>> [    0.043810] smp: Bringing up secondary CPUs ...

>>>

>>> This is because get_cpu() in xen_evtchn_fifo_init() will disable

>>> preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

>>>

>>> xen_evtchn_fifo_init() will always be called before SMP is initialized,

>>> so {get,put}_cpu() could be replaced by a simple smp_processor_id().

>>

>> On x86 this will be called out of init_IRQ(), which is already preceded

>> by preempt_disable().

>

> Well the main problem is preempt_disable() itself. in_atomic() will

> check preempt_count and return 1 if it is non-zero.

>

> __get_free_page might sleep if GFP_ATOMIC is not set and therefore you

> will see the splat when CONFIG_DEBUG_ATOMIC is enabled. However, those

> checks don't happen before the scheduler is setup. Hence why you don't

> see the error on x86.

>

> Cheers,

>

>>

>> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

>>

>


-- 
Julien Grall
Boris Ostrovsky Aug. 22, 2017, 8:14 p.m. UTC | #4
On 08/22/2017 12:15 PM, Julien Grall wrote:
> Hi,

>

> Gentle ping. This patch was reviewed but not queued. Are we waiting

> for other reviewed?


Applied to for-linus-4.14.

-boris
diff mbox

Patch

diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 3c41470c7fc4..76b318e88382 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -432,12 +432,12 @@  static int xen_evtchn_cpu_dead(unsigned int cpu)
 
 int __init xen_evtchn_fifo_init(void)
 {
-	int cpu = get_cpu();
+	int cpu = smp_processor_id();
 	int ret;
 
 	ret = evtchn_fifo_alloc_control_block(cpu);
 	if (ret < 0)
-		goto out;
+		return ret;
 
 	pr_info("Using FIFO-based ABI\n");
 
@@ -446,7 +446,6 @@  int __init xen_evtchn_fifo_init(void)
 	cpuhp_setup_state_nocalls(CPUHP_XEN_EVTCHN_PREPARE,
 				  "xen/evtchn:prepare",
 				  xen_evtchn_cpu_prepare, xen_evtchn_cpu_dead);
-out:
-	put_cpu();
+
 	return ret;
 }