diff mbox

xen/events: xen_evtchn_fifo_init can be called very late

Message ID 1390869269-12502-1-git-send-email-julien.grall@linaro.org
State Rejected, archived
Headers show

Commit Message

Julien Grall Jan. 28, 2014, 12:34 a.m. UTC
On ARM, xen_init_IRQ (which calls xen_evtchn_fifo_init) is called after
all CPUs are online. It would mean that the notifier will never be called.

Therefore, when a secondary CPU will receive an interrupt, Linux will segfault
because the event channel structure for this processor is not initialized.

This can be fixed by calling the init function on every online cpu when the
event channel fifo driver is initialized.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 drivers/xen/events/events_fifo.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Stefano Stabellini Jan. 28, 2014, 2:30 p.m. UTC | #1
On Tue, 28 Jan 2014, David Vrabel wrote:
> On 28/01/14 00:34, Julien Grall wrote:
> > On ARM, xen_init_IRQ (which calls xen_evtchn_fifo_init) is called after
> > all CPUs are online. It would mean that the notifier will never be called.
> 
> Why does ARM call xen_init_IRQ() so late?  Is it possible to call it
> earlier when only the boot CPU is online?  There are problems with
> attempting to init FIFO event channels after all CPUs are online.
> 
> If evtchn_fifo_init_control_block(cpu) fails on anything other than the
> first CPU, that CPU will be unable to receive any events.  Xen will have
> been switched to FIFO mode and it is not possible to revert back to
> 2-level mode.

We simply didn't need to be called that early.
Most of xen_guest_init could be moved to an early_initcall, if that is
necessary.



> > Therefore, when a secondary CPU will receive an interrupt, Linux will segfault
> > because the event channel structure for this processor is not initialized.
> > 
> > This can be fixed by calling the init function on every online cpu when the
> > event channel fifo driver is initialized.
> > 
> > Signed-off-by: Julien Grall <julien.grall@linaro.org>
> > ---
> >  drivers/xen/events/events_fifo.c |   11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
> > index 1de2a19..15498ab 100644
> > --- a/drivers/xen/events/events_fifo.c
> > +++ b/drivers/xen/events/events_fifo.c
> > @@ -410,12 +410,14 @@ static struct notifier_block evtchn_fifo_cpu_notifier = {
> >  
> >  int __init xen_evtchn_fifo_init(void)
> >  {
> > -	int cpu = get_cpu();
> > +	int cpu;
> >  	int ret;
> >  
> > -	ret = evtchn_fifo_init_control_block(cpu);
> > -	if (ret < 0)
> > -		goto out;
> > +	for_each_online_cpu(cpu) {
> > +		ret = evtchn_fifo_init_control_block(cpu);
> > +		if (ret < 0)
> > +			goto out;
> 
> You need to handle this error differently depending on whether the first
> call fails or not.
> 
> Failure on first CPU: return an error and the caller will fallback to
> using 2-level mode.
> 
> Failure on second or later CPU: you need to offline that CPU.  It may
> not be possible to offline a CPU with standard calls (e.g., cpu_down())
> as it won't have working interrupts.
> 
> David
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Julien Grall Jan. 28, 2014, 2:36 p.m. UTC | #2
On 01/28/2014 02:30 PM, Stefano Stabellini wrote:
> On Tue, 28 Jan 2014, David Vrabel wrote:
>> On 28/01/14 00:34, Julien Grall wrote:
>>> On ARM, xen_init_IRQ (which calls xen_evtchn_fifo_init) is called after
>>> all CPUs are online. It would mean that the notifier will never be called.
>>
>> Why does ARM call xen_init_IRQ() so late?  Is it possible to call it
>> earlier when only the boot CPU is online?  There are problems with
>> attempting to init FIFO event channels after all CPUs are online.
>>
>> If evtchn_fifo_init_control_block(cpu) fails on anything other than the
>> first CPU, that CPU will be unable to receive any events.  Xen will have
>> been switched to FIFO mode and it is not possible to revert back to
>> 2-level mode.
> 
> We simply didn't need to be called that early.
> Most of xen_guest_init could be moved to an early_initcall, if that is
> necessary.
> 

I'm actually working on a patch to move xen_init_IRQ() in early_initcall.
diff mbox

Patch

diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 1de2a19..15498ab 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -410,12 +410,14 @@  static struct notifier_block evtchn_fifo_cpu_notifier = {
 
 int __init xen_evtchn_fifo_init(void)
 {
-	int cpu = get_cpu();
+	int cpu;
 	int ret;
 
-	ret = evtchn_fifo_init_control_block(cpu);
-	if (ret < 0)
-		goto out;
+	for_each_online_cpu(cpu) {
+		ret = evtchn_fifo_init_control_block(cpu);
+		if (ret < 0)
+			goto out;
+	}
 
 	pr_info("Using FIFO-based ABI\n");
 
@@ -423,6 +425,5 @@  int __init xen_evtchn_fifo_init(void)
 
 	register_cpu_notifier(&evtchn_fifo_cpu_notifier);
 out:
-	put_cpu();
 	return ret;
 }