diff mbox

[Xen-devel] xen: domain_update_node_affinity: Correct the ASSERT

Message ID 1406302204-13992-1-git-send-email-julien.grall@linaro.org
State Superseded, archived
Headers show

Commit Message

Julien Grall July 25, 2014, 3:30 p.m. UTC
The commit "move domain to cpupool0 before destroying it" make Xen crashes
when a domain is destroyed with d->vcpus allocated but no VCPU initialized.

Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
Xen call trace:
    [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
    [<00000004>] 00000004 (LR)
    [<00226870>] sched_move_domain+0x3cc/0x42c
    [<0020925c>] domain_kill+0xc8/0x178
    [<00206a0c>] do_domctl+0xaac/0x15e4
    [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
    [<002559f0>] return_from_trap+0/0x4

Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>

---
    This patch should be backported to Xen 4.4
---
 xen/common/domain.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich July 25, 2014, 3:42 p.m. UTC | #1
>>> On 25.07.14 at 17:30, <julien.grall@linaro.org> wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>          }
>          /* Filter out non-online cpus */
>          cpumask_and(dom_cpumask, dom_cpumask, online);
> -        ASSERT(!cpumask_empty(dom_cpumask));
> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));

Wouldn't it be better for the function to bail early in that case?

Jan
Julien Grall July 25, 2014, 3:44 p.m. UTC | #2
Hi Jan,

On 07/25/2014 04:42 PM, Jan Beulich wrote:
>>>> On 25.07.14 at 17:30, <julien.grall@linaro.org> wrote:
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>>          }
>>          /* Filter out non-online cpus */
>>          cpumask_and(dom_cpumask, dom_cpumask, online);
>> -        ASSERT(!cpumask_empty(dom_cpumask));
>> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
> 
> Wouldn't it be better for the function to bail early in that case?

I've no idea. I mostly followed the advice on this thread:

http://www.gossamer-threads.com/lists/xen/devel/340233

Regards,
Andrew Cooper July 25, 2014, 3:44 p.m. UTC | #3
On 25/07/14 16:30, Julien Grall wrote:
> The commit "move domain to cpupool0 before destroying it" make Xen crashes
> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>
> Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
> Xen call trace:
>     [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
>     [<00000004>] 00000004 (LR)
>     [<00226870>] sched_move_domain+0x3cc/0x42c
>     [<0020925c>] domain_kill+0xc8/0x178
>     [<00206a0c>] do_domctl+0xaac/0x15e4
>     [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
>     [<002559f0>] return_from_trap+0/0x4
>
> Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.
>
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Tim Deegan <tim@xen.org>
>
> ---
>     This patch should be backported to Xen 4.4
> ---
>  xen/common/domain.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index d7a84cf..188b769 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>          }
>          /* Filter out non-online cpus */
>          cpumask_and(dom_cpumask, dom_cpumask, online);
> -        ASSERT(!cpumask_empty(dom_cpumask));
> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));

You don't actually care for the vcpu pointers themselves.  You only care
whether the domain has vcpus.

d->max_vcpus == 0 is a more appropriate check.

~Andrew

>          /* And compute the intersection between hard, online and soft */
>          cpumask_and(dom_cpumask_soft, dom_cpumask_soft, dom_cpumask);
>
Julien Grall July 25, 2014, 3:48 p.m. UTC | #4
On 07/25/2014 04:44 PM, Andrew Cooper wrote:
> On 25/07/14 16:30, Julien Grall wrote:
>> The commit "move domain to cpupool0 before destroying it" make Xen crashes
>> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>>
>> Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
>> Xen call trace:
>>     [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
>>     [<00000004>] 00000004 (LR)
>>     [<00226870>] sched_move_domain+0x3cc/0x42c
>>     [<0020925c>] domain_kill+0xc8/0x178
>>     [<00206a0c>] do_domctl+0xaac/0x15e4
>>     [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
>>     [<002559f0>] return_from_trap+0/0x4
>>
>> Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: George Dunlap <george.dunlap@citrix.com>
>> Cc: Dario Faggioli <dario.faggioli@citrix.com>
>> Cc: Juergen Gross <jgross@suse.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Tim Deegan <tim@xen.org>
>>
>> ---
>>     This patch should be backported to Xen 4.4
>> ---
>>  xen/common/domain.c |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>> index d7a84cf..188b769 100644
>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>>          }
>>          /* Filter out non-online cpus */
>>          cpumask_and(dom_cpumask, dom_cpumask, online);
>> -        ASSERT(!cpumask_empty(dom_cpumask));
>> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
> 
> You don't actually care for the vcpu pointers themselves.  You only care
> whether the domain has vcpus.
> 
> d->max_vcpus == 0 is a more appropriate check.

Actually no... max_vcpus has been set before Xen is allocating
alloc_vcpu (see DOMCTL_max_vpus in common common/domctl.c).

So if Xen fails to allocate VCPU0, the hypervisor will still crash when
the domain is killed.

Regards,
Jan Beulich July 25, 2014, 3:59 p.m. UTC | #5
>>> On 25.07.14 at 17:44, <julien.grall@linaro.org> wrote:
> Hi Jan,
> 
> On 07/25/2014 04:42 PM, Jan Beulich wrote:
>>>>> On 25.07.14 at 17:30, <julien.grall@linaro.org> wrote:
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>>>          }
>>>          /* Filter out non-online cpus */
>>>          cpumask_and(dom_cpumask, dom_cpumask, online);
>>> -        ASSERT(!cpumask_empty(dom_cpumask));
>>> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
>> 
>> Wouldn't it be better for the function to bail early in that case?
> 
> I've no idea. I mostly followed the advice on this thread:
> 
> http://www.gossamer-threads.com/lists/xen/devel/340233 

Yeah, I recall that discussion. But looking at the function, nothing
useful will be done when the domain has no vCPU yet. Dario?

Jan
Andrew Cooper July 25, 2014, 4:05 p.m. UTC | #6
On 25/07/14 16:48, Julien Grall wrote:
> On 07/25/2014 04:44 PM, Andrew Cooper wrote:
>> On 25/07/14 16:30, Julien Grall wrote:
>>> The commit "move domain to cpupool0 before destroying it" make Xen crashes
>>> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>>>
>>> Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
>>> Xen call trace:
>>>     [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
>>>     [<00000004>] 00000004 (LR)
>>>     [<00226870>] sched_move_domain+0x3cc/0x42c
>>>     [<0020925c>] domain_kill+0xc8/0x178
>>>     [<00206a0c>] do_domctl+0xaac/0x15e4
>>>     [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
>>>     [<002559f0>] return_from_trap+0/0x4
>>>
>>> Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.
>>>
>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>> Cc: George Dunlap <george.dunlap@citrix.com>
>>> Cc: Dario Faggioli <dario.faggioli@citrix.com>
>>> Cc: Juergen Gross <jgross@suse.com>
>>> Cc: Ian Campbell <ian.campbell@citrix.com>
>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>> Cc: Jan Beulich <jbeulich@suse.com>
>>> Cc: Keir Fraser <keir@xen.org>
>>> Cc: Tim Deegan <tim@xen.org>
>>>
>>> ---
>>>     This patch should be backported to Xen 4.4
>>> ---
>>>  xen/common/domain.c |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/xen/common/domain.c b/xen/common/domain.c
>>> index d7a84cf..188b769 100644
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
>>>          }
>>>          /* Filter out non-online cpus */
>>>          cpumask_and(dom_cpumask, dom_cpumask, online);
>>> -        ASSERT(!cpumask_empty(dom_cpumask));
>>> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
>> You don't actually care for the vcpu pointers themselves.  You only care
>> whether the domain has vcpus.
>>
>> d->max_vcpus == 0 is a more appropriate check.
> Actually no... max_vcpus has been set before Xen is allocating
> alloc_vcpu (see DOMCTL_max_vpus in common common/domctl.c).
>
> So if Xen fails to allocate VCPU0, the hypervisor will still crash when
> the domain is killed.
>
> Regards,
>

Ah yes - I had mis-remembered the problem at hand.  Sorry for the noise.

~Andrew
Dario Faggioli July 28, 2014, 5:23 p.m. UTC | #7
On ven, 2014-07-25 at 16:59 +0100, Jan Beulich wrote:
> >>> On 25.07.14 at 17:44, <julien.grall@linaro.org> wrote:
> > Hi Jan,
> > 
> > On 07/25/2014 04:42 PM, Jan Beulich wrote:
> >>>>> On 25.07.14 at 17:30, <julien.grall@linaro.org> wrote:
> >>> --- a/xen/common/domain.c
> >>> +++ b/xen/common/domain.c
> >>> @@ -449,7 +449,7 @@ void domain_update_node_affinity(struct domain *d)
> >>>          }
> >>>          /* Filter out non-online cpus */
> >>>          cpumask_and(dom_cpumask, dom_cpumask, online);
> >>> -        ASSERT(!cpumask_empty(dom_cpumask));
> >>> +        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
> >> 
> >> Wouldn't it be better for the function to bail early in that case?
> > 
> > I've no idea. I mostly followed the advice on this thread:
> > 
> > http://www.gossamer-threads.com/lists/xen/devel/340233 
> 
> Yeah, I recall that discussion. But looking at the function, nothing
> useful will be done when the domain has no vCPU yet. Dario?
> 
For sure nothing useful happens if the for_each_vcpu() loop never
executes, because of lack of vcpus.

Functionally wise, bailing or going ahead, but avoiding the ASSERT to
trigger (as this patch does) is exactly the same.

I'm not sure which approach I personally prefer... If calling the
function without any allocated vcpus is a "legitimate" and frequent
enough use case, I'd say let's bail explicitly. If it's a corner case,
then I think this patch is ok.

It looks to me like we're in the latter situation (corner case), so I'm
actually fine with this patch. That's why I'm acking it, but really,
it's a matter of taste.. FWIW, I'd ack a version that bails too.

Regards,
Dario
Dario Faggioli July 28, 2014, 5:25 p.m. UTC | #8
On ven, 2014-07-25 at 16:30 +0100, Julien Grall wrote:
> The commit "move domain to cpupool0 before destroying it" make Xen crashes
> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
> 
The title of the commit is certainly useful. Perhaps the beginning of
the hash would have been too.

> Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
> Xen call trace:
>     [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
>     [<00000004>] 00000004 (LR)
>     [<00226870>] sched_move_domain+0x3cc/0x42c
>     [<0020925c>] domain_kill+0xc8/0x178
>     [<00206a0c>] do_domctl+0xaac/0x15e4
>     [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
>     [<002559f0>] return_from_trap+0/0x4
> 
> Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Dario Faggioli <dario.faggioli@citrix.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>
> Cc: Tim Deegan <tim@xen.org>
> 
In any case:

Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Julien Grall July 28, 2014, 5:31 p.m. UTC | #9
Hi Dario,

On 07/28/2014 06:25 PM, Dario Faggioli wrote:
> On ven, 2014-07-25 at 16:30 +0100, Julien Grall wrote:
>> The commit "move domain to cpupool0 before destroying it" make Xen crashes
>> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>>
> The title of the commit is certainly useful. Perhaps the beginning of
> the hash would have been too.

I didn't add the commit. Because it will be backported to Xen 4.4.

Would it be fine to point to a Xen 4.5 commit in Xen 4.4 branch?

>> Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:452
>> Xen call trace:
>>     [<00207bd8>] domain_update_node_affinity+0x10c/0x238 (PC)
>>     [<00000004>] 00000004 (LR)
>>     [<00226870>] sched_move_domain+0x3cc/0x42c
>>     [<0020925c>] domain_kill+0xc8/0x178
>>     [<00206a0c>] do_domctl+0xaac/0x15e4
>>     [<002529c0>] do_trap_hypervisor+0xc5c/0xf94
>>     [<002559f0>] return_from_trap+0/0x4
>>
>> Fix the ASSERT to check if d->vcpu is allocated and VCPU 0 is initialized.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: George Dunlap <george.dunlap@citrix.com>
>> Cc: Dario Faggioli <dario.faggioli@citrix.com>
>> Cc: Juergen Gross <jgross@suse.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> Cc: Keir Fraser <keir@xen.org>
>> Cc: Tim Deegan <tim@xen.org>
>>
> In any case:
> 
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

Thanks!

Regards,
Jan Beulich July 29, 2014, 6:40 a.m. UTC | #10
>>> On 28.07.14 at 19:31, <julien.grall@linaro.org> wrote:
> On 07/28/2014 06:25 PM, Dario Faggioli wrote:
>> On ven, 2014-07-25 at 16:30 +0100, Julien Grall wrote:
>>> The commit "move domain to cpupool0 before destroying it" make Xen crashes
>>> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>>>
>> The title of the commit is certainly useful. Perhaps the beginning of
>> the hash would have been too.
> 
> I didn't add the commit. Because it will be backported to Xen 4.4.
> 
> Would it be fine to point to a Xen 4.5 commit in Xen 4.4 branch?

Yes. Backports are (supposed to be) identifiable (via some kind of
reference to the backported commit), which would implicitly mean
any commit IDs referenced in the description refer to the master
branch. Furthermore, if someone wanted to look up the
referenced commit, having the (beginning of the) hash in hands
makes this a much more immediate action than just the title; I
believe such a lookup would even succeed on any of the stable
branches, as - luckily - we decided to make them branches rather
than standalone trees.

Jan
Julien Grall July 29, 2014, 10:46 a.m. UTC | #11
Hi Jan,

On 07/29/2014 07:40 AM, Jan Beulich wrote:
>>>> On 28.07.14 at 19:31, <julien.grall@linaro.org> wrote:
>> On 07/28/2014 06:25 PM, Dario Faggioli wrote:
>>> On ven, 2014-07-25 at 16:30 +0100, Julien Grall wrote:
>>>> The commit "move domain to cpupool0 before destroying it" make Xen crashes
>>>> when a domain is destroyed with d->vcpus allocated but no VCPU initialized.
>>>>
>>> The title of the commit is certainly useful. Perhaps the beginning of
>>> the hash would have been too.
>>
>> I didn't add the commit. Because it will be backported to Xen 4.4.
>>
>> Would it be fine to point to a Xen 4.5 commit in Xen 4.4 branch?
> 
> Yes. Backports are (supposed to be) identifiable (via some kind of
> reference to the backported commit), which would implicitly mean
> any commit IDs referenced in the description refer to the master
> branch. Furthermore, if someone wanted to look up the
> referenced commit, having the (beginning of the) hash in hands
> makes this a much more immediate action than just the title; I
> believe such a lookup would even succeed on any of the stable
> branches, as - luckily - we decided to make them branches rather
> than standalone trees.

Thank you for the explanation. I will send a v2 today with the commit ID
of xen unstable in the message.

Regards,
diff mbox

Patch

diff --git a/xen/common/domain.c b/xen/common/domain.c
index d7a84cf..188b769 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -449,7 +449,7 @@  void domain_update_node_affinity(struct domain *d)
         }
         /* Filter out non-online cpus */
         cpumask_and(dom_cpumask, dom_cpumask, online);
-        ASSERT(!cpumask_empty(dom_cpumask));
+        ASSERT( !d->vcpu || !d->vcpu[0] || !cpumask_empty(dom_cpumask));
         /* And compute the intersection between hard, online and soft */
         cpumask_and(dom_cpumask_soft, dom_cpumask_soft, dom_cpumask);