diff mbox

[Xen-devel] Xen crashing when killing a domain with no VCPUs allocated

Message ID 53C982FF.7070608@linaro.org
State Not Applicable, archived
Headers show

Commit Message

Julien Grall July 18, 2014, 8:26 p.m. UTC
On 18/07/14 17:39, Ian Campbell wrote:
> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>> Hi all,
>>
>> I've been played with the function alloc_vcpu on ARM. And I hit one case
>> where this function can failed.
>>
>> During domain creation, the toolstack will call DOMCTL_max_vcpus which may
>> fail, for instance because alloc_vcpu didn't succeed. In this case, the
>> toolstack will call DOMCTL_domaindestroy. And I got the below stack trace.
>>
>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by returning
>> in an error in vcpu_initialize.
>>
>> I'm not sure how to correctly fix it.
> 
> I think a simple check at the head of the function would be ok.
> 
> Alternatively perhaps in sched_mode_domain, which could either detect
> this or could detect a domain in pool0 being moved to pool0 and short
> circuit.

I was thinking about the small fix below. If it's fine for everyone, I can
send a patch next week.



Regards,

Comments

George Dunlap July 21, 2014, 10:33 a.m. UTC | #1
On 07/18/2014 09:26 PM, Julien Grall wrote:
>
> On 18/07/14 17:39, Ian Campbell wrote:
>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>> Hi all,
>>>
>>> I've been played with the function alloc_vcpu on ARM. And I hit one case
>>> where this function can failed.
>>>
>>> During domain creation, the toolstack will call DOMCTL_max_vcpus which may
>>> fail, for instance because alloc_vcpu didn't succeed. In this case, the
>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack trace.
>>>
>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by returning
>>> in an error in vcpu_initialize.
>>>
>>> I'm not sure how to correctly fix it.
>> I think a simple check at the head of the function would be ok.
>>
>> Alternatively perhaps in sched_mode_domain, which could either detect
>> this or could detect a domain in pool0 being moved to pool0 and short
>> circuit.
> I was thinking about the small fix below. If it's fine for everyone, I can
> send a patch next week.
>
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index e9eb0bc..c44d047 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
>       }
>   
>       /* Do we have vcpus already? If not, no need to update node-affinity */
> -    if ( d->vcpu )
> +    if ( d->vcpu && d->vcpu[0] != NULL )
>           domain_update_node_affinity(d);

So is the problem that we're allocating the vcpu array area, but not 
putting any vcpus in it?

Overall it seems like those checks for the existence of cpus should be 
moved into domain_update_node_affinity().  The ASSERT() there I think is 
just a sanity check to make sure we're not getting a ridiculous result 
out of our calculation; but of course if there actually are no vcpus, 
it's not ridiculous at all.

One solution might be to change the ASSERT to 
ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we 
could probably even remove the d->vcpu conditional when calling it.

  -George
Andrew Cooper July 21, 2014, 10:42 a.m. UTC | #2
On 21/07/14 11:33, George Dunlap wrote:
> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>
>> On 18/07/14 17:39, Ian Campbell wrote:
>>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>>> Hi all,
>>>>
>>>> I've been played with the function alloc_vcpu on ARM. And I hit one
>>>> case
>>>> where this function can failed.
>>>>
>>>> During domain creation, the toolstack will call DOMCTL_max_vcpus
>>>> which may
>>>> fail, for instance because alloc_vcpu didn't succeed. In this case,
>>>> the
>>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack
>>>> trace.
>>>>
>>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by
>>>> returning
>>>> in an error in vcpu_initialize.
>>>>
>>>> I'm not sure how to correctly fix it.
>>> I think a simple check at the head of the function would be ok.
>>>
>>> Alternatively perhaps in sched_mode_domain, which could either detect
>>> this or could detect a domain in pool0 being moved to pool0 and short
>>> circuit.
>> I was thinking about the small fix below. If it's fine for everyone,
>> I can
>> send a patch next week.
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index e9eb0bc..c44d047 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>> cpupool *c)
>>       }
>>         /* Do we have vcpus already? If not, no need to update
>> node-affinity */
>> -    if ( d->vcpu )
>> +    if ( d->vcpu && d->vcpu[0] != NULL )
>>           domain_update_node_affinity(d);
>
> So is the problem that we're allocating the vcpu array area, but not
> putting any vcpus in it?

The problem (as I recall) was that domain_create() got midway through
and alloc_vcpu(0) failed with -ENOMEM.  Following that failure, the
toolstack called domain_destroy().

Having d->vcpu properly allocated and containing fully NULL pointers is
a valid position to be in, especial in error or teardown paths.

>
> Overall it seems like those checks for the existence of cpus should be
> moved into domain_update_node_affinity().  The ASSERT() there I think
> is just a sanity check to make sure we're not getting a ridiculous
> result out of our calculation; but of course if there actually are no
> vcpus, it's not ridiculous at all.
>
> One solution might be to change the ASSERT to
> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then
> we could probably even remove the d->vcpu conditional when calling it.

If you were going along this line, the pointer checks are substantially
less expensive than cpumask_empty(), so the ||'s should be reordered. 
However, I am not convinced that it is necessarily the best solution,
given my previous observation.

~Andrew
George Dunlap July 21, 2014, 10:49 a.m. UTC | #3
On 07/21/2014 11:42 AM, Andrew Cooper wrote:
> On 21/07/14 11:33, George Dunlap wrote:
>> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>> On 18/07/14 17:39, Ian Campbell wrote:
>>>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>>>> Hi all,
>>>>>
>>>>> I've been played with the function alloc_vcpu on ARM. And I hit one
>>>>> case
>>>>> where this function can failed.
>>>>>
>>>>> During domain creation, the toolstack will call DOMCTL_max_vcpus
>>>>> which may
>>>>> fail, for instance because alloc_vcpu didn't succeed. In this case,
>>>>> the
>>>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack
>>>>> trace.
>>>>>
>>>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by
>>>>> returning
>>>>> in an error in vcpu_initialize.
>>>>>
>>>>> I'm not sure how to correctly fix it.
>>>> I think a simple check at the head of the function would be ok.
>>>>
>>>> Alternatively perhaps in sched_mode_domain, which could either detect
>>>> this or could detect a domain in pool0 being moved to pool0 and short
>>>> circuit.
>>> I was thinking about the small fix below. If it's fine for everyone,
>>> I can
>>> send a patch next week.
>>>
>>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>>> index e9eb0bc..c44d047 100644
>>> --- a/xen/common/schedule.c
>>> +++ b/xen/common/schedule.c
>>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>>> cpupool *c)
>>>        }
>>>          /* Do we have vcpus already? If not, no need to update
>>> node-affinity */
>>> -    if ( d->vcpu )
>>> +    if ( d->vcpu && d->vcpu[0] != NULL )
>>>            domain_update_node_affinity(d);
>> So is the problem that we're allocating the vcpu array area, but not
>> putting any vcpus in it?
> The problem (as I recall) was that domain_create() got midway through
> and alloc_vcpu(0) failed with -ENOMEM.  Following that failure, the
> toolstack called domain_destroy().
>
> Having d->vcpu properly allocated and containing fully NULL pointers is
> a valid position to be in, especial in error or teardown paths.
>
>> Overall it seems like those checks for the existence of cpus should be
>> moved into domain_update_node_affinity().  The ASSERT() there I think
>> is just a sanity check to make sure we're not getting a ridiculous
>> result out of our calculation; but of course if there actually are no
>> vcpus, it's not ridiculous at all.
>>
>> One solution might be to change the ASSERT to
>> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then
>> we could probably even remove the d->vcpu conditional when calling it.
> If you were going along this line, the pointer checks are substantially
> less expensive than cpumask_empty(), so the ||'s should be reordered.
> However, I am not convinced that it is necessarily the best solution,
> given my previous observation.

Er, I was with you until the last part.  What's wrong with changing the 
assert from "Make sure I have *something* in there" to "Make sure I have 
*something* in there *if I have any vcpus*"?  That seems to be accepting 
that having d->vcpu allocated but full of null pointers is a valid 
condition.

  -George
Julien Grall July 21, 2014, 11:46 a.m. UTC | #4
Hi George,

On 07/21/2014 11:33 AM, George Dunlap wrote:
> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>
>> On 18/07/14 17:39, Ian Campbell wrote:
>>> On Fri, 2014-07-18 at 14:27 +0100, Julien Grall wrote:
>>>> Hi all,
>>>>
>>>> I've been played with the function alloc_vcpu on ARM. And I hit one
>>>> case
>>>> where this function can failed.
>>>>
>>>> During domain creation, the toolstack will call DOMCTL_max_vcpus
>>>> which may
>>>> fail, for instance because alloc_vcpu didn't succeed. In this case, the
>>>> toolstack will call DOMCTL_domaindestroy. And I got the below stack
>>>> trace.
>>>>
>>>> It can be reproduced on Xen 4.5 (and I also suspect Xen 4.4) by
>>>> returning
>>>> in an error in vcpu_initialize.
>>>>
>>>> I'm not sure how to correctly fix it.
>>> I think a simple check at the head of the function would be ok.
>>>
>>> Alternatively perhaps in sched_mode_domain, which could either detect
>>> this or could detect a domain in pool0 being moved to pool0 and short
>>> circuit.
>> I was thinking about the small fix below. If it's fine for everyone, I
>> can
>> send a patch next week.
>>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index e9eb0bc..c44d047 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>> cpupool *c)
>>       }
>>         /* Do we have vcpus already? If not, no need to update
>> node-affinity */
>> -    if ( d->vcpu )
>> +    if ( d->vcpu && d->vcpu[0] != NULL )
>>           domain_update_node_affinity(d);
> 
> So is the problem that we're allocating the vcpu array area, but not
> putting any vcpus in it?

Yes.


> Overall it seems like those checks for the existence of cpus should be
> moved into domain_update_node_affinity().  The ASSERT() there I think is
> just a sanity check to make sure we're not getting a ridiculous result
> out of our calculation; but of course if there actually are no vcpus,
> it's not ridiculous at all.
> 
> One solution might be to change the ASSERT to
> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we
> could probably even remove the d->vcpu conditional when calling it.

This solution also works for me. Which change do you prefer?

Regards,
Dario Faggioli July 21, 2014, 12:57 p.m. UTC | #5
On lun, 2014-07-21 at 12:46 +0100, Julien Grall wrote:
> On 07/21/2014 11:33 AM, George Dunlap wrote:
> > On 07/18/2014 09:26 PM, Julien Grall wrote:

> >> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> >> index e9eb0bc..c44d047 100644
> >> --- a/xen/common/schedule.c
> >> +++ b/xen/common/schedule.c
> >> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
> >> cpupool *c)
> >>       }
> >>         /* Do we have vcpus already? If not, no need to update
> >> node-affinity */
> >> -    if ( d->vcpu )
> >> +    if ( d->vcpu && d->vcpu[0] != NULL )
> >>           domain_update_node_affinity(d);
> > 

> > Overall it seems like those checks for the existence of cpus should be
> > moved into domain_update_node_affinity().  The ASSERT() there I think is
> > just a sanity check to make sure we're not getting a ridiculous result
> > out of our calculation; but of course if there actually are no vcpus,
> > it's not ridiculous at all.
> > 
> > One solution might be to change the ASSERT to
> > ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we
> > could probably even remove the d->vcpu conditional when calling it.
> 
> This solution also works for me. Which change do you prefer?
> 
FWIW, I think I like changing the ASSERT() in
domain_update_node_affinity(), as George suggested (and perhaps with the
reordering Andrew suggested) better.

Regards,
Dario
Jan Beulich July 23, 2014, 3:31 p.m. UTC | #6
>>> On 21.07.14 at 14:57, <dario.faggioli@citrix.com> wrote:
> On lun, 2014-07-21 at 12:46 +0100, Julien Grall wrote:
>> On 07/21/2014 11:33 AM, George Dunlap wrote:
>> > On 07/18/2014 09:26 PM, Julien Grall wrote:
> 
>> >> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> >> index e9eb0bc..c44d047 100644
>> >> --- a/xen/common/schedule.c
>> >> +++ b/xen/common/schedule.c
>> >> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>> >> cpupool *c)
>> >>       }
>> >>         /* Do we have vcpus already? If not, no need to update
>> >> node-affinity */
>> >> -    if ( d->vcpu )
>> >> +    if ( d->vcpu && d->vcpu[0] != NULL )
>> >>           domain_update_node_affinity(d);
>> > 
> 
>> > Overall it seems like those checks for the existence of cpus should be
>> > moved into domain_update_node_affinity().  The ASSERT() there I think is
>> > just a sanity check to make sure we're not getting a ridiculous result
>> > out of our calculation; but of course if there actually are no vcpus,
>> > it's not ridiculous at all.
>> > 
>> > One solution might be to change the ASSERT to
>> > ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we
>> > could probably even remove the d->vcpu conditional when calling it.
>> 
>> This solution also works for me. Which change do you prefer?
>> 
> FWIW, I think I like changing the ASSERT() in
> domain_update_node_affinity(), as George suggested (and perhaps with the
> reordering Andrew suggested) better.

+1

Jan
Julien Grall July 24, 2014, 2:04 p.m. UTC | #7
Hi,

On 07/23/2014 04:31 PM, Jan Beulich wrote:
>>>> On 21.07.14 at 14:57, <dario.faggioli@citrix.com> wrote:
>> On lun, 2014-07-21 at 12:46 +0100, Julien Grall wrote:
>>> On 07/21/2014 11:33 AM, George Dunlap wrote:
>>>> On 07/18/2014 09:26 PM, Julien Grall wrote:
>>
>>>>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>>>>> index e9eb0bc..c44d047 100644
>>>>> --- a/xen/common/schedule.c
>>>>> +++ b/xen/common/schedule.c
>>>>> @@ -311,7 +311,7 @@ int sched_move_domain(struct domain *d, struct
>>>>> cpupool *c)
>>>>>       }
>>>>>         /* Do we have vcpus already? If not, no need to update
>>>>> node-affinity */
>>>>> -    if ( d->vcpu )
>>>>> +    if ( d->vcpu && d->vcpu[0] != NULL )
>>>>>           domain_update_node_affinity(d);
>>>>
>>
>>>> Overall it seems like those checks for the existence of cpus should be
>>>> moved into domain_update_node_affinity().  The ASSERT() there I think is
>>>> just a sanity check to make sure we're not getting a ridiculous result
>>>> out of our calculation; but of course if there actually are no vcpus,
>>>> it's not ridiculous at all.
>>>>
>>>> One solution might be to change the ASSERT to
>>>> ASSERT(!cpumask_empty(dom_cpumask) || !d->vcpu || !d->vcpu[0]).  Then we
>>>> could probably even remove the d->vcpu conditional when calling it.
>>>
>>> This solution also works for me. Which change do you prefer?
>>>
>> FWIW, I think I like changing the ASSERT() in
>> domain_update_node_affinity(), as George suggested (and perhaps with the
>> reordering Andrew suggested) better.
> 
> +1

Thanks. I will send a patch during the next couple days to fix this issue.

Regards,
diff mbox

Patch

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index e9eb0bc..c44d047 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -311,7 +311,7 @@  int sched_move_domain(struct domain *d, struct cpupool *c)
     }
 
     /* Do we have vcpus already? If not, no need to update node-affinity */
-    if ( d->vcpu )
+    if ( d->vcpu && d->vcpu[0] != NULL )
         domain_update_node_affinity(d);
 
     domain_unpause(d);