diff mbox

[Xen-devel,v2] xen/arm: vcpu: Correctly release resource when the VCPU failed to initialized

Message ID 1398885355-13200-1-git-send-email-julien.grall@linaro.org
State Accepted, archived
Headers show

Commit Message

Julien Grall April 30, 2014, 7:15 p.m. UTC
While I was adding new failing code at the end of the function, I've noticed
that the vtimers are not freed which mess all the timers and will crash Xen
quickly when the page will be reused.

Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
are safe for now. With the new GICv3 code, the former function will be able
to fail. This will result to a memory leak.

Call vcpu_destroy if the initialization has failed. We also need to add a
boolean to know if the vtimers are correctly setup as the timer common code
doesn't have safe guard against removing non-initialized timer.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---
    Changes in v2:
        - Update commit message
---
 xen/arch/arm/domain.c        |    8 ++++++--
 xen/arch/arm/vtimer.c        |    5 +++++
 xen/include/asm-arm/domain.h |    1 +
 3 files changed, 12 insertions(+), 2 deletions(-)

Comments

Ian Campbell May 2, 2014, 12:25 p.m. UTC | #1
On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
> While I was adding new failing code at the end of the function, I've noticed
> that the vtimers are not freed which mess all the timers and will crash Xen
> quickly when the page will be reused.
> 
> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
> are safe for now. With the new GICv3 code, the former function will be able
> to fail. This will result to a memory leak.
> 
> Call vcpu_destroy if the initialization has failed. We also need to add a
> boolean to know if the vtimers are correctly setup as the timer common code
> doesn't have safe guard against removing non-initialized timer.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>

I was about to acked + apply but it failed to build on arm64 with:

        domain.c: In function 'alloc_vcpu_struct':
        /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
         #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
                                       ^
        domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
             BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
             ^
struct arch_vcpu is apparently now too large.

I had also reworded your commit message somewhat:
    xen/arm: vcpu: Correctly release resources when a VCPU fails to initialize
    
    While I was adding new failing code at the end of the function, I noticed
    that the vtimers are not freed which messes up all the timers and will crash
    Xen quickly when the page s reused.
    
    Currently neither vcpu_vgic_init nor vcpu_vtimer_init fails, so we
    are safe for now. With the new GICv3 code, the former function will be able
    to fail. This will result in a memory leak.
    
    Call vcpu_destroy if the initialization has failed. We also need to add a
    boolean to know if the vtimers are correctly setup as the timer common code
    doesn't have any safeguard against removing a non-initialized timer.

Ian.

> 
> ---
>     Changes in v2:
>         - Update commit message
> ---
>  xen/arch/arm/domain.c        |    8 ++++++--
>  xen/arch/arm/vtimer.c        |    5 +++++
>  xen/include/asm-arm/domain.h |    1 +
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index ccccb77..c47db4a 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -468,12 +468,16 @@ int vcpu_initialise(struct vcpu *v)
>      processor_vcpu_initialise(v);
>  
>      if ( (rc = vcpu_vgic_init(v)) != 0 )
> -        return rc;
> +        goto fail;
>  
>      if ( (rc = vcpu_vtimer_init(v)) != 0 )
> -        return rc;
> +        goto fail;
>  
>      return rc;
> +
> +fail:
> +    vcpu_destroy(v);
> +    return rc;
>  }
>  
>  void vcpu_destroy(struct vcpu *v)
> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> index cb690bb..c515e7e 100644
> --- a/xen/arch/arm/vtimer.c
> +++ b/xen/arch/arm/vtimer.c
> @@ -77,11 +77,16 @@ int vcpu_vtimer_init(struct vcpu *v)
>          : GUEST_TIMER_VIRT_PPI;
>      t->v = v;
>  
> +    v->arch.vtimer_initialized = 1;
> +
>      return 0;
>  }
>  
>  void vcpu_timer_destroy(struct vcpu *v)
>  {
> +    if ( !v->arch.vtimer_initialized )
> +        return;
> +
>      kill_timer(&v->arch.virt_timer.timer);
>      kill_timer(&v->arch.phys_timer.timer);
>  }
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index ec66a4e..1be3da2 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -285,6 +285,7 @@ struct arch_vcpu
>  
>      struct vtimer phys_timer;
>      struct vtimer virt_timer;
> +    bool_t vtimer_initialized;
>  }  __cacheline_aligned;
>  
>  void vcpu_show_execution_state(struct vcpu *);
Julien Grall May 2, 2014, 2:09 p.m. UTC | #2
On 05/02/2014 01:25 PM, Ian Campbell wrote:
> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
>> While I was adding new failing code at the end of the function, I've noticed
>> that the vtimers are not freed which mess all the timers and will crash Xen
>> quickly when the page will be reused.
>>
>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
>> are safe for now. With the new GICv3 code, the former function will be able
>> to fail. This will result to a memory leak.
>>
>> Call vcpu_destroy if the initialization has failed. We also need to add a
>> boolean to know if the vtimers are correctly setup as the timer common code
>> doesn't have safe guard against removing non-initialized timer.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> I was about to acked + apply but it failed to build on arm64 with:
> 
>         domain.c: In function 'alloc_vcpu_struct':
>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
>                                        ^
>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
>              ^
> struct arch_vcpu is apparently now too large.

Hmmm... I'm not sure what is the best solution. Can:
	1) Allocate 2 pages for the VCPU structure
	2) Allocate vgic structure outside.

Any opinions?

Regards,
Andrew Cooper May 2, 2014, 2:17 p.m. UTC | #3
On 02/05/14 15:09, Julien Grall wrote:
> On 05/02/2014 01:25 PM, Ian Campbell wrote:
>> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
>>> While I was adding new failing code at the end of the function, I've noticed
>>> that the vtimers are not freed which mess all the timers and will crash Xen
>>> quickly when the page will be reused.
>>>
>>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
>>> are safe for now. With the new GICv3 code, the former function will be able
>>> to fail. This will result to a memory leak.
>>>
>>> Call vcpu_destroy if the initialization has failed. We also need to add a
>>> boolean to know if the vtimers are correctly setup as the timer common code
>>> doesn't have safe guard against removing non-initialized timer.
>>>
>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> I was about to acked + apply but it failed to build on arm64 with:
>>
>>         domain.c: In function 'alloc_vcpu_struct':
>>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
>>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
>>                                        ^
>>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
>>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
>>              ^
>> struct arch_vcpu is apparently now too large.
> Hmmm... I'm not sure what is the best solution. Can:
> 	1) Allocate 2 pages for the VCPU structure
> 	2) Allocate vgic structure outside.
>
> Any opinions?
>
> Regards,
>

2)

The reason structs vcpu/domain were reduced to this size to was avoid
needing multi-page allocations, which risk allocation failures on
systems with sufficiently fragmented memory.

~Andrew
Ian Campbell May 2, 2014, 3:15 p.m. UTC | #4
On Fri, 2014-05-02 at 15:17 +0100, Andrew Cooper wrote:
> On 02/05/14 15:09, Julien Grall wrote:
> > On 05/02/2014 01:25 PM, Ian Campbell wrote:
> >> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
> >>> While I was adding new failing code at the end of the function, I've noticed
> >>> that the vtimers are not freed which mess all the timers and will crash Xen
> >>> quickly when the page will be reused.
> >>>
> >>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
> >>> are safe for now. With the new GICv3 code, the former function will be able
> >>> to fail. This will result to a memory leak.
> >>>
> >>> Call vcpu_destroy if the initialization has failed. We also need to add a
> >>> boolean to know if the vtimers are correctly setup as the timer common code
> >>> doesn't have safe guard against removing non-initialized timer.
> >>>
> >>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >> I was about to acked + apply but it failed to build on arm64 with:
> >>
> >>         domain.c: In function 'alloc_vcpu_struct':
> >>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
> >>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
> >>                                        ^
> >>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
> >>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
> >>              ^
> >> struct arch_vcpu is apparently now too large.
> > Hmmm... I'm not sure what is the best solution. Can:
> > 	1) Allocate 2 pages for the VCPU structure
> > 	2) Allocate vgic structure outside.
> >
> > Any opinions?
> >
> > Regards,
> >
> 
> 2)
> 
> The reason structs vcpu/domain were reduced to this size to was avoid
> needing multi-page allocations, which risk allocation failures on
> systems with sufficiently fragmented memory.

Ack. #2 (with s/vgic/anything suitably self contained/) is the answer.

Ian.
Ian Campbell May 2, 2014, 3:27 p.m. UTC | #5
On Fri, 2014-05-02 at 16:15 +0100, Ian Campbell wrote:
> On Fri, 2014-05-02 at 15:17 +0100, Andrew Cooper wrote:
> > On 02/05/14 15:09, Julien Grall wrote:
> > > On 05/02/2014 01:25 PM, Ian Campbell wrote:
> > >> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
> > >>> While I was adding new failing code at the end of the function, I've noticed
> > >>> that the vtimers are not freed which mess all the timers and will crash Xen
> > >>> quickly when the page will be reused.
> > >>>
> > >>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
> > >>> are safe for now. With the new GICv3 code, the former function will be able
> > >>> to fail. This will result to a memory leak.
> > >>>
> > >>> Call vcpu_destroy if the initialization has failed. We also need to add a
> > >>> boolean to know if the vtimers are correctly setup as the timer common code
> > >>> doesn't have safe guard against removing non-initialized timer.
> > >>>
> > >>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> > >> I was about to acked + apply but it failed to build on arm64 with:
> > >>
> > >>         domain.c: In function 'alloc_vcpu_struct':
> > >>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
> > >>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
> > >>                                        ^
> > >>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
> > >>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
> > >>              ^
> > >> struct arch_vcpu is apparently now too large.
> > > Hmmm... I'm not sure what is the best solution. Can:
> > > 	1) Allocate 2 pages for the VCPU structure
> > > 	2) Allocate vgic structure outside.
> > >
> > > Any opinions?
> > >
> > > Regards,
> > >
> > 
> > 2)
> > 
> > The reason structs vcpu/domain were reduced to this size to was avoid
> > needing multi-page allocations, which risk allocation failures on
> > systems with sufficiently fragmented memory.
> 
> Ack. #2 (with s/vgic/anything suitably self contained/) is the answer.

Was Vijay not moving the vgtic stuff out in one of the gicv3 patches?
Julien Grall May 2, 2014, 3:36 p.m. UTC | #6
On 05/02/2014 04:27 PM, Ian Campbell wrote:
> On Fri, 2014-05-02 at 16:15 +0100, Ian Campbell wrote:
>> On Fri, 2014-05-02 at 15:17 +0100, Andrew Cooper wrote:
>>> On 02/05/14 15:09, Julien Grall wrote:
>>>> On 05/02/2014 01:25 PM, Ian Campbell wrote:
>>>>> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
>>>>>> While I was adding new failing code at the end of the function, I've noticed
>>>>>> that the vtimers are not freed which mess all the timers and will crash Xen
>>>>>> quickly when the page will be reused.
>>>>>>
>>>>>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
>>>>>> are safe for now. With the new GICv3 code, the former function will be able
>>>>>> to fail. This will result to a memory leak.
>>>>>>
>>>>>> Call vcpu_destroy if the initialization has failed. We also need to add a
>>>>>> boolean to know if the vtimers are correctly setup as the timer common code
>>>>>> doesn't have safe guard against removing non-initialized timer.
>>>>>>
>>>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>>>> I was about to acked + apply but it failed to build on arm64 with:
>>>>>
>>>>>         domain.c: In function 'alloc_vcpu_struct':
>>>>>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
>>>>>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
>>>>>                                        ^
>>>>>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
>>>>>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
>>>>>              ^
>>>>> struct arch_vcpu is apparently now too large.
>>>> Hmmm... I'm not sure what is the best solution. Can:
>>>> 	1) Allocate 2 pages for the VCPU structure
>>>> 	2) Allocate vgic structure outside.
>>>>
>>>> Any opinions?
>>>>
>>>> Regards,
>>>>
>>>
>>> 2)
>>>
>>> The reason structs vcpu/domain were reduced to this size to was avoid
>>> needing multi-page allocations, which risk allocation failures on
>>> systems with sufficiently fragmented memory.
>>
>> Ack. #2 (with s/vgic/anything suitably self contained/) is the answer.
> 
> Was Vijay not moving the vgtic stuff out in one of the gicv3 patches?

(CC him)

IIRC, he only moves the private_irqs field. I think we should move the
whole structure, to give more space for the future.

This patch will be necessary for the GICv3 serie has vcpu_vgic_init will
be able to fail (see patch #10).

Ideally, for bisection purpose, it should be applied before the patch #10.

Regards,
Julien Grall May 7, 2014, 3:09 p.m. UTC | #7
On 02/05/14 16:36, Julien Grall wrote:
>> Was Vijay not moving the vgtic stuff out in one of the gicv3 patches?
>
> (CC him)
>
> IIRC, he only moves the private_irqs field. I think we should move the
> whole structure, to give more space for the future.
>
> This patch will be necessary for the GICv3 serie has vcpu_vgic_init will
> be able to fail (see patch #10).
>
> Ideally, for bisection purpose, it should be applied before the patch #10.

After thinking, both patch are interdependent. If I want to move out 
some structure Xen will have to allocate memory. Which mean the function 
can fail and leak some resources (before this patch is applied).

As we are fine for now, I will wait GICv3 patches is pushed (or another 
patch related to arch vcpu initialization) before sending again this.

Regards,
Julien Grall May 21, 2014, 12:39 p.m. UTC | #8
Hi Ian,

On 05/02/2014 01:25 PM, Ian Campbell wrote:
> On Wed, 2014-04-30 at 20:15 +0100, Julien Grall wrote:
>> While I was adding new failing code at the end of the function, I've noticed
>> that the vtimers are not freed which mess all the timers and will crash Xen
>> quickly when the page will be reused.
>>
>> Currently neither vcpu_vgic_init nor vcpu_vtimer_init fail, so we
>> are safe for now. With the new GICv3 code, the former function will be able
>> to fail. This will result to a memory leak.
>>
>> Call vcpu_destroy if the initialization has failed. We also need to add a
>> boolean to know if the vtimers are correctly setup as the timer common code
>> doesn't have safe guard against removing non-initialized timer.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> I was about to acked + apply but it failed to build on arm64 with:
> 
>         domain.c: In function 'alloc_vcpu_struct':
>         /local/scratch/ianc/devel/committer.git/xen/include/xen/lib.h:19:31: error: static assertion failed: "!(sizeof(*v) > PAGE_SIZE)"
>          #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
>                                        ^
>         domain.c:415:5: note: in expansion of macro 'BUILD_BUG_ON'
>              BUILD_BUG_ON(sizeof(*v) > PAGE_SIZE);
>              ^
> struct arch_vcpu is apparently now too large.
> 
> I had also reworded your commit message somewhat:
>     xen/arm: vcpu: Correctly release resources when a VCPU fails to initialize
>     
>     While I was adding new failing code at the end of the function, I noticed
>     that the vtimers are not freed which messes up all the timers and will crash
>     Xen quickly when the page s reused.
>     
>     Currently neither vcpu_vgic_init nor vcpu_vtimer_init fails, so we
>     are safe for now. With the new GICv3 code, the former function will be able
>     to fail. This will result in a memory leak.
>     
>     Call vcpu_destroy if the initialization has failed. We also need to add a
>     boolean to know if the vtimers are correctly setup as the timer common code
>     doesn't have any safeguard against removing a non-initialized timer.

The commit 6fedf29 "xen/arm: Drop event_mask in arch_vcpu" which makes
this patch compiles on both arm32 and arm64.

I think you can safely push this patch now :).

Thanks,
Ian Campbell June 2, 2014, 2:39 p.m. UTC | #9
On Wed, 2014-05-21 at 13:39 +0100, Julien Grall wrote:
> 
> The commit 6fedf29 "xen/arm: Drop event_mask in arch_vcpu" which makes
> this patch compiles on both arm32 and arm64.
> 
> I think you can safely push this patch now :).

Acked + applied.
diff mbox

Patch

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index ccccb77..c47db4a 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -468,12 +468,16 @@  int vcpu_initialise(struct vcpu *v)
     processor_vcpu_initialise(v);
 
     if ( (rc = vcpu_vgic_init(v)) != 0 )
-        return rc;
+        goto fail;
 
     if ( (rc = vcpu_vtimer_init(v)) != 0 )
-        return rc;
+        goto fail;
 
     return rc;
+
+fail:
+    vcpu_destroy(v);
+    return rc;
 }
 
 void vcpu_destroy(struct vcpu *v)
diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index cb690bb..c515e7e 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -77,11 +77,16 @@  int vcpu_vtimer_init(struct vcpu *v)
         : GUEST_TIMER_VIRT_PPI;
     t->v = v;
 
+    v->arch.vtimer_initialized = 1;
+
     return 0;
 }
 
 void vcpu_timer_destroy(struct vcpu *v)
 {
+    if ( !v->arch.vtimer_initialized )
+        return;
+
     kill_timer(&v->arch.virt_timer.timer);
     kill_timer(&v->arch.phys_timer.timer);
 }
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index ec66a4e..1be3da2 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -285,6 +285,7 @@  struct arch_vcpu
 
     struct vtimer phys_timer;
     struct vtimer virt_timer;
+    bool_t vtimer_initialized;
 }  __cacheline_aligned;
 
 void vcpu_show_execution_state(struct vcpu *);