diff mbox series

[Xen-devel,for-4.10] libs/evtchn: Remove active handler on clean-up or failure

Message ID 20171110171050.19836-1-julien.grall@linaro.org
State Accepted
Commit 0de212b03066571523f3174535bb4fb1264ca1de
Headers show
Series [Xen-devel,for-4.10] libs/evtchn: Remove active handler on clean-up or failure | expand

Commit Message

Julien Grall Nov. 10, 2017, 5:10 p.m. UTC
Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
Implement for libxenevtchn" added a call to register allowing to
restrict the event channel.

However, the call to deregister the handler was not performed if open
failed or when closing the event channel. This will result to corrupt
the list of handlers and potentially crash the application later one.

Fix it by calling xentoolcore_deregister_active_handle on failure and
closure.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---

This patch is fixing a bug introduced after the code freeze by
"xentoolcore_restrict_all: Implement for libxenevtchn".

The call to xentoolcore_deregister_active_handle is done at the same
place as for the grants. But I am not convinced this is thread safe as
there are potential race between close the event channel and restict
handler. Do we care about that?
---
 tools/libs/evtchn/core.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Ross Lagerwall Nov. 13, 2017, 9:04 a.m. UTC | #1
On 11/10/2017 05:10 PM, Julien Grall wrote:
> Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
> Implement for libxenevtchn" added a call to register allowing to
> restrict the event channel.
> 
> However, the call to deregister the handler was not performed if open
> failed or when closing the event channel. This will result to corrupt
> the list of handlers and potentially crash the application later one.
> 
> Fix it by calling xentoolcore_deregister_active_handle on failure and
> closure.

Thanks for fixing this.

> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> ---
> 
> This patch is fixing a bug introduced after the code freeze by
> "xentoolcore_restrict_all: Implement for libxenevtchn".
> 
> The call to xentoolcore_deregister_active_handle is done at the same
> place as for the grants. But I am not convinced this is thread safe as
> there are potential race between close the event channel and restict
> handler. Do we care about that?

Both xentoolcore__deregister_active_handle() and 
xentoolcore_restrict_all() hold the same lock when mutating the list so 
there shouldn't be a problem with the list itself.

However, I think it should call xentoolcore__deregister_active_handle() 
_before_ calling osdep_evtchn_close() to avoid trying to restrict a 
closed fd or some other fd that happens to have the same number.

I think all the other libs need to be fixed as well, unless there was a 
reason it was done this way.
Ian Jackson Nov. 14, 2017, 11:51 a.m. UTC | #2
Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
> On 11/10/2017 05:10 PM, Julien Grall wrote:
> > Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
> > Implement for libxenevtchn" added a call to register allowing to
> > restrict the event channel.
> > 
> > However, the call to deregister the handler was not performed if open
> > failed or when closing the event channel. This will result to corrupt
> > the list of handlers and potentially crash the application later one.

Sorry for not spotting this during review.
The fix is correct as far as it goes, so:

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

> > The call to xentoolcore_deregister_active_handle is done at the same
> > place as for the grants. But I am not convinced this is thread safe as
> > there are potential race between close the event channel and restict
> > handler. Do we care about that?
...
> However, I think it should call xentoolcore__deregister_active_handle() 
> _before_ calling osdep_evtchn_close() to avoid trying to restrict a 
> closed fd or some other fd that happens to have the same number.

You are right.  But this slightly weakens the guarantee provided by
xentoolcore_restrict_all.

> I think all the other libs need to be fixed as well, unless there was a 
> reason it was done this way.

I will send a further patch.  In the meantime I suggest we apply
Julien's fix.

Ian.
Ross Lagerwall Nov. 14, 2017, 12:05 p.m. UTC | #3
On 11/14/2017 11:51 AM, Ian Jackson wrote:
> Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
>> On 11/10/2017 05:10 PM, Julien Grall wrote:
>>> Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
>>> Implement for libxenevtchn" added a call to register allowing to
>>> restrict the event channel.
>>>
>>> However, the call to deregister the handler was not performed if open
>>> failed or when closing the event channel. This will result to corrupt
>>> the list of handlers and potentially crash the application later one.
> 
> Sorry for not spotting this during review.
> The fix is correct as far as it goes, so:
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
>>> The call to xentoolcore_deregister_active_handle is done at the same
>>> place as for the grants. But I am not convinced this is thread safe as
>>> there are potential race between close the event channel and restict
>>> handler. Do we care about that?
> ...
>> However, I think it should call xentoolcore__deregister_active_handle()
>> _before_ calling osdep_evtchn_close() to avoid trying to restrict a
>> closed fd or some other fd that happens to have the same number.
> 
> You are right.  But this slightly weakens the guarantee provided by
> xentoolcore_restrict_all.
> 

Now that I look at it, a similar scenario can happen during open. Since 
the handle is registered before it is actually opened, a concurrent 
xentoolcore_restrict_all() will try to restrict a handle that it not 
properly set up.

I think it is OK if xentoolcore_restrict_all() works with any open 
handle where a handle is defined as open if it has _completed_ the call 
to e.g. xenevtchn_open() and has not yet called xenevtchn_close().
Julien Grall Nov. 14, 2017, 12:14 p.m. UTC | #4
Hi,

On 14/11/17 11:51, Ian Jackson wrote:
> Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
>> On 11/10/2017 05:10 PM, Julien Grall wrote:
>>> Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
>>> Implement for libxenevtchn" added a call to register allowing to
>>> restrict the event channel.
>>>
>>> However, the call to deregister the handler was not performed if open
>>> failed or when closing the event channel. This will result to corrupt
>>> the list of handlers and potentially crash the application later one.
> 
> Sorry for not spotting this during review.
> The fix is correct as far as it goes, so:
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
>>> The call to xentoolcore_deregister_active_handle is done at the same
>>> place as for the grants. But I am not convinced this is thread safe as
>>> there are potential race between close the event channel and restict
>>> handler. Do we care about that?
> ...
>> However, I think it should call xentoolcore__deregister_active_handle()
>> _before_ calling osdep_evtchn_close() to avoid trying to restrict a
>> closed fd or some other fd that happens to have the same number.
> 
> You are right.  But this slightly weakens the guarantee provided by
> xentoolcore_restrict_all.
> 
>> I think all the other libs need to be fixed as well, unless there was a
>> reason it was done this way.
> 
> I will send a further patch.  In the meantime I suggest we apply
> Julien's fix.

I am going to leave the decision to you and Wei. It feels a bit odd to 
release-ack my patch :).

Cheers,
Ian Jackson Nov. 14, 2017, 12:15 p.m. UTC | #5
Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
> Now that I look at it, a similar scenario can happen during open. Since 
> the handle is registered before it is actually opened, a concurrent 
> xentoolcore_restrict_all() will try to restrict a handle that it not 
> properly set up.

I think this is not a problem because the handle has thing->fd = -1.
So the restrict call will be a no-op (or give EBADF).

Ian.
Wei Liu Nov. 14, 2017, 1:53 p.m. UTC | #6
On Tue, Nov 14, 2017 at 12:14:14PM +0000, Julien Grall wrote:
> Hi,
> 
> On 14/11/17 11:51, Ian Jackson wrote:
> > Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
> > > On 11/10/2017 05:10 PM, Julien Grall wrote:
> > > > Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
> > > > Implement for libxenevtchn" added a call to register allowing to
> > > > restrict the event channel.
> > > > 
> > > > However, the call to deregister the handler was not performed if open
> > > > failed or when closing the event channel. This will result to corrupt
> > > > the list of handlers and potentially crash the application later one.
> > 
> > Sorry for not spotting this during review.
> > The fix is correct as far as it goes, so:
> > 
> > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> > 
> > > > The call to xentoolcore_deregister_active_handle is done at the same
> > > > place as for the grants. But I am not convinced this is thread safe as
> > > > there are potential race between close the event channel and restict
> > > > handler. Do we care about that?
> > ...
> > > However, I think it should call xentoolcore__deregister_active_handle()
> > > _before_ calling osdep_evtchn_close() to avoid trying to restrict a
> > > closed fd or some other fd that happens to have the same number.
> > 
> > You are right.  But this slightly weakens the guarantee provided by
> > xentoolcore_restrict_all.
> > 
> > > I think all the other libs need to be fixed as well, unless there was a
> > > reason it was done this way.
> > 
> > I will send a further patch.  In the meantime I suggest we apply
> > Julien's fix.
> 
> I am going to leave the decision to you and Wei. It feels a bit odd to
> release-ack my patch :).

We can only commit patches that are both acked and release-acked. The
latter gives RM control over when the patch should be applied.
Sometimes it is better to wait until something else happens (like
getting the tree to a stable state).

That's how I used release-ack anyway.

For this particular patch, my interpretation of what you just said
is you've given us release-ack and we can apply this patch anytime. I
will commit it soon.
Wei Liu Nov. 14, 2017, 2:02 p.m. UTC | #7
On Tue, Nov 14, 2017 at 12:15:42PM +0000, Ian Jackson wrote:
> Closing the fd before unhooking it from the list runs the risk that a
> concurrent thread calls xentoolcore_restrict_all will operate on the
> old fd value, which might refer to a new fd by then.  So we need to do
> it in the other order.
> 
> Sadly this weakens the guarantee provided by xentoolcore_restrict_all
> slight, but not (I think) in a problematic way.  It would be possible

slightly

> to implement the previous guarantee, but it would involve replacing
> all of the close() calls in all of the individual osdep parts of all
> of the individual libraries with calls to a new function which does
>    dup2("/dev/null", thing->fd);
>    pthread_mutex_lock(&handles_lock);
>    thing->fd = -1;
>    pthread_mutex_unlock(&handles_lock);
>    close(fd);
> which would be terribly tedious.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall Nov. 14, 2017, 2:19 p.m. UTC | #8
Hi,

On 14/11/17 14:02, Wei Liu wrote:
> On Tue, Nov 14, 2017 at 12:15:42PM +0000, Ian Jackson wrote:
>> Closing the fd before unhooking it from the list runs the risk that a
>> concurrent thread calls xentoolcore_restrict_all will operate on the
>> old fd value, which might refer to a new fd by then.  So we need to do
>> it in the other order.
>>
>> Sadly this weakens the guarantee provided by xentoolcore_restrict_all
>> slight, but not (I think) in a problematic way.  It would be possible
> 
> slightly
> 
>> to implement the previous guarantee, but it would involve replacing
>> all of the close() calls in all of the individual osdep parts of all
>> of the individual libraries with calls to a new function which does
>>     dup2("/dev/null", thing->fd);
>>     pthread_mutex_lock(&handles_lock);
>>     thing->fd = -1;
>>     pthread_mutex_unlock(&handles_lock);
>>     close(fd);
>> which would be terribly tedious.
>>
>> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> 
> Acked-by: Wei Liu <wei.liu2@citrix.com>

I think this is 4.10 material, xentoolcore was introduced in this 
release and it would be good to have it right from now. I want to 
confirm that you are both happy with that?

Cheers,
Julien Grall Nov. 14, 2017, 2:26 p.m. UTC | #9
Hi Wei,

On 14/11/17 13:53, Wei Liu wrote:
> On Tue, Nov 14, 2017 at 12:14:14PM +0000, Julien Grall wrote:
>> Hi,
>>
>> On 14/11/17 11:51, Ian Jackson wrote:
>>> Ross Lagerwall writes ("Re: [PATCH for-4.10] libs/evtchn: Remove active handler on clean-up or failure"):
>>>> On 11/10/2017 05:10 PM, Julien Grall wrote:
>>>>> Commit 89d55473ed16543044a31d1e0d4660cf5a3f49df "xentoolcore_restrict_all:
>>>>> Implement for libxenevtchn" added a call to register allowing to
>>>>> restrict the event channel.
>>>>>
>>>>> However, the call to deregister the handler was not performed if open
>>>>> failed or when closing the event channel. This will result to corrupt
>>>>> the list of handlers and potentially crash the application later one.
>>>
>>> Sorry for not spotting this during review.
>>> The fix is correct as far as it goes, so:
>>>
>>> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
>>>
>>>>> The call to xentoolcore_deregister_active_handle is done at the same
>>>>> place as for the grants. But I am not convinced this is thread safe as
>>>>> there are potential race between close the event channel and restict
>>>>> handler. Do we care about that?
>>> ...
>>>> However, I think it should call xentoolcore__deregister_active_handle()
>>>> _before_ calling osdep_evtchn_close() to avoid trying to restrict a
>>>> closed fd or some other fd that happens to have the same number.
>>>
>>> You are right.  But this slightly weakens the guarantee provided by
>>> xentoolcore_restrict_all.
>>>
>>>> I think all the other libs need to be fixed as well, unless there was a
>>>> reason it was done this way.
>>>
>>> I will send a further patch.  In the meantime I suggest we apply
>>> Julien's fix.
>>
>> I am going to leave the decision to you and Wei. It feels a bit odd to
>> release-ack my patch :).
> 
> We can only commit patches that are both acked and release-acked. The
> latter gives RM control over when the patch should be applied.
> Sometimes it is better to wait until something else happens (like
> getting the tree to a stable state).
> 
> That's how I used release-ack anyway.

I feel a bit odd to release-ack my patch and usually for Arm patches 
deferred to Stefano the decision whether the patch is suitable for the 
release.

> 
> For this particular patch, my interpretation of what you just said
> is you've given us release-ack and we can apply this patch anytime. I
> will commit it soon.

Thanks! I hope it will fixed some osstest failure.

Cheers,
Ross Lagerwall Nov. 14, 2017, 2:26 p.m. UTC | #10
On 11/14/2017 12:15 PM, Ian Jackson wrote:
> Closing the fd before unhooking it from the list runs the risk that a
> concurrent thread calls xentoolcore_restrict_all will operate on the
> old fd value, which might refer to a new fd by then.  So we need to do
> it in the other order.
> 
> Sadly this weakens the guarantee provided by xentoolcore_restrict_all
> slight, but not (I think) in a problematic way.  It would be possible
> to implement the previous guarantee, but it would involve replacing
> all of the close() calls in all of the individual osdep parts of all
> of the individual libraries with calls to a new function which does
>     dup2("/dev/null", thing->fd);
>     pthread_mutex_lock(&handles_lock);
>     thing->fd = -1;
>     pthread_mutex_unlock(&handles_lock);
>     close(fd);
> which would be terribly tedious.
> 
...
> diff --git a/tools/libs/toolcore/include/xentoolcore.h b/tools/libs/toolcore/include/xentoolcore.h
> index 8d28c2d..b3a3c93 100644
> --- a/tools/libs/toolcore/include/xentoolcore.h
> +++ b/tools/libs/toolcore/include/xentoolcore.h
> @@ -39,6 +39,15 @@
>    * fail (even though such a call is potentially meaningful).
>    * (If called again with a different domid, it will necessarily fail.)
>    *
> + * Note for multi-threaded programs: If xentoolcore_restrict_all is
> + * called concurrently with a function which /or closes Xen library

"which /or closes..." - Is this a typo?

> + * handles (e.g.  libxl_ctx_free, xs_close), the restriction is only
> + * guaranteed to be effective after all of the closing functions have
> + * returned, even if that is later than the return from
> + * xentoolcore_restrict_all.  (Of course if xentoolcore_restrict_all
> + * it is called concurrently with opening functions, the new handles
> + * might or might not be restricted.)
> + *
>    *  ====================================================================
>    *  IMPORTANT - IMPLEMENTATION STATUS
>    *
> diff --git a/tools/libs/toolcore/include/xentoolcore_internal.h b/tools/libs/toolcore/include/xentoolcore_internal.h
> index dbdb1dd..04f5848 100644
> --- a/tools/libs/toolcore/include/xentoolcore_internal.h
> +++ b/tools/libs/toolcore/include/xentoolcore_internal.h
> @@ -48,8 +48,10 @@
>    *     4. ONLY THEN actually open the relevant fd or whatever
>    *
>    *   III. during the "close handle" function
> - *     1. FIRST close the relevant fd or whatever
> - *     2. call xentoolcore__deregister_active_handle
> + *     1. FIRST call xentoolcore__deregister_active_handle
> + *     2. close the relevant fd or whatever
> + *
> + * [ III(b). Do the same as III for error exit from the open function. ]
>    *
>    *   IV. in the restrict_callback function
>    *     * Arrange that the fd (or other handle) can no longer by used
> diff --git a/tools/xenstore/xs.c b/tools/xenstore/xs.c
> index 23f3f09..abffd9c 100644
> --- a/tools/xenstore/xs.c
> +++ b/tools/xenstore/xs.c
> @@ -279,9 +279,9 @@ err:
>   	saved_errno = errno;
>   
>   	if (h) {
> +		xentoolcore__deregister_active_handle(&h->tc_ah);
>   		if (h->fd >= 0)
>   			close(h->fd);
> -		xentoolcore__deregister_active_handle(&h->tc_ah);
>   	}
>   	free(h);
>   
> @@ -342,8 +342,8 @@ static void close_fds_free(struct xs_handle *h) {
>   		close(h->watch_pipe[1]);
>   	}
>   
> -        close(h->fd);
>   	xentoolcore__deregister_active_handle(&h->tc_ah);
> +        close(h->fd);
>           

Since the rest of this file uses tabs, you may as well use tabs for this 
line as well.

Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Ian Jackson Nov. 14, 2017, 2:57 p.m. UTC | #11
Julien Grall writes ("Re: [PATCH] tools: xentoolcore_restrict_all: Do deregistration before close"):
> I think this is 4.10 material, xentoolcore was introduced in this 
> release and it would be good to have it right from now. I want to 
> confirm that you are both happy with that?

Yes, absolutely.  Sorry, I forgot the for-4.10 tag in the Subject.

Ian.
Ian Jackson Nov. 14, 2017, 3:01 p.m. UTC | #12
Ross Lagerwall writes ("Re: [PATCH] tools: xentoolcore_restrict_all: Do deregistration before close"):
> On 11/14/2017 12:15 PM, Ian Jackson wrote:
> > + * Note for multi-threaded programs: If xentoolcore_restrict_all is
> > + * called concurrently with a function which /or closes Xen library
> 
> "which /or closes..." - Is this a typo?

Yes, fixed, thanks.

> > -        close(h->fd);
> >   	xentoolcore__deregister_active_handle(&h->tc_ah);
> > +        close(h->fd);
> >           
> 
> Since the rest of this file uses tabs, you may as well use tabs for this 
> line as well.

I didn't change the use of tabs vs. the use of spaces.

> Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>

Thanks,
Ian.
Julien Grall Nov. 16, 2017, 3:01 p.m. UTC | #13
Hi Ian,

On 14/11/17 14:57, Ian Jackson wrote:
> Julien Grall writes ("Re: [PATCH] tools: xentoolcore_restrict_all: Do deregistration before close"):
>> I think this is 4.10 material, xentoolcore was introduced in this
>> release and it would be good to have it right from now. I want to
>> confirm that you are both happy with that?
> 
> Yes, absolutely.  Sorry, I forgot the for-4.10 tag in the Subject.

Release-acked-by: Julien Grall <julien.grall@linaro.org>

Cheers,
diff mbox series

Patch

diff --git a/tools/libs/evtchn/core.c b/tools/libs/evtchn/core.c
index 14b7549a6b..2dba58bf00 100644
--- a/tools/libs/evtchn/core.c
+++ b/tools/libs/evtchn/core.c
@@ -56,6 +56,7 @@  xenevtchn_handle *xenevtchn_open(xentoollog_logger *logger, unsigned open_flags)
 
 err:
     osdep_evtchn_close(xce);
+    xentoolcore__deregister_active_handle(&xce->tc_ah);
     xtl_logger_destroy(xce->logger_tofree);
     free(xce);
     return NULL;
@@ -69,6 +70,7 @@  int xenevtchn_close(xenevtchn_handle *xce)
         return 0;
 
     rc = osdep_evtchn_close(xce);
+    xentoolcore__deregister_active_handle(&xce->tc_ah);
     xtl_logger_destroy(xce->logger_tofree);
     free(xce);
     return rc;