diff mbox series

ceph: cancel delayed work instead of flushing on mdsc teardown

Message ID 20210727201230.178286-1-jlayton@kernel.org
State New
Headers show
Series ceph: cancel delayed work instead of flushing on mdsc teardown | expand

Commit Message

Jeff Layton July 27, 2021, 8:12 p.m. UTC
The first thing metric_delayed_work does is check mdsc->stopping,
and then return immediately if it's set...which is good since we would
have already torn down the metric structures at this point, otherwise.

Worse yet, it's possible that the ceph_metric_destroy call could race
with the delayed_work, in which case we could end up a end up accessing
destroyed percpu variables.

At this point in the mdsc teardown, the "stopping" flag has already been
set, so there's no benefit to flushing the work. Just cancel it instead,
and do so before we tear down the metrics structures.

Cc: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/mds_client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Xiubo Li July 29, 2021, 2:56 a.m. UTC | #1
On 7/28/21 4:12 AM, Jeff Layton wrote:
> The first thing metric_delayed_work does is check mdsc->stopping,

> and then return immediately if it's set...which is good since we would

> have already torn down the metric structures at this point, otherwise.

>

> Worse yet, it's possible that the ceph_metric_destroy call could race

> with the delayed_work, in which case we could end up a end up accessing

> destroyed percpu variables.

>

> At this point in the mdsc teardown, the "stopping" flag has already been

> set, so there's no benefit to flushing the work. Just cancel it instead,

> and do so before we tear down the metrics structures.

>

> Cc: Xiubo Li <xiubli@redhat.com>

> Signed-off-by: Jeff Layton <jlayton@kernel.org>

> ---

>   fs/ceph/mds_client.c | 2 +-

>   1 file changed, 1 insertion(+), 1 deletion(-)

>

> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c

> index c43091a30ba8..d3f2baf3c352 100644

> --- a/fs/ceph/mds_client.c

> +++ b/fs/ceph/mds_client.c

> @@ -4977,9 +4977,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)

>   

>   	ceph_mdsc_stop(mdsc);

>   

> +	cancel_delayed_work_sync(&mdsc->metric.delayed_work);

>   	ceph_metric_destroy(&mdsc->metric);

>   


In the "ceph_metric_destroy()" it will also do 
"cancel_delayed_work_sync(&mdsc->metric.delayed_work)".

We can just move the it to the front of the _destory().



> -	flush_delayed_work(&mdsc->metric.delayed_work);

>   	fsc->mdsc = NULL;

>   	kfree(mdsc);

>   	dout("mdsc_destroy %p done\n", mdsc);
Jeff Layton July 29, 2021, 11:34 a.m. UTC | #2
On Thu, 2021-07-29 at 10:56 +0800, Xiubo Li wrote:
> On 7/28/21 4:12 AM, Jeff Layton wrote:

> > The first thing metric_delayed_work does is check mdsc->stopping,

> > and then return immediately if it's set...which is good since we would

> > have already torn down the metric structures at this point, otherwise.

> > 

> > Worse yet, it's possible that the ceph_metric_destroy call could race

> > with the delayed_work, in which case we could end up a end up accessing

> > destroyed percpu variables.

> > 

> > At this point in the mdsc teardown, the "stopping" flag has already been

> > set, so there's no benefit to flushing the work. Just cancel it instead,

> > and do so before we tear down the metrics structures.

> > 

> > Cc: Xiubo Li <xiubli@redhat.com>

> > Signed-off-by: Jeff Layton <jlayton@kernel.org>

> > ---

> >   fs/ceph/mds_client.c | 2 +-

> >   1 file changed, 1 insertion(+), 1 deletion(-)

> > 

> > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c

> > index c43091a30ba8..d3f2baf3c352 100644

> > --- a/fs/ceph/mds_client.c

> > +++ b/fs/ceph/mds_client.c

> > @@ -4977,9 +4977,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc)

> >   

> >   	ceph_mdsc_stop(mdsc);

> >   

> > +	cancel_delayed_work_sync(&mdsc->metric.delayed_work);

> >   	ceph_metric_destroy(&mdsc->metric);

> >   

> 

> In the "ceph_metric_destroy()" it will also do 

> "cancel_delayed_work_sync(&mdsc->metric.delayed_work)".

> 

> We can just move the it to the front of the _destory().

> 

> 


Good point! I'll send a v2 after I test it out.

> 

> > -	flush_delayed_work(&mdsc->metric.delayed_work);

> >   	fsc->mdsc = NULL;

> >   	kfree(mdsc);

> >   	dout("mdsc_destroy %p done\n", mdsc);

> 


Thanks,
-- 
Jeff Layton <jlayton@kernel.org>
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index c43091a30ba8..d3f2baf3c352 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -4977,9 +4977,9 @@  void ceph_mdsc_destroy(struct ceph_fs_client *fsc)
 
 	ceph_mdsc_stop(mdsc);
 
+	cancel_delayed_work_sync(&mdsc->metric.delayed_work);
 	ceph_metric_destroy(&mdsc->metric);
 
-	flush_delayed_work(&mdsc->metric.delayed_work);
 	fsc->mdsc = NULL;
 	kfree(mdsc);
 	dout("mdsc_destroy %p done\n", mdsc);