Message ID | 20210727201230.178286-1-jlayton@kernel.org |
---|---|
State | New |
Headers | show |
Series | ceph: cancel delayed work instead of flushing on mdsc teardown | expand |
On 7/28/21 4:12 AM, Jeff Layton wrote: > The first thing metric_delayed_work does is check mdsc->stopping, > and then return immediately if it's set...which is good since we would > have already torn down the metric structures at this point, otherwise. > > Worse yet, it's possible that the ceph_metric_destroy call could race > with the delayed_work, in which case we could end up a end up accessing > destroyed percpu variables. > > At this point in the mdsc teardown, the "stopping" flag has already been > set, so there's no benefit to flushing the work. Just cancel it instead, > and do so before we tear down the metrics structures. > > Cc: Xiubo Li <xiubli@redhat.com> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/ceph/mds_client.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index c43091a30ba8..d3f2baf3c352 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -4977,9 +4977,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) > > ceph_mdsc_stop(mdsc); > > + cancel_delayed_work_sync(&mdsc->metric.delayed_work); > ceph_metric_destroy(&mdsc->metric); > In the "ceph_metric_destroy()" it will also do "cancel_delayed_work_sync(&mdsc->metric.delayed_work)". We can just move the it to the front of the _destory(). > - flush_delayed_work(&mdsc->metric.delayed_work); > fsc->mdsc = NULL; > kfree(mdsc); > dout("mdsc_destroy %p done\n", mdsc);
On Thu, 2021-07-29 at 10:56 +0800, Xiubo Li wrote: > On 7/28/21 4:12 AM, Jeff Layton wrote: > > The first thing metric_delayed_work does is check mdsc->stopping, > > and then return immediately if it's set...which is good since we would > > have already torn down the metric structures at this point, otherwise. > > > > Worse yet, it's possible that the ceph_metric_destroy call could race > > with the delayed_work, in which case we could end up a end up accessing > > destroyed percpu variables. > > > > At this point in the mdsc teardown, the "stopping" flag has already been > > set, so there's no benefit to flushing the work. Just cancel it instead, > > and do so before we tear down the metrics structures. > > > > Cc: Xiubo Li <xiubli@redhat.com> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/ceph/mds_client.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > index c43091a30ba8..d3f2baf3c352 100644 > > --- a/fs/ceph/mds_client.c > > +++ b/fs/ceph/mds_client.c > > @@ -4977,9 +4977,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) > > > > ceph_mdsc_stop(mdsc); > > > > + cancel_delayed_work_sync(&mdsc->metric.delayed_work); > > ceph_metric_destroy(&mdsc->metric); > > > > In the "ceph_metric_destroy()" it will also do > "cancel_delayed_work_sync(&mdsc->metric.delayed_work)". > > We can just move the it to the front of the _destory(). > > Good point! I'll send a v2 after I test it out. > > > - flush_delayed_work(&mdsc->metric.delayed_work); > > fsc->mdsc = NULL; > > kfree(mdsc); > > dout("mdsc_destroy %p done\n", mdsc); > Thanks, -- Jeff Layton <jlayton@kernel.org>
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index c43091a30ba8..d3f2baf3c352 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4977,9 +4977,9 @@ void ceph_mdsc_destroy(struct ceph_fs_client *fsc) ceph_mdsc_stop(mdsc); + cancel_delayed_work_sync(&mdsc->metric.delayed_work); ceph_metric_destroy(&mdsc->metric); - flush_delayed_work(&mdsc->metric.delayed_work); fsc->mdsc = NULL; kfree(mdsc); dout("mdsc_destroy %p done\n", mdsc);
The first thing metric_delayed_work does is check mdsc->stopping, and then return immediately if it's set...which is good since we would have already torn down the metric structures at this point, otherwise. Worse yet, it's possible that the ceph_metric_destroy call could race with the delayed_work, in which case we could end up a end up accessing destroyed percpu variables. At this point in the mdsc teardown, the "stopping" flag has already been set, so there's no benefit to flushing the work. Just cancel it instead, and do so before we tear down the metrics structures. Cc: Xiubo Li <xiubli@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/mds_client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)