Message ID | 20220606233142.150457-1-jlayton@kernel.org |
---|---|
State | New |
Headers | show |
Series | ceph: wait on async create before checking caps for syncfs | expand |
On Thu, Jun 9, 2022 at 12:18 PM Xiubo Li <xiubli@redhat.com> wrote: > > > On 6/9/22 12:02 PM, Yan, Zheng wrote: > > On Thu, Jun 9, 2022 at 11:56 AM Xiubo Li <xiubli@redhat.com> wrote: > >> > >> On 6/9/22 11:29 AM, Yan, Zheng wrote: > >>> On Thu, Jun 9, 2022 at 11:19 AM Xiubo Li <xiubli@redhat.com> wrote: > >>>> On 6/9/22 10:15 AM, Yan, Zheng wrote: > >>>>> The recent series of patches that add "wait on async xxxx" at various > >>>>> places do not seem correct. The correct fix should make mds avoid any > >>>>> wait when handling async requests. > >>>>> > >>>> In this case I am thinking what will happen if the async create request > >>>> is deferred, then the cap flush related request should fail to find the > >>>> ino. > >>>> > >>>> Should we wait ? Then how to distinguish from migrating a subtree and a > >>>> deferred async create cases ? > >>>> > >>> async op caps are revoked at freezingtree stage of subtree migration. > >>> see Locker::invalidate_lock_caches > >>> > >> Sorry I may not totally understand this issue. > >> > >> I think you mean in case of migration and then the MDS will revoke caps > >> for the async create files and then the kclient will send a MclientCap > >> request to mds, right ? > >> > >> If my understanding is correct, there is another case that: > >> > >> 1, async create a fileA > >> > >> 2, then write a lot of data to it and then release the Fw cap ref, and > >> if we should report the size to MDS, it will send a MclientCap request > >> to MDS too. > >> > >> 3, what if the async create of fileA was deferred due to some reason, > >> then the MclientCap request will fail to find the ino ? > >> > > Async op should not be deferred in any case. > > Recently we have hit a similar bug, caused by deferring a request and > requeuing it and the following request was executed before it. > that bug is mds bug. > >>>>> On Wed, Jun 8, 2022 at 12:56 PM Jeff Layton <jlayton@kernel.org> wrote: > >>>>>> Currently, we'll call ceph_check_caps, but if we're still waiting on the > >>>>>> reply, we'll end up spinning around on the same inode in > >>>>>> flush_dirty_session_caps. Wait for the async create reply before > >>>>>> flushing caps. > >>>>>> > >>>>>> Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) > >>>>>> URL: https://tracker.ceph.com/issues/55823 > >>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> > >>>>>> --- > >>>>>> fs/ceph/caps.c | 1 + > >>>>>> 1 file changed, 1 insertion(+) > >>>>>> > >>>>>> I don't know if this will fix the tx queue stalls completely, but I > >>>>>> haven't seen one with this patch in place. I think it makes sense on its > >>>>>> own, either way. > >>>>>> > >>>>>> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > >>>>>> index 0a48bf829671..5ecfff4b37c9 100644 > >>>>>> --- a/fs/ceph/caps.c > >>>>>> +++ b/fs/ceph/caps.c > >>>>>> @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) > >>>>>> ihold(inode); > >>>>>> dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); > >>>>>> spin_unlock(&mdsc->cap_dirty_lock); > >>>>>> + ceph_wait_on_async_create(inode); > >>>>>> ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); > >>>>>> iput(inode); > >>>>>> spin_lock(&mdsc->cap_dirty_lock); > >>>>>> -- > >>>>>> 2.36.1 > >>>>>> >
Are you suggesting that the MDS ought to hold a cap message for an inode before its create request is processed? Note that the MDS won't even be aware that the inode even _exists_ at that point. As far as the MDS knows, it's just be a delegated inode number to the client. At what point does the MDS give up on holding such a cap request if the create request never comes in for some reason? I don't see the harm in making the client wait until it gets a create reply before sending a cap message. If we want to revert fbed7045f552 instead, we can do that, but it'll cause a regression until the MDS is fixed [1]. Regardless, we need to either take this patch or revert that one. I move that we take this patch for now to address the softlockups. Once the MDS is fixed we could revert this and fbed7045f552 without causing a regression. [1]: https://tracker.ceph.com/issues/54107 On Thu, 2022-06-09 at 10:15 +0800, Yan, Zheng wrote: > The recent series of patches that add "wait on async xxxx" at various > places do not seem correct. The correct fix should make mds avoid any > wait when handling async requests. > > > On Wed, Jun 8, 2022 at 12:56 PM Jeff Layton <jlayton@kernel.org> wrote: > > > > Currently, we'll call ceph_check_caps, but if we're still waiting on the > > reply, we'll end up spinning around on the same inode in > > flush_dirty_session_caps. Wait for the async create reply before > > flushing caps. > > > > Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) > > URL: https://tracker.ceph.com/issues/55823 > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/ceph/caps.c | 1 + > > 1 file changed, 1 insertion(+) > > > > I don't know if this will fix the tx queue stalls completely, but I > > haven't seen one with this patch in place. I think it makes sense on its > > own, either way. > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > index 0a48bf829671..5ecfff4b37c9 100644 > > --- a/fs/ceph/caps.c > > +++ b/fs/ceph/caps.c > > @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) > > ihold(inode); > > dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); > > spin_unlock(&mdsc->cap_dirty_lock); > > + ceph_wait_on_async_create(inode); > > ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); > > iput(inode); > > spin_lock(&mdsc->cap_dirty_lock); > > -- > > 2.36.1 > >
On Wed, Jun 29, 2022 at 8:08 PM Jeff Layton <jlayton@kernel.org> wrote: > > Are you suggesting that the MDS ought to hold a cap message for an inode > before its create request is processed? Note that the MDS won't even be > aware that the inode even _exists_ at that point. As far as the MDS > knows, it's just be a delegated inode number to the client. At what > point does the MDS give up on holding such a cap request if the create > request never comes in for some reason? > For an async request, MDS should not process it immediately. If there is any wait when handling async request, it's mds bug. I suggest tracking down any wait, and fix it. > I don't see the harm in making the client wait until it gets a create > reply before sending a cap message. If we want to revert fbed7045f552 > instead, we can do that, but it'll cause a regression until the MDS is > fixed [1]. Regardless, we need to either take this patch or revert that > one. > > I move that we take this patch for now to address the softlockups. Once > the MDS is fixed we could revert this and fbed7045f552 without causing a > regression. > > [1]: https://tracker.ceph.com/issues/54107 > > > On Thu, 2022-06-09 at 10:15 +0800, Yan, Zheng wrote: > > The recent series of patches that add "wait on async xxxx" at various > > places do not seem correct. The correct fix should make mds avoid any > > wait when handling async requests. > > > > > > On Wed, Jun 8, 2022 at 12:56 PM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > Currently, we'll call ceph_check_caps, but if we're still waiting on the > > > reply, we'll end up spinning around on the same inode in > > > flush_dirty_session_caps. Wait for the async create reply before > > > flushing caps. > > > > > > Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) > > > URL: https://tracker.ceph.com/issues/55823 > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > fs/ceph/caps.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > I don't know if this will fix the tx queue stalls completely, but I > > > haven't seen one with this patch in place. I think it makes sense on its > > > own, either way. > > > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > > index 0a48bf829671..5ecfff4b37c9 100644 > > > --- a/fs/ceph/caps.c > > > +++ b/fs/ceph/caps.c > > > @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) > > > ihold(inode); > > > dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); > > > spin_unlock(&mdsc->cap_dirty_lock); > > > + ceph_wait_on_async_create(inode); > > > ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); > > > iput(inode); > > > spin_lock(&mdsc->cap_dirty_lock); > > > -- > > > 2.36.1 > > > > > -- > Jeff Layton <jlayton@kernel.org>
On Thu, Jun 30, 2022 at 10:33 AM Yan, Zheng <ukernel@gmail.com> wrote: > > On Wed, Jun 29, 2022 at 8:08 PM Jeff Layton <jlayton@kernel.org> wrote: > > > > Are you suggesting that the MDS ought to hold a cap message for an inode > > before its create request is processed? Note that the MDS won't even be > > aware that the inode even _exists_ at that point. As far as the MDS > > knows, it's just be a delegated inode number to the client. At what > > point does the MDS give up on holding such a cap request if the create > > request never comes in for some reason? > > > For an async request, MDS should not process it immediately. If there > is any wait when handling async request, it's mds bug. I suggest > tracking down any wait, and fix it. > I mean: For an async request, MDS should process it immediately. This is important for preserving request ordering. > > > I don't see the harm in making the client wait until it gets a create > > reply before sending a cap message. If we want to revert fbed7045f552 > > instead, we can do that, but it'll cause a regression until the MDS is > > fixed [1]. Regardless, we need to either take this patch or revert that > > one. > > > > I move that we take this patch for now to address the softlockups. Once > > the MDS is fixed we could revert this and fbed7045f552 without causing a > > regression. > > > > [1]: https://tracker.ceph.com/issues/54107 > > > > > > On Thu, 2022-06-09 at 10:15 +0800, Yan, Zheng wrote: > > > The recent series of patches that add "wait on async xxxx" at various > > > places do not seem correct. The correct fix should make mds avoid any > > > wait when handling async requests. > > > > > > > > > On Wed, Jun 8, 2022 at 12:56 PM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > Currently, we'll call ceph_check_caps, but if we're still waiting on the > > > > reply, we'll end up spinning around on the same inode in > > > > flush_dirty_session_caps. Wait for the async create reply before > > > > flushing caps. > > > > > > > > Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) > > > > URL: https://tracker.ceph.com/issues/55823 > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > --- > > > > fs/ceph/caps.c | 1 + > > > > 1 file changed, 1 insertion(+) > > > > > > > > I don't know if this will fix the tx queue stalls completely, but I > > > > haven't seen one with this patch in place. I think it makes sense on its > > > > own, either way. > > > > > > > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > > > > index 0a48bf829671..5ecfff4b37c9 100644 > > > > --- a/fs/ceph/caps.c > > > > +++ b/fs/ceph/caps.c > > > > @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) > > > > ihold(inode); > > > > dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); > > > > spin_unlock(&mdsc->cap_dirty_lock); > > > > + ceph_wait_on_async_create(inode); > > > > ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); > > > > iput(inode); > > > > spin_lock(&mdsc->cap_dirty_lock); > > > > -- > > > > 2.36.1 > > > > > > > > -- > > Jeff Layton <jlayton@kernel.org>
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 0a48bf829671..5ecfff4b37c9 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4389,6 +4389,7 @@ static void flush_dirty_session_caps(struct ceph_mds_session *s) ihold(inode); dout("flush_dirty_caps %llx.%llx\n", ceph_vinop(inode)); spin_unlock(&mdsc->cap_dirty_lock); + ceph_wait_on_async_create(inode); ceph_check_caps(ci, CHECK_CAPS_FLUSH, NULL); iput(inode); spin_lock(&mdsc->cap_dirty_lock);
Currently, we'll call ceph_check_caps, but if we're still waiting on the reply, we'll end up spinning around on the same inode in flush_dirty_session_caps. Wait for the async create reply before flushing caps. Fixes: fbed7045f552 (ceph: wait for async create reply before sending any cap messages) URL: https://tracker.ceph.com/issues/55823 Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ceph/caps.c | 1 + 1 file changed, 1 insertion(+) I don't know if this will fix the tx queue stalls completely, but I haven't seen one with this patch in place. I think it makes sense on its own, either way.