Message ID | 20220804080624.14768-1-xiubli@redhat.com |
---|---|
State | New |
Headers | show |
Series | ceph: fail the open_by_handle_at() if the dentry is being unlinked | expand |
On Thu, 2022-08-04 at 16:06 +0800, xiubli@redhat.com wrote: > From: Xiubo Li <xiubli@redhat.com> > > When unlinking a file the kclient will send a unlink request to MDS > by holding the dentry reference, and then the MDS will return 2 replies, > which are unsafe reply and a deferred safe reply. > > After the unsafe reply received the kernel will return and succeed > the unlink request to user space apps. > > Only when the safe reply received the dentry's reference will be > released. Or the dentry will only be unhashed from dcache. But when > the open_by_handle_at() begins to open the unlinked files it will > succeed. > > URL: https://tracker.ceph.com/issues/56524 > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > fs/ceph/export.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c > index 0ebf2bd93055..7d2ae977b8c9 100644 > --- a/fs/ceph/export.c > +++ b/fs/ceph/export.c > @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) > static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > { > struct inode *inode = __lookup_inode(sb, ino); > + struct dentry *dentry; > int err; > > if (IS_ERR(inode)) > @@ -197,7 +198,15 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) > iput(inode); > return ERR_PTR(-ESTALE); > } > - return d_obtain_alias(inode); > + > + /* -ESTALE if the dentry is unhashed, which should being released */ > + dentry = d_obtain_alias(inode); > + if (d_unhashed(dentry)) { > + dput(dentry); > + return ERR_PTR(-ESTALE); > + } > + > + return dentry; > } > > static struct dentry *__snapfh_to_dentry(struct super_block *sb, Reviewed-by: Jeff Layton <jlayton@kernel.org>
While reviewing the implementation of __fh_to_dentry (in the CephFS client), I noticed a possible race condition. Linux has a syscall linkat(2) which allows, given an open file descriptor, to create a link for the file. So an inode that is unlinked can become linked. Now the problem: The line ((inode->i_nlink == 0) && !__ceph_is_file_opened(ci)) performs two checks. If, in between those checks, the file goes from the unlinked and open state to the linked and closed state, then we return -ESTALE even though the inode is linked. I don't think this is the intended behavior. I guess this (going from unlinked and open to linked and closed) can happen when a concurrent process calls linkat() and then close().
On 9/7/23 00:58, Sebastian Hasler wrote: > While reviewing the implementation of __fh_to_dentry (in the CephFS > client), I noticed a possible race condition. > > Linux has a syscall linkat(2) which allows, given an open file > descriptor, to create a link for the file. So an inode that is > unlinked can become linked. > > Now the problem: The line ((inode->i_nlink == 0) && > !__ceph_is_file_opened(ci)) performs two checks. If, in between those > checks, the file goes from the unlinked and open state to the linked > and closed state, then we return -ESTALE even though the inode is > linked. I don't think this is the intended behavior. I guess this > (going from unlinked and open to linked and closed) can happen when a > concurrent process calls linkat() and then close(). > Hi Sebastian, Thanks for your reporting. int linkat(int olddirfd, const char *oldpath, int newdirfd, const char *newpath, int flags); BTW, for "an open file descripter", do you mean "olddirfd" ? Because "olddirfd" is a dir's open file descripter, how is that possible it can become linked again ? Correct me if I'm misreading it. Thanks - Xiubo
On 07/09/2023 04:05, Xiubo Li wrote: > > int linkat(int olddirfd, const char *oldpath, int newdirfd, const char > *newpath, int flags); > > BTW, for "an open file descripter", do you mean "olddirfd" ? Because > "olddirfd" is a dir's open file descripter, how is that possible it > can become linked again ? Yes, I mean olddirfd, and the manual says: "If oldpath is an empty string, create a link to the file referenced by olddirfd".
On 11/09/2023 05:21, Xiubo Li wrote: > "This will generally not work if the file has a link count of > zero (files created with O_TMPFILE and without O_EXCL are an > exception)." In this sentence, "generally" probably means "typically", so it depends on the specific file system's behavior. I don't know the CephFS behavior in this case. However, at least files created with O_TMPFILE are an exception. Does CephFS support creating files with O_TMPFILE? Such files can (and often will) go from the unlinked and open state to the linked and closed state.
On 9/11/23 18:07, Sebastian Hasler wrote: > On 11/09/2023 05:21, Xiubo Li wrote: > >> "This will generally not work if the file has a link count of >> zero (files created with O_TMPFILE and without O_EXCL are an >> exception)." > In this sentence, "generally" probably means "typically", so it > depends on the specific file system's behavior. I don't know the > CephFS behavior in this case. However, at least files created with > O_TMPFILE are an exception. Does CephFS support creating files with > O_TMPFILE? Such files can (and often will) go from the unlinked and > open state to the linked and closed state. > Hmm, yeah. I didn't see anywhere is disallowing to create with O_TMPFILE. Let me have a look at this case carefully later. Thanks very much for your reporting Sebastian. - Xiubo
diff --git a/fs/ceph/export.c b/fs/ceph/export.c index 0ebf2bd93055..7d2ae977b8c9 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -182,6 +182,7 @@ struct inode *ceph_lookup_inode(struct super_block *sb, u64 ino) static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) { struct inode *inode = __lookup_inode(sb, ino); + struct dentry *dentry; int err; if (IS_ERR(inode)) @@ -197,7 +198,15 @@ static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino) iput(inode); return ERR_PTR(-ESTALE); } - return d_obtain_alias(inode); + + /* -ESTALE if the dentry is unhashed, which should being released */ + dentry = d_obtain_alias(inode); + if (d_unhashed(dentry)) { + dput(dentry); + return ERR_PTR(-ESTALE); + } + + return dentry; } static struct dentry *__snapfh_to_dentry(struct super_block *sb,