mbox series

[v3,0/3] ceph: fscrypt: fix atomic open bug for encrypted directories

Message ID 20230316181413.26916-1-lhenriques@suse.de
Headers show
Series ceph: fscrypt: fix atomic open bug for encrypted directories | expand

Message

Luis Henriques March 16, 2023, 6:14 p.m. UTC
Hi!

I started seeing fstest generic/123 failing in ceph fscrypt, when running it
with 'test_dummy_encryption'.  This test is quite simple:

1. Creates a directory with write permissions for root only
2. Writes into a file in that directory
3. Uses 'su' to try to modify that file as a different user, and
   gets -EPERM

All the test steps succeed, but the test fails to cleanup: 'rm -rf <dir>'
will fail with -ENOTEMPTY.  'strace' shows that calling unlinkat() to remove
the file got a -ENOENT and then -ENOTEMPTY for the directory.

This is because 'su' does a drop_caches ('su (874): drop_caches: 2' in
dmesg), and ceph's atomic open will do:

	if (IS_ENCRYPTED(dir)) {
		set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
		if (!fscrypt_has_encryption_key(dir)) {
			spin_lock(&dentry->d_lock);
			dentry->d_flags |= DCACHE_NOKEY_NAME;
			spin_unlock(&dentry->d_lock);
		}
	}

Although 'dir' has the encryption key available, fscrypt_has_encryption_key()
will return 'false' because fscrypt info isn't yet set after the cache
cleanup.

The first patch will add a new helper for the atomic_open that will force
the fscrypt info to be loaded into an inode that has been evicted recently
but for which the key is still available.

The second patch switches ceph atomic_open to use the new fscrypt helper.

Cheers,
--
Luís

Changes since v2:
- Make helper more generic and to be used both in lookup and atomic open
  operations
- Modify ceph_lookup (patch 0002) and ceph_atomic_open (patch 0003) to use
  the new helper

Changes since v1:
- Dropped IS_ENCRYPTED() from helper function because kerneldoc says
  already that it applies to encrypted directories and, most importantly,
  because it would introduce a different behaviour for
  CONFIG_FS_ENCRYPTION and !CONFIG_FS_ENCRYPTION.
- Rephrased helper kerneldoc

Changes since initial RFC (after Eric's review):
- Added kerneldoc comments to the new fscrypt helper
- Dropped '__' from helper name (now fscrypt_prepare_atomic_open())
- Added IS_ENCRYPTED() check in helper
- DCACHE_NOKEY_NAME is not set if fscrypt_get_encryption_info() returns an
  error
- Fixed helper for !CONFIG_FS_ENCRYPTION (now defined 'static inline')

Luís Henriques (3):
  fscrypt: new helper function - fscrypt_prepare_lookup_partial()
  ceph: switch ceph_open() to use new fscrypt helper
  ceph: switch ceph_open_atomic() to use the new fscrypt helper

 fs/ceph/dir.c           | 13 +++++++------
 fs/ceph/file.c          |  8 +++-----
 fs/crypto/hooks.c       | 37 +++++++++++++++++++++++++++++++++++++
 include/linux/fscrypt.h |  7 +++++++
 4 files changed, 54 insertions(+), 11 deletions(-)

Comments

Ilya Dryomov March 20, 2023, 11:20 a.m. UTC | #1
On Mon, Mar 20, 2023 at 2:07 AM Xiubo Li <xiubli@redhat.com> wrote:
>
>
> On 17/03/2023 02:14, Luís Henriques wrote:
> > Hi!
> >
> > I started seeing fstest generic/123 failing in ceph fscrypt, when running it
> > with 'test_dummy_encryption'.  This test is quite simple:
> >
> > 1. Creates a directory with write permissions for root only
> > 2. Writes into a file in that directory
> > 3. Uses 'su' to try to modify that file as a different user, and
> >     gets -EPERM
> >
> > All the test steps succeed, but the test fails to cleanup: 'rm -rf <dir>'
> > will fail with -ENOTEMPTY.  'strace' shows that calling unlinkat() to remove
> > the file got a -ENOENT and then -ENOTEMPTY for the directory.
> >
> > This is because 'su' does a drop_caches ('su (874): drop_caches: 2' in
> > dmesg), and ceph's atomic open will do:
> >
> >       if (IS_ENCRYPTED(dir)) {
> >               set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
> >               if (!fscrypt_has_encryption_key(dir)) {
> >                       spin_lock(&dentry->d_lock);
> >                       dentry->d_flags |= DCACHE_NOKEY_NAME;
> >                       spin_unlock(&dentry->d_lock);
> >               }
> >       }
> >
> > Although 'dir' has the encryption key available, fscrypt_has_encryption_key()
> > will return 'false' because fscrypt info isn't yet set after the cache
> > cleanup.
> >
> > The first patch will add a new helper for the atomic_open that will force
> > the fscrypt info to be loaded into an inode that has been evicted recently
> > but for which the key is still available.
> >
> > The second patch switches ceph atomic_open to use the new fscrypt helper.
> >
> > Cheers,
> > --
> > Luís
> >
> > Changes since v2:
> > - Make helper more generic and to be used both in lookup and atomic open
> >    operations
> > - Modify ceph_lookup (patch 0002) and ceph_atomic_open (patch 0003) to use
> >    the new helper
> >
> > Changes since v1:
> > - Dropped IS_ENCRYPTED() from helper function because kerneldoc says
> >    already that it applies to encrypted directories and, most importantly,
> >    because it would introduce a different behaviour for
> >    CONFIG_FS_ENCRYPTION and !CONFIG_FS_ENCRYPTION.
> > - Rephrased helper kerneldoc
> >
> > Changes since initial RFC (after Eric's review):
> > - Added kerneldoc comments to the new fscrypt helper
> > - Dropped '__' from helper name (now fscrypt_prepare_atomic_open())
> > - Added IS_ENCRYPTED() check in helper
> > - DCACHE_NOKEY_NAME is not set if fscrypt_get_encryption_info() returns an
> >    error
> > - Fixed helper for !CONFIG_FS_ENCRYPTION (now defined 'static inline')
>
> This series looks good to me.
>
> And I have run the test locally and worked well.
>
>
> > Luís Henriques (3):
> >    fscrypt: new helper function - fscrypt_prepare_lookup_partial()
>
> Eric,
>
> If possible I we can pick this together to ceph repo and need your ack
> about this. Or you can pick it to the crypto repo then please feel free
> to add:
>
> Tested-by: Xiubo Li <xiubli@redhat.com> and Reviewed-by: Xiubo Li
> <xiubli@redhat.com>

I would prefer the fscrypt helper to go through the fscrypt tree.

Thanks,

                Ilya
Luis Henriques March 20, 2023, 2:07 p.m. UTC | #2
Xiubo Li <xiubli@redhat.com> writes:

> On 17/03/2023 02:14, Luís Henriques wrote:
>> Hi!
>>
>> I started seeing fstest generic/123 failing in ceph fscrypt, when running it
>> with 'test_dummy_encryption'.  This test is quite simple:
>>
>> 1. Creates a directory with write permissions for root only
>> 2. Writes into a file in that directory
>> 3. Uses 'su' to try to modify that file as a different user, and
>>     gets -EPERM
>>
>> All the test steps succeed, but the test fails to cleanup: 'rm -rf <dir>'
>> will fail with -ENOTEMPTY.  'strace' shows that calling unlinkat() to remove
>> the file got a -ENOENT and then -ENOTEMPTY for the directory.
>>
>> This is because 'su' does a drop_caches ('su (874): drop_caches: 2' in
>> dmesg), and ceph's atomic open will do:
>>
>> 	if (IS_ENCRYPTED(dir)) {
>> 		set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
>> 		if (!fscrypt_has_encryption_key(dir)) {
>> 			spin_lock(&dentry->d_lock);
>> 			dentry->d_flags |= DCACHE_NOKEY_NAME;
>> 			spin_unlock(&dentry->d_lock);
>> 		}
>> 	}
>>
>> Although 'dir' has the encryption key available, fscrypt_has_encryption_key()
>> will return 'false' because fscrypt info isn't yet set after the cache
>> cleanup.
>>
>> The first patch will add a new helper for the atomic_open that will force
>> the fscrypt info to be loaded into an inode that has been evicted recently
>> but for which the key is still available.
>>
>> The second patch switches ceph atomic_open to use the new fscrypt helper.
>>
>> Cheers,
>> --
>> Luís
>>
>> Changes since v2:
>> - Make helper more generic and to be used both in lookup and atomic open
>>    operations
>> - Modify ceph_lookup (patch 0002) and ceph_atomic_open (patch 0003) to use
>>    the new helper
>>
>> Changes since v1:
>> - Dropped IS_ENCRYPTED() from helper function because kerneldoc says
>>    already that it applies to encrypted directories and, most importantly,
>>    because it would introduce a different behaviour for
>>    CONFIG_FS_ENCRYPTION and !CONFIG_FS_ENCRYPTION.
>> - Rephrased helper kerneldoc
>>
>> Changes since initial RFC (after Eric's review):
>> - Added kerneldoc comments to the new fscrypt helper
>> - Dropped '__' from helper name (now fscrypt_prepare_atomic_open())
>> - Added IS_ENCRYPTED() check in helper
>> - DCACHE_NOKEY_NAME is not set if fscrypt_get_encryption_info() returns an
>>    error
>> - Fixed helper for !CONFIG_FS_ENCRYPTION (now defined 'static inline')
>
> This series looks good to me.
>
> And I have run the test locally and worked well.

Awesome, thanks a lot Xiubo.  I've been testing it locally as well and I
haven't observed any breakage either.

Cheers,
Eric Biggers March 20, 2023, 10:16 p.m. UTC | #3
On Mon, Mar 20, 2023 at 08:47:18PM +0800, Xiubo Li wrote:
> 
> On 20/03/2023 19:20, Ilya Dryomov wrote:
> > On Mon, Mar 20, 2023 at 2:07 AM Xiubo Li <xiubli@redhat.com> wrote:
> > > 
> > > On 17/03/2023 02:14, Luís Henriques wrote:
> > > > Hi!
> > > > 
> > > > I started seeing fstest generic/123 failing in ceph fscrypt, when running it
> > > > with 'test_dummy_encryption'.  This test is quite simple:
> > > > 
> > > > 1. Creates a directory with write permissions for root only
> > > > 2. Writes into a file in that directory
> > > > 3. Uses 'su' to try to modify that file as a different user, and
> > > >      gets -EPERM
> > > > 
> > > > All the test steps succeed, but the test fails to cleanup: 'rm -rf <dir>'
> > > > will fail with -ENOTEMPTY.  'strace' shows that calling unlinkat() to remove
> > > > the file got a -ENOENT and then -ENOTEMPTY for the directory.
> > > > 
> > > > This is because 'su' does a drop_caches ('su (874): drop_caches: 2' in
> > > > dmesg), and ceph's atomic open will do:
> > > > 
> > > >        if (IS_ENCRYPTED(dir)) {
> > > >                set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
> > > >                if (!fscrypt_has_encryption_key(dir)) {
> > > >                        spin_lock(&dentry->d_lock);
> > > >                        dentry->d_flags |= DCACHE_NOKEY_NAME;
> > > >                        spin_unlock(&dentry->d_lock);
> > > >                }
> > > >        }
> > > > 
> > > > Although 'dir' has the encryption key available, fscrypt_has_encryption_key()
> > > > will return 'false' because fscrypt info isn't yet set after the cache
> > > > cleanup.
> > > > 
> > > > The first patch will add a new helper for the atomic_open that will force
> > > > the fscrypt info to be loaded into an inode that has been evicted recently
> > > > but for which the key is still available.
> > > > 
> > > > The second patch switches ceph atomic_open to use the new fscrypt helper.
> > > > 
> > > > Cheers,
> > > > --
> > > > Luís
> > > > 
> > > > Changes since v2:
> > > > - Make helper more generic and to be used both in lookup and atomic open
> > > >     operations
> > > > - Modify ceph_lookup (patch 0002) and ceph_atomic_open (patch 0003) to use
> > > >     the new helper
> > > > 
> > > > Changes since v1:
> > > > - Dropped IS_ENCRYPTED() from helper function because kerneldoc says
> > > >     already that it applies to encrypted directories and, most importantly,
> > > >     because it would introduce a different behaviour for
> > > >     CONFIG_FS_ENCRYPTION and !CONFIG_FS_ENCRYPTION.
> > > > - Rephrased helper kerneldoc
> > > > 
> > > > Changes since initial RFC (after Eric's review):
> > > > - Added kerneldoc comments to the new fscrypt helper
> > > > - Dropped '__' from helper name (now fscrypt_prepare_atomic_open())
> > > > - Added IS_ENCRYPTED() check in helper
> > > > - DCACHE_NOKEY_NAME is not set if fscrypt_get_encryption_info() returns an
> > > >     error
> > > > - Fixed helper for !CONFIG_FS_ENCRYPTION (now defined 'static inline')
> > > This series looks good to me.
> > > 
> > > And I have run the test locally and worked well.
> > > 
> > > 
> > > > Luís Henriques (3):
> > > >     fscrypt: new helper function - fscrypt_prepare_lookup_partial()
> > > Eric,
> > > 
> > > If possible I we can pick this together to ceph repo and need your ack
> > > about this. Or you can pick it to the crypto repo then please feel free
> > > to add:
> > > 
> > > Tested-by: Xiubo Li <xiubli@redhat.com> and Reviewed-by: Xiubo Li
> > > <xiubli@redhat.com>
> > I would prefer the fscrypt helper to go through the fscrypt tree.
> 
> Sure. This also LGTM.
> 
> Thanks
> 

I've applied it to
https://git.kernel.org/pub/scm/fs/fscrypt/linux.git/log/?h=for-next

But I ended up reworking the comment a bit and moving the function to be just
below __fscrypt_prepare_lookup().  So I sent out v4 that matches what I applied.

BTW, I'm wondering if anyone has had any thoughts about the race condition I
described at https://lore.kernel.org/r/ZBC1P4Gn6eAKD61+@sol.localdomain/.  In
particular, I'm wondering whether this helper function will need to be changed
or not.  Maybe not, because ceph could look at DCACHE_NOKEY_NAME to determine
whether the name should be treated as a no-key name or not, instead of checking
fscrypt_has_encryption_key() again (as I think it is doing currently)?

- Eric
Luis Henriques March 21, 2023, 12:13 p.m. UTC | #4
Eric Biggers <ebiggers@kernel.org> writes:

> On Mon, Mar 20, 2023 at 08:47:18PM +0800, Xiubo Li wrote:
>> 
>> On 20/03/2023 19:20, Ilya Dryomov wrote:
>> > On Mon, Mar 20, 2023 at 2:07 AM Xiubo Li <xiubli@redhat.com> wrote:
>> > > 
>> > > On 17/03/2023 02:14, Luís Henriques wrote:
>> > > > Hi!
>> > > > 
>> > > > I started seeing fstest generic/123 failing in ceph fscrypt, when running it
>> > > > with 'test_dummy_encryption'.  This test is quite simple:
>> > > > 
>> > > > 1. Creates a directory with write permissions for root only
>> > > > 2. Writes into a file in that directory
>> > > > 3. Uses 'su' to try to modify that file as a different user, and
>> > > >      gets -EPERM
>> > > > 
>> > > > All the test steps succeed, but the test fails to cleanup: 'rm -rf <dir>'
>> > > > will fail with -ENOTEMPTY.  'strace' shows that calling unlinkat() to remove
>> > > > the file got a -ENOENT and then -ENOTEMPTY for the directory.
>> > > > 
>> > > > This is because 'su' does a drop_caches ('su (874): drop_caches: 2' in
>> > > > dmesg), and ceph's atomic open will do:
>> > > > 
>> > > >        if (IS_ENCRYPTED(dir)) {
>> > > >                set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
>> > > >                if (!fscrypt_has_encryption_key(dir)) {
>> > > >                        spin_lock(&dentry->d_lock);
>> > > >                        dentry->d_flags |= DCACHE_NOKEY_NAME;
>> > > >                        spin_unlock(&dentry->d_lock);
>> > > >                }
>> > > >        }
>> > > > 
>> > > > Although 'dir' has the encryption key available, fscrypt_has_encryption_key()
>> > > > will return 'false' because fscrypt info isn't yet set after the cache
>> > > > cleanup.
>> > > > 
>> > > > The first patch will add a new helper for the atomic_open that will force
>> > > > the fscrypt info to be loaded into an inode that has been evicted recently
>> > > > but for which the key is still available.
>> > > > 
>> > > > The second patch switches ceph atomic_open to use the new fscrypt helper.
>> > > > 
>> > > > Cheers,
>> > > > --
>> > > > Luís
>> > > > 
>> > > > Changes since v2:
>> > > > - Make helper more generic and to be used both in lookup and atomic open
>> > > >     operations
>> > > > - Modify ceph_lookup (patch 0002) and ceph_atomic_open (patch 0003) to use
>> > > >     the new helper
>> > > > 
>> > > > Changes since v1:
>> > > > - Dropped IS_ENCRYPTED() from helper function because kerneldoc says
>> > > >     already that it applies to encrypted directories and, most importantly,
>> > > >     because it would introduce a different behaviour for
>> > > >     CONFIG_FS_ENCRYPTION and !CONFIG_FS_ENCRYPTION.
>> > > > - Rephrased helper kerneldoc
>> > > > 
>> > > > Changes since initial RFC (after Eric's review):
>> > > > - Added kerneldoc comments to the new fscrypt helper
>> > > > - Dropped '__' from helper name (now fscrypt_prepare_atomic_open())
>> > > > - Added IS_ENCRYPTED() check in helper
>> > > > - DCACHE_NOKEY_NAME is not set if fscrypt_get_encryption_info() returns an
>> > > >     error
>> > > > - Fixed helper for !CONFIG_FS_ENCRYPTION (now defined 'static inline')
>> > > This series looks good to me.
>> > > 
>> > > And I have run the test locally and worked well.
>> > > 
>> > > 
>> > > > Luís Henriques (3):
>> > > >     fscrypt: new helper function - fscrypt_prepare_lookup_partial()
>> > > Eric,
>> > > 
>> > > If possible I we can pick this together to ceph repo and need your ack
>> > > about this. Or you can pick it to the crypto repo then please feel free
>> > > to add:
>> > > 
>> > > Tested-by: Xiubo Li <xiubli@redhat.com> and Reviewed-by: Xiubo Li
>> > > <xiubli@redhat.com>
>> > I would prefer the fscrypt helper to go through the fscrypt tree.
>> 
>> Sure. This also LGTM.
>> 
>> Thanks
>> 
>
> I've applied it to
> https://git.kernel.org/pub/scm/fs/fscrypt/linux.git/log/?h=for-next
>
> But I ended up reworking the comment a bit and moving the function to be just
> below __fscrypt_prepare_lookup().  So I sent out v4 that matches what I applied.

Awesome, thanks a lot, Eric.

> BTW, I'm wondering if anyone has had any thoughts about the race condition I
> described at https://lore.kernel.org/r/ZBC1P4Gn6eAKD61+@sol.localdomain/.  In
> particular, I'm wondering whether this helper function will need to be changed
> or not.  Maybe not, because ceph could look at DCACHE_NOKEY_NAME to determine
> whether the name should be treated as a no-key name or not, instead of checking
> fscrypt_has_encryption_key() again (as I think it is doing currently)?

I started looking into that but, to be honest, I haven't yet reached any
conclusion.  It looks like the ceph code that handles filenames *may* have
this race too (I'm looking at ceph_fill_trace()) but I'm still not 100%
sure.  In any case, I think that an eventual fix for this race (if it does
indeed exist!) will likely be restricted to the ceph code and won't touch
the generic fscrypt code.  But I'm still looking...

Cheers,