[1/3] storage: refresh volume before deletion

Message ID 4a3ceade1d4d5921e8e05452b7eadefba1a3a6c2.1457544659.git.crobinso@redhat.com
State New
Headers show

Commit Message

Cole Robinson March 9, 2016, 5:38 p.m.
file volume deletion, via virFileRemove, has some logic that depends
on uid/gid owner of the path. This info is cached in the volume XML.
We need to be sure we are using up to date uid/gid before attempting
to delete the volume, otherwise we can hit a case like this:

- test.img created with uid=root
- VM starts up using test.img, owner changed to uid=qemu
- test.img pool is refreshed, uid=qemu is cached in the volume XML
- VM shuts down, volume owner changed to gid=root
- vol-delete test.img thinks uid=qemu, virFileRemove does setuid(qemu),
  fails to delete test.img since it is actually uid=root

https://bugzilla.redhat.com/show_bug.cgi?id=1289327
---
 src/storage/storage_driver.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

-- 
2.5.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Comments

Cole Robinson March 9, 2016, 8:46 p.m. | #1
On 03/09/2016 03:28 PM, John Ferlan wrote:
> 

> 

> On 03/09/2016 12:38 PM, Cole Robinson wrote:

>> file volume deletion, via virFileRemove, has some logic that depends

>> on uid/gid owner of the path. This info is cached in the volume XML.

>> We need to be sure we are using up to date uid/gid before attempting

>> to delete the volume, otherwise we can hit a case like this:

>>

>> - test.img created with uid=root

>> - VM starts up using test.img, owner changed to uid=qemu

>> - test.img pool is refreshed, uid=qemu is cached in the volume XML

>> - VM shuts down, volume owner changed to gid=root

>> - vol-delete test.img thinks uid=qemu, virFileRemove does setuid(qemu),

>>   fails to delete test.img since it is actually uid=root

>>

>> https://bugzilla.redhat.com/show_bug.cgi?id=1289327

>> ---

>>  src/storage/storage_driver.c | 10 ++++++++++

>>  1 file changed, 10 insertions(+)

>>

> 

> Coincidentally I was thinking about this one today... Still not

> convinced this is the only "root" (ahem) cause...

> 

> For the startup/stop path, I assume your reference is the path through

> virSecurityDACSetOwnershipInternal from virSecurityDACSetOwnership and

> virSecurityDACRestoreFileLabelInternal...

> 

> But what if the mode, owner, group are provided in the XML when the

> volume is created:

> 

> # cat vol.xml

> <volume type='file'>

>   <name>test.img</name>

>   <source>

>   </source>

>   <capacity unit='bytes'>10485760</capacity>

>   <allocation unit='bytes'>10485760</allocation>

>   <target>

>     <path>/home/vm-images/test.img</path>

>     <format type='raw'/>

>     <permissions>

>       <mode>0600</mode>

>       <owner>107</owner>

>       <group>107</group>

>       <label>system_u:object_r:unlabeled_t:s0</label>

>     </permissions>

>   </target>

> </volume>

> 

> 

> # virsh vol-create default vol.xml

> Vol test.img created from vol.xml

> 

> # virsh vol-delete test.img default

> error: Failed to delete vol test.img

> error: cannot unlink file '/home/vm-images/test.img': Permission denied

> 

> #

> 

> Yes, I know the following two patches resolve this case...  What's

> interesting is the extents we've gone to on creation to set things up,

> but only a few bread crumbs are left for deletion. The assumption at

> deletion being that no one changed anything from creation, but we see

> that's not true.

> 

> I've been under the assumption that the owner gets set/changed during

> virStorageBackendCreateRaw, virStorageFileBackendFileCreate, or

> virStorageBackendCreateExecCommand.  The setting of the owner for the

> CreateRaw path would occur because the flags had FORCE_OWNER and

> FORCE_MODE set.  The setting of the owner for the CreateExecCommand in

> the non POOL_NETFS path was because they were supplied in the XML and

> not taken as the default from the running of the command to create the

> file (eg qemu-img) and taking the default file/directory ownership.

> Since we run as root, this can all happen without issues.

> 

> Then after any of the create processing happens, we can refresh the pool

> which calls virStorageBackendUpdateVolTargetInfoFD and changes the

> volume uid, gid, mode based on what the lstat(path, &sb) has found...

> This perhaps helps in the event

> 

> But when we get to delete, we have no idea how things could have been

> originally created or perhaps (likely) updated, so we only check what we

> can... If we really wanted to be elaborate - save a provided uid, gid,

> mode at creation... That would help at deletion to determine if

> something changed.  Doesn't work well for existing files (daemon start,

> restart, reload).  We'd need to save volume xml's somewhere. Probably

> far more effort...

> 


This description may be another manifestation of the issue, however
virt-install/virt-manager _never_ specify UID/GID when creating a volume, so
it's unlikely to be the case for any of the bug reporters so far.

> OK - so a too long winded way to say - I don't think the following patch

> matters as anything (including libvirt) could have changed the file's

> ownership and/or protections, but not updated the volume XML. The

> refresh updates the volume XML to match the file's protection and

> furthermore only matters in the current virFileRemove if the change is

> back to root:root and only because we compare vs. gete{uid|gid}().

> 


I don't follow this conclusion really... refreshVol syncs the XML/volume cache
with the current uid/gid/mode on disk, which is then passed down to
virFileRemove. If the cached uid is 'qemu' but the on disk uid is 'cole', the
internal cache is synced, virFileRemove will setuid(cole) and successfully
remove the disk image. Without this change, libvirt would setuid(qemu) and
fail to remove the disk.

So I don't see how this patch is unneeded, or only works for root:root

- Cole

> John

> 

>> diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c

>> index 1d96618..ded54c9 100644

>> --- a/src/storage/storage_driver.c

>> +++ b/src/storage/storage_driver.c

>> @@ -1801,6 +1801,16 @@ storageVolDelete(virStorageVolPtr obj,

>>          goto cleanup;

>>      }

>>  

>> +    /* Need to ensure we are using up to date uid/gid for deletion */

>> +    if (backend->refreshVol &&

>> +        backend->refreshVol(obj->conn, pool, vol) < 0) {

>> +        /* The file may have been removed behind libvirt's back. Don't

>> +           error here, let the deletion fail or succeed instead */

>> +        VIR_INFO("Failed to refresh volume before deletion: %s",

>> +                 virGetLastErrorMessage());

>> +        virResetLastError();

>> +    }

>> +

>>      if (storageVolDeleteInternal(obj, backend, pool, vol, flags, true) < 0)

>>          goto cleanup;

>>  

>>


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Cole Robinson March 10, 2016, 1:22 a.m. | #2
On 03/09/2016 08:20 PM, John Ferlan wrote:
> 

> [...]

> 

>>> OK - so a too long winded way to say - I don't think the following patch

>>> matters as anything (including libvirt) could have changed the file's

>>> ownership and/or protections, but not updated the volume XML. The

>>> refresh updates the volume XML to match the file's protection and

>>> furthermore only matters in the current virFileRemove if the change is

>>> back to root:root and only because we compare vs. gete{uid|gid}().

>>>

>>

>> I don't follow this conclusion really... refreshVol syncs the XML/volume cache

>> with the current uid/gid/mode on disk, which is then passed down to

>> virFileRemove. If the cached uid is 'qemu' but the on disk uid is 'cole', the

>> internal cache is synced, virFileRemove will setuid(cole) and successfully

>> remove the disk image. Without this change, libvirt would setuid(qemu) and

>> fail to remove the disk.

>>

>> So I don't see how this patch is unneeded, or only works for root:root

>>

> 

> I was considering the checks:

> 

> +    if (geteuid() != 0)

> +        return false;

> 

> We're running as root...

> 

> +

> +    /* uid/gid weren'd specified */

> +    if ((uid == (uid_t) -1) && (gid == (gid_t) -1))

> +        return false;

> 

> We've provided qemu:qemu...

> 

> +

> +    /* already running as proper uid/gid */

> +    if (uid == geteuid() && gid == getegid())

> +        return false;

> +

> 

> At this point geteuid() would return 0 (root)

> 

> Maybe I'm wrong...  I suppose it cannot hurt, but without patch 3 we'd

> go through the setuid path - I think. I could also be missing something

> really obvious.

> 


Right, we would go through the setuid path... but it would _succeed_, because
we would be setuid($correct-file-uid) and not setuid($out-of-date-file-uid)

This patch is infact still needed; consider the case when the cached uid is
out of date on NFS: we will do setuid($out-of-date-file-uid) and fail in the
same way as the original report.

- Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Cole Robinson March 10, 2016, 3:53 p.m. | #3
On 03/10/2016 04:30 AM, Ján Tomko wrote:
> On Wed, Mar 09, 2016 at 12:38:58PM -0500, Cole Robinson wrote:

>> file volume deletion, via virFileRemove, has some logic that depends

>> on uid/gid owner of the path. This info is cached in the volume XML.

>> We need to be sure we are using up to date uid/gid before attempting

>> to delete the volume, otherwise we can hit a case like this:

>>

>> - test.img created with uid=root

>> - VM starts up using test.img, owner changed to uid=qemu

>> - test.img pool is refreshed, uid=qemu is cached in the volume XML

>> - VM shuts down, volume owner changed to gid=root

>> - vol-delete test.img thinks uid=qemu, virFileRemove does setuid(qemu),

>>   fails to delete test.img since it is actually uid=root

>>

> 

> For local filesystems, root should be able to remove anything.

> For root-squashed NFS, the uid:gid will usually be nobody:nobody

> the whole time, so this refresh is unlikely to learn something new.

> 

> I think the real fix is patch 3/3 that stops doing setuid on NFS.


Hmm yeah I was forgetting some NFS details :) Thanks for correcting me. I'll
drop this patch

- Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Patch hide | download patch | download mbox

diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
index 1d96618..ded54c9 100644
--- a/src/storage/storage_driver.c
+++ b/src/storage/storage_driver.c
@@ -1801,6 +1801,16 @@  storageVolDelete(virStorageVolPtr obj,
         goto cleanup;
     }
 
+    /* Need to ensure we are using up to date uid/gid for deletion */
+    if (backend->refreshVol &&
+        backend->refreshVol(obj->conn, pool, vol) < 0) {
+        /* The file may have been removed behind libvirt's back. Don't
+           error here, let the deletion fail or succeed instead */
+        VIR_INFO("Failed to refresh volume before deletion: %s",
+                 virGetLastErrorMessage());
+        virResetLastError();
+    }
+
     if (storageVolDeleteInternal(obj, backend, pool, vol, flags, true) < 0)
         goto cleanup;