Message ID | 20221214194056.161492-63-michael.roth@amd.com |
---|---|
State | New |
Headers | show |
Series | Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand |
Hi Dionna, Mike, On 14/12/2022 21:40, Michael Roth wrote: > From: Dionna Glaze <dionnaglaze@google.com> > > The /dev/sev device has the ability to store host-wide certificates for > the key used by the AMD-SP for SEV-SNP attestation report signing, > but for hosts that want to specify additional certificates that are > specific to the image launched in a VM, a different way is needed to > communicate those certificates. > > This patch adds two new KVM ioctl commands: KVM_SEV_SNP_{GET,SET}_CERTS > > The certificates that are set with this command are expected to follow > the same format as the host certificates, but that format is opaque > to the kernel. > > The new behavior for custom certificates is that the extended guest > request command will now return the overridden certificates if they > were installed for the instance. The error condition for a too small > data buffer is changed to return the overridden certificate data size > if there is an overridden certificate set installed. > > Setting a 0 length certificate returns the system state to only return > the host certificates on an extended guest request. > > We also increase the SEV_FW_BLOB_MAX_SIZE another 4K page to allow > space for an extra certificate. > > Cc: Tom Lendacky <Thomas.Lendacky@amd.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Signed-off-by: Dionna Glaze <dionnaglaze@google.com> > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> > Signed-off-by: Michael Roth <michael.roth@amd.com> > --- > arch/x86/kvm/svm/sev.c | 111 ++++++++++++++++++++++++++++++++++++++- > arch/x86/kvm/svm/svm.h | 1 + > include/linux/psp-sev.h | 2 +- > include/uapi/linux/kvm.h | 12 +++++ > 4 files changed, 123 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 4de952d1d446..d0e58cffd1ed 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -2081,6 +2081,7 @@ static void *snp_context_create(struct kvm *kvm, struct kvm_sev_cmd *argp) > goto e_free; > > sev->snp_certs_data = certs_data; > + sev->snp_certs_len = 0; > > return context; > > @@ -2364,6 +2365,86 @@ static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp) > return ret; > } > > +static int snp_get_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) > +{ > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; > + struct kvm_sev_snp_get_certs params; > + > + if (!sev_snp_guest(kvm)) > + return -ENOTTY; > + > + if (!sev->snp_context) > + return -EINVAL; > + > + if (copy_from_user(¶ms, (void __user *)(uintptr_t)argp->data, > + sizeof(params))) > + return -EFAULT; > + > + /* No instance certs set. */ > + if (!sev->snp_certs_len) > + return -ENOENT; > + > + if (params.certs_len < sev->snp_certs_len) { > + /* Output buffer too small. Return the required size. */ > + params.certs_len = sev->snp_certs_len; > + > + if (copy_to_user((void __user *)(uintptr_t)argp->data, ¶ms, > + sizeof(params))) > + return -EFAULT; > + > + return -EINVAL; > + } > + > + if (copy_to_user((void __user *)(uintptr_t)params.certs_uaddr, > + sev->snp_certs_data, sev->snp_certs_len)) > + return -EFAULT; > + > + return 0; > +} > + > +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) > +{ > + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; > + unsigned long length = SEV_FW_BLOB_MAX_SIZE; > + void *to_certs = sev->snp_certs_data; > + struct kvm_sev_snp_set_certs params; > + > + if (!sev_snp_guest(kvm)) > + return -ENOTTY; > + > + if (!sev->snp_context) > + return -EINVAL; > + > + if (copy_from_user(¶ms, (void __user *)(uintptr_t)argp->data, > + sizeof(params))) > + return -EFAULT; > + > + if (params.certs_len > SEV_FW_BLOB_MAX_SIZE) > + return -EINVAL; > + > + /* > + * Setting a length of 0 is the same as "uninstalling" instance- > + * specific certificates. > + */ > + if (params.certs_len == 0) { > + sev->snp_certs_len = 0; > + return 0; > + } > + > + /* Page-align the length */ > + length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK; > + > + if (copy_from_user(to_certs, > + (void __user *)(uintptr_t)params.certs_uaddr, > + params.certs_len)) { > + return -EFAULT; > + } > + > + sev->snp_certs_len = length; Here we set the length to the page-aligned value, but we copy only params.cert_len bytes. If there are two subsequent snp_set_instance_certs() calls where the second one has a shorter length, we might "keep" some leftover bytes from the first call. Consider: 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) If I understand correctly, on the second call we'll copy 4097 "BBB..." bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - 1) & PAGE_MASK which will be 8192. Later when fetching the certs (for the extended report or in snp_get_instance_certs()) the user will get a buffer of 8192 bytes filled with 4097 BBBs and 4095 leftover AAAs. Maybe zero sev->snp_certs_data entirely before writing to it? Related question (not only for this patch) regarding snp_certs_data (host or per-instance): why is its size page-aligned at all? why is it limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer is never sent to the PSP. > + > + return 0; > +} > + > int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp) > { > struct kvm_sev_cmd sev_cmd; [...] > diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h > index a1e6624540f3..970a9de0ed20 100644 > --- a/include/linux/psp-sev.h > +++ b/include/linux/psp-sev.h > @@ -22,7 +22,7 @@ > #define __psp_pa(x) __pa(x) > #endif > > -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ > +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ > This has effects in drivers/crypto/ccp/sev-dev.c (for example in alloc_snp_host_map). Is that OK? -Dov > /** > * SEV platform state > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 61b1e26ced01..48bcc59cf86b 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1949,6 +1949,8 @@ enum sev_cmd_id { > KVM_SEV_SNP_LAUNCH_START, > KVM_SEV_SNP_LAUNCH_UPDATE, > KVM_SEV_SNP_LAUNCH_FINISH, > + KVM_SEV_SNP_GET_CERTS, > + KVM_SEV_SNP_SET_CERTS, > > KVM_SEV_NR_MAX, > }; > @@ -2096,6 +2098,16 @@ struct kvm_sev_snp_launch_finish { > __u8 pad[6]; > }; > > +struct kvm_sev_snp_get_certs { > + __u64 certs_uaddr; > + __u64 certs_len; > +}; > + > +struct kvm_sev_snp_set_certs { > + __u64 certs_uaddr; > + __u64 certs_len; > +}; > + > #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) > #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) > #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)
> > + > > +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) > > +{ > [...] > > Here we set the length to the page-aligned value, but we copy only > params.cert_len bytes. If there are two subsequent > snp_set_instance_certs() calls where the second one has a shorter > length, we might "keep" some leftover bytes from the first call. > > Consider: > 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) > 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) > > If I understand correctly, on the second call we'll copy 4097 "BBB..." > bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - > 1) & PAGE_MASK which will be 8192. > > Later when fetching the certs (for the extended report or in > snp_get_instance_certs()) the user will get a buffer of 8192 bytes > filled with 4097 BBBs and 4095 leftover AAAs. > > Maybe zero sev->snp_certs_data entirely before writing to it? > Yes, I agree it should be zeroed, at least if the previous length is greater than the new length. Good catch. > Related question (not only for this patch) regarding snp_certs_data > (host or per-instance): why is its size page-aligned at all? why is it > limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer > is never sent to the PSP. > The buffer is meant to be copied into the guest driver following the GHCB extended guest request protocol. The data to copy back are expected to be in 4K page granularity. > [...] > > > > -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ > > +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ > > > > This has effects in drivers/crypto/ccp/sev-dev.c > (for > example in alloc_snp_host_map). Is that OK? > No, this was a mistake of mine because I was using a bloated data encoding that needed 5 pages for the GUID table plus 4 small certificates. I've since fixed that in our user space code. We shouldn't change this size and instead wait for a better size negotiation protocol between the guest and host to avoid this awkward hard-coding.
On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>> + >>> +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) >>> +{ >> [...] >> >> Here we set the length to the page-aligned value, but we copy only >> params.cert_len bytes. If there are two subsequent >> snp_set_instance_certs() calls where the second one has a shorter >> length, we might "keep" some leftover bytes from the first call. >> >> Consider: >> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) >> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) >> >> If I understand correctly, on the second call we'll copy 4097 "BBB..." >> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >> 1) & PAGE_MASK which will be 8192. >> >> Later when fetching the certs (for the extended report or in >> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >> filled with 4097 BBBs and 4095 leftover AAAs. >> >> Maybe zero sev->snp_certs_data entirely before writing to it? >> > > Yes, I agree it should be zeroed, at least if the previous length is > greater than the new length. Good catch. > > >> Related question (not only for this patch) regarding snp_certs_data >> (host or per-instance): why is its size page-aligned at all? why is it >> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer >> is never sent to the PSP. >> > > The buffer is meant to be copied into the guest driver following the > GHCB extended guest request protocol. The data to copy back are > expected to be in 4K page granularity. I don't think the data has to be in 4K page granularity. Why do you think it does? Thanks, Tom > >> [...] >>> >>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ >>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ >>> >> >> This has effects in drivers/crypto/ccp/sev-dev.c >> (for >> example in alloc_snp_host_map). Is that OK? >> > > No, this was a mistake of mine because I was using a bloated data > encoding that needed 5 pages for the GUID table plus 4 small > certificates. I've since fixed that in our user space code. > We shouldn't change this size and instead wait for a better size > negotiation protocol between the guest and host to avoid this awkward > hard-coding. > >
Hi Tom, On 10/01/2023 0:27, Tom Lendacky wrote: > On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>>> + >>>> +static int snp_set_instance_certs(struct kvm *kvm, struct >>>> kvm_sev_cmd *argp) >>>> +{ >>> [...] >>> >>> Here we set the length to the page-aligned value, but we copy only >>> params.cert_len bytes. If there are two subsequent >>> snp_set_instance_certs() calls where the second one has a shorter >>> length, we might "keep" some leftover bytes from the first call. >>> >>> Consider: >>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) >>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) >>> >>> If I understand correctly, on the second call we'll copy 4097 "BBB..." >>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >>> 1) & PAGE_MASK which will be 8192. >>> >>> Later when fetching the certs (for the extended report or in >>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >>> filled with 4097 BBBs and 4095 leftover AAAs. >>> >>> Maybe zero sev->snp_certs_data entirely before writing to it? >>> >> >> Yes, I agree it should be zeroed, at least if the previous length is >> greater than the new length. Good catch. >> >> >>> Related question (not only for this patch) regarding snp_certs_data >>> (host or per-instance): why is its size page-aligned at all? why is it >>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer >>> is never sent to the PSP. >>> >> >> The buffer is meant to be copied into the guest driver following the >> GHCB extended guest request protocol. The data to copy back are >> expected to be in 4K page granularity. > > I don't think the data has to be in 4K page granularity. Why do you > think it does? > I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication Block Standardization (July 2022), page 37. The table says: -------------- NAE Event: SNP Extended Guest Request Notes: RAX will have the guest physical address of the page(s) to hold returned data RBX State to Hypervisor: will contain the number of guest contiguous pages supplied to hold returned data State from Hypervisor: on error will contain the number of guest contiguous pages required to hold the data to be returned ... The request page, response page and data page(s) must be assigned to the hypervisor (shared). -------------- According to this spec, it looks like the sizes are communicated as number of pages in RBX. So the data should start at a 4KB alignment (this is verified in snp_handle_ext_guest_request()) and its length should be 4KB-aligned, as Dionna noted. I see no reason (in the spec and in the kernel code) for the data length to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some flow because Dionna ran into this limit. -Dov > Thanks, > Tom > >> >>> [...] >>>> >>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ >>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ >>>> >>> >>> This has effects in drivers/crypto/ccp/sev-dev.c >>> (for >>> example in alloc_snp_host_map). Is that OK? >>> >> >> No, this was a mistake of mine because I was using a bloated data >> encoding that needed 5 pages for the GUID table plus 4 small >> certificates. I've since fixed that in our user space code. >> We shouldn't change this size and instead wait for a better size >> negotiation protocol between the guest and host to avoid this awkward >> hard-coding. >> >>
On 1/10/23 01:10, Dov Murik wrote: > Hi Tom, > > On 10/01/2023 0:27, Tom Lendacky wrote: >> On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>>>> + >>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct >>>>> kvm_sev_cmd *argp) >>>>> +{ >>>> [...] >>>> >>>> Here we set the length to the page-aligned value, but we copy only >>>> params.cert_len bytes. If there are two subsequent >>>> snp_set_instance_certs() calls where the second one has a shorter >>>> length, we might "keep" some leftover bytes from the first call. >>>> >>>> Consider: >>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) >>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) >>>> >>>> If I understand correctly, on the second call we'll copy 4097 "BBB..." >>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >>>> 1) & PAGE_MASK which will be 8192. >>>> >>>> Later when fetching the certs (for the extended report or in >>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >>>> filled with 4097 BBBs and 4095 leftover AAAs. >>>> >>>> Maybe zero sev->snp_certs_data entirely before writing to it? >>>> >>> >>> Yes, I agree it should be zeroed, at least if the previous length is >>> greater than the new length. Good catch. >>> >>> >>>> Related question (not only for this patch) regarding snp_certs_data >>>> (host or per-instance): why is its size page-aligned at all? why is it >>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer >>>> is never sent to the PSP. >>>> >>> >>> The buffer is meant to be copied into the guest driver following the >>> GHCB extended guest request protocol. The data to copy back are >>> expected to be in 4K page granularity. >> >> I don't think the data has to be in 4K page granularity. Why do you >> think it does? >> > > I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication > Block Standardization (July 2022), page 37. The table says: > > -------------- > > NAE Event: SNP Extended Guest Request > > Notes: > > RAX will have the guest physical address of the page(s) to hold returned > data > > RBX > State to Hypervisor: will contain the number of guest contiguous > pages supplied to hold returned data > State from Hypervisor: on error will contain the number of guest > contiguous pages required to hold the data to be returned > > ... > > The request page, response page and data page(s) must be assigned to the > hypervisor (shared). > > -------------- > > > According to this spec, it looks like the sizes are communicated as > number of pages in RBX. So the data should start at a 4KB alignment > (this is verified in snp_handle_ext_guest_request()) and its length > should be 4KB-aligned, as Dionna noted. That only indicates how many pages are required to hold the data, but the hypervisor only has to copy however much data is present. If the data is 20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for the number of pages, then the code returns 1 in RBX to indicate that one page is required to hold the 20 bytes. > > I see no reason (in the spec and in the kernel code) for the data length > to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some > flow because Dionna ran into this limit. Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way to keep the memory usage controlled because data is coming from userspace and it isn't expected that the data would be larger than that. I'm not sure if that was in from the start or as a result of a review comment. Not sure what is the best approach is. Thanks, Tom > > > -Dov > > > >> Thanks, >> Tom >> >>> >>>> [...] >>>>> >>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ >>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ >>>>> >>>> >>>> This has effects in drivers/crypto/ccp/sev-dev.c >>>> (for >>>> example in alloc_snp_host_map). Is that OK? >>>> >>> >>> No, this was a mistake of mine because I was using a bloated data >>> encoding that needed 5 pages for the GUID table plus 4 small >>> certificates. I've since fixed that in our user space code. >>> We shouldn't change this size and instead wait for a better size >>> negotiation protocol between the guest and host to avoid this awkward >>> hard-coding. >>> >>>
On Tue, Jan 10, 2023 at 8:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 1/10/23 01:10, Dov Murik wrote: > > Hi Tom, > > > > On 10/01/2023 0:27, Tom Lendacky wrote: > >> On 1/9/23 10:55, Dionna Amalie Glaze wrote: > >>>>> + > >>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct > >>>>> kvm_sev_cmd *argp) > >>>>> +{ > >>>> [...] > >>>> > >>>> Here we set the length to the page-aligned value, but we copy only > >>>> params.cert_len bytes. If there are two subsequent > >>>> snp_set_instance_certs() calls where the second one has a shorter > >>>> length, we might "keep" some leftover bytes from the first call. > >>>> > >>>> Consider: > >>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) > >>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) > >>>> > >>>> If I understand correctly, on the second call we'll copy 4097 "BBB..." > >>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - > >>>> 1) & PAGE_MASK which will be 8192. > >>>> > >>>> Later when fetching the certs (for the extended report or in > >>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes > >>>> filled with 4097 BBBs and 4095 leftover AAAs. > >>>> > >>>> Maybe zero sev->snp_certs_data entirely before writing to it? > >>>> > >>> > >>> Yes, I agree it should be zeroed, at least if the previous length is > >>> greater than the new length. Good catch. > >>> > >>> > >>>> Related question (not only for this patch) regarding snp_certs_data > >>>> (host or per-instance): why is its size page-aligned at all? why is it > >>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer > >>>> is never sent to the PSP. > >>>> > >>> > >>> The buffer is meant to be copied into the guest driver following the > >>> GHCB extended guest request protocol. The data to copy back are > >>> expected to be in 4K page granularity. > >> > >> I don't think the data has to be in 4K page granularity. Why do you > >> think it does? > >> > > > > I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication > > Block Standardization (July 2022), page 37. The table says: > > > > -------------- > > > > NAE Event: SNP Extended Guest Request > > > > Notes: > > > > RAX will have the guest physical address of the page(s) to hold returned > > data > > > > RBX > > State to Hypervisor: will contain the number of guest contiguous > > pages supplied to hold returned data > > State from Hypervisor: on error will contain the number of guest > > contiguous pages required to hold the data to be returned > > > > ... > > > > The request page, response page and data page(s) must be assigned to the > > hypervisor (shared). > > > > -------------- > > > > > > According to this spec, it looks like the sizes are communicated as > > number of pages in RBX. So the data should start at a 4KB alignment > > (this is verified in snp_handle_ext_guest_request()) and its length > > should be 4KB-aligned, as Dionna noted. > > That only indicates how many pages are required to hold the data, but the > hypervisor only has to copy however much data is present. If the data is > 20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for > the number of pages, then the code returns 1 in RBX to indicate that one > page is required to hold the 20 bytes. > > > > > I see no reason (in the spec and in the kernel code) for the data length > > to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some > > flow because Dionna ran into this limit. > > Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way > to keep the memory usage controlled because data is coming from userspace > and it isn't expected that the data would be larger than that. > > I'm not sure if that was in from the start or as a result of a review > comment. Not sure what is the best approach is. This was discussed a bit in the guest driver changes recently too that SEV_FW_BLOB_MAX_SIZE is used in the guest driver code for the max cert length. We discussed increasing the limit there after fixing the IV reuse issue. Maybe we could introduce SEV_CERT_BLOB_MAX_SIZE here to be more clear there is no firmware based limit? Then we could switch the guest driver to use that too. Dionna confirmed 4 pages is enough for our current usecase, Dov would you recommend something larger to start? > > Thanks, > Tom > > > > > > > -Dov > > > > > > > >> Thanks, > >> Tom > >> > >>> > >>>> [...] > >>>>> > >>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ > >>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ > >>>>> > >>>> > >>>> This has effects in drivers/crypto/ccp/sev-dev.c > >>>> (for > >>>> example in alloc_snp_host_map). Is that OK? > >>>> > >>> > >>> No, this was a mistake of mine because I was using a bloated data > >>> encoding that needed 5 pages for the GUID table plus 4 small > >>> certificates. I've since fixed that in our user space code. > >>> We shouldn't change this size and instead wait for a better size > >>> negotiation protocol between the guest and host to avoid this awkward > >>> hard-coding. > >>> > >>>
On 10/01/2023 17:10, Tom Lendacky wrote: > On 1/10/23 01:10, Dov Murik wrote: >> Hi Tom, >> >> On 10/01/2023 0:27, Tom Lendacky wrote: >>> On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>>>>> + >>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct >>>>>> kvm_sev_cmd *argp) >>>>>> +{ >>>>> [...] >>>>> >>>>> Here we set the length to the page-aligned value, but we copy only >>>>> params.cert_len bytes. If there are two subsequent >>>>> snp_set_instance_certs() calls where the second one has a shorter >>>>> length, we might "keep" some leftover bytes from the first call. >>>>> >>>>> Consider: >>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", >>>>> certs_len=8192) >>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", >>>>> certs_len=4097) >>>>> >>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..." >>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >>>>> 1) & PAGE_MASK which will be 8192. >>>>> >>>>> Later when fetching the certs (for the extended report or in >>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >>>>> filled with 4097 BBBs and 4095 leftover AAAs. >>>>> >>>>> Maybe zero sev->snp_certs_data entirely before writing to it? >>>>> >>>> >>>> Yes, I agree it should be zeroed, at least if the previous length is >>>> greater than the new length. Good catch. >>>> >>>> >>>>> Related question (not only for this patch) regarding snp_certs_data >>>>> (host or per-instance): why is its size page-aligned at all? why is it >>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this >>>>> buffer >>>>> is never sent to the PSP. >>>>> >>>> >>>> The buffer is meant to be copied into the guest driver following the >>>> GHCB extended guest request protocol. The data to copy back are >>>> expected to be in 4K page granularity. >>> >>> I don't think the data has to be in 4K page granularity. Why do you >>> think it does? >>> >> >> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication >> Block Standardization (July 2022), page 37. The table says: >> >> -------------- >> >> NAE Event: SNP Extended Guest Request >> >> Notes: >> >> RAX will have the guest physical address of the page(s) to hold returned >> data >> >> RBX >> State to Hypervisor: will contain the number of guest contiguous >> pages supplied to hold returned data >> State from Hypervisor: on error will contain the number of guest >> contiguous pages required to hold the data to be returned >> >> ... >> >> The request page, response page and data page(s) must be assigned to the >> hypervisor (shared). >> >> -------------- >> >> >> According to this spec, it looks like the sizes are communicated as >> number of pages in RBX. So the data should start at a 4KB alignment >> (this is verified in snp_handle_ext_guest_request()) and its length >> should be 4KB-aligned, as Dionna noted. > > That only indicates how many pages are required to hold the data, but > the hypervisor only has to copy however much data is present. If the > data is 20 bytes, then you only have to copy 20 bytes. If the user > supplied 0 for the number of pages, then the code returns 1 in RBX to > indicate that one page is required to hold the 20 bytes. > Maybe it should only copy 20 bytes, but current implementation copies whole 4KB pages: if (sev->snp_certs_len) data_npages = sev->snp_certs_len >> PAGE_SHIFT; ... ... /* Copy the certificate blob in the guest memory */ if (data_npages && kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT)) rc = SEV_RET_INVALID_ADDRESS; (elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment to data_npages is in fact correct even though looks off-by-one; aside, maybe it's better to use some DIV_ROUND_UP macro anywhere we calculate the number of needed pages.) Also -- how does the guest know they got only 20 bytes and not 4096? Do they have to read all the 'struct cert_table' entries at the beginning of the received data? -Dov >> >> I see no reason (in the spec and in the kernel code) for the data length >> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some >> flow because Dionna ran into this limit. > > Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way > to keep the memory usage controlled because data is coming from > userspace and it isn't expected that the data would be larger than that. > > I'm not sure if that was in from the start or as a result of a review > comment. Not sure what is the best approach is. > > Thanks, > Tom > >> >> >> -Dov >> >> >> >>> Thanks, >>> Tom >>> >>>> >>>>> [...] >>>>>> >>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ >>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ >>>>>> >>>>> >>>>> This has effects in drivers/crypto/ccp/sev-dev.c >>>>> (for >>>>> example in alloc_snp_host_map). Is that OK? >>>>> >>>> >>>> No, this was a mistake of mine because I was using a bloated data >>>> encoding that needed 5 pages for the GUID table plus 4 small >>>> certificates. I've since fixed that in our user space code. >>>> We shouldn't change this size and instead wait for a better size >>>> negotiation protocol between the guest and host to avoid this awkward >>>> hard-coding. >>>> >>>>
Hi Peter, On 10/01/2023 17:23, Peter Gonda wrote: > On Tue, Jan 10, 2023 at 8:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote: >> >> On 1/10/23 01:10, Dov Murik wrote: >>> Hi Tom, >>> >>> On 10/01/2023 0:27, Tom Lendacky wrote: >>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>>>>>> + >>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct >>>>>>> kvm_sev_cmd *argp) >>>>>>> +{ >>>>>> [...] >>>>>> >>>>>> Here we set the length to the page-aligned value, but we copy only >>>>>> params.cert_len bytes. If there are two subsequent >>>>>> snp_set_instance_certs() calls where the second one has a shorter >>>>>> length, we might "keep" some leftover bytes from the first call. >>>>>> >>>>>> Consider: >>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192) >>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097) >>>>>> >>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..." >>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >>>>>> 1) & PAGE_MASK which will be 8192. >>>>>> >>>>>> Later when fetching the certs (for the extended report or in >>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >>>>>> filled with 4097 BBBs and 4095 leftover AAAs. >>>>>> >>>>>> Maybe zero sev->snp_certs_data entirely before writing to it? >>>>>> >>>>> >>>>> Yes, I agree it should be zeroed, at least if the previous length is >>>>> greater than the new length. Good catch. >>>>> >>>>> >>>>>> Related question (not only for this patch) regarding snp_certs_data >>>>>> (host or per-instance): why is its size page-aligned at all? why is it >>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer >>>>>> is never sent to the PSP. >>>>>> >>>>> >>>>> The buffer is meant to be copied into the guest driver following the >>>>> GHCB extended guest request protocol. The data to copy back are >>>>> expected to be in 4K page granularity. >>>> >>>> I don't think the data has to be in 4K page granularity. Why do you >>>> think it does? >>>> >>> >>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication >>> Block Standardization (July 2022), page 37. The table says: >>> >>> -------------- >>> >>> NAE Event: SNP Extended Guest Request >>> >>> Notes: >>> >>> RAX will have the guest physical address of the page(s) to hold returned >>> data >>> >>> RBX >>> State to Hypervisor: will contain the number of guest contiguous >>> pages supplied to hold returned data >>> State from Hypervisor: on error will contain the number of guest >>> contiguous pages required to hold the data to be returned >>> >>> ... >>> >>> The request page, response page and data page(s) must be assigned to the >>> hypervisor (shared). >>> >>> -------------- >>> >>> >>> According to this spec, it looks like the sizes are communicated as >>> number of pages in RBX. So the data should start at a 4KB alignment >>> (this is verified in snp_handle_ext_guest_request()) and its length >>> should be 4KB-aligned, as Dionna noted. >> >> That only indicates how many pages are required to hold the data, but the >> hypervisor only has to copy however much data is present. If the data is >> 20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for >> the number of pages, then the code returns 1 in RBX to indicate that one >> page is required to hold the 20 bytes. >> >>> >>> I see no reason (in the spec and in the kernel code) for the data length >>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some >>> flow because Dionna ran into this limit. >> >> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way >> to keep the memory usage controlled because data is coming from userspace >> and it isn't expected that the data would be larger than that. >> >> I'm not sure if that was in from the start or as a result of a review >> comment. Not sure what is the best approach is. > > This was discussed a bit in the guest driver changes recently too that > SEV_FW_BLOB_MAX_SIZE is used in the guest driver code for the max cert > length. We discussed increasing the limit there after fixing the IV > reuse issue. I see it now. (Joerg, maybe we should add F:drivers/virt/coco/ to the MAINTAINERS list so that patches there are hopefully sent to linux-coco?) > > Maybe we could introduce SEV_CERT_BLOB_MAX_SIZE here to be more clear > there is no firmware based limit? Then we could switch the guest > driver to use that too. Dionna confirmed 4 pages is enough for our > current usecase, Dov would you recommend something larger to start? > Introducing a new constant sounds good to me (and use the same constant in the guest driver). I think 4 pages are OK; I also don't see real harm in increasing this limit to 1 MB (if the host+guest agree to pass more stuff there, besides certificates). But maybe that's just abusing this channel, and for other data we should use other mechanisms (like vsock). -Dov
On 1/11/23 00:00, Dov Murik wrote: > > > On 10/01/2023 17:10, Tom Lendacky wrote: >> On 1/10/23 01:10, Dov Murik wrote: >>> Hi Tom, >>> >>> On 10/01/2023 0:27, Tom Lendacky wrote: >>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote: >>>>>>> + >>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct >>>>>>> kvm_sev_cmd *argp) >>>>>>> +{ >>>>>> [...] >>>>>> >>>>>> Here we set the length to the page-aligned value, but we copy only >>>>>> params.cert_len bytes. If there are two subsequent >>>>>> snp_set_instance_certs() calls where the second one has a shorter >>>>>> length, we might "keep" some leftover bytes from the first call. >>>>>> >>>>>> Consider: >>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", >>>>>> certs_len=8192) >>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", >>>>>> certs_len=4097) >>>>>> >>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..." >>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE - >>>>>> 1) & PAGE_MASK which will be 8192. >>>>>> >>>>>> Later when fetching the certs (for the extended report or in >>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes >>>>>> filled with 4097 BBBs and 4095 leftover AAAs. >>>>>> >>>>>> Maybe zero sev->snp_certs_data entirely before writing to it? >>>>>> >>>>> >>>>> Yes, I agree it should be zeroed, at least if the previous length is >>>>> greater than the new length. Good catch. >>>>> >>>>> >>>>>> Related question (not only for this patch) regarding snp_certs_data >>>>>> (host or per-instance): why is its size page-aligned at all? why is it >>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this >>>>>> buffer >>>>>> is never sent to the PSP. >>>>>> >>>>> >>>>> The buffer is meant to be copied into the guest driver following the >>>>> GHCB extended guest request protocol. The data to copy back are >>>>> expected to be in 4K page granularity. >>>> >>>> I don't think the data has to be in 4K page granularity. Why do you >>>> think it does? >>>> >>> >>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication >>> Block Standardization (July 2022), page 37. The table says: >>> >>> -------------- >>> >>> NAE Event: SNP Extended Guest Request >>> >>> Notes: >>> >>> RAX will have the guest physical address of the page(s) to hold returned >>> data >>> >>> RBX >>> State to Hypervisor: will contain the number of guest contiguous >>> pages supplied to hold returned data >>> State from Hypervisor: on error will contain the number of guest >>> contiguous pages required to hold the data to be returned >>> >>> ... >>> >>> The request page, response page and data page(s) must be assigned to the >>> hypervisor (shared). >>> >>> -------------- >>> >>> >>> According to this spec, it looks like the sizes are communicated as >>> number of pages in RBX. So the data should start at a 4KB alignment >>> (this is verified in snp_handle_ext_guest_request()) and its length >>> should be 4KB-aligned, as Dionna noted. >> >> That only indicates how many pages are required to hold the data, but >> the hypervisor only has to copy however much data is present. If the >> data is 20 bytes, then you only have to copy 20 bytes. If the user >> supplied 0 for the number of pages, then the code returns 1 in RBX to >> indicate that one page is required to hold the 20 bytes. >> > > > Maybe it should only copy 20 bytes, but current implementation copies > whole 4KB pages: > > > if (sev->snp_certs_len) > data_npages = sev->snp_certs_len >> PAGE_SHIFT; > ... > ... > /* Copy the certificate blob in the guest memory */ > if (data_npages && > kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT)) > rc = SEV_RET_INVALID_ADDRESS; > > > (elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment > to data_npages is in fact correct even though looks off-by-one; aside, maybe it's > better to use some DIV_ROUND_UP macro anywhere we calculate the number of > needed pages.) Hmmm... yeah, not sure why it was implemented that way, I guess it can always be changed later if desired. > > Also -- how does the guest know they got only 20 bytes and not 4096? Do they have > to read all the 'struct cert_table' entries at the beginning of the received data? Yes, they should walk the cert table entries. Thanks, Tom > > -Dov > > >>> >>> I see no reason (in the spec and in the kernel code) for the data length >>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some >>> flow because Dionna ran into this limit. >> >> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way >> to keep the memory usage controlled because data is coming from >> userspace and it isn't expected that the data would be larger than that. >> >> I'm not sure if that was in from the start or as a result of a review >> comment. Not sure what is the best approach is. >> >> Thanks, >> Tom >> >>> >>> >>> -Dov >>> >>> >>> >>>> Thanks, >>>> Tom >>>> >>>>> >>>>>> [...] >>>>>>> >>>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ >>>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ >>>>>>> >>>>>> >>>>>> This has effects in drivers/crypto/ccp/sev-dev.c >>>>>> (for >>>>>> example in alloc_snp_host_map). Is that OK? >>>>>> >>>>> >>>>> No, this was a mistake of mine because I was using a bloated data >>>>> encoding that needed 5 pages for the GUID table plus 4 small >>>>> certificates. I've since fixed that in our user space code. >>>>> We shouldn't change this size and instead wait for a better size >>>>> negotiation protocol between the guest and host to avoid this awkward >>>>> hard-coding. >>>>> >>>>>
> + > + /* Page-align the length */ > + length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK; > + I believe Ashish wanted this to be PAGE_ALIGN(params.certs_len)
On Thu, Jan 19, 2023 at 2:18 PM Kalra, Ashish <ashish.kalra@amd.com> wrote: > > Hello Dionna, > > Do you also have other updates to this patch with regard to review > comments from Dov ? > Apart from the PAGE_ALIGN change, the result of the whole discussion appears to only need the following immediately before the copy_from_user of certs_uaddr in the snp_set_instance_certs function: /* The size could shrink and leave garbage at the end. */ memset(sev->snp_certs_data, 0, SEV_FW_BLOB_MAX_SIZE); I don't believe there is an off-by-one with the page shifting for the number of pages because snp_certs_len is already rounded up to the nearest page size. Any other change wrt the way the blob size is decided between the guest and host should come later.
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 4de952d1d446..d0e58cffd1ed 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -2081,6 +2081,7 @@ static void *snp_context_create(struct kvm *kvm, struct kvm_sev_cmd *argp) goto e_free; sev->snp_certs_data = certs_data; + sev->snp_certs_len = 0; return context; @@ -2364,6 +2365,86 @@ static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp) return ret; } +static int snp_get_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) +{ + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; + struct kvm_sev_snp_get_certs params; + + if (!sev_snp_guest(kvm)) + return -ENOTTY; + + if (!sev->snp_context) + return -EINVAL; + + if (copy_from_user(¶ms, (void __user *)(uintptr_t)argp->data, + sizeof(params))) + return -EFAULT; + + /* No instance certs set. */ + if (!sev->snp_certs_len) + return -ENOENT; + + if (params.certs_len < sev->snp_certs_len) { + /* Output buffer too small. Return the required size. */ + params.certs_len = sev->snp_certs_len; + + if (copy_to_user((void __user *)(uintptr_t)argp->data, ¶ms, + sizeof(params))) + return -EFAULT; + + return -EINVAL; + } + + if (copy_to_user((void __user *)(uintptr_t)params.certs_uaddr, + sev->snp_certs_data, sev->snp_certs_len)) + return -EFAULT; + + return 0; +} + +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp) +{ + struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info; + unsigned long length = SEV_FW_BLOB_MAX_SIZE; + void *to_certs = sev->snp_certs_data; + struct kvm_sev_snp_set_certs params; + + if (!sev_snp_guest(kvm)) + return -ENOTTY; + + if (!sev->snp_context) + return -EINVAL; + + if (copy_from_user(¶ms, (void __user *)(uintptr_t)argp->data, + sizeof(params))) + return -EFAULT; + + if (params.certs_len > SEV_FW_BLOB_MAX_SIZE) + return -EINVAL; + + /* + * Setting a length of 0 is the same as "uninstalling" instance- + * specific certificates. + */ + if (params.certs_len == 0) { + sev->snp_certs_len = 0; + return 0; + } + + /* Page-align the length */ + length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK; + + if (copy_from_user(to_certs, + (void __user *)(uintptr_t)params.certs_uaddr, + params.certs_len)) { + return -EFAULT; + } + + sev->snp_certs_len = length; + + return 0; +} + int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_sev_cmd sev_cmd; @@ -2463,6 +2544,12 @@ int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp) case KVM_SEV_SNP_LAUNCH_FINISH: r = snp_launch_finish(kvm, &sev_cmd); break; + case KVM_SEV_SNP_GET_CERTS: + r = snp_get_instance_certs(kvm, &sev_cmd); + break; + case KVM_SEV_SNP_SET_CERTS: + r = snp_set_instance_certs(kvm, &sev_cmd); + break; default: r = -EINVAL; goto out; @@ -3575,8 +3662,28 @@ static void snp_handle_ext_guest_request(struct vcpu_svm *svm, gpa_t req_gpa, gp if (rc) goto unlock; - rc = snp_guest_ext_guest_request(&req, (unsigned long)sev->snp_certs_data, - &data_npages, &err); + /* + * If the VMM has overridden the certs, then change the error message + * if the size is inappropriate for the override. Otherwise, use a + * regular guest request and copy back the instance certs. + */ + if (sev->snp_certs_len) { + if ((data_npages << PAGE_SHIFT) < sev->snp_certs_len) { + rc = -EINVAL; + err = SNP_GUEST_REQ_INVALID_LEN; + goto datalen; + } + rc = sev_issue_cmd(kvm, SEV_CMD_SNP_GUEST_REQUEST, &req, + (int *)&err); + } else { + rc = snp_guest_ext_guest_request(&req, + (unsigned long)sev->snp_certs_data, + &data_npages, &err); + } +datalen: + if (sev->snp_certs_len) + data_npages = sev->snp_certs_len >> PAGE_SHIFT; + if (rc) { /* * If buffer length is small then return the expected diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 38aa579f6f70..8d1ba66860a4 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -102,6 +102,7 @@ struct kvm_sev_info { void *snp_context; /* SNP guest context page */ spinlock_t psc_lock; void *snp_certs_data; + unsigned int snp_certs_len; /* Size of instance override for certs */ struct mutex guest_req_lock; u64 sev_features; /* Features set at VMSA creation */ diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h index a1e6624540f3..970a9de0ed20 100644 --- a/include/linux/psp-sev.h +++ b/include/linux/psp-sev.h @@ -22,7 +22,7 @@ #define __psp_pa(x) __pa(x) #endif -#define SEV_FW_BLOB_MAX_SIZE 0x4000 /* 16KB */ +#define SEV_FW_BLOB_MAX_SIZE 0x5000 /* 20KB */ /** * SEV platform state diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 61b1e26ced01..48bcc59cf86b 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1949,6 +1949,8 @@ enum sev_cmd_id { KVM_SEV_SNP_LAUNCH_START, KVM_SEV_SNP_LAUNCH_UPDATE, KVM_SEV_SNP_LAUNCH_FINISH, + KVM_SEV_SNP_GET_CERTS, + KVM_SEV_SNP_SET_CERTS, KVM_SEV_NR_MAX, }; @@ -2096,6 +2098,16 @@ struct kvm_sev_snp_launch_finish { __u8 pad[6]; }; +struct kvm_sev_snp_get_certs { + __u64 certs_uaddr; + __u64 certs_len; +}; + +struct kvm_sev_snp_set_certs { + __u64 certs_uaddr; + __u64 certs_len; +}; + #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2)