Message ID | 20250514-venus-fixes-v3-0-32298566011f@quicinc.com |
---|---|
Headers | show |
Series | venus driver fixes for vulnerabilities due to unexpected firmware payload | expand |
On 15/05/2025 10:56, Vikash Garodia wrote: > memcpy(hfi_dev->pkt_buf, rd_ptr from shared queue, dwords..) > > pkt_hdr = (struct hfi_pkt_hdr *) (hfi_dev->pkt_buf); > > if ((pkt_hdr->size >> 2) != dwords) > return -EINVAL; Yeah it would be better wrt the commit log. But does it really give additional data-confidence - I don't believe it does. => The firmware can update the pkt header after our subsequent-to-memcpy() check. Again this is a data-lifetime expectation question. You validate the received data against a maximum size reading to a buffer you know the size of - and do it once. The firmware might corrupt that data in-between but that is not catastrophic for the APSS which has a buffer of a known size containing potential bad data. Fine - and additional check after the mempcy() only imparts verisimilitude - only validates our data at the time of the check. my-linear-uninterrupted-context: memcpy(); if(*rd_ptr >> 2 > len) <- doesn't branch return -EBAD if(*rd_ptr >> 2 > len) <- does branch firmware went nuts return -EBAD Superficially you might say this addresses the problem if (*rd_ptr > MAX) return -EBAD; memcpy(); if (*rd_ptr > MAX) return -EBAD; But what if the "malicious" firmware only updated the data in the packet, not the length - or another field we are not checking ? As I say if this can happen if (*rd_ptr > MAX) return -EBAD; memcpy(); if (*rd_ptr > MAX) // good return -EBAD; if (*rd_ptr > MAX) //bad return -EBAD; then this can happen if (*rd_ptr > MAX) return -EBAD; memcpy(); if (*rd_ptr > MAX) // good return -EBAD; if (*rd_ptr > MAX) //good return -EBAD; if (*rd_ptr > MAX) //bad return -EBAD; We need to have a crisp and clear definition of the data-lifetime. Since we don't have a checksum element in the header the only check that makes sense is to validate the buffer size data_len = *ptr_val >> 2; if (data_len > max) return BAD; Using the data_len in memcpy if the *ptr_val can change is _NOT_ TOCTOU This is TOCTOU if (*ptr_val > max) return EBAD; memcpy(dst, src, *ptr_val); Because I validated the content of the pointer and then I relied on the data that pointer pointed to, which might have changed. TBH I think the entire premise of this patch is bogus. --- bod
On 5/15/2025 3:58 PM, Bryan O'Donoghue wrote: > On 15/05/2025 10:56, Vikash Garodia wrote: >> memcpy(hfi_dev->pkt_buf, rd_ptr from shared queue, dwords..) >> >> pkt_hdr = (struct hfi_pkt_hdr *) (hfi_dev->pkt_buf); >> >> if ((pkt_hdr->size >> 2) != dwords) >> return -EINVAL; > > Yeah it would be better wrt the commit log. > > But does it really give additional data-confidence - I don't believe it does. > > => The firmware can update the pkt header after our subsequent-to-memcpy() check. How will that matter if the queue is updated after memcpy to local packet ? All processing of data would be from local packet. > > Again this is a data-lifetime expectation question. > > You validate the received data against a maximum size reading to a buffer you > know the size of - and do it once. > > The firmware might corrupt that data in-between but that is not catastrophic for > the APSS which has a buffer of a known size containing potential bad data. There is no way to authenticate the content of payload. We are trying to avoid vulnerability by ensuring OOB does not happen by validating the size _alone_. Do you see rest of the data in payload can lead to any kind of vulnerability ? > > Fine - and additional check after the mempcy() only imparts verisimilitude - > only validates our data at the time of the check. > > my-linear-uninterrupted-context: > > memcpy(); > > if(*rd_ptr >> 2 > len) <- doesn't branch > return -EBAD > > if(*rd_ptr >> 2 > len) <- does branch firmware went nuts > return -EBAD > > Superficially you might say this addresses the problem > > if (*rd_ptr > MAX) > return -EBAD; > > memcpy(); > > if (*rd_ptr > MAX) > return -EBAD; > > But what if the "malicious" firmware only updated the data in the packet, not > the length - or another field we are not checking ? That does not cause any vulnerability. You can check and suggest if you see a vulnerability when the data outside length is an issue w.r.t vulnerability. > > As I say if this can happen > > > if (*rd_ptr > MAX) > return -EBAD; > > memcpy(); > > if (*rd_ptr > MAX) // good > return -EBAD; > > > if (*rd_ptr > MAX) //bad > return -EBAD; > > then this can happen > > if (*rd_ptr > MAX) > return -EBAD; > > memcpy(); > > if (*rd_ptr > MAX) // good > return -EBAD; > > > if (*rd_ptr > MAX) //good > return -EBAD; > > if (*rd_ptr > MAX) //bad > return -EBAD; > > We need to have a crisp and clear definition of the data-lifetime. Since we > don't have a checksum element in the header the only check that makes sense is > to validate the buffer size > > data_len = *ptr_val >> 2; > if (data_len > max) > return BAD; > > Using the data_len in memcpy if the *ptr_val can change is _NOT_ TOCTOU > > This is TOCTOU > > if (*ptr_val > max) > return EBAD; > > memcpy(dst, src, *ptr_val); > > Because I validated the content of the pointer and then I relied on the data > that pointer pointed to, which might have changed. Yes, precisely for that, memcpy() does not rely on ptr_val. The one you are referring as data_len is same as dword. Regards, Vikash > > TBH I think the entire premise of this patch is bogus. > > --- > bod
On 15/05/2025 13:11, Vikash Garodia wrote: >> But what if the "malicious" firmware only updated the data in the packet, not >> the length - or another field we are not checking ? > That does not cause any vulnerability. You can check and suggest if you see a > vulnerability when the data outside length is an issue w.r.t vulnerability. I don't believe you have identified a vulnerability here. You read a length field, you check that length field against a MAX size. Re-reading to see if the firmware wrote new bad data to the transmitted packet in-memory is not a fix before or after the memcpy() because the time you do that re-read is not fixed - locked wrt the freerunning firmware. --- bod
On 5/15/2025 6:17 PM, Bryan O'Donoghue wrote: > On 15/05/2025 13:11, Vikash Garodia wrote: >>> But what if the "malicious" firmware only updated the data in the packet, not >>> the length - or another field we are not checking ? >> That does not cause any vulnerability. You can check and suggest if you see a >> vulnerability when the data outside length is an issue w.r.t vulnerability. > > I don't believe you have identified a vulnerability here. > > You read a length field, you check that length field against a MAX size. > > Re-reading to see if the firmware wrote new bad data to the transmitted packet > in-memory is not a fix before or after the memcpy() because the time you do that > re-read is not fixed - locked wrt the freerunning firmware. It would be more meaningful if you can suggest the vulnerability you see with the changes suggested i.e adding the check in local packet against the size read from shared queue. Based on that we can see how to fix it, otherwise this discussion in not leading to any conclusion. Regards, Vikash
On 15/05/2025 14:23, Vikash Garodia wrote: >> Re-reading to see if the firmware wrote new bad data to the transmitted packet >> in-memory is not a fix before or after the memcpy() because the time you do that >> re-read is not fixed - locked wrt the freerunning firmware. > It would be more meaningful if you can suggest the vulnerability you see with > the changes suggested i.e adding the check in local packet against the size read > from shared queue. Based on that we can see how to fix it, otherwise this > discussion in not leading to any conclusion. So to re-iterate. TOCTOU is this if (*ptr_val >> 2 >= MAX) return -EBAD; memcpy(dst, src, *ptr_val >> 2); Here a malicious actor can change *ptr_val between our check and our use. not data_value = *ptr_val >> 2; if (data_value >= MAX) return -EBAD; memcpy(dst, src, data_value); Here the taking a copy of the value and subsequently relying on that value mitigates TOCTOU, because the value *ptr_val is latched - read into a local variable which cannot be manipulated from an outside agent i.e. venus firmware. The example in the commit log is not a TOCTOU for that reason. Adding an additional check _after_ the memcpy() seems silly to me because data_value = *ptr_val >> 2; if (data_value >= MAX) return -EBAD; memcpy(dst, src, data_value); // This statement could be false if (data_value != *ptr_value >> 2) return -EBAD; // while this subsequent statement is true if (data_value != *ptr_value >> 2) return -EBAD; And in any case this is a post-use sanity check not a mitigation for TOCTOU which we don't have. --- bod
On 5/15/2025 11:21 PM, Bryan O'Donoghue wrote: > On 15/05/2025 14:23, Vikash Garodia wrote: >>> Re-reading to see if the firmware wrote new bad data to the transmitted packet >>> in-memory is not a fix before or after the memcpy() because the time you do that >>> re-read is not fixed - locked wrt the freerunning firmware. >> It would be more meaningful if you can suggest the vulnerability you see with >> the changes suggested i.e adding the check in local packet against the size read >> from shared queue. Based on that we can see how to fix it, otherwise this >> discussion in not leading to any conclusion. > > So to re-iterate. > > TOCTOU is this > > if (*ptr_val >> 2 >= MAX) > return -EBAD; > > memcpy(dst, src, *ptr_val >> 2); > > Here a malicious actor can change *ptr_val between our check and our use. > > not > > data_value = *ptr_val >> 2; > > if (data_value >= MAX) > return -EBAD; > > memcpy(dst, src, data_value); > > Here the taking a copy of the value and subsequently relying on that value > mitigates TOCTOU, because the value *ptr_val is latched - read into a local > variable which cannot be manipulated from an outside agent i.e. venus firmware. > > The example in the commit log is not a TOCTOU for that reason. > > Adding an additional check _after_ the memcpy() seems silly to me because > > data_value = *ptr_val >> 2; > > if (data_value >= MAX) > return -EBAD; > > memcpy(dst, src, data_value); > > // This statement could be false > if (data_value != *ptr_value >> 2) > return -EBAD; > > // while this subsequent statement is true > if (data_value != *ptr_value >> 2) > return -EBAD; > Check the pseudo code which i proposed earlier in this conversation [1]. It does not rely on ptr_val at all to check the sanity after memcpy. [1] https://lore.kernel.org/all/0c50c24a-35fa-acfb-a807-b4ed5394506b@quicinc.com/ > And in any case this is a post-use sanity check not a mitigation for TOCTOU > which we don't have. > > --- > bod
This series primarily adds check at relevant places in venus driver where there are possible OOB accesses due to unexpected payload from venus firmware. The patches describes the specific OOB possibility. Changes in v3: - Add check for validating the size instead of forcefully updating it (Bryan) - Reduce duplication of code while handling sequence change event (Vikash) - Update the inst->error for failure case instead of slienly breaking (Bryan) - Link to v2: https://lore.kernel.org/lkml/20250215-venus-security-fixes-v2-0-cfc7e4b87168@quicinc.com/ Changes in v2: - Decompose sequence change event function. - Fix repopulating the packet .with the first read during read_queue. - Link to v1: https://lore.kernel.org/r/20250104-venus-security-fixes-v1-0-9d0dd4594cb4@quicinc.com Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> --- Vedang Nagar (2): media: venus: fix TOCTOU vulnerability when reading packets from shared memory media: venus: Fix OOB read due to missing payload bound check drivers/media/platform/qcom/venus/hfi_msgs.c | 83 +++++++++++++++++++-------- drivers/media/platform/qcom/venus/hfi_venus.c | 3 + 2 files changed, 61 insertions(+), 25 deletions(-) --- base-commit: b64b134942c8cf4801ea288b3fd38b509aedec21 change-id: 20250514-venus-fixes-8d93bccd9b9d Best regards,