diff mbox series

[v2,4/4] venus: hfi_parser: Add check to keep the number of codecs within range

Message ID 1691634304-2158-5-git-send-email-quic_vgarodia@quicinc.com
State Accepted
Commit 0768a9dd809ef52440b5df7dce5a1c1c7e97abbd
Headers show
Series Venus driver fixes to avoid possible OOB accesses | expand

Commit Message

Vikash Garodia Aug. 10, 2023, 2:25 a.m. UTC
Supported codec bitmask is populated from the payload from venus firmware.
There is a possible case when all the bits in the codec bitmask is set. In
such case, core cap for decoder is filled  and MAX_CODEC_NUM is utilized.
Now while filling the caps for encoder, it can lead to access the caps
array beyong 32 index. Hence leading to OOB write.
The fix counts the supported encoder and decoder. If the count is more than
max, then it skips accessing the caps.

Cc: stable@vger.kernel.org
Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser")
Signed-off-by: Vikash Garodia <quic_vgarodia@quicinc.com>
---
 drivers/media/platform/qcom/venus/hfi_parser.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Vikash Garodia Aug. 11, 2023, 6:04 a.m. UTC | #1
On 8/10/2023 5:03 PM, Bryan O'Donoghue wrote:
> On 10/08/2023 03:25, Vikash Garodia wrote:
>> +    if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) >
>> MAX_CODEC_NUM)
>> +        return;
>> +
> 
> Shouldn't this be >= ?
Not needed. Lets take a hypothetical case when core->dec_codecs has initial 16
(0-15) bits set and core->enc_codecs has next 16 bits (16-31) set. The bit count
would be 32. The codec loop after this check would run on caps array index 0-31.
I do not see a possibility for OOB access in this case.

> 
> struct hfi_plat_caps caps[MAX_CODEC_NUM];
> 
> ---
> bod
>
Bryan O'Donoghue Aug. 11, 2023, 8:42 a.m. UTC | #2
On 11/08/2023 07:04, Vikash Garodia wrote:
> 
> On 8/10/2023 5:03 PM, Bryan O'Donoghue wrote:
>> On 10/08/2023 03:25, Vikash Garodia wrote:
>>> +    if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) >
>>> MAX_CODEC_NUM)
>>> +        return;
>>> +
>>
>> Shouldn't this be >= ?
> Not needed. Lets take a hypothetical case when core->dec_codecs has initial 16
> (0-15) bits set and core->enc_codecs has next 16 bits (16-31) set. The bit count
> would be 32. The codec loop after this check would run on caps array index 0-31.
> I do not see a possibility for OOB access in this case.
> 
>>
>> struct hfi_plat_caps caps[MAX_CODEC_NUM];
>>
>> ---
>> bod
>>

Are you not doing a general defensive coding pass in this series ie

"[PATCH v2 2/4] venus: hfi: fix the check to handle session buffer 
requirement"

---
bod
Vikash Garodia Aug. 11, 2023, 8:49 a.m. UTC | #3
On 8/11/2023 2:12 PM, Bryan O'Donoghue wrote:
> On 11/08/2023 07:04, Vikash Garodia wrote:
>>
>> On 8/10/2023 5:03 PM, Bryan O'Donoghue wrote:
>>> On 10/08/2023 03:25, Vikash Garodia wrote:
>>>> +    if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) >
>>>> MAX_CODEC_NUM)
>>>> +        return;
>>>> +
>>>
>>> Shouldn't this be >= ?
>> Not needed. Lets take a hypothetical case when core->dec_codecs has initial 16
>> (0-15) bits set and core->enc_codecs has next 16 bits (16-31) set. The bit count
>> would be 32. The codec loop after this check would run on caps array index 0-31.
>> I do not see a possibility for OOB access in this case.
>>
>>>
>>> struct hfi_plat_caps caps[MAX_CODEC_NUM];
>>>
>>> ---
>>> bod
>>>
> 
> Are you not doing a general defensive coding pass in this series ie
> 
> "[PATCH v2 2/4] venus: hfi: fix the check to handle session buffer requirement"

In "PATCH v2 2/4", there is a possibility if the check does not consider "=".
Here in this patch, I do not see a possibility.

> 
> ---
> bod
Bryan O'Donoghue Aug. 11, 2023, 10:41 a.m. UTC | #4
On 11/08/2023 09:49, Vikash Garodia wrote:
> 
> On 8/11/2023 2:12 PM, Bryan O'Donoghue wrote:
>> On 11/08/2023 07:04, Vikash Garodia wrote:
>>>
>>> On 8/10/2023 5:03 PM, Bryan O'Donoghue wrote:
>>>> On 10/08/2023 03:25, Vikash Garodia wrote:
>>>>> +    if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) >
>>>>> MAX_CODEC_NUM)
>>>>> +        return;
>>>>> +
>>>>
>>>> Shouldn't this be >= ?
>>> Not needed. Lets take a hypothetical case when core->dec_codecs has initial 16
>>> (0-15) bits set and core->enc_codecs has next 16 bits (16-31) set. The bit count
>>> would be 32. The codec loop after this check would run on caps array index 0-31.
>>> I do not see a possibility for OOB access in this case.
>>>
>>>>
>>>> struct hfi_plat_caps caps[MAX_CODEC_NUM];
>>>>
>>>> ---
>>>> bod
>>>>
>>
>> Are you not doing a general defensive coding pass in this series ie
>>
>> "[PATCH v2 2/4] venus: hfi: fix the check to handle session buffer requirement"
> 
> In "PATCH v2 2/4", there is a possibility if the check does not consider "=".
> Here in this patch, I do not see a possibility.
> 
>>
>> ---
>> bod

But surely hweight_long(core->dec_codecs) + 
hweight_long(core->enc_codecs) == MAX_CODEC_NUM is an invalid offset ?

---
bod
Vikash Garodia Aug. 14, 2023, 6:34 a.m. UTC | #5
On 8/12/2023 12:21 AM, Bryan O'Donoghue wrote:
> On 11/08/2023 17:02, Vikash Garodia wrote:
>>
>>
>> On 8/11/2023 4:11 PM, Bryan O'Donoghue wrote:
>>> On 11/08/2023 09:49, Vikash Garodia wrote:
>>>>
>>>> On 8/11/2023 2:12 PM, Bryan O'Donoghue wrote:
>>>>> On 11/08/2023 07:04, Vikash Garodia wrote:
>>>>>>
>>>>>> On 8/10/2023 5:03 PM, Bryan O'Donoghue wrote:
>>>>>>> On 10/08/2023 03:25, Vikash Garodia wrote:
>>>>>>>> +    if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) >
>>>>>>>> MAX_CODEC_NUM)
>>>>>>>> +        return;
>>>>>>>> +
>>>>>>>
>>>>>>> Shouldn't this be >= ?
>>>>>> Not needed. Lets take a hypothetical case when core->dec_codecs has
>>>>>> initial 16
>>>>>> (0-15) bits set and core->enc_codecs has next 16 bits (16-31) set. The bit
>>>>>> count
>>>>>> would be 32. The codec loop after this check would run on caps array index
>>>>>> 0-31.
>>>>>> I do not see a possibility for OOB access in this case.
>>>>>>
>>>>>>>
>>>>>>> struct hfi_plat_caps caps[MAX_CODEC_NUM];
>>>>>>>
>>>>>>> ---
>>>>>>> bod
>>>>>>>
>>>>>
>>>>> Are you not doing a general defensive coding pass in this series ie
>>>>>
>>>>> "[PATCH v2 2/4] venus: hfi: fix the check to handle session buffer
>>>>> requirement"
>>>>
>>>> In "PATCH v2 2/4", there is a possibility if the check does not consider "=".
>>>> Here in this patch, I do not see a possibility.
>>>>
>>>>>
>>>>> ---
>>>>> bod
>>>
>>> But surely hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) ==
>>> MAX_CODEC_NUM is an invalid offset ?
>>
>> No, it isn't. Please run through the loop with the bitmasks added upto 32 and
>> see if there is a possibility of OOB.
> 
> IDK Vikash, the logic here seems suspect.
> 
> We have two loops that check for up to 32 indexes per loop. Why not have a
> capabilities index that can accommodate all 64 bits ?
Max codecs supported can be 32, which is also a very high number. At max the
hardware supports 5-6 codecs, including both decoder and encoder. 64 indices is
would not be needed.

> Why is it valid to have 16 encoder bits and 16 decoder bits but invalid to have
> 16 encoder bits with 17 decoder bits ? While at the same time valid to have 0
> encoder bits but 17 decoder bits ?
The addition of the encoder and decoder should be 32. Any combination which adds
to it, would go through. For ex, (17 dec + 15 enc) OR (32 dec + 0 enc) OR (0 dec
+ 32 enc) etc are valid combination theoretically, though there are only few
decoders and encoders actually supported by hardware.

Regards,
Vikash
Bryan O'Donoghue Aug. 14, 2023, 2:15 p.m. UTC | #6
On 14/08/2023 07:34, Vikash Garodia wrote:
>> We have two loops that check for up to 32 indexes per loop. Why not have a
>> capabilities index that can accommodate all 64 bits ?
> Max codecs supported can be 32, which is also a very high number. At max the
> hardware supports 5-6 codecs, including both decoder and encoder. 64 indices is
> would not be needed.
> 

But the bug you are fixing here is an overflow where we have received a 
full range 32 bit for each decode and encode.

How is the right fix not to extend the storage to the maximum possible 2 
x 32 ? Or indeed why not constrain the input data to 32/2 for each 
encode/decode path ?

The bug here is that we can copy two arrays of size X into one array of 
size X.

Please consider expanding the size of the storage array to accommodate 
the full size the protocol supports 2 x 32.

---
bod
Vikash Garodia Aug. 29, 2023, 8 a.m. UTC | #7
Hi Bryan,

On 8/14/2023 7:45 PM, Bryan O'Donoghue wrote:
> On 14/08/2023 07:34, Vikash Garodia wrote:
>>> We have two loops that check for up to 32 indexes per loop. Why not have a
>>> capabilities index that can accommodate all 64 bits ?
>> Max codecs supported can be 32, which is also a very high number. At max the
>> hardware supports 5-6 codecs, including both decoder and encoder. 64 indices is
>> would not be needed.
>>
> 
> But the bug you are fixing here is an overflow where we have received a full
> range 32 bit for each decode and encode.
> 
> How is the right fix not to extend the storage to the maximum possible 2 x 32 ?
> Or indeed why not constrain the input data to 32/2 for each encode/decode path ?
At this point, we agree that there is very less or no possibility to have this
as a real usecase i.e having 64 (or more than 32) codecs supported in video
hardware. There seem to be no value add if we are extending the cap array from
32 to 64, as anything beyond 32 itself indicates rogue firmware. The idea here
is to gracefully come out of such case when firmware is responding with such
data payload.
Again, lets think of constraining the data to 32/2. We have 2 32 bit masks for
decoder and encoder. Malfunctioning firmware could still send payload with all
bits enabled in those masks. Then the driver needs to add same check to avoid
the memcpy in such case.

> The bug here is that we can copy two arrays of size X into one array of size X.
> 
> Please consider expanding the size of the storage array to accommodate the full
> size the protocol supports 2 x 32.
I see this as an alternate implementation to existing handling. 64 index would
never exist practically, so accommodating it only implies to store the data for
invalid response and gracefully close the session.

Thanks,
Vikash
Bryan O'Donoghue Aug. 29, 2023, 11:59 a.m. UTC | #8
On 29/08/2023 09:00, Vikash Garodia wrote:
> Hi Bryan,
> 
> On 8/14/2023 7:45 PM, Bryan O'Donoghue wrote:
>> On 14/08/2023 07:34, Vikash Garodia wrote:
>>>> We have two loops that check for up to 32 indexes per loop. Why not have a
>>>> capabilities index that can accommodate all 64 bits ?
>>> Max codecs supported can be 32, which is also a very high number. At max the
>>> hardware supports 5-6 codecs, including both decoder and encoder. 64 indices is
>>> would not be needed.
>>>
>>
>> But the bug you are fixing here is an overflow where we have received a full
>> range 32 bit for each decode and encode.
>>
>> How is the right fix not to extend the storage to the maximum possible 2 x 32 ?
>> Or indeed why not constrain the input data to 32/2 for each encode/decode path ?
> At this point, we agree that there is very less or no possibility to have this
> as a real usecase i.e having 64 (or more than 32) codecs supported in video
> hardware. There seem to be no value add if we are extending the cap array from
> 32 to 64, as anything beyond 32 itself indicates rogue firmware. The idea here
> is to gracefully come out of such case when firmware is responding with such
> data payload.
> Again, lets think of constraining the data to 32/2. We have 2 32 bit masks for
> decoder and encoder. Malfunctioning firmware could still send payload with all
> bits enabled in those masks. Then the driver needs to add same check to avoid
> the memcpy in such case.
> 
>> The bug here is that we can copy two arrays of size X into one array of size X.
>>
>> Please consider expanding the size of the storage array to accommodate the full
>> size the protocol supports 2 x 32.
> I see this as an alternate implementation to existing handling. 64 index would
> never exist practically, so accommodating it only implies to store the data for
> invalid response and gracefully close the session.

What's the contractual definition of "this many bits per encoder and 
decoder" between firmware and APSS in that case ?

Where do we get the idea that 32/2 per encoder/decoder is valid but 32 
per encoder decoder is invalid ?

At this moment in time 16 encoder/decoder bits would be equally invalid.

I suggest the right answer is to buffer the protocol data unit - PDU 
maximum as an RX or constrain the maximum number of encoder/decoder bits 
based on HFI version.

ie.

- Either constrain on the PDU or
- Constrain on the known number of maximum bits per f/w version

---
bod
Vikash Garodia Aug. 29, 2023, 2:06 p.m. UTC | #9
On 8/29/2023 5:29 PM, Bryan O'Donoghue wrote:
> On 29/08/2023 09:00, Vikash Garodia wrote:
>> Hi Bryan,
>>
>> On 8/14/2023 7:45 PM, Bryan O'Donoghue wrote:
>>> On 14/08/2023 07:34, Vikash Garodia wrote:
>>>>> We have two loops that check for up to 32 indexes per loop. Why not have a
>>>>> capabilities index that can accommodate all 64 bits ?
>>>> Max codecs supported can be 32, which is also a very high number. At max the
>>>> hardware supports 5-6 codecs, including both decoder and encoder. 64 indices is
>>>> would not be needed.
>>>>
>>>
>>> But the bug you are fixing here is an overflow where we have received a full
>>> range 32 bit for each decode and encode.
>>>
>>> How is the right fix not to extend the storage to the maximum possible 2 x 32 ?
>>> Or indeed why not constrain the input data to 32/2 for each encode/decode path ?
>> At this point, we agree that there is very less or no possibility to have this
>> as a real usecase i.e having 64 (or more than 32) codecs supported in video
>> hardware. There seem to be no value add if we are extending the cap array from
>> 32 to 64, as anything beyond 32 itself indicates rogue firmware. The idea here
>> is to gracefully come out of such case when firmware is responding with such
>> data payload.
>> Again, lets think of constraining the data to 32/2. We have 2 32 bit masks for
>> decoder and encoder. Malfunctioning firmware could still send payload with all
>> bits enabled in those masks. Then the driver needs to add same check to avoid
>> the memcpy in such case.
>>
>>> The bug here is that we can copy two arrays of size X into one array of size X.
>>>
>>> Please consider expanding the size of the storage array to accommodate the full
>>> size the protocol supports 2 x 32.
>> I see this as an alternate implementation to existing handling. 64 index would
>> never exist practically, so accommodating it only implies to store the data for
>> invalid response and gracefully close the session.
> 
> What's the contractual definition of "this many bits per encoder and decoder"
> between firmware and APSS in that case ?
> 
> Where do we get the idea that 32/2 per encoder/decoder is valid but 32 per
> encoder decoder is invalid ?
> 
> At this moment in time 16 encoder/decoder bits would be equally invalid.
> 
> I suggest the right answer is to buffer the protocol data unit - PDU maximum as
> an RX or constrain the maximum number of encoder/decoder bits based on HFI version.
> 
> ie.
> 
> - Either constrain on the PDU or
> - Constrain on the known number of maximum bits per f/w version

Let me simply ask this - What benefit we will be getting with above approaches
over the existing handling ?

Thanks,
Vikash
> ---
> bod
>
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/venus/hfi_parser.c b/drivers/media/platform/qcom/venus/hfi_parser.c
index 9d6ba22..c438395 100644
--- a/drivers/media/platform/qcom/venus/hfi_parser.c
+++ b/drivers/media/platform/qcom/venus/hfi_parser.c
@@ -19,6 +19,9 @@  static void init_codecs(struct venus_core *core)
 	struct hfi_plat_caps *caps = core->caps, *cap;
 	unsigned long bit;
 
+	if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) > MAX_CODEC_NUM)
+		return;
+
 	for_each_set_bit(bit, &core->dec_codecs, MAX_CODEC_NUM) {
 		cap = &caps[core->codecs_count++];
 		cap->codec = BIT(bit);