diff mbox series

[v2] Bluetooth: RFCOMM: Use skb_trim to trim checksum

Message ID 20220610110749.110881-1-soenke.huster@eknoes.de
State New
Headers show
Series [v2] Bluetooth: RFCOMM: Use skb_trim to trim checksum | expand

Commit Message

Soenke Huster June 10, 2022, 11:07 a.m. UTC
As skb->tail might be zero, it can underflow. This leads to a page
fault: skb_tail_pointer simply adds skb->tail (which is now MAX_UINT)
to skb->head.

    BUG: unable to handle page fault for address: ffffed1021de29ff
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    RIP: 0010:rfcomm_run+0x831/0x4040 (net/bluetooth/rfcomm/core.c:1751)

By using skb_trim instead of the direct manipulation, skb->tail
is reset. Thus, the correct pointer to the checksum is used.

Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
---
v2: Clarified how the bug triggers, minimize code change

 net/bluetooth/rfcomm/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

bluez.test.bot@gmail.com June 10, 2022, 11:57 a.m. UTC | #1
This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=649266

---Test result---

Test Summary:
CheckPatch                    PASS      0.78 seconds
GitLint                       PASS      0.42 seconds
SubjectPrefix                 PASS      0.31 seconds
BuildKernel                   PASS      31.50 seconds
BuildKernel32                 PASS      30.28 seconds
Incremental Build with patchesPASS      39.69 seconds
TestRunner: Setup             PASS      487.85 seconds
TestRunner: l2cap-tester      PASS      18.02 seconds
TestRunner: bnep-tester       PASS      5.79 seconds
TestRunner: mgmt-tester       FAIL      106.97 seconds
TestRunner: rfcomm-tester     PASS      9.38 seconds
TestRunner: sco-tester        PASS      9.27 seconds
TestRunner: smp-tester        PASS      9.11 seconds
TestRunner: userchan-tester   PASS      6.15 seconds

Details
##############################
Test: TestRunner: mgmt-tester - FAIL - 106.97 seconds
Run test-runner with mgmt-tester
Total: 493, Passed: 491 (99.6%), Failed: 2, Not Run: 0

Failed Test Cases
Add Advertising - Success (Name+data+appear)         Timed out    2.769 seconds
Add Ext Advertising - Success (Name+data+appear)     Timed out    2.405 seconds



---
Regards,
Linux Bluetooth
Soenke Huster June 10, 2022, 3:35 p.m. UTC | #2
Hi Eric,

On 10.06.22 15:59, Eric Dumazet wrote:
> On Fri, Jun 10, 2022 at 4:08 AM Soenke Huster <soenke.huster@eknoes.de> wrote:
>>
>> As skb->tail might be zero, it can underflow. This leads to a page
>> fault: skb_tail_pointer simply adds skb->tail (which is now MAX_UINT)
>> to skb->head.
>>
>>     BUG: unable to handle page fault for address: ffffed1021de29ff
>>     #PF: supervisor read access in kernel mode
>>     #PF: error_code(0x0000) - not-present page
>>     RIP: 0010:rfcomm_run+0x831/0x4040 (net/bluetooth/rfcomm/core.c:1751)
>>
>> By using skb_trim instead of the direct manipulation, skb->tail
>> is reset. Thus, the correct pointer to the checksum is used.
>>
>> Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
>> ---
>> v2: Clarified how the bug triggers, minimize code change
>>
>>  net/bluetooth/rfcomm/core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>> index 7324764384b6..443b55edb3ab 100644
>> --- a/net/bluetooth/rfcomm/core.c
>> +++ b/net/bluetooth/rfcomm/core.c
>> @@ -1747,7 +1747,7 @@ static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
>>         type = __get_type(hdr->ctrl);
>>
>>         /* Trim FCS */
>> -       skb->len--; skb->tail--;
>> +       skb_trim(skb, skb->len - 1);
>>         fcs = *(u8 *)skb_tail_pointer(skb);
>>
>>         if (__check_fcs(skb->data, type, fcs)) {
>> --
>> 2.36.1
>>
> 
> Again, I do not see how skb->tail could possibly zero at this point.
> 
> If it was, skb with illegal layout has been queued in the first place,
> we need to fix the producer, not the consumer.
> 

Sorry, I thought that might be a right place as there is not much code in the kernel
that manipulates ->tail directly.

> A driver missed an skb_put() perhaps.
> 

I am using the (I guess quite unused) virtio_bt driver, and figured out that the following
fixes the bug:

--- a/drivers/bluetooth/virtio_bt.c
+++ b/drivers/bluetooth/virtio_bt.c
@@ -219,7 +219,7 @@ static void virtbt_rx_work(struct work_struct *work)
        if (!skb)
                return;
 
-       skb->len = len;
+       skb_put(skb, len);
        virtbt_rx_handle(vbt, skb);
 
        if (virtbt_add_inbuf(vbt) < 0)

I guess this is the root cause? I just used Bluetooth for a while in the VM
and no error occurred, everything worked fine.

> Can you please dump the skb here  ?
> 
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index 7324764384b6773074032ad671777bf86bd3360e..358ccb4fe7214aea0bb4084188c7658316fe0ff7
> 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -1746,6 +1746,11 @@ static struct rfcomm_session
> *rfcomm_recv_frame(struct rfcomm_session *s,
>         dlci = __get_dlci(hdr->addr);
>         type = __get_type(hdr->ctrl);
> 
> +       if (!skb->tail) {
> +               DO_ONCE_LITE(skb_dump(KERN_ERR, skb, false));
> +               kfree_skb(skb);
> +               return s;
> +       }
>         /* Trim FCS */
>         skb->len--; skb->tail--;
>         fcs = *(u8 *)skb_tail_pointer(skb);

If it might still help:

skb len=4 headroom=9 headlen=4 tailroom=1728          
mac=(-1,-1) net=(0,-1) trans=-1                       
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)   
hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0      
skb linear:   00000000: 03 3f 01 1c
Soenke Huster June 14, 2022, 1:42 p.m. UTC | #3
Hi Eric,

On 10.06.22 18:55, Eric Dumazet wrote:
> On Fri, Jun 10, 2022 at 8:35 AM Sönke Huster <soenke.huster@eknoes.de> wrote:
>>
>> Hi Eric,
>>
>> On 10.06.22 15:59, Eric Dumazet wrote:
>>> On Fri, Jun 10, 2022 at 4:08 AM Soenke Huster <soenke.huster@eknoes.de> wrote:
>>>>
>>>> As skb->tail might be zero, it can underflow. This leads to a page
>>>> fault: skb_tail_pointer simply adds skb->tail (which is now MAX_UINT)
>>>> to skb->head.
>>>>
>>>>     BUG: unable to handle page fault for address: ffffed1021de29ff
>>>>     #PF: supervisor read access in kernel mode
>>>>     #PF: error_code(0x0000) - not-present page
>>>>     RIP: 0010:rfcomm_run+0x831/0x4040 (net/bluetooth/rfcomm/core.c:1751)
>>>>
>>>> By using skb_trim instead of the direct manipulation, skb->tail
>>>> is reset. Thus, the correct pointer to the checksum is used.
>>>>
>>>> Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
>>>> ---
>>>> v2: Clarified how the bug triggers, minimize code change
>>>>
>>>>  net/bluetooth/rfcomm/core.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>>>> index 7324764384b6..443b55edb3ab 100644
>>>> --- a/net/bluetooth/rfcomm/core.c
>>>> +++ b/net/bluetooth/rfcomm/core.c
>>>> @@ -1747,7 +1747,7 @@ static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
>>>>         type = __get_type(hdr->ctrl);
>>>>
>>>>         /* Trim FCS */
>>>> -       skb->len--; skb->tail--;
>>>> +       skb_trim(skb, skb->len - 1);
>>>>         fcs = *(u8 *)skb_tail_pointer(skb);
>>>>
>>>>         if (__check_fcs(skb->data, type, fcs)) {
>>>> --
>>>> 2.36.1
>>>>
>>>
>>> Again, I do not see how skb->tail could possibly zero at this point.
>>>
>>> If it was, skb with illegal layout has been queued in the first place,
>>> we need to fix the producer, not the consumer.
>>>
>>
>> Sorry, I thought that might be a right place as there is not much code in the kernel
>> that manipulates ->tail directly.
>>
>>> A driver missed an skb_put() perhaps.
>>>
>>
>> I am using the (I guess quite unused) virtio_bt driver, and figured out that the following
>> fixes the bug:
>>
>> --- a/drivers/bluetooth/virtio_bt.c
>> +++ b/drivers/bluetooth/virtio_bt.c
>> @@ -219,7 +219,7 @@ static void virtbt_rx_work(struct work_struct *work)
>>         if (!skb)
>>                 return;
>>
>> -       skb->len = len;
>> +       skb_put(skb, len);
> 
> Removing skb->len=len seems about right.
> But skb_put() should be done earlier.
> 
> We are approaching the skb producer :)
> 
> Now you have to find/check who added this illegal skb in the virt queue.
> 
> Maybe virtbt_add_inbuf() ?

I think here, the length of the skb can't really be known - an empty SKB is put into
the virtqueue, and then filled with data in the device, which is implemented in a Hypervisor.
Maybe my implementation of that device might then be wrong, on the other hand I am pretty
sure the driver should be the one that sets the length of the skb. But the driver only
knows it in virtbt_rx_work, as it learns the size of the added buffer there for the first time.

> 
> Also there is kernel info leak I think.
> 

I think your are right!

> diff --git a/drivers/bluetooth/virtio_bt.c b/drivers/bluetooth/virtio_bt.c
> index 67c21263f9e0f250f0719b8e7f1fe15b0eba5ee0..c9b832c447ee451f027430b284d7bb246f6ecb24
> 100644
> --- a/drivers/bluetooth/virtio_bt.c
> +++ b/drivers/bluetooth/virtio_bt.c
> @@ -37,6 +37,9 @@ static int virtbt_add_inbuf(struct virtio_bluetooth *vbt)
>         if (!skb)
>                 return -ENOMEM;
> 
> +       skb_put(skb, 1000);
> +       memset(skb->data, 0, 1000);
> +
>         sg_init_one(sg, skb->data, 1000);
> 
>         err = virtqueue_add_inbuf(vq, sg, 1, skb, GFP_KERNEL);
> 
> 
>>         virtbt_rx_handle(vbt, skb);
>>
>>         if (virtbt_add_inbuf(vbt) < 0)
>>
>> I guess this is the root cause? I just used Bluetooth for a while in the VM
>> and no error occurred, everything worked fine.
>>
>>> Can you please dump the skb here  ?
>>>
>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>>> index 7324764384b6773074032ad671777bf86bd3360e..358ccb4fe7214aea0bb4084188c7658316fe0ff7
>>> 100644
>>> --- a/net/bluetooth/rfcomm/core.c
>>> +++ b/net/bluetooth/rfcomm/core.c
>>> @@ -1746,6 +1746,11 @@ static struct rfcomm_session
>>> *rfcomm_recv_frame(struct rfcomm_session *s,
>>>         dlci = __get_dlci(hdr->addr);
>>>         type = __get_type(hdr->ctrl);
>>>
>>> +       if (!skb->tail) {
>>> +               DO_ONCE_LITE(skb_dump(KERN_ERR, skb, false));
>>> +               kfree_skb(skb);
>>> +               return s;
>>> +       }
>>>         /* Trim FCS */
>>>         skb->len--; skb->tail--;
>>>         fcs = *(u8 *)skb_tail_pointer(skb);
>>
>> If it might still help:
>>
>> skb len=4 headroom=9 headlen=4 tailroom=1728
>> mac=(-1,-1) net=(0,-1) trans=-1
>> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
>> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
>> hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
>> skb linear:   00000000: 03 3f 01 1c
>>
Eric Dumazet June 14, 2022, 2:17 p.m. UTC | #4
On Tue, Jun 14, 2022 at 6:42 AM Sönke Huster <soenke.huster@eknoes.de> wrote:
>
> Hi Eric,
>
> On 10.06.22 18:55, Eric Dumazet wrote:
> > On Fri, Jun 10, 2022 at 8:35 AM Sönke Huster <soenke.huster@eknoes.de> wrote:
> >>
> >> Hi Eric,
> >>
> >> On 10.06.22 15:59, Eric Dumazet wrote:
> >>> On Fri, Jun 10, 2022 at 4:08 AM Soenke Huster <soenke.huster@eknoes.de> wrote:
> >>>>
> >>>> As skb->tail might be zero, it can underflow. This leads to a page
> >>>> fault: skb_tail_pointer simply adds skb->tail (which is now MAX_UINT)
> >>>> to skb->head.
> >>>>
> >>>>     BUG: unable to handle page fault for address: ffffed1021de29ff
> >>>>     #PF: supervisor read access in kernel mode
> >>>>     #PF: error_code(0x0000) - not-present page
> >>>>     RIP: 0010:rfcomm_run+0x831/0x4040 (net/bluetooth/rfcomm/core.c:1751)
> >>>>
> >>>> By using skb_trim instead of the direct manipulation, skb->tail
> >>>> is reset. Thus, the correct pointer to the checksum is used.
> >>>>
> >>>> Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
> >>>> ---
> >>>> v2: Clarified how the bug triggers, minimize code change
> >>>>
> >>>>  net/bluetooth/rfcomm/core.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> >>>> index 7324764384b6..443b55edb3ab 100644
> >>>> --- a/net/bluetooth/rfcomm/core.c
> >>>> +++ b/net/bluetooth/rfcomm/core.c
> >>>> @@ -1747,7 +1747,7 @@ static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
> >>>>         type = __get_type(hdr->ctrl);
> >>>>
> >>>>         /* Trim FCS */
> >>>> -       skb->len--; skb->tail--;
> >>>> +       skb_trim(skb, skb->len - 1);
> >>>>         fcs = *(u8 *)skb_tail_pointer(skb);
> >>>>
> >>>>         if (__check_fcs(skb->data, type, fcs)) {
> >>>> --
> >>>> 2.36.1
> >>>>
> >>>
> >>> Again, I do not see how skb->tail could possibly zero at this point.
> >>>
> >>> If it was, skb with illegal layout has been queued in the first place,
> >>> we need to fix the producer, not the consumer.
> >>>
> >>
> >> Sorry, I thought that might be a right place as there is not much code in the kernel
> >> that manipulates ->tail directly.
> >>
> >>> A driver missed an skb_put() perhaps.
> >>>
> >>
> >> I am using the (I guess quite unused) virtio_bt driver, and figured out that the following
> >> fixes the bug:
> >>
> >> --- a/drivers/bluetooth/virtio_bt.c
> >> +++ b/drivers/bluetooth/virtio_bt.c
> >> @@ -219,7 +219,7 @@ static void virtbt_rx_work(struct work_struct *work)
> >>         if (!skb)
> >>                 return;
> >>
> >> -       skb->len = len;
> >> +       skb_put(skb, len);
> >
> > Removing skb->len=len seems about right.
> > But skb_put() should be done earlier.
> >
> > We are approaching the skb producer :)
> >
> > Now you have to find/check who added this illegal skb in the virt queue.
> >
> > Maybe virtbt_add_inbuf() ?
>
> I think here, the length of the skb can't really be known - an empty SKB is put into
> the virtqueue, and then filled with data in the device, which is implemented in a Hypervisor.
> Maybe my implementation of that device might then be wrong, on the other hand I am pretty
> sure the driver should be the one that sets the length of the skb. But the driver only
> knows it in virtbt_rx_work, as it learns the size of the added buffer there for the first time.
>
> >
> > Also there is kernel info leak I think.
> >
>
> I think your are right!

If this patch in drivers/bluetooth/virtio_bt.c fixes the issue, please
submit a formal patch.
You can take ownership of it, of course.

If not, more investigation is needed on your side ;)

Thanks !

>
> > diff --git a/drivers/bluetooth/virtio_bt.c b/drivers/bluetooth/virtio_bt.c
> > index 67c21263f9e0f250f0719b8e7f1fe15b0eba5ee0..c9b832c447ee451f027430b284d7bb246f6ecb24
> > 100644
> > --- a/drivers/bluetooth/virtio_bt.c
> > +++ b/drivers/bluetooth/virtio_bt.c
> > @@ -37,6 +37,9 @@ static int virtbt_add_inbuf(struct virtio_bluetooth *vbt)
> >         if (!skb)
> >                 return -ENOMEM;
> >
> > +       skb_put(skb, 1000);
> > +       memset(skb->data, 0, 1000);
> > +
> >         sg_init_one(sg, skb->data, 1000);
> >
> >         err = virtqueue_add_inbuf(vq, sg, 1, skb, GFP_KERNEL);
> >
> >
> >>         virtbt_rx_handle(vbt, skb);
> >>
> >>         if (virtbt_add_inbuf(vbt) < 0)
> >>
> >> I guess this is the root cause? I just used Bluetooth for a while in the VM
> >> and no error occurred, everything worked fine.
> >>
> >>> Can you please dump the skb here  ?
> >>>
> >>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> >>> index 7324764384b6773074032ad671777bf86bd3360e..358ccb4fe7214aea0bb4084188c7658316fe0ff7
> >>> 100644
> >>> --- a/net/bluetooth/rfcomm/core.c
> >>> +++ b/net/bluetooth/rfcomm/core.c
> >>> @@ -1746,6 +1746,11 @@ static struct rfcomm_session
> >>> *rfcomm_recv_frame(struct rfcomm_session *s,
> >>>         dlci = __get_dlci(hdr->addr);
> >>>         type = __get_type(hdr->ctrl);
> >>>
> >>> +       if (!skb->tail) {
> >>> +               DO_ONCE_LITE(skb_dump(KERN_ERR, skb, false));
> >>> +               kfree_skb(skb);
> >>> +               return s;
> >>> +       }
> >>>         /* Trim FCS */
> >>>         skb->len--; skb->tail--;
> >>>         fcs = *(u8 *)skb_tail_pointer(skb);
> >>
> >> If it might still help:
> >>
> >> skb len=4 headroom=9 headlen=4 tailroom=1728
> >> mac=(-1,-1) net=(0,-1) trans=-1
> >> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> >> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
> >> hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
> >> skb linear:   00000000: 03 3f 01 1c
> >>
diff mbox series

Patch

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 7324764384b6..443b55edb3ab 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -1747,7 +1747,7 @@  static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
 	type = __get_type(hdr->ctrl);
 
 	/* Trim FCS */
-	skb->len--; skb->tail--;
+	skb_trim(skb, skb->len - 1);
 	fcs = *(u8 *)skb_tail_pointer(skb);
 
 	if (__check_fcs(skb->data, type, fcs)) {