Message ID | 20201001194329.9328-1-anant.thazhemadam@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | [Linux-kernel-mentees,v2] bluetooth: hci_h5: close serdev device and free hu in h5_close | expand |
Hi, On 10/1/20 9:43 PM, Anant Thazhemadam wrote: > When h5_close() gets called, the memory allocated for the hu gets > freed only if hu->serdev doesn't exist. This leads to a memory leak. > So when h5_close() is requested, close the serdev device instance and > free the memory allocated to the hu entirely instead. > > Fixes: ce945552fde4 ("Bluetooth: hci_h5: Add support for serdev enumerated devices") > Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com > Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com > Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> So 2 comments to this: 1. You claim this change fixes a memory-leak, but in the serdev case the memory is allocated in h5_serdev_probe() like this: h5 = devm_kzalloc(dev, sizeof(*h5), GFP_KERNEL); So its lifetime is tied to the lifetime of the driver being bound to the serdev and it is automatically freed when the driver gets unbound. If you had looked at where the h5 struct gets allocated in h5_close()-s counterpart, h5_open() then you could have seen this there: if (hu->serdev) { h5 = serdev_device_get_drvdata(hu->serdev); } else { h5 = kzalloc(sizeof(*h5), GFP_KERNEL); if (!h5) return -ENOMEM; } So it is very clear here, that the kzalloc only happens in the if (!hu->serdev) which clearly makes the kfree() have the same condition the right thing todo. In the hu->serdev the data which was allocated by h5_serdev_probe() is used instead and no alloc happens inside h5_open() In general when looking at resource teardown you should also look at where they are allocated in the mirror function and the teardown should mirror the alloc code. So the main change of your commit is wrong: NACK. 2. You are making multiple changes in a single commit here, I count at least 3. When ever you are making multiple changes like this, you should really split the commit up in 1 commit per change and explain in each commit message why that change is being made (why it is necessary). Writing commit messages like this also leads to you double-checking your own work by carefully considering why you are making the changes. So about the 3 different changes: 2a) Make the kfree(h5) call unconditionally, which as mentioned above is wrong. 2b) Introduce a call to kfree_skb(h5->rx_skb); which is not mentioned in the commit message at all. This looks like it would actually be a sensible change, at least in the "if (!hu->serdev)" case. When using the serdev interface then just freeing it will leave a dangling pointer, so it would be better (for both the hu->serdev and the !hu->serdev cases) to call h5_reset_rx() on close instead which does: if (h5->rx_skb) { kfree_skb(h5->rx_skb); h5->rx_skb = NULL; } 2c) Introduce a call to serdev_device_close(), without really explaining why in the commit message. Again if you would have looked at the mirror h5_close() function then you see no serdev_device_open() there. Actually serdev_device_open() is not called anywhere in the hci_h5.c code. Digging a little deeper (using grep) shows that hci_uart_register_device() calls serdev_device_open() and hci_uart_register_device() gets called from: h5_serdev_probe(), likewise hci_uart_unregister_device() calls serdev_device_close() for us and that gets called from h5_serdev_remove(), so there is no need to call serdev_device_close() from h5_close() and doing so will actually break things, because then the serdev will be left closed on a subsequent h5_open() call. Regards, Hans > --- > Changes in v2: > * Fixed the Fixes tag > > drivers/bluetooth/hci_h5.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c > index e41854e0d79a..3d1585add572 100644 > --- a/drivers/bluetooth/hci_h5.c > +++ b/drivers/bluetooth/hci_h5.c > @@ -248,8 +248,12 @@ static int h5_close(struct hci_uart *hu) > if (h5->vnd && h5->vnd->close) > h5->vnd->close(h5); > > - if (!hu->serdev) > - kfree(h5); > + if (hu->serdev) > + serdev_device_close(hu->serdev); > + > + kfree_skb(h5->rx_skb); > + kfree(h5); > + h5 = NULL; > > return 0; > } >
On 02/10/20 3:52 pm, Hans de Goede wrote: > Hi, > > On 10/1/20 9:43 PM, Anant Thazhemadam wrote: >> When h5_close() gets called, the memory allocated for the hu gets >> freed only if hu->serdev doesn't exist. This leads to a memory leak. >> So when h5_close() is requested, close the serdev device instance and >> free the memory allocated to the hu entirely instead. >> >> Fixes: ce945552fde4 ("Bluetooth: hci_h5: Add support for serdev enumerated devices") >> Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com >> Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com >> Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> > > So 2 comments to this: > > 1. You claim this change fixes a memory-leak, but in the serdev case > the memory is allocated in h5_serdev_probe() like this: > > h5 = devm_kzalloc(dev, sizeof(*h5), GFP_KERNEL); > > So its lifetime is tied to the lifetime of the driver being bound > to the serdev and it is automatically freed when the driver gets > unbound. If you had looked at where the h5 struct gets allocated > in h5_close()-s counterpart, h5_open() then you could have seen > this there: > > if (hu->serdev) { > h5 = serdev_device_get_drvdata(hu->serdev); > } else { > h5 = kzalloc(sizeof(*h5), GFP_KERNEL); > if (!h5) > return -ENOMEM; > } > > So it is very clear here, that the kzalloc only happens in the > if (!hu->serdev) which clearly makes the kfree() have the same > condition the right thing todo. In the hu->serdev the data which > was allocated by h5_serdev_probe() is used instead and no alloc > happens inside h5_open() > > In general when looking at resource teardown you should also look > at where they are allocated in the mirror function > and the teardown should mirror the alloc code. > > So the main change of your commit is wrong: > > NACK. > > > 2. You are making multiple changes in a single commit here, I count at > least 3. When ever you are making multiple changes like this, you should really > split the commit up in 1 commit per change and explain in each commit > message why that change is being made (why it is necessary). Writing > commit messages like this also leads to you double-checking your own > work by carefully considering why you are making the changes. > > So about the 3 different changes: > > 2a) Make the kfree(h5) call unconditionally, which as mentioned above > is wrong. > Hmm, I understand, thank you. > 2b) Introduce a call to kfree_skb(h5->rx_skb); which is not mentioned in > the commit message at all. This looks like it would actually be a sensible > change, at least in the "if (!hu->serdev)" case. When using the serdev > interface then just freeing it will leave a dangling pointer, so it > would be better (for both the hu->serdev and the !hu->serdev cases) > to call h5_reset_rx() on close instead which does: > > if (h5->rx_skb) { > kfree_skb(h5->rx_skb); > h5->rx_skb = NULL; > } > This was the first thing I did! (and I tested and it works too) https://syzkaller.appspot.com/text?tag=Patch&x=13acf917900000 However, I was worried that doing so might mean freeing an skb that might be required somewhere by the serdev connection (which could possibly lead to a null ptr dereference), so I thought closing the connection altogether might be a better alternative. But now, since I've been told better, I'll submit a v3, that makes the more sensible change and doesn't close the serdev_device, with a much more informative commit message! I also found a similar implementation in mrvl_close() where the serdev_device was being closed and felt it might be the right way to go. But I now know that I shouldn't have overlooked the fact that the open functions for both differ. I'm sorry for this. Like you mentioned, I should've given more priority to the mirror function, and looked to see if even they were along the same lines, and I will be sure to keep that in mind. > 2c) Introduce a call to serdev_device_close(), without really explaining > why in the commit message. Again if you would have looked at the mirror > h5_close() function then you see no serdev_device_open() there. > Actually serdev_device_open() is not called anywhere in the hci_h5.c code. > > Digging a little deeper (using grep) shows that hci_uart_register_device() > calls serdev_device_open() and hci_uart_register_device() gets called from: > h5_serdev_probe(), likewise hci_uart_unregister_device() calls > serdev_device_close() for us and that gets called from h5_serdev_remove(), > so there is no need to call serdev_device_close() from h5_close() and doing > so will actually break things, because then the serdev will be left closed > on a subsequent h5_open() call. > > Regards, > > Hans > Thank you so much for this review, and for your time! :) Thanks, Anant
On 02/10/20 3:52 pm, Hans de Goede wrote: > Hi, > > On 10/1/20 9:43 PM, Anant Thazhemadam wrote: >> When h5_close() gets called, the memory allocated for the hu gets >> freed only if hu->serdev doesn't exist. This leads to a memory leak. >> So when h5_close() is requested, close the serdev device instance and >> free the memory allocated to the hu entirely instead. >> >> Fixes: ce945552fde4 ("Bluetooth: hci_h5: Add support for serdev enumerated devices") >> Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com >> Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com >> Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> > > So 2 comments to this: > > 1. You claim this change fixes a memory-leak, but in the serdev case > the memory is allocated in h5_serdev_probe() like this: > > h5 = devm_kzalloc(dev, sizeof(*h5), GFP_KERNEL); > > So its lifetime is tied to the lifetime of the driver being bound > to the serdev and it is automatically freed when the driver gets > unbound. If you had looked at where the h5 struct gets allocated > in h5_close()-s counterpart, h5_open() then you could have seen > this there: > > if (hu->serdev) { > h5 = serdev_device_get_drvdata(hu->serdev); > } else { > h5 = kzalloc(sizeof(*h5), GFP_KERNEL); > if (!h5) > return -ENOMEM; > } > > So it is very clear here, that the kzalloc only happens in the > if (!hu->serdev) which clearly makes the kfree() have the same > condition the right thing todo. In the hu->serdev the data which > was allocated by h5_serdev_probe() is used instead and no alloc > happens inside h5_open() > > In general when looking at resource teardown you should also look > at where they are allocated in the mirror function > and the teardown should mirror the alloc code. > > So the main change of your commit is wrong: > > NACK. > > > 2. You are making multiple changes in a single commit here, I count at > least 3. When ever you are making multiple changes like this, you should really > split the commit up in 1 commit per change and explain in each commit > message why that change is being made (why it is necessary). Writing > commit messages like this also leads to you double-checking your own > work by carefully considering why you are making the changes. > > So about the 3 different changes: > > 2a) Make the kfree(h5) call unconditionally, which as mentioned above > is wrong. > > 2b) Introduce a call to kfree_skb(h5->rx_skb); which is not mentioned in > the commit message at all. This looks like it would actually be a sensible > change, at least in the "if (!hu->serdev)" case. When using the serdev > interface then just freeing it will leave a dangling pointer, so it > would be better (for both the hu->serdev and the !hu->serdev cases) > to call h5_reset_rx() on close instead which does: > > if (h5->rx_skb) { > kfree_skb(h5->rx_skb); > h5->rx_skb = NULL; > } > > 2c) Introduce a call to serdev_device_close(), without really explaining > why in the commit message. Again if you would have looked at the mirror > h5_close() function then you see no serdev_device_open() there. > Actually serdev_device_open() is not called anywhere in the hci_h5.c code. > > Digging a little deeper (using grep) shows that hci_uart_register_device() > calls serdev_device_open() and hci_uart_register_device() gets called from: > h5_serdev_probe(), likewise hci_uart_unregister_device() calls > serdev_device_close() for us and that gets called from h5_serdev_remove(), > so there is no need to call serdev_device_close() from h5_close() and doing > so will actually break things, because then the serdev will be left closed > on a subsequent h5_open() call. > > Regards, > > Hans > Hi, I did some more investigating and testing for this bug, and turns out I was very wrong. I'm truly sorry for that. The memory leak that is caused is caused when !hu->serdev itself, since we free h5, but not h5->rx_skb. This is what causes the memory leak. I'll send in a v3 that corrects this issue soon enough, by freeing h5->rx_skb first followed by h5 when !hu->serdev; and otherwise (when hu->serdev exists) calling h5_reset_rx(). Sorry for the trouble. Thanks, Anant
Hi, On 10/4/20 7:17 AM, Anant Thazhemadam wrote: > When h5_close() is called and !hu->serdev, h5 is directly freed. > However, h5->rx_skb is not freed before h5 is freed, which causes > a memory leak. > Freeing h5->rx_skb (when !hu->serdev) fixes this memory leak before > freeing h5. > > Fixes: ce945552fde4 ("Bluetooth: hci_h5: Add support for serdev enumerated devices") > Reported-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com > Tested-by: syzbot+6ce141c55b2f7aafd1c4@syzkaller.appspotmail.com > Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> > --- > Changes in v3: > * Free h5->rx_skb when !hu->serdev, and fix the memory leak > * Do not incorrectly and unnecessarily call serdev_device_close() > > Changes in v2: > * Fixed the Fixes tag > > Hans de Goede also suggested calling h5_reset_rx() on close (for both, !hu->serdev > and hu->serdev cases). > However, doing so seems to lead to a null-ptr-dereference error, > https://syzkaller.appspot.com/text?tag=CrashReport&x=136a9a5d900000, > and for this reason, it has not been implemented. > > Instead, directly freeing h5->rx_skb seems to suffice in preventing the memory leak > reported. > And since h5 is freed immediately after freeing h5->rx_skb, assigning h5->rx_skb to > be NULL isn't necessary. > > drivers/bluetooth/hci_h5.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c > index e41854e0d79a..171e55c080ce 100644 > --- a/drivers/bluetooth/hci_h5.c > +++ b/drivers/bluetooth/hci_h5.c > @@ -248,8 +248,10 @@ static int h5_close(struct hci_uart *hu) > if (h5->vnd && h5->vnd->close) > h5->vnd->close(h5); > > - if (!hu->serdev) > + if (!hu->serdev){ > + kfree_skb(h5->rx_skb); > kfree(h5); > + } > > return 0; > } > To fully fix the memleak you also need to add a kfree_skb(h5->rx_skb); call to the end of h5_serdev_remove(), because in the hu->serdev case that is where the h5 struct will be free-ed (it is free-ed after that function exits). Regards, Hans
On 05-10-2020 14:48, Hans de Goede wrote: > To fully fix the memleak you also need to add a kfree_skb(h5->rx_skb); > call to the end of h5_serdev_remove(), because in the hu->serdev case > that is where the h5 struct will be free-ed (it is free-ed after that > function exits). Hi Hans, I'm not entirely convinced that it might be entirely the best idea to do that. * The bug detected by syzbot only provides us with reproducer and information about this bug (which gets triggered when !hu->serdev). Even if like you said, there might be a memory leak unattended to when hu->serdev exists, then this might not necessarily be the right place to fix it. * From what I can see, all the drivers that have modified to provide serdev support have different close() mechanisms. However, one thing they do have in common (in this context) is that their respective serdev_remove() function simply calls hci_uart_unregister_device() to unregister the device. It is primarily for this reason that I feel adding a kfree_skb() call at the end of h5_serdev_remove() might not exactly be the best way we could solve this (and since this hasn't been picked up by syzbot yet, there's no way to know if this just fixes things or ends up causing unforeseen complications). Alternatively, wouldn't freeing h5->rx_skb and assigning it to NULL, for both hu->serdev and !hu->serdev cases within h5_close() itself be a better approach? I've also taken the liberty of testing a patch that does this, and it seems to work correctly too. :) But then again, I'm not exactly an authority on how this works. Thanks, Anant
Hi, On 10/6/20 4:44 AM, Anant Thazhemadam wrote: > On 05-10-2020 14:48, Hans de Goede wrote: >> To fully fix the memleak you also need to add a kfree_skb(h5->rx_skb); >> call to the end of h5_serdev_remove(), because in the hu->serdev case >> that is where the h5 struct will be free-ed (it is free-ed after that >> function exits). > > Hi Hans, > > I'm not entirely convinced that it might be entirely the best idea to do > that. > > * The bug detected by syzbot only provides us with reproducer and > information about this bug (which gets triggered when !hu->serdev). > Even if like you said, there might be a memory leak unattended to when > hu->serdev exists, then this might not necessarily be the right place to fix > it. > > * From what I can see, all the drivers that have modified to provide serdev > support have different close() mechanisms. > However, one thing they do have in common (in this context) is that their > respective serdev_remove() function simply calls hci_uart_unregister_device() > to unregister the device. > It is primarily for this reason that I feel adding a kfree_skb() call at the end > of h5_serdev_remove() might not exactly be the best way we could solve this > (and since this hasn't been picked up by syzbot yet, there's no way to know if > this just fixes things or ends up causing unforeseen complications). > > Alternatively, wouldn't freeing h5->rx_skb and assigning it to NULL, for both > hu->serdev and !hu->serdev cases within h5_close() itself be a better > approach? That will indeed also fix the leak in both cases and is more consistent wrt when the free_skb happens. So this sounds good to me, go for it. Regards, Hans > I've also taken the liberty of testing a patch that does this, and it > seems to work correctly too. :) > > But then again, I'm not exactly an authority on how this works. > > Thanks, > Anant >
diff --git a/drivers/bluetooth/hci_h5.c b/drivers/bluetooth/hci_h5.c index e41854e0d79a..3d1585add572 100644 --- a/drivers/bluetooth/hci_h5.c +++ b/drivers/bluetooth/hci_h5.c @@ -248,8 +248,12 @@ static int h5_close(struct hci_uart *hu) if (h5->vnd && h5->vnd->close) h5->vnd->close(h5); - if (!hu->serdev) - kfree(h5); + if (hu->serdev) + serdev_device_close(hu->serdev); + + kfree_skb(h5->rx_skb); + kfree(h5); + h5 = NULL; return 0; }