Message ID | 20241127212027.2704515-1-max.kellermann@ionos.com |
---|---|
State | New |
Headers | show |
Series | fs/ceph/file: fix memory leaks in __ceph_sync_read() | expand |
On Fri, Nov 29, 2024 at 9:06 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Thu, Nov 28, 2024 at 1:18 PM Alex Markuze <amarkuze@redhat.com> wrote: > > Pages are freed in `ceph_osdc_put_request`, trying to release them > > this way will end badly. > > Is there anybody else who can explain this to me? > I believe Alex is wrong and my patch is correct, but maybe I'm missing > something. It's been a week. Is there really nobody who understands this piece of code? I believe I do understand it, but my understanding conflicts with Alex's, and he's the expert (and I'm not).
On Thu, Dec 5, 2024 at 9:32 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Fri, Nov 29, 2024 at 9:06 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > > > On Thu, Nov 28, 2024 at 1:18 PM Alex Markuze <amarkuze@redhat.com> wrote: > > > Pages are freed in `ceph_osdc_put_request`, trying to release them > > > this way will end badly. > > > > Is there anybody else who can explain this to me? > > I believe Alex is wrong and my patch is correct, but maybe I'm missing > > something. > > It's been a week. Is there really nobody who understands this piece of > code? I believe I do understand it, but my understanding conflicts > with Alex's, and he's the expert (and I'm not). Hi Max, Your understanding is correct. Pages would be freed automatically together with the request only if the ownership is transferred by passing true for own_pages to osd_req_op_extent_osd_data_pages(), which __ceph_sync_read() doesn't do. These error path leaks were introduced in commits 03bc06c7b0bd ("ceph: add new mount option to enable sparse reads") and f0fe1e54cfcf ("ceph: plumb in decryption during reads") with support for fscrypt. Looking at the former commit, it looks like a similar leak was introduced in ceph_direct_read_write() too -- on bvecs instead of pages. I have applied this patch and will take care of the leak on bvecs myself because I think I see other issues there. Thanks, Ilya
This is a bad patch, I don't appreciate partial fixes that introduce unnecessary complications to the code, and it conflicts with the complete fix in the other thread. Please coordinate with me in the future. On Thu, Dec 5, 2024 at 1:25 PM Ilya Dryomov <idryomov@gmail.com> wrote: > > On Thu, Dec 5, 2024 at 9:32 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > > > On Fri, Nov 29, 2024 at 9:06 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > > > > > On Thu, Nov 28, 2024 at 1:18 PM Alex Markuze <amarkuze@redhat.com> wrote: > > > > Pages are freed in `ceph_osdc_put_request`, trying to release them > > > > this way will end badly. > > > > > > Is there anybody else who can explain this to me? > > > I believe Alex is wrong and my patch is correct, but maybe I'm missing > > > something. > > > > It's been a week. Is there really nobody who understands this piece of > > code? I believe I do understand it, but my understanding conflicts > > with Alex's, and he's the expert (and I'm not). > > Hi Max, > > Your understanding is correct. Pages would be freed automatically > together with the request only if the ownership is transferred by > passing true for own_pages to osd_req_op_extent_osd_data_pages(), which > __ceph_sync_read() doesn't do. > > These error path leaks were introduced in commits 03bc06c7b0bd ("ceph: > add new mount option to enable sparse reads") and f0fe1e54cfcf ("ceph: > plumb in decryption during reads") with support for fscrypt. Looking > at the former commit, it looks like a similar leak was introduced in > ceph_direct_read_write() too -- on bvecs instead of pages. > > I have applied this patch and will take care of the leak on bvecs > myself because I think I see other issues there. > > Thanks, > > Ilya >
On Thu, Dec 5, 2024 at 12:31 PM Alex Markuze <amarkuze@redhat.com> wrote: > This is a bad patch, I don't appreciate partial fixes that introduce > unnecessary complications to the code, and it conflicts with the > complete fix in the other thread. Alex, and I don't appreciate the unnecessary complications you introduce to the Ceph contribution process! The mistake you made in your first review ("will end badly") is not a big deal; happens to everybody - but you still don't admit the mistake and you ghosted me for a week. But then saying you don't appreciate the work of somebody who found a bug and posted a simple fix is not good communication. You can say you prefer a different patch and explain the technical reasons; but saying you don't appreciate it is quite condescending. Now back to the technical facts: - What exactly about my patch is "bad"? - Do you believe my patch is not strictly an improvement? - Why do you believe my fix is only "partial"? - What unnecessary complications are introduced by my two-line patch in your opinion? - What "other thread"? I can't find anything on LKML and ceph-devel. Max
The full fix is now in the testing branch. Max, please follow the mailing list, I posted the patch last week on the initial thread regarding this issue. Please, comment on the correct thread, having two threads regarding the same issue introduces unnecessary confusion. The fix resolves the following tracker. https://tracker.ceph.com/issues/67524 Additionally, these issues are no longer recreated after the fix. https://tracker.ceph.com/issues/68981 https://tracker.ceph.com/issues/68980 I will make a couple runs with KASAN and its peers, as it's not immediately clear why these two issues are affected. On Thu, Dec 5, 2024 at 2:02 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Thu, Dec 5, 2024 at 12:31 PM Alex Markuze <amarkuze@redhat.com> wrote: > > This is a bad patch, I don't appreciate partial fixes that introduce > > unnecessary complications to the code, and it conflicts with the > > complete fix in the other thread. > > Alex, and I don't appreciate the unnecessary complications you > introduce to the Ceph contribution process! > > The mistake you made in your first review ("will end badly") is not a > big deal; happens to everybody - but you still don't admit the mistake > and you ghosted me for a week. But then saying you don't appreciate > the work of somebody who found a bug and posted a simple fix is not > good communication. You can say you prefer a different patch and > explain the technical reasons; but saying you don't appreciate it is > quite condescending. > > Now back to the technical facts: > > - What exactly about my patch is "bad"? > - Do you believe my patch is not strictly an improvement? > - Why do you believe my fix is only "partial"? > - What unnecessary complications are introduced by my two-line patch > in your opinion? > - What "other thread"? I can't find anything on LKML and ceph-devel. > > Max >
On Thu, Dec 5, 2024 at 1:02 PM Max Kellermann <max.kellermann@ionos.com> wrote:
> - What "other thread"? I can't find anything on LKML and ceph-devel.
Just in case you mean this patch authored by you:
https://github.com/ceph/ceph-client/commit/2a802a906f9c89f8ae492dbfcd82ff41272abab1
I don't think that's a good patch, and if I had the power, I would
certainly reject it, because:
- it's big and confusing; hard to review
- it's badly documented; the commit message is just "fixing a race
condition when a file shrinks" but other than that, doesn't explain
anything; a proper explanation is necessary for such a complicated
diff
- the patch changes many distinct things in one patch, but it should
really be split
- this patch is about the buffer overflow for which my patch is much
simpler: https://lore.kernel.org/lkml/20241127212130.2704804-1-max.kellermann@ionos.com/
which I suggested merging instead of all the other candicate patches
https://lore.kernel.org/lkml/CAKPOu+9kdcjMf36bF3HAW4K8v0mHxXQX3_oQfGSshmXBKtS43A@mail.gmail.com/
but you did not reply (as usual, sigh!)
- deeply hidden in this patch is also a fix for the memory leak, but
instead of making one large patch which fixes everything, you should
first merge my trivial leak bug fix and then the fix for the buffer
overflow on top
On Thu, Dec 5, 2024 at 1:17 PM Alex Markuze <amarkuze@redhat.com> wrote: > > The full fix is now in the testing branch. Alex, yet again, you did not reply to any of my questions! This is tiring. The full fix may be a "full" fix for something else, and just happens to mix in several other unrelated things, e.g. a fix for the memory leak bug I found. That is what I call "bad". My patch is not a "partial" fix. It is a full fix for the memory leak bug. It just doesn't fix anything else, but that's how it's supposed to be (and certainly not "bad"): a tiny patch that can be reviewed easily that addresses just one issue. That's the opposite of "complications". > Max, please follow the mailing list, I posted the patch last week on > the initial thread regarding this issue. Please, comment on the > correct thread, having two threads regarding the same issue introduces > unnecessary confusion. But... THIS is the initial thread regarding this issue (= memory leak). It is you who creates unnecessary confusion: with your emails and with your code.
I will explain the process for ceph client patches. It's important to note: The process itself and part of the automation is still evolving and so many things have to be done manually. 1. A patch is created and pushed into a local testing branch and tested with xfstests on a kernel compiled with KASAN, several static analysis tools are run as well. 2. The local testing branch is merged with the `testing` branch which implies that all sepia labs teutology tests run on this new kernel -- so care must be taken before pushing. 3. The patch is reviewed on the mailing list by the community, when acked Ilya takes it to the main branch and eventually to the upstream kernel. I will break it up into three separate patches, as it addresses three different issues. So that's a good comment. Any other constructive comments will be appreciated, both regarding the patch and the process. I didn't send an RFC yet; I need to understand how the other two issues were fixed as well, I will send an RFC when I'm done, as I need to make sure all root causes are fixed. I prefer fixing things once. On Thu, Dec 5, 2024 at 2:22 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Thu, Dec 5, 2024 at 1:02 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > - What "other thread"? I can't find anything on LKML and ceph-devel. > > Just in case you mean this patch authored by you: > https://github.com/ceph/ceph-client/commit/2a802a906f9c89f8ae492dbfcd82ff41272abab1 > > I don't think that's a good patch, and if I had the power, I would > certainly reject it, because: > > - it's big and confusing; hard to review > - it's badly documented; the commit message is just "fixing a race > condition when a file shrinks" but other than that, doesn't explain > anything; a proper explanation is necessary for such a complicated > diff > - the patch changes many distinct things in one patch, but it should > really be split > - this patch is about the buffer overflow for which my patch is much > simpler: https://lore.kernel.org/lkml/20241127212130.2704804-1-max.kellermann@ionos.com/ > which I suggested merging instead of all the other candicate patches > https://lore.kernel.org/lkml/CAKPOu+9kdcjMf36bF3HAW4K8v0mHxXQX3_oQfGSshmXBKtS43A@mail.gmail.com/ > but you did not reply (as usual, sigh!) > - deeply hidden in this patch is also a fix for the memory leak, but > instead of making one large patch which fixes everything, you should > first merge my trivial leak bug fix and then the fix for the buffer > overflow on top >
On Thu, Dec 5, 2024 at 1:57 PM Alex Markuze <amarkuze@redhat.com> wrote: > > I will explain the process for ceph client patches. It's important to > note: The process itself and part of the automation is still evolving > and so many things have to be done manually. None of this answers any of my questions on your negative review comments. > I will break it up into three separate patches, as it addresses three > different issues. ... one of which will be my patch.
On Thu, Dec 5, 2024 at 2:08 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Thu, Dec 5, 2024 at 1:57 PM Alex Markuze <amarkuze@redhat.com> wrote: > > > > I will explain the process for ceph client patches. It's important to > > note: The process itself and part of the automation is still evolving > > and so many things have to be done manually. > > None of this answers any of my questions on your negative review comments. > > > I will break it up into three separate patches, as it addresses three > > different issues. > > ... one of which will be my patch. It looks like Alex's preference is to address these memory leaks by passing true for own_pages to osd_req_op_extent_osd_data_pages() and adjusting where the OSD request is put accordingly -- this is what he folded in his WIP patch which was posted in Luis' thread [1] the day after this patch and pushed to the testing branch earlier today [2]. Unfortunately an update to this thread was missed, leaving everyone including myself confused. Max, would you be willing to redo this patch to pass true for own_pages and post a v2? There is nothing "bad", "partial" or otherwise wrong with this version, but having the pages be taken care of automatically is a bit nicer and a conflict with Alex's ongoing work would be avoided. [1] https://lore.kernel.org/ceph-devel/CAO8a2ShzHuTizjY+T+RNr28XLGOJPNoXJf1rQUburMBVwJywMA@mail.gmail.com/ [2] https://github.com/ceph/ceph-client/commit/2a802a906f9c89f8ae492dbfcd82ff41272abab1 Thanks, Ilya
On Thu, Dec 5, 2024 at 4:12 PM Ilya Dryomov <idryomov@gmail.com> wrote: > Max, would you be willing to redo this patch to pass true for own_pages > and post a v2? There is nothing "bad", "partial" or otherwise wrong > with this version, but having the pages be taken care of automatically > is a bit nicer and a conflict with Alex's ongoing work would be avoided. Yes, I will send a patch for this. Even though I don't agree that this is the best way forward; I'd prefer to fix the leak with a minimal patch adding only the necessary calls, and not mix this with code refactoring. A minimal fix is easier to review. Mixing a bug fix with refactoring is more dangerous, which is important to avoid, because this patch will be put in all stable branches. But ... if you want to, I'll do that. btw. Alex's patch (https://github.com/ceph/ceph-client/commit/2a802a906f9c89f8ae4) introduces another memory leak: by postponing the ceph_osdc_put_request() call, the -EFAULT branch now leaks the ceph_osd_request object. I'll take care not to make the same mistake in my v2.
On Thu, Dec 5, 2024 at 4:37 PM Max Kellermann <max.kellermann@ionos.com> wrote: > btw. Alex's patch > (https://github.com/ceph/ceph-client/commit/2a802a906f9c89f8ae4) > introduces another memory leak Sorry, that was wrong. The "break" only exits the inner "while" loop. I got confused by nested loops.
On Thu, Dec 5, 2024 at 5:30 PM Alex Markuze <amarkuze@redhat.com> wrote: > > Good. > This sequence has not been tested independently, but it should be fine. Applied. Thanks, Ilya > > On Thu, Dec 5, 2024 at 5:49 PM Max Kellermann <max.kellermann@ionos.com> wrote: > > > > In two `break` statements, the call to ceph_release_page_vector() was > > missing, leaking the allocation from ceph_alloc_page_vector(). > > > > Instead of adding the missing ceph_release_page_vector() calls, the > > Ceph maintainers preferred to transfer page ownership to the > > `ceph_osd_request` by passing `own_pages=true` to > > osd_req_op_extent_osd_data_pages(). This requires postponing the > > ceph_osdc_put_request() call until after the block that accesses the > > `pages`. > > > > Cc: stable@vger.kernel.org > > Signed-off-by: Max Kellermann <max.kellermann@ionos.com> > > --- > > fs/ceph/file.c | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > > index 4b8d59ebda00..ce342a5d4b8b 100644 > > --- a/fs/ceph/file.c > > +++ b/fs/ceph/file.c > > @@ -1127,7 +1127,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, > > > > osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, > > offset_in_page(read_off), > > - false, false); > > + false, true); > > > > op = &req->r_ops[0]; > > if (sparse) { > > @@ -1186,8 +1186,6 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, > > ret = min_t(ssize_t, fret, len); > > } > > > > - ceph_osdc_put_request(req); > > - > > /* Short read but not EOF? Zero out the remainder. */ > > if (ret >= 0 && ret < len && (off + ret < i_size)) { > > int zlen = min(len - ret, i_size - off - ret); > > @@ -1221,7 +1219,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, > > break; > > } > > } > > - ceph_release_page_vector(pages, num_pages); > > + > > + ceph_osdc_put_request(req); > > > > if (ret < 0) { > > if (ret == -EBLOCKLISTED) > > -- > > 2.45.2 > > >
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 4b8d59ebda00..24d0f1cc9aac 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1134,6 +1134,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, extent_cnt = __ceph_sparse_read_ext_count(inode, read_len); ret = ceph_alloc_sparse_ext_map(op, extent_cnt); if (ret) { + ceph_release_page_vector(pages, num_pages); ceph_osdc_put_request(req); break; } @@ -1168,6 +1169,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, op->extent.sparse_ext_cnt); if (fret < 0) { ret = fret; + ceph_release_page_vector(pages, num_pages); ceph_osdc_put_request(req); break; }
In two `break` statements, the call to ceph_release_page_vector() was missing, leaking the allocation from ceph_alloc_page_vector(). Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann@ionos.com> --- fs/ceph/file.c | 2 ++ 1 file changed, 2 insertions(+)