diff mbox series

[1/1] dma-buf: heaps: Map system heap pages as managed by linux vm

Message ID 20210128083817.314315-1-surenb@google.com
State New
Headers show
Series [1/1] dma-buf: heaps: Map system heap pages as managed by linux vm | expand

Commit Message

Suren Baghdasaryan Jan. 28, 2021, 8:38 a.m. UTC
Currently system heap maps its buffers with VM_PFNMAP flag using
remap_pfn_range. This results in such buffers not being accounted
for in PSS calculations because vm treats this memory as having no
page structs. Without page structs there are no counters representing
how many processes are mapping a page and therefore PSS calculation
is impossible.
Historically, ION driver used to map its buffers as VM_PFNMAP areas
due to memory carveouts that did not have page structs [1]. That
is not the case anymore and it seems there was desire to move away
from remap_pfn_range [2].
Dmabuf system heap design inherits this ION behavior and maps its
pages using remap_pfn_range even though allocated pages are backed
by page structs.
Clear VM_IO and VM_PFNMAP flags when mapping memory allocated by the
system heap and replace remap_pfn_range with vm_insert_page, following
Laura's suggestion in [1]. This would allow correct PSS calculation
for dmabufs.

[1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
[2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
(sorry, could not find lore links for these discussions)

Suggested-by: Laura Abbott <labbott@kernel.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 drivers/dma-buf/heaps/system_heap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Suren Baghdasaryan Feb. 2, 2021, 1:08 a.m. UTC | #1
On Thu, Jan 28, 2021 at 11:00 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Thu, Jan 28, 2021 at 10:19 AM Minchan Kim <minchan@kernel.org> wrote:
> >
> > On Thu, Jan 28, 2021 at 09:52:59AM -0800, Suren Baghdasaryan wrote:
> > > On Thu, Jan 28, 2021 at 1:13 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > >
> > > > On Thu, Jan 28, 2021 at 12:38:17AM -0800, Suren Baghdasaryan wrote:
> > > > > Currently system heap maps its buffers with VM_PFNMAP flag using
> > > > > remap_pfn_range. This results in such buffers not being accounted
> > > > > for in PSS calculations because vm treats this memory as having no
> > > > > page structs. Without page structs there are no counters representing
> > > > > how many processes are mapping a page and therefore PSS calculation
> > > > > is impossible.
> > > > > Historically, ION driver used to map its buffers as VM_PFNMAP areas
> > > > > due to memory carveouts that did not have page structs [1]. That
> > > > > is not the case anymore and it seems there was desire to move away
> > > > > from remap_pfn_range [2].
> > > > > Dmabuf system heap design inherits this ION behavior and maps its
> > > > > pages using remap_pfn_range even though allocated pages are backed
> > > > > by page structs.
> > > > > Clear VM_IO and VM_PFNMAP flags when mapping memory allocated by the
> > > > > system heap and replace remap_pfn_range with vm_insert_page, following
> > > > > Laura's suggestion in [1]. This would allow correct PSS calculation
> > > > > for dmabufs.
> > > > >
> > > > > [1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-memory-for-direct-io
> > > > > [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-October/127519.html
> > > > > (sorry, could not find lore links for these discussions)
> > > > >
> > > > > Suggested-by: Laura Abbott <labbott@kernel.org>
> > > > > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > > > > ---
> > > > >  drivers/dma-buf/heaps/system_heap.c | 6 ++++--
> > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
> > > > > index 17e0e9a68baf..0e92e42b2251 100644
> > > > > --- a/drivers/dma-buf/heaps/system_heap.c
> > > > > +++ b/drivers/dma-buf/heaps/system_heap.c
> > > > > @@ -200,11 +200,13 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > > > >       struct sg_page_iter piter;
> > > > >       int ret;
> > > > >
> > > > > +     /* All pages are backed by a "struct page" */
> > > > > +     vma->vm_flags &= ~VM_PFNMAP;
> > > >
> > > > Why do we clear this flag?  It shouldn't even be set here as far as I
> > > > can tell.
> > >
> > > Thanks for the question, Christoph.
> > > I tracked down that flag being set by drm_gem_mmap_obj() which DRM
> > > drivers use to "Set up the VMA to prepare mapping of the GEM object"
> > > (according to drm_gem_mmap_obj comments). I also see a pattern in
> > > several DMR drivers to call drm_gem_mmap_obj()/drm_gem_mmap(), then
> > > clear VM_PFNMAP and then map the VMA (for example here:
> > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/rockchip/rockchip_drm_gem.c#L246).
> > > I thought that dmabuf allocator (in this case the system heap) would
> > > be the right place to set these flags because it controls how memory
> > > is allocated before mapping. However it's quite possible that I'm
> >
> > However, you're not setting but removing a flag under the caller.
> > It's different with appending more flags(e.g., removing condition
> > vs adding more conditions). If we should remove the flag, caller
> > didn't need to set it from the beginning. Hiding it under this API
> > continue to make wrong usecase in future.
>
> Which takes us back to the question of why VM_PFNMAP is being set by
> the caller in the first place.
>
> >
> > > missing the real reason for VM_PFNMAP being set in drm_gem_mmap_obj()
> > > before dma_buf_mmap() is called. I could not find the answer to that,
> > > so I hope someone here can clarify that.
> >
> > Guess DRM had used carved out pure PFN memory long time ago and
> > changed to use dmabuf since somepoint.
>
> It would be really good to know the reason for sure to address the
> issue properly.
>
> > Whatever there is a history, rather than removing the flag
> > under them, let's add WARN_ON(vma->vm_flags & VM_PFNMAP) so
> > we could clean up catching them and start discussion.
>
> The issue with not clearing the flag here is that vm_insert_page() has
> a BUG_ON(vma->vm_flags & VM_PFNMAP). If we do not clear this flag I
> suspect we will get many angry developers :)
> If your above guess is correct and we can mandate dmabuf heap users
> not to use VM_PFNMAP then I think the following code might be the best
> way forward:
>
> +       bool pfn_requested = !!(vma->vm_flags & VM_PFNMAP);
> +.      WARN_ON_ONCE(pfn_requested);
>
>         for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
>                 struct page *page = sg_page_iter_page(&piter);
>
> -               ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
> -                                     vma->vm_page_prot);
> +               ret = pfn_requested ?
> +.                      remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
> +                                     vma->vm_page_prot) :
> +                       vm_insert_page(vma, addr, page);

Folks, any objections to the approach above?
Suren Baghdasaryan Feb. 2, 2021, 8:44 a.m. UTC | #2
On Mon, Feb 1, 2021 at 11:03 PM Christoph Hellwig <hch@infradead.org> wrote:
>

> IMHO the

>

>         BUG_ON(vma->vm_flags & VM_PFNMAP);

>

> in vm_insert_page should just become a WARN_ON_ONCE with an error

> return, and then we just need to gradually fix up the callers that

> trigger it instead of coming up with workarounds like this.


For the existing vm_insert_page users this should be fine since
BUG_ON() guarantees that none of them sets VM_PFNMAP. However, for the
system_heap_mmap I have one concern. When vm_insert_page returns an
error due to VM_PFNMAP flag, the whole mmap operation should fail
(system_heap_mmap returning an error leading to dma_buf_mmap failure).
Could there be cases when a heap user (DRM driver for example) would
be expected to work with a heap which requires VM_PFNMAP and at the
same time with another heap which requires !VM_PFNMAP? IOW, this
introduces a dependency between the heap and its
user. The user would have to know expectations of the heap it uses and
can't work with another heap that has the opposite expectation. This
usecase is purely theoretical and maybe I should not worry about it
for now?
Christoph Hellwig Feb. 2, 2021, 8:51 a.m. UTC | #3
On Tue, Feb 02, 2021 at 12:44:44AM -0800, Suren Baghdasaryan wrote:
> On Mon, Feb 1, 2021 at 11:03 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > IMHO the
> >
> >         BUG_ON(vma->vm_flags & VM_PFNMAP);
> >
> > in vm_insert_page should just become a WARN_ON_ONCE with an error
> > return, and then we just need to gradually fix up the callers that
> > trigger it instead of coming up with workarounds like this.
> 
> For the existing vm_insert_page users this should be fine since
> BUG_ON() guarantees that none of them sets VM_PFNMAP.

Even for them WARN_ON_ONCE plus an actual error return is a way
better assert that is much developer friendly.

> However, for the
> system_heap_mmap I have one concern. When vm_insert_page returns an
> error due to VM_PFNMAP flag, the whole mmap operation should fail
> (system_heap_mmap returning an error leading to dma_buf_mmap failure).
> Could there be cases when a heap user (DRM driver for example) would
> be expected to work with a heap which requires VM_PFNMAP and at the
> same time with another heap which requires !VM_PFNMAP? IOW, this
> introduces a dependency between the heap and its
> user. The user would have to know expectations of the heap it uses and
> can't work with another heap that has the opposite expectation. This
> usecase is purely theoretical and maybe I should not worry about it
> for now?

If such a case ever arises we can look into it.
diff mbox series

Patch

diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c
index 17e0e9a68baf..0e92e42b2251 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -200,11 +200,13 @@  static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
 	struct sg_page_iter piter;
 	int ret;
 
+	/* All pages are backed by a "struct page" */
+	vma->vm_flags &= ~VM_PFNMAP;
+
 	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
 		struct page *page = sg_page_iter_page(&piter);
 
-		ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
-				      vma->vm_page_prot);
+		ret = vm_insert_page(vma, addr, page);
 		if (ret)
 			return ret;
 		addr += PAGE_SIZE;