diff mbox series

[v2,1/3] dma-buf: heaps: Add deferred-free-helper library code

Message ID 20210123034655.102813-1-john.stultz@linaro.org
State Superseded
Headers show
Series [v2,1/3] dma-buf: heaps: Add deferred-free-helper library code | expand

Commit Message

John Stultz Jan. 23, 2021, 3:46 a.m. UTC
This patch provides infrastructure for deferring buffer frees.

This is a feature ION provided which when used with some form
of a page pool, provides a nice performance boost in an
allocation microbenchmark. The reason it helps is it allows the
page-zeroing to be done out of the normal allocation/free path,
and pushed off to a kthread.

As not all heaps will find this useful, its implemented as
a optional helper library that heaps can utilize.

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
v2:
* Fix sleep in atomic issue from using a mutex, by switching
  to a spinlock as Reported-by: kernel test robot <oliver.sang@intel.com>
* Cleanup API to use a reason enum for clarity and add some documentation
  comments as suggested by Suren Baghdasaryan.
---
 drivers/dma-buf/heaps/Kconfig                |   3 +
 drivers/dma-buf/heaps/Makefile               |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 136 +++++++++++++++++++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 ++++++++
 4 files changed, 195 insertions(+)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h

Comments

Daniel Mentz Jan. 27, 2021, 8:21 p.m. UTC | #1
On Fri, Jan 22, 2021 at 7:47 PM John Stultz <john.stultz@linaro.org> wrote:
> +static int system_heap_clear_pages(struct page **pages, int num, pgprot_t pgprot)

> +{

> +       void *addr = vmap(pages, num, VM_MAP, pgprot);

> +

> +       if (!addr)

> +               return -ENOMEM;

> +       memset(addr, 0, PAGE_SIZE * num);

> +       vunmap(addr);

> +       return 0;

> +}


I thought that vmap/vunmap are expensive, and I am wondering if
there's a faster way that avoids vmap.

How about lifting this code from lib/iov_iter.c
static void memzero_page(struct page *page, size_t offset, size_t len)
{
        char *addr = kmap_atomic(page);
        memset(addr + offset, 0, len);
        kunmap_atomic(addr);
}

Or what about lifting that code from the old ion_cma_heap.c

if (PageHighMem(pages)) {
        unsigned long nr_clear_pages = nr_pages;
        struct page *page = pages;

        while (nr_clear_pages > 0) {
                void *vaddr = kmap_atomic(page);

                memset(vaddr, 0, PAGE_SIZE);
                kunmap_atomic(vaddr);
                page++;
                nr_clear_pages--;
        }
} else {
        memset(page_address(pages), 0, size);
}
John Stultz Jan. 28, 2021, 5:10 a.m. UTC | #2
On Wed, Jan 27, 2021 at 12:21 PM Daniel Mentz <danielmentz@google.com> wrote:
>
> On Fri, Jan 22, 2021 at 7:47 PM John Stultz <john.stultz@linaro.org> wrote:
> > +static int system_heap_clear_pages(struct page **pages, int num, pgprot_t pgprot)
> > +{
> > +       void *addr = vmap(pages, num, VM_MAP, pgprot);
> > +
> > +       if (!addr)
> > +               return -ENOMEM;
> > +       memset(addr, 0, PAGE_SIZE * num);
> > +       vunmap(addr);
> > +       return 0;
> > +}
>
> I thought that vmap/vunmap are expensive, and I am wondering if
> there's a faster way that avoids vmap.
>
> How about lifting this code from lib/iov_iter.c
> static void memzero_page(struct page *page, size_t offset, size_t len)
> {
>         char *addr = kmap_atomic(page);
>         memset(addr + offset, 0, len);
>         kunmap_atomic(addr);
> }
>
> Or what about lifting that code from the old ion_cma_heap.c
>
> if (PageHighMem(pages)) {
>         unsigned long nr_clear_pages = nr_pages;
>         struct page *page = pages;
>
>         while (nr_clear_pages > 0) {
>                 void *vaddr = kmap_atomic(page);
>
>                 memset(vaddr, 0, PAGE_SIZE);
>                 kunmap_atomic(vaddr);
>                 page++;
>                 nr_clear_pages--;
>         }
> } else {
>         memset(page_address(pages), 0, size);
> }

Though, this last memset only works since CMA is contiguous, so it
probably needs to always do the kmap_atomic for each page, right?

I'm still a little worried if this is right, as the current
implementation with the vmap comes from the old ion_heap_sglist_zero
logic, which similarly tries to batch the vmaps  32 pages at at time,
but I'll give it a try.

thanks
-john
Daniel Mentz Jan. 28, 2021, 7:04 a.m. UTC | #3
On Wed, Jan 27, 2021 at 9:10 PM John Stultz <john.stultz@linaro.org> wrote:
>

> On Wed, Jan 27, 2021 at 12:21 PM Daniel Mentz <danielmentz@google.com> wrote:

> >

> > On Fri, Jan 22, 2021 at 7:47 PM John Stultz <john.stultz@linaro.org> wrote:

> > > +static int system_heap_clear_pages(struct page **pages, int num, pgprot_t pgprot)

> > > +{

> > > +       void *addr = vmap(pages, num, VM_MAP, pgprot);

> > > +

> > > +       if (!addr)

> > > +               return -ENOMEM;

> > > +       memset(addr, 0, PAGE_SIZE * num);

> > > +       vunmap(addr);

> > > +       return 0;

> > > +}

> >

> > I thought that vmap/vunmap are expensive, and I am wondering if

> > there's a faster way that avoids vmap.

> >

> > How about lifting this code from lib/iov_iter.c

> > static void memzero_page(struct page *page, size_t offset, size_t len)

> > {

> >         char *addr = kmap_atomic(page);

> >         memset(addr + offset, 0, len);

> >         kunmap_atomic(addr);

> > }

> >

> > Or what about lifting that code from the old ion_cma_heap.c

> >

> > if (PageHighMem(pages)) {

> >         unsigned long nr_clear_pages = nr_pages;

> >         struct page *page = pages;

> >

> >         while (nr_clear_pages > 0) {

> >                 void *vaddr = kmap_atomic(page);

> >

> >                 memset(vaddr, 0, PAGE_SIZE);

> >                 kunmap_atomic(vaddr);

> >                 page++;

> >                 nr_clear_pages--;

> >         }

> > } else {

> >         memset(page_address(pages), 0, size);

> > }

>

> Though, this last memset only works since CMA is contiguous, so it

> probably needs to always do the kmap_atomic for each page, right?


Yeah, but with the system heap page pool, some of these pages might be
64KB or 1MB large. kmap_atomic(page) just maps to page_address(page)
in most cases. I think iterating over all pages individually in this
manner might still be faster than using vmap.

>

> I'm still a little worried if this is right, as the current

> implementation with the vmap comes from the old ion_heap_sglist_zero

> logic, which similarly tries to batch the vmaps  32 pages at at time,

> but I'll give it a try.
diff mbox series

Patch

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..ecf65204f714 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,3 +1,6 @@ 
+config DMABUF_HEAPS_DEFERRED_FREE
+	bool
+
 config DMABUF_HEAPS_SYSTEM
 	bool "DMA-BUF System Heap"
 	depends on DMABUF_HEAPS
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..4e7839875615 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@ 
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_DMABUF_HEAPS_DEFERRED_FREE) += deferred-free-helper.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)	+= system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA)		+= cma_heap.o
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.c b/drivers/dma-buf/heaps/deferred-free-helper.c
new file mode 100644
index 000000000000..cf04148167a2
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.c
@@ -0,0 +1,136 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Deferred dmabuf freeing helper
+ *
+ * Copyright (C) 2020 Linaro, Ltd.
+ *
+ * Based on the ION page pool code
+ * Copyright (C) 2011 Google, Inc.
+ */
+
+#include <linux/freezer.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/swap.h>
+#include <linux/sched/signal.h>
+
+#include "deferred-free-helper.h"
+
+static LIST_HEAD(free_list);
+static size_t list_size;
+wait_queue_head_t freelist_waitqueue;
+struct task_struct *freelist_task;
+static DEFINE_SPINLOCK(free_list_lock);
+
+void deferred_free(struct deferred_freelist_item *item,
+		   void (*free)(struct deferred_freelist_item*,
+				enum df_reason),
+		   size_t size)
+{
+	unsigned long flags;
+
+	INIT_LIST_HEAD(&item->list);
+	item->size = size;
+	item->free = free;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	list_add(&item->list, &free_list);
+	list_size += size;
+	spin_unlock_irqrestore(&free_list_lock, flags);
+	wake_up(&freelist_waitqueue);
+}
+
+static size_t free_one_item(enum df_reason reason)
+{
+	unsigned long flags;
+	size_t size = 0;
+	struct deferred_freelist_item *item;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	if (list_empty(&free_list)) {
+		spin_unlock_irqrestore(&free_list_lock, flags);
+		return 0;
+	}
+	item = list_first_entry(&free_list, struct deferred_freelist_item, list);
+	list_del(&item->list);
+	size = item->size;
+	list_size -= size;
+	spin_unlock_irqrestore(&free_list_lock, flags);
+
+	item->free(item, reason);
+	return size;
+}
+
+static unsigned long get_freelist_size(void)
+{
+	unsigned long size;
+	unsigned long flags;
+
+	spin_lock_irqsave(&free_list_lock, flags);
+	size = list_size;
+	spin_unlock_irqrestore(&free_list_lock, flags);
+	return size;
+}
+
+static unsigned long freelist_shrink_count(struct shrinker *shrinker,
+					   struct shrink_control *sc)
+{
+	return get_freelist_size();
+}
+
+static unsigned long freelist_shrink_scan(struct shrinker *shrinker,
+					  struct shrink_control *sc)
+{
+	int total_freed = 0;
+
+	if (sc->nr_to_scan == 0)
+		return 0;
+
+	while (total_freed < sc->nr_to_scan) {
+		int freed = free_one_item(DF_UNDER_PRESSURE);
+
+		if (!freed)
+			break;
+
+		total_freed += freed;
+	}
+
+	return total_freed;
+}
+
+static struct shrinker freelist_shrinker = {
+	.count_objects = freelist_shrink_count,
+	.scan_objects = freelist_shrink_scan,
+	.seeks = DEFAULT_SEEKS,
+	.batch = 0,
+};
+
+static int deferred_free_thread(void *data)
+{
+	while (true) {
+		wait_event_freezable(freelist_waitqueue,
+				     get_freelist_size() > 0);
+
+		free_one_item(DF_NORMAL);
+	}
+
+	return 0;
+}
+
+static int deferred_freelist_init(void)
+{
+	list_size = 0;
+
+	init_waitqueue_head(&freelist_waitqueue);
+	freelist_task = kthread_run(deferred_free_thread, NULL,
+				    "%s", "dmabuf-deferred-free-worker");
+	if (IS_ERR(freelist_task)) {
+		pr_err("%s: creating thread for deferred free failed\n",
+		       __func__);
+		return -1;
+	}
+	sched_set_normal(freelist_task, 19);
+
+	return register_shrinker(&freelist_shrinker);
+}
+device_initcall(deferred_freelist_init);
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.h b/drivers/dma-buf/heaps/deferred-free-helper.h
new file mode 100644
index 000000000000..2c43dd5a3eda
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.h
@@ -0,0 +1,55 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef DEFERRED_FREE_HELPER_H
+#define DEFERRED_FREE_HELPER_H
+
+/**
+ * df_reason - enum for reason why item was freed
+ *
+ * This provides a reason for why the free funciton was called
+ * on the item. This is useful when deferred_free is used in
+ * combination with a pagepool, so under pressure the page can
+ * be immediately freed.
+ *
+ * DF_NORMAL:         Normal deferred free
+ *
+ * DF_UNDER_PRESSURE: Free was called because the system
+ *                    is under memory pressure. Usually
+ *                    from a shrinker. Avoid allocating
+ *                    memory in the free call, as it may
+ *                    fail.
+ */
+enum df_reason {
+	DF_NORMAL,
+	DF_UNDER_PRESSURE,
+};
+
+/**
+ * deferred_freelist_item - item structure for deferred freelist
+ *
+ * This is to be added to the structure for whatever you want to
+ * defer freeing on.
+ *
+ * @size: size of the item to be freed
+ * @free: function pointer to be called when freeing the item
+ * @list: list entry for the deferred list
+ */
+struct deferred_freelist_item {
+	size_t size;
+	void (*free)(struct deferred_freelist_item *i,
+		     enum df_reason reason);
+	struct list_head list;
+};
+
+/**
+ * deferred_free - call to add item to the deferred free list
+ *
+ * @item: Pointer to deferred_freelist_item field of a structure
+ * @free: Function pointer to the free call
+ * @size: Size of the item to be freed
+ */
+void deferred_free(struct deferred_freelist_item *item,
+		   void (*free)(struct deferred_freelist_item *i,
+				enum df_reason reason),
+		   size_t size);
+#endif