[1/4] dma-mapping: add a new dma_need_sync API

Message ID	20200629130359.2690853-2-hch@lst.de
State	New
Headers	show Return-Path: <SRS0=550V=AK=vger.kernel.org=netdev-owner@kernel.org> From: Christoph Hellwig <hch@lst.de> To: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com>, Jonathan Lemon <jonathan.lemon@gmail.com>, iommu@lists.linux-foundation.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/4] dma-mapping: add a new dma_need_sync API Date: Mon, 29 Jun 2020 15:03:56 +0200 Message-Id: <20200629130359.2690853-2-hch@lst.de> In-Reply-To: <20200629130359.2690853-1-hch@lst.de> References: <20200629130359.2690853-1-hch@lst.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	[1/4] dma-mapping: add a new dma_need_sync API \| expand [1/4] dma-mapping: add a new dma_need_sync API [2/4] xsk: replace the cheap_dma flag with a dma_need_sync flag

Message ID

20200629130359.2690853-2-hch@lst.de

State

New

Headers

From: Christoph Hellwig <hch@lst.de>
To: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	iommu@lists.linux-foundation.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH 1/4] dma-mapping: add a new dma_need_sync API
Date: Mon, 29 Jun 2020 15:03:56 +0200
Message-Id: <20200629130359.2690853-2-hch@lst.de>
In-Reply-To: <20200629130359.2690853-1-hch@lst.de>
References: <20200629130359.2690853-1-hch@lst.de>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: netdev-owner@vger.kernel.org
Precedence: bulk

Series

[1/4] dma-mapping: add a new dma_need_sync API | expand

Commit Message

Christoph Hellwig June 29, 2020, 1:03 p.m. UTC

Add a new API to check if calls to dma_sync_single_for_{device,cpu} are
required for a given DMA streaming mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/core-api/dma-api.rst |  8 ++++++++
 include/linux/dma-direct.h         |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 kernel/dma/direct.c                |  6 ++++++
 kernel/dma/mapping.c               | 10 ++++++++++
 5 files changed, 30 insertions(+)

Comments

Jonathan Lemon July 6, 2020, 7:42 p.m. UTC | #1

On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:
> Add a new API to check if calls to dma_sync_single_for_{device,cpu} are

> required for a given DMA streaming mapping.

> 

> +::

> +

> +	bool

> +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);

> +

> +Returns %true if dma_sync_single_for_{device,cpu} calls are required to

> +transfer memory ownership.  Returns %false if those calls can be skipped.


Hi Christoph -

Thie call above is for a specific dma_addr.  For correctness, would I
need to check every addr, or can I assume that for a specific memory
type (pages returned from malloc), that the answer would be identical?
-- 
Jonathan

Christoph Hellwig July 7, 2020, 6:47 a.m. UTC | #2

On Mon, Jul 06, 2020 at 12:42:27PM -0700, Jonathan Lemon wrote:
> On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:

> > Add a new API to check if calls to dma_sync_single_for_{device,cpu} are

> > required for a given DMA streaming mapping.

> > 

> > +::

> > +

> > +	bool

> > +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);

> > +

> > +Returns %true if dma_sync_single_for_{device,cpu} calls are required to

> > +transfer memory ownership.  Returns %false if those calls can be skipped.

> 

> Hi Christoph -

> 

> Thie call above is for a specific dma_addr.  For correctness, would I

> need to check every addr, or can I assume that for a specific memory

> type (pages returned from malloc), that the answer would be identical?


You need to check every mapping.  E.g. this API pairs with a
dma_map_single/page call.  For S/G mappings you'd need to call it for
each entry, although if you have a use case for that we really should
add a dma_sg_need_sync helper instea of open coding the scatterlist walk.

Jonathan Lemon July 7, 2020, 3:11 p.m. UTC | #3

On Tue, Jul 07, 2020 at 08:47:30AM +0200, Christoph Hellwig wrote:
> On Mon, Jul 06, 2020 at 12:42:27PM -0700, Jonathan Lemon wrote:

> > On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:

> > > Add a new API to check if calls to dma_sync_single_for_{device,cpu} are

> > > required for a given DMA streaming mapping.

> > > 

> > > +::

> > > +

> > > +	bool

> > > +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);

> > > +

> > > +Returns %true if dma_sync_single_for_{device,cpu} calls are required to

> > > +transfer memory ownership.  Returns %false if those calls can be skipped.

> > 

> > Hi Christoph -

> > 

> > Thie call above is for a specific dma_addr.  For correctness, would I

> > need to check every addr, or can I assume that for a specific memory

> > type (pages returned from malloc), that the answer would be identical?

> 

> You need to check every mapping.  E.g. this API pairs with a

> dma_map_single/page call.  For S/G mappings you'd need to call it for

> each entry, although if you have a use case for that we really should

> add a dma_sg_need_sync helper instea of open coding the scatterlist walk.


My use case is setting up a pinned memory area, and caching the dma
mappings.  I'd like to bypass storing the DMA addresses if they aren't
needed.  For example:

setup()
{
    if (dma_need_sync(dev, addr, len)) {
        kvmalloc_array(...)
        cache_dma_mappings(...)
    }


dev_get_dma(page)
{
    if (!cache)
        return page_to_phys(page)

    return dma_cache_lookup(...)



The reason for doing it this way is that the page in question may be
backed by either system memory, or device memory such as a GPU.  For the
latter, the GPU provides a table of DMA addresses where data may be
accessed, so I'm unable to use the dma_map_page() API.
-- 
Jonathan

Christoph Hellwig July 7, 2020, 3:14 p.m. UTC | #4

On Tue, Jul 07, 2020 at 08:11:09AM -0700, Jonathan Lemon wrote:
> > You need to check every mapping.  E.g. this API pairs with a

> > dma_map_single/page call.  For S/G mappings you'd need to call it for

> > each entry, although if you have a use case for that we really should

> > add a dma_sg_need_sync helper instea of open coding the scatterlist walk.

> 

> My use case is setting up a pinned memory area, and caching the dma

> mappings.  I'd like to bypass storing the DMA addresses if they aren't

> needed.  For example:

> 

> setup()

> {

>     if (dma_need_sync(dev, addr, len)) {

>         kvmalloc_array(...)

>         cache_dma_mappings(...)

>     }

> 

> 

> dev_get_dma(page)

> {

>     if (!cache)

>         return page_to_phys(page)

> 

>     return dma_cache_lookup(...)

> 

> 

> 

> The reason for doing it this way is that the page in question may be

> backed by either system memory, or device memory such as a GPU.  For the

> latter, the GPU provides a table of DMA addresses where data may be

> accessed, so I'm unable to use the dma_map_page() API.


dma_need_sync doesn't tell you if the unmap needs the dma_addr_t.
I've been think about replacing CONFIG_NEED_DMA_MAP_STATE with a runtime
for a while, which would give you exattly what you need.  For now it
isn't very useful as there are very few configs left that do not have
CONFIG_NEED_DMA_MAP_STATE set.

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 2d8d2fed731720..f41620439ef349 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -204,6 +204,14 @@  Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+	bool
+	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
+
+Returns %true if dma_sync_single_for_{device,cpu} calls are required to
+transfer memory ownership.  Returns %false if those calls can be skipped.
+
 ::
 
 	unsigned long
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index cdfa400f89b3d3..5184735a0fe8eb 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -85,4 +85,5 @@  int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs);
 int dma_direct_supported(struct device *dev, u64 mask);
+bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
 #endif /* _LINUX_DMA_DIRECT_H */
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 78f677cf45ab69..a33ed3954ed465 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -461,6 +461,7 @@  int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+bool dma_need_sync(struct device *dev, dma_addr_t dma_addr);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
@@ -571,6 +572,10 @@  static inline size_t dma_max_mapping_size(struct device *dev)
 {
 	return 0;
 }
+static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	return false;
+}
 static inline unsigned long dma_get_merge_boundary(struct device *dev)
 {
 	return 0;
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 93f578a8e613ba..95866b64758100 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -539,3 +539,9 @@  size_t dma_direct_max_mapping_size(struct device *dev)
 		return swiotlb_max_mapping_size(dev);
 	return SIZE_MAX;
 }
+
+bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	return !dev_is_dma_coherent(dev) ||
+		is_swiotlb_buffer(dma_to_phys(dev, dma_addr));
+}
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 98e3d873792ea4..a8c18c9a796fdc 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -397,6 +397,16 @@  size_t dma_max_mapping_size(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
 
+bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	if (dma_is_direct(ops))
+		return dma_direct_need_sync(dev, dma_addr);
+	return ops->sync_single_for_cpu || ops->sync_single_for_device;
+}
+EXPORT_SYMBOL_GPL(dma_need_sync);
+
 unsigned long dma_get_merge_boundary(struct device *dev)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);

[1/4] dma-mapping: add a new dma_need_sync API

Commit Message

Comments

Patch