diff mbox

For the problem when using swiotlb

Message ID 20141121093509.GA19783@e104818-lin.cambridge.arm.com
State New
Headers show

Commit Message

Catalin Marinas Nov. 21, 2014, 9:35 a.m. UTC
On Thu, Nov 20, 2014 at 07:40:00AM +0000, Arnd Bergmann wrote:
> On Thursday 20 November 2014 10:57:53 Ding Tianhong wrote:
> > On 2014/11/19 16:45, Arnd Bergmann wrote:
> > > On Wednesday 19 November 2014 11:17:15 Ding Tianhong wrote:
> > >> On 2014/11/18 2:09, Catalin Marinas wrote:
> > >>> On Mon, Nov 17, 2014 at 12:18:42PM +0000, Arnd Bergmann wrote:
> > >> Thanks everyone, I think I found the way to fix it, need to enable DMA_CMA, to reserve a big memory
> > >> for CMA and set coherent mask for dev, then dma_alloc and dma_mapping will not use the swiotlb until
> > >> the memory out of mask or swiotlb_force is enabled.
> > >>
> > >> If I still understand uncorrectly, please inform me.
> > > 
> > > Please do not use CMA to work around the problem, but fix the underlying bug
> > > instead.
> > > 
> > > The driver should call 'dma_set_mask_and_coherent()' with the appropriate
> > > dma mask, and check whether that succeeded. However, the code implementing
> > > dma_set_mask_and_coherent on arm64 also needs to be changed to look up
> > > the dma-ranges property (see of_dma_configure()), and check if the mask
> > > is possible.
> > > 
> > The dma_pfn_offset looks only support arm32, but my platform is
> > aarch64 and I check the latest kernel version, I think the dma-rangs
> > still could not work for aarch64, so maybe we should add
> > dma_pfn_offset for aarch64 first.
> 
> I didn't mean the dma_pfn_offset. The problem is that the of_dma_configure
> code currently doesn't look at the mask. As I explained in my reply to
> Catalin, it should set the mask to the size of the dma-ranges if that is
> 32-bit or smaller, and dma_set_mask should look at the same dma-ranges
> property to decide what to set the mask to when a driver asks for a
> mask larger than 64-bit.

But this wouldn't help Ding's case, here the driver needs to set the
wider DMA mask.

Anyway, back to your point, to make sure I understand what you meant (I
can send a proper patch with log afterwards):

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Arnd Bergmann Nov. 21, 2014, 12:48 p.m. UTC | #1
On Friday 21 November 2014 09:35:10 Catalin Marinas wrote:
> On Thu, Nov 20, 2014 at 07:40:00AM +0000, Arnd Bergmann wrote:
> > On Thursday 20 November 2014 10:57:53 Ding Tianhong wrote:
> 
> But this wouldn't help Ding's case, here the driver needs to set the
> wider DMA mask.
> 
> Anyway, back to your point, to make sure I understand what you meant (I
> can send a proper patch with log afterwards):

Thanks for putting this into code!

> @@ -88,11 +89,24 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
>  {
>  	if (!dev->dma_mask || !dma_supported(dev, mask))
>  		return -EIO;
> +	/* if asking for bigger dma mask, limit it to the bus dma ranges */
> +	if (mask > *dev->dma_mask)
> +		mask &= of_dma_get_range_mask(dev);
>  	*dev->dma_mask = mask;
>  
>  	return 0;
>  }

As you commented later, the dma_supported check indeed needs to happen
after the masking.

> +static inline int dma_set_coherent_mask(struct device *dev, u64 mask)
> +{
> +	if (!dma_supported(dev, mask))
> +		return -EIO;
> +	if (mask > dev->coherent_dma_mask)
> +		mask &= of_dma_get_range_mask(dev);
> +	dev->coherent_dma_mask = mask;
> +	return 0;
> +}

There is an interesting side problem here: the dma mask points to
coherent_dma_mask for all devices probed from DT, so this breaks
if we have any driver that sets them to different values. It is a
preexisting problem them.

>  EXPORT_SYMBOL_GPL(of_dma_get_range);
>  
> +u64 of_dma_get_range_mask(struct device *dev)
> +{
> +	u64 dma_addr, paddr, size;
> +
> +	/* no dma mask limiting if no of_node or no dma-ranges property */
> +	if (!dev->of_node ||
> +	    of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size) < 0)
> +		return DMA_BIT_MASK(64);

If no dma-ranges are present, we should assume that the bus only supports
32-bit DMA, or we could make it architecture specific. It would probably
be best for arm64 to require a dma-ranges property for doing any DMA
at all, but we can't do that on arm32 any more now.

> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 3b64d0bf5bba..50d1ac4739e6 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -200,6 +200,10 @@ static void of_dma_configure(struct device *dev)
>  	/* DMA ranges found. Calculate and set dma_pfn_offset */
>  	dev->dma_pfn_offset = PFN_DOWN(paddr - dma_addr);
>  	dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", dev->dma_pfn_offset);
> +
> +	/* limit the coherent_dma_mask to the dma-ranges size property */

I would change the comment to clarify that we are actually changing
the dma_mask here as well.

> +	if (size < (1ULL << 32))
> +		dev->coherent_dma_mask = DMA_BIT_MASK(ilog2(size));
>  }
>  


As you mentioned in another mail in this thread, we wouldn't be
able to suuport this case on arm64.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Catalin Marinas Nov. 21, 2014, 4:57 p.m. UTC | #2
On Fri, Nov 21, 2014 at 12:48:09PM +0000, Arnd Bergmann wrote:
> On Friday 21 November 2014 09:35:10 Catalin Marinas wrote:
> > +static inline int dma_set_coherent_mask(struct device *dev, u64 mask)
> > +{
> > +	if (!dma_supported(dev, mask))
> > +		return -EIO;
> > +	if (mask > dev->coherent_dma_mask)
> > +		mask &= of_dma_get_range_mask(dev);
> > +	dev->coherent_dma_mask = mask;
> > +	return 0;
> > +}
> 
> There is an interesting side problem here: the dma mask points to
> coherent_dma_mask for all devices probed from DT, so this breaks
> if we have any driver that sets them to different values. It is a
> preexisting problem them.

Such drivers would have to set both masks separately (or call
dma_set_mask_and_coherent). What we assume though is that dma-ranges
refers to both dma_mask and coherent_dma_mask. I don't think that would
be a problem for ARM systems.

> >  EXPORT_SYMBOL_GPL(of_dma_get_range);
> >  
> > +u64 of_dma_get_range_mask(struct device *dev)
> > +{
> > +	u64 dma_addr, paddr, size;
> > +
> > +	/* no dma mask limiting if no of_node or no dma-ranges property */
> > +	if (!dev->of_node ||
> > +	    of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size) < 0)
> > +		return DMA_BIT_MASK(64);
> 
> If no dma-ranges are present, we should assume that the bus only supports
> 32-bit DMA, or we could make it architecture specific. It would probably
> be best for arm64 to require a dma-ranges property for doing any DMA
> at all, but we can't do that on arm32 any more now.

I thought about this but it could break some existing arm64 DT files if
we mandate dma-ranges property (we could try though). The mask limiting
is arch-specific anyway.

> > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > index 3b64d0bf5bba..50d1ac4739e6 100644
> > --- a/drivers/of/platform.c
> > +++ b/drivers/of/platform.c
> > @@ -200,6 +200,10 @@ static void of_dma_configure(struct device *dev)
> >  	/* DMA ranges found. Calculate and set dma_pfn_offset */
> >  	dev->dma_pfn_offset = PFN_DOWN(paddr - dma_addr);
> >  	dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", dev->dma_pfn_offset);
> > +
> > +	/* limit the coherent_dma_mask to the dma-ranges size property */
> 
> I would change the comment to clarify that we are actually changing
> the dma_mask here as well.
> 
> > +	if (size < (1ULL << 32))
> > +		dev->coherent_dma_mask = DMA_BIT_MASK(ilog2(size));
> >  }
> 
> As you mentioned in another mail in this thread, we wouldn't be
> able to suuport this case on arm64.

The mask would still be valid and even usable if an IOMMU is present. Do
you mean we should not limit it at all here?

There is a scenario where smaller mask would work on arm64. For example
Juno, you can have 2GB of RAM in the 32-bit phys range (starting at
0x80000000). A device with 31-bit mask and a dma_pfn_offset of
0x80000000 would still work (there isn't any but just as an example). So
the check in dma_alloc_coherent() would be something like:

	phys_to_dma(top of ZONE_DMA) - dma_pfn_offset <= coherent_dma_mask

(or assuming RAM starts at 0 and ignoring dma_pfn_offset for now)

If the condition above fails, dma_alloc_coherent() would no longer fall
back to swiotlb but issue a dev_warn() and return NULL.
Arnd Bergmann Nov. 21, 2014, 5:04 p.m. UTC | #3
On Friday 21 November 2014 16:57:09 Catalin Marinas wrote:
> On Fri, Nov 21, 2014 at 12:48:09PM +0000, Arnd Bergmann wrote:
> > On Friday 21 November 2014 09:35:10 Catalin Marinas wrote:
> > > +static inline int dma_set_coherent_mask(struct device *dev, u64 mask)
> > > +{
> > > +	if (!dma_supported(dev, mask))
> > > +		return -EIO;
> > > +	if (mask > dev->coherent_dma_mask)
> > > +		mask &= of_dma_get_range_mask(dev);
> > > +	dev->coherent_dma_mask = mask;
> > > +	return 0;
> > > +}
> > 
> > There is an interesting side problem here: the dma mask points to
> > coherent_dma_mask for all devices probed from DT, so this breaks
> > if we have any driver that sets them to different values. It is a
> > preexisting problem them.
> 
> Such drivers would have to set both masks separately (or call
> dma_set_mask_and_coherent). What we assume though is that dma-ranges
> refers to both dma_mask and coherent_dma_mask. I don't think that would
> be a problem for ARM systems.

Right, I'm just saying that we don't have a way to detect drivers that
break this assumption, not that we have a serious problem already.

> > >  EXPORT_SYMBOL_GPL(of_dma_get_range);
> > >  
> > > +u64 of_dma_get_range_mask(struct device *dev)
> > > +{
> > > +	u64 dma_addr, paddr, size;
> > > +
> > > +	/* no dma mask limiting if no of_node or no dma-ranges property */
> > > +	if (!dev->of_node ||
> > > +	    of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size) < 0)
> > > +		return DMA_BIT_MASK(64);
> > 
> > If no dma-ranges are present, we should assume that the bus only supports
> > 32-bit DMA, or we could make it architecture specific. It would probably
> > be best for arm64 to require a dma-ranges property for doing any DMA
> > at all, but we can't do that on arm32 any more now.
> 
> I thought about this but it could break some existing arm64 DT files if
> we mandate dma-ranges property (we could try though). The mask limiting
> is arch-specific anyway.

Yes, this has taken far too long to get addressed, we should have added
the properties right from the start. If we have a function in architecture
specific code, maybe we can just check for the short list of already
supported platforms that need backwards compatibility and require the
mask for everything else?

> > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > > index 3b64d0bf5bba..50d1ac4739e6 100644
> > > --- a/drivers/of/platform.c
> > > +++ b/drivers/of/platform.c
> > > @@ -200,6 +200,10 @@ static void of_dma_configure(struct device *dev)
> > >  	/* DMA ranges found. Calculate and set dma_pfn_offset */
> > >  	dev->dma_pfn_offset = PFN_DOWN(paddr - dma_addr);
> > >  	dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", dev->dma_pfn_offset);
> > > +
> > > +	/* limit the coherent_dma_mask to the dma-ranges size property */
> > 
> > I would change the comment to clarify that we are actually changing
> > the dma_mask here as well.
> > 
> > > +	if (size < (1ULL << 32))
> > > +		dev->coherent_dma_mask = DMA_BIT_MASK(ilog2(size));
> > >  }
> > 
> > As you mentioned in another mail in this thread, we wouldn't be
> > able to suuport this case on arm64.
> 
> The mask would still be valid and even usable if an IOMMU is present. Do
> you mean we should not limit it at all here?

The code is certainly correct on arm32, as long as we have an appropriate
DMA zone.

> There is a scenario where smaller mask would work on arm64. For example
> Juno, you can have 2GB of RAM in the 32-bit phys range (starting at
> 0x80000000). A device with 31-bit mask and a dma_pfn_offset of
> 0x80000000 would still work (there isn't any but just as an example). So
> the check in dma_alloc_coherent() would be something like:
> 
> 	phys_to_dma(top of ZONE_DMA) - dma_pfn_offset <= coherent_dma_mask
> 
> (or assuming RAM starts at 0 and ignoring dma_pfn_offset for now)
> 
> If the condition above fails, dma_alloc_coherent() would no longer fall
> back to swiotlb but issue a dev_warn() and return NULL.

Ah, that looks like it should work on all architectures, very nice.
How about checking this condition, and then printing a small warning
(dev_warn, not WARN_ON) and setting the dma_mask pointer to NULL?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Catalin Marinas Nov. 21, 2014, 5:51 p.m. UTC | #4
On Fri, Nov 21, 2014 at 05:04:28PM +0000, Arnd Bergmann wrote:
> On Friday 21 November 2014 16:57:09 Catalin Marinas wrote:
> > There is a scenario where smaller mask would work on arm64. For example
> > Juno, you can have 2GB of RAM in the 32-bit phys range (starting at
> > 0x80000000). A device with 31-bit mask and a dma_pfn_offset of
> > 0x80000000 would still work (there isn't any but just as an example). So
> > the check in dma_alloc_coherent() would be something like:
> > 
> > 	phys_to_dma(top of ZONE_DMA) - dma_pfn_offset <= coherent_dma_mask
> > 
> > (or assuming RAM starts at 0 and ignoring dma_pfn_offset for now)
> > 
> > If the condition above fails, dma_alloc_coherent() would no longer fall
> > back to swiotlb but issue a dev_warn() and return NULL.
> 
> Ah, that looks like it should work on all architectures, very nice.
> How about checking this condition, and then printing a small warning
> (dev_warn, not WARN_ON) and setting the dma_mask pointer to NULL?

I would not add the above ZONE_DMA check to of_dma_configure(). For
example on arm64, we may not support a small coherent_dma_mask but the
same value for dma_mask could be fine via swiotlb bouncing (or IOMMU).
However, that's an arch-specific decision. Maybe after the above setting
of dev->coherent_dma_mask in of_dma_configure(), we could add:

	if (!dma_supported(dev, dev->coherent_dma_mask))
		dev->dma_mask = NULL;

The arch dma_supported() can check the swiotlb bouncing or ZONE_DMA
limits.

Strangely, we don't have a coherent_dma_supported() but we can defer
such check to dma_alloc_coherent() and that's where we would check the
top of ZONE_DMA.
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1e0e4671dd25..d6a4b4619174 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -120,6 +120,9 @@  config HAVE_GENERIC_RCU_GUP
 config ARCH_DMA_ADDR_T_64BIT
 	def_bool y
 
+config ARCH_HAS_DMA_SET_COHERENT_MASK
+	def_bool y
+
 config NEED_DMA_MAP_STATE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
index adeae3f6f0fc..92dcd251e549 100644
--- a/arch/arm64/include/asm/dma-mapping.h
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -18,6 +18,7 @@ 
 
 #ifdef __KERNEL__
 
+#include <linux/of_address.h>
 #include <linux/types.h>
 #include <linux/vmalloc.h>
 
@@ -88,11 +89,24 @@  static inline int dma_set_mask(struct device *dev, u64 mask)
 {
 	if (!dev->dma_mask || !dma_supported(dev, mask))
 		return -EIO;
+	/* if asking for bigger dma mask, limit it to the bus dma ranges */
+	if (mask > *dev->dma_mask)
+		mask &= of_dma_get_range_mask(dev);
 	*dev->dma_mask = mask;
 
 	return 0;
 }
 
+static inline int dma_set_coherent_mask(struct device *dev, u64 mask)
+{
+	if (!dma_supported(dev, mask))
+		return -EIO;
+	if (mask > dev->coherent_dma_mask)
+		mask &= of_dma_get_range_mask(dev);
+	dev->coherent_dma_mask = mask;
+	return 0;
+}
+
 static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
 {
 	if (!dev->dma_mask)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index afdb78299f61..89c04abdf9bb 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -1,5 +1,6 @@ 
 
 #include <linux/device.h>
+#include <linux/dma-mapping.h>
 #include <linux/io.h>
 #include <linux/ioport.h>
 #include <linux/module.h>
@@ -979,6 +980,19 @@  out:
 }
 EXPORT_SYMBOL_GPL(of_dma_get_range);
 
+u64 of_dma_get_range_mask(struct device *dev)
+{
+	u64 dma_addr, paddr, size;
+
+	/* no dma mask limiting if no of_node or no dma-ranges property */
+	if (!dev->of_node ||
+	    of_dma_get_range(dev->of_node, &dma_addr, &paddr, &size) < 0)
+		return DMA_BIT_MASK(64);
+
+	return DMA_BIT_MASK(ilog2(size));
+}
+EXPORT_SYMBOL_GPL(of_dma_get_range_mask);
+
 /**
  * of_dma_is_coherent - Check if device is coherent
  * @np:	device node
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 3b64d0bf5bba..50d1ac4739e6 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -200,6 +200,10 @@  static void of_dma_configure(struct device *dev)
 	/* DMA ranges found. Calculate and set dma_pfn_offset */
 	dev->dma_pfn_offset = PFN_DOWN(paddr - dma_addr);
 	dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", dev->dma_pfn_offset);
+
+	/* limit the coherent_dma_mask to the dma-ranges size property */
+	if (size < (1ULL << 32))
+		dev->coherent_dma_mask = DMA_BIT_MASK(ilog2(size));
 }
 
 /**
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 8cb14eb393d6..fffb1a49a1a7 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -55,6 +55,8 @@  extern struct of_pci_range *of_pci_range_parser_one(
 					struct of_pci_range *range);
 extern int of_dma_get_range(struct device_node *np, u64 *dma_addr,
 				u64 *paddr, u64 *size);
+extern u64 of_dma_get_range_mask(struct device *dev);
+
 extern bool of_dma_is_coherent(struct device_node *np);
 #else /* CONFIG_OF_ADDRESS */
 static inline struct device_node *of_find_matching_node_by_address(