diff mbox series

net: mvpp2: avoid bouncing buffers

Message ID 20180820024730.9147-1-brian.brooks@linaro.org
State New
Headers show
Series net: mvpp2: avoid bouncing buffers | expand

Commit Message

Brian Brooks Aug. 20, 2018, 2:47 a.m. UTC
Some memory regions used by this device need to share the same
upper 8 bits of the 40-bit bus address. Currently, a coherent
DMA mask of 32 bits is used so that dma_alloc_coherent() regions
have the same upper 8 bits. Packet buffers are not allocated via
DMA APIs, and the device does not require these memory regions
to have the same upper 8 bits. However, packet buffers are being
bounced during streaming mappings because streaming and coherent
DMA are using the same DMA mask, i.e. dev->dma_mask points to
dev->coherent_dma_mask.

Avoid bouncing packet buffers by ensuring streaming DMA uses a
mask of 40 bits and coherent DMA uses a mask of 32 bits. iperf3
shows throughput increases from 4.04 Gbps to 9.14 Gbps.

Signed-off-by: Brian Brooks <brian.brooks@linaro.org>

---
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h      | 3 +++
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 6 ++++++
 2 files changed, 9 insertions(+)

-- 
2.17.1

Comments

David Miller Aug. 20, 2018, 2:55 a.m. UTC | #1
From: Brian Brooks <brian.brooks@linaro.org>

Date: Sun, 19 Aug 2018 21:47:30 -0500

> @@ -5126,6 +5126,12 @@ static int mvpp2_probe(struct platform_device *pdev)

>  	}

>  

>  	if (priv->hw_version == MVPP22) {

> +		/* Platform code may have set dev->dma_mask to point

> +		 * to dev->coherent_dma_mask, but we want to ensure

> +		 * they take different values due to comment below.

> +		 */

> +		pdev->dev.dma_mask = &priv->dma_mask;


The platform code might be doing this exactly because it cannot support
different coherent and streaming DMA masks.

Well, in any case, the platform code is doing it for a reason and
overriding this in a "driver" of all places seems totally
inappropriate and at best a layering violation.

I would rather you fix this in a way that involves well defined APIs
that set the DMA masks or whatever to the values that you need, rather
than going behind the platform code's back and changing the DMA mask
pointer like this.

Thanks.
Yan Markman Aug. 20, 2018, 7:02 a.m. UTC | #2
Hi everybody
The MVPP2 code already has DMA's patch taken from OLD Backport and placed by Antoine Tenart.
Please refer the attached 
Best regards
Yan Markman


-----Original Message-----
From: Christoph Hellwig [mailto:hch@infradead.org] 

Sent: Monday, August 20, 2018 9:23 AM
To: David Miller <davem@davemloft.net>
Cc: brian.brooks@linaro.org; antoine.tenart@bootlin.com; maxime.chevallier@bootlin.com; Yan Markman <ymarkman@marvell.com>; Stefan Chulski <stefanc@marvell.com>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; bjorn.topel@intel.com; brian.brooks@arm.com
Subject: Re: [PATCH] net: mvpp2: avoid bouncing buffers

On Sun, Aug 19, 2018 at 07:55:05PM -0700, David Miller wrote:
> From: Brian Brooks <brian.brooks@linaro.org>

> Date: Sun, 19 Aug 2018 21:47:30 -0500

> 

> > @@ -5126,6 +5126,12 @@ static int mvpp2_probe(struct platform_device *pdev)

> >  	}

> >  

> >  	if (priv->hw_version == MVPP22) {

> > +		/* Platform code may have set dev->dma_mask to point

> > +		 * to dev->coherent_dma_mask, but we want to ensure

> > +		 * they take different values due to comment below.

> > +		 */

> > +		pdev->dev.dma_mask = &priv->dma_mask;

> 

> The platform code might be doing this exactly because it cannot 

> support different coherent and streaming DMA masks.

> 

> Well, in any case, the platform code is doing it for a reason and 

> overriding this in a "driver" of all places seems totally 

> inappropriate and at best a layering violation.

> 

> I would rather you fix this in a way that involves well defined APIs 

> that set the DMA masks or whatever to the values that you need, rather 

> than going behind the platform code's back and changing the DMA mask 

> pointer like this.


Agreed.  What platform do you see this issue on?
Brian Brooks Aug. 27, 2018, 1:55 p.m. UTC | #3
On 08/19 23:23:26, Christoph Hellwig wrote:
> On Sun, Aug 19, 2018 at 07:55:05PM -0700, David Miller wrote:

> > From: Brian Brooks <brian.brooks@linaro.org>

> > Date: Sun, 19 Aug 2018 21:47:30 -0500

> > 

> > > @@ -5126,6 +5126,12 @@ static int mvpp2_probe(struct platform_device *pdev)

> > >  	}

> > >  

> > >  	if (priv->hw_version == MVPP22) {

> > > +		/* Platform code may have set dev->dma_mask to point

> > > +		 * to dev->coherent_dma_mask, but we want to ensure

> > > +		 * they take different values due to comment below.

> > > +		 */

> > > +		pdev->dev.dma_mask = &priv->dma_mask;

> > 

> > The platform code might be doing this exactly because it cannot support

> > different coherent and streaming DMA masks.

> > 

> > Well, in any case, the platform code is doing it for a reason and

> > overriding this in a "driver" of all places seems totally

> > inappropriate and at best a layering violation.

> > 

> > I would rather you fix this in a way that involves well defined APIs

> > that set the DMA masks or whatever to the values that you need, rather

> > than going behind the platform code's back and changing the DMA mask

> > pointer like this.

> 

> Agreed.  What platform do you see this issue on?


This is Armada 8040 SoC with PPv2 net device on chip. MACCHIATObin board.

Both DT and ACPI have the following snippet:

	/*
	 * Set default coherent_dma_mask to 32 bit.  Drivers are expected to
	 * setup the correct supported mask.
	 */
	if (!dev->coherent_dma_mask)
		dev->coherent_dma_mask = DMA_BIT_MASK(32);

	/*
	 * Set it to coherent_dma_mask by default if the architecture
	 * code has not set it.
	 */
	if (!dev->dma_mask)
		dev->dma_mask = &dev->coherent_dma_mask;

Some architectures code setup DMA masks, but it does not seem appropriate
to add mvpp2 specific code in arch/arm64. The mvpp2 driver appeared to be
the least invasive place to resolve the limitation of this net device.

Another DMA API could be added to allocate storage for the streaming
mask to ensure masks can take on separate values when the existing DMA
APIs are used by the driver to set those values. But, this would be the
only driver using such API. Would that be how to handle this?
Christoph Hellwig Aug. 27, 2018, 3:48 p.m. UTC | #4
WE should basically never have dev->dma_mask = &dev->coherent_dma_mask,
so until that is the case you are doctoring around the symptoms and
not the problem.

Does the patch below help your case?

----
From 6294e0e330851ee06e66ab85b348f1d92d375d7a Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>

Date: Mon, 27 Aug 2018 17:23:24 +0200
Subject: driver core: initialize a default DMA mask for platform device

We still treat devices without a DMA mask as defaulting to 32-bits for
both mask, but a few releases ago we've started warning about such
cases, as they require special cases to work around this sloppyness.
Add a dma_mask field to struct platform_object so that we can initialize
the dma_mask pointer in struct device and initialize both masks to
32-bits by default.  Architectures can still override this in
arch_setup_pdev_archdata if needed.

Note that the code looks a little odd with the various conditionals
because we have to support platform_device structures that are
statically allocated.

Signed-off-by: Christoph Hellwig <hch@lst.de>

---
 drivers/base/platform.c         | 15 +++++++++++++--
 include/linux/platform_device.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index dff82a3c2caa..baf4b06cf2d9 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -225,6 +225,17 @@ struct platform_object {
 	char name[];
 };
 
+static void setup_pdev_archdata(struct platform_device *pdev)
+{
+	if (!pdev->dev.coherent_dma_mask)
+		pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32);
+	if (!pdev->dma_mask)
+		pdev->dma_mask = DMA_BIT_MASK(32);
+	if (!pdev->dev.dma_mask)
+		pdev->dev.dma_mask = &pdev->dma_mask;
+	arch_setup_pdev_archdata(pdev);
+};
+
 /**
  * platform_device_put - destroy a platform device
  * @pdev: platform device to free
@@ -271,7 +282,7 @@ struct platform_device *platform_device_alloc(const char *name, int id)
 		pa->pdev.id = id;
 		device_initialize(&pa->pdev.dev);
 		pa->pdev.dev.release = platform_device_release;
-		arch_setup_pdev_archdata(&pa->pdev);
+		setup_pdev_archdata(&pa->pdev);
 	}
 
 	return pa ? &pa->pdev : NULL;
@@ -472,7 +483,7 @@ EXPORT_SYMBOL_GPL(platform_device_del);
 int platform_device_register(struct platform_device *pdev)
 {
 	device_initialize(&pdev->dev);
-	arch_setup_pdev_archdata(pdev);
+	setup_pdev_archdata(pdev);
 	return platform_device_add(pdev);
 }
 EXPORT_SYMBOL_GPL(platform_device_register);
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 1a9f38f27f65..d84ec1de6022 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -25,6 +25,7 @@ struct platform_device {
 	int		id;
 	bool		id_auto;
 	struct device	dev;
+	dma_addr_t	dma_mask;
 	u32		num_resources;
 	struct resource	*resource;
 
-- 
2.18.0
Brian Brooks Sept. 2, 2018, 2:10 a.m. UTC | #5
On 08/27 08:48:43, Christoph Hellwig wrote:
> WE should basically never have dev->dma_mask = &dev->coherent_dma_mask,

> so until that is the case you are doctoring around the symptoms and

> not the problem.

> 

> Does the patch below help your case?


Yes. Just tested it. Works great.

I see how this patch addresses the issue in the platform code instead of
the driver code. Thanks.

> ----

> From 6294e0e330851ee06e66ab85b348f1d92d375d7a Mon Sep 17 00:00:00 2001

> From: Christoph Hellwig <hch@lst.de>

> Date: Mon, 27 Aug 2018 17:23:24 +0200

> Subject: driver core: initialize a default DMA mask for platform device

> 

> We still treat devices without a DMA mask as defaulting to 32-bits for

> both mask, but a few releases ago we've started warning about such

> cases, as they require special cases to work around this sloppyness.

> Add a dma_mask field to struct platform_object so that we can initialize

> the dma_mask pointer in struct device and initialize both masks to

> 32-bits by default.  Architectures can still override this in

> arch_setup_pdev_archdata if needed.

> 

> Note that the code looks a little odd with the various conditionals

> because we have to support platform_device structures that are

> statically allocated.

> 

> Signed-off-by: Christoph Hellwig <hch@lst.de>

> ---

>  drivers/base/platform.c         | 15 +++++++++++++--

>  include/linux/platform_device.h |  1 +

>  2 files changed, 14 insertions(+), 2 deletions(-)

> 

> diff --git a/drivers/base/platform.c b/drivers/base/platform.c

> index dff82a3c2caa..baf4b06cf2d9 100644

> --- a/drivers/base/platform.c

> +++ b/drivers/base/platform.c

> @@ -225,6 +225,17 @@ struct platform_object {

>  	char name[];

>  };

>  

> +static void setup_pdev_archdata(struct platform_device *pdev)

> +{

> +	if (!pdev->dev.coherent_dma_mask)

> +		pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32);

> +	if (!pdev->dma_mask)

> +		pdev->dma_mask = DMA_BIT_MASK(32);

> +	if (!pdev->dev.dma_mask)

> +		pdev->dev.dma_mask = &pdev->dma_mask;

> +	arch_setup_pdev_archdata(pdev);

> +};

> +

>  /**

>   * platform_device_put - destroy a platform device

>   * @pdev: platform device to free

> @@ -271,7 +282,7 @@ struct platform_device *platform_device_alloc(const char *name, int id)

>  		pa->pdev.id = id;

>  		device_initialize(&pa->pdev.dev);

>  		pa->pdev.dev.release = platform_device_release;

> -		arch_setup_pdev_archdata(&pa->pdev);

> +		setup_pdev_archdata(&pa->pdev);

>  	}

>  

>  	return pa ? &pa->pdev : NULL;

> @@ -472,7 +483,7 @@ EXPORT_SYMBOL_GPL(platform_device_del);

>  int platform_device_register(struct platform_device *pdev)

>  {

>  	device_initialize(&pdev->dev);

> -	arch_setup_pdev_archdata(pdev);

> +	setup_pdev_archdata(pdev);

>  	return platform_device_add(pdev);

>  }

>  EXPORT_SYMBOL_GPL(platform_device_register);

> diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h

> index 1a9f38f27f65..d84ec1de6022 100644

> --- a/include/linux/platform_device.h

> +++ b/include/linux/platform_device.h

> @@ -25,6 +25,7 @@ struct platform_device {

>  	int		id;

>  	bool		id_auto;

>  	struct device	dev;

> +	dma_addr_t	dma_mask;


Hmm.. should struct device use dma_addr_t instead of u64 for masks too?

>  	u32		num_resources;

>  	struct resource	*resource;

>  

> -- 

> 2.18.0

>
Antoine Tenart March 1, 2019, 2:26 p.m. UTC | #6
Hi Christoph,

I saw you sent this patch as part of another series back in August, but
it seems it was never applied and I can't find it even in -next. We made
some tests and this patch is helping a lot the PPv2 engine driver in
improving its performances.

Do you plan on re-sending it, or reworking it? Are there current issues
with it that prevent it from being merged upstream? Can we help in any
way?

Thanks!
Antoine

On Mon, Aug 27, 2018 at 08:48:43AM -0700, Christoph Hellwig wrote:
> WE should basically never have dev->dma_mask = &dev->coherent_dma_mask,

> so until that is the case you are doctoring around the symptoms and

> not the problem.

> 

> Does the patch below help your case?

> 

> ----

> From 6294e0e330851ee06e66ab85b348f1d92d375d7a Mon Sep 17 00:00:00 2001

> From: Christoph Hellwig <hch@lst.de>

> Date: Mon, 27 Aug 2018 17:23:24 +0200

> Subject: driver core: initialize a default DMA mask for platform device

> 

> We still treat devices without a DMA mask as defaulting to 32-bits for

> both mask, but a few releases ago we've started warning about such

> cases, as they require special cases to work around this sloppyness.

> Add a dma_mask field to struct platform_object so that we can initialize

> the dma_mask pointer in struct device and initialize both masks to

> 32-bits by default.  Architectures can still override this in

> arch_setup_pdev_archdata if needed.

> 

> Note that the code looks a little odd with the various conditionals

> because we have to support platform_device structures that are

> statically allocated.

> 

> Signed-off-by: Christoph Hellwig <hch@lst.de>

> ---

>  drivers/base/platform.c         | 15 +++++++++++++--

>  include/linux/platform_device.h |  1 +

>  2 files changed, 14 insertions(+), 2 deletions(-)

> 

> diff --git a/drivers/base/platform.c b/drivers/base/platform.c

> index dff82a3c2caa..baf4b06cf2d9 100644

> --- a/drivers/base/platform.c

> +++ b/drivers/base/platform.c

> @@ -225,6 +225,17 @@ struct platform_object {

>  	char name[];

>  };

>  

> +static void setup_pdev_archdata(struct platform_device *pdev)

> +{

> +	if (!pdev->dev.coherent_dma_mask)

> +		pdev->dev.coherent_dma_mask = DMA_BIT_MASK(32);

> +	if (!pdev->dma_mask)

> +		pdev->dma_mask = DMA_BIT_MASK(32);

> +	if (!pdev->dev.dma_mask)

> +		pdev->dev.dma_mask = &pdev->dma_mask;

> +	arch_setup_pdev_archdata(pdev);

> +};

> +

>  /**

>   * platform_device_put - destroy a platform device

>   * @pdev: platform device to free

> @@ -271,7 +282,7 @@ struct platform_device *platform_device_alloc(const char *name, int id)

>  		pa->pdev.id = id;

>  		device_initialize(&pa->pdev.dev);

>  		pa->pdev.dev.release = platform_device_release;

> -		arch_setup_pdev_archdata(&pa->pdev);

> +		setup_pdev_archdata(&pa->pdev);

>  	}

>  

>  	return pa ? &pa->pdev : NULL;

> @@ -472,7 +483,7 @@ EXPORT_SYMBOL_GPL(platform_device_del);

>  int platform_device_register(struct platform_device *pdev)

>  {

>  	device_initialize(&pdev->dev);

> -	arch_setup_pdev_archdata(pdev);

> +	setup_pdev_archdata(pdev);

>  	return platform_device_add(pdev);

>  }

>  EXPORT_SYMBOL_GPL(platform_device_register);

> diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h

> index 1a9f38f27f65..d84ec1de6022 100644

> --- a/include/linux/platform_device.h

> +++ b/include/linux/platform_device.h

> @@ -25,6 +25,7 @@ struct platform_device {

>  	int		id;

>  	bool		id_auto;

>  	struct device	dev;

> +	dma_addr_t	dma_mask;

>  	u32		num_resources;

>  	struct resource	*resource;

>  

> -- 

> 2.18.0

> 


-- 
Antoine Ténart, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Christoph Hellwig March 11, 2019, 3:46 p.m. UTC | #7
On Fri, Mar 01, 2019 at 03:26:02PM +0100, Antoine Tenart wrote:
> Hi Christoph,

> 

> I saw you sent this patch as part of another series back in August, but

> it seems it was never applied and I can't find it even in -next. We made

> some tests and this patch is helping a lot the PPv2 engine driver in

> improving its performances.

> 

> Do you plan on re-sending it, or reworking it? Are there current issues

> with it that prevent it from being merged upstream? Can we help in any

> way?


Hi Antoine,

it's been a while, but my vague memory is that this caused failures
in some ARM setups, and I was tought why for AMBA setting up the
default mask is a bad idea.  I don't really remember more, though.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index def00dc3eb4e..3eb0c3ede8d2 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -676,6 +676,9 @@  struct mvpp2 {
 	struct clk *mg_core_clk;
 	struct clk *axi_clk;
 
+	/* DMA mask for streaming mappings */
+	u64 dma_mask;
+
 	/* List of pointers to port structures */
 	int port_count;
 	struct mvpp2_port *port_list[MVPP2_MAX_PORTS];
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 0319ed9ef8b8..3a190c489589 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -5126,6 +5126,12 @@  static int mvpp2_probe(struct platform_device *pdev)
 	}
 
 	if (priv->hw_version == MVPP22) {
+		/* Platform code may have set dev->dma_mask to point
+		 * to dev->coherent_dma_mask, but we want to ensure
+		 * they take different values due to comment below.
+		 */
+		pdev->dev.dma_mask = &priv->dma_mask;
+
 		err = dma_set_mask(&pdev->dev, MVPP2_DESC_DMA_MASK);
 		if (err)
 			goto err_axi_clk;