mbox series

[RESEND,v8,net-next,00/15] net: mvpp2: Add TX Flow Control support

Message ID 1612685964-21890-1-git-send-email-stefanc@marvell.com
Headers show
Series net: mvpp2: Add TX Flow Control support | expand

Message

Stefan Chulski Feb. 7, 2021, 8:19 a.m. UTC
From: Stefan Chulski <stefanc@marvell.com>

Armada hardware has a pause generation mechanism in GOP (MAC).
The GOP generate flow control frames based on an indication programmed in Ports Control 0 Register. There is a bit per port.
However assertion of the PortX Pause bits in the ports control 0 register only sends a one time pause.
To complement the function the GOP has a mechanism to periodically send pause control messages based on periodic counters.
This mechanism ensures that the pause is effective as long as the Appropriate PortX Pause is asserted.

Problem is that Packet Processor that actually can drop packets due to lack of resources not connected to the GOP flow control generation mechanism.
To solve this issue Armada has firmware running on CM3 CPU dedicated for Flow Control support.
Firmware monitors Packet Processor resources and asserts XON/XOFF by writing to Ports Control 0 Register.

MSS shared SRAM memory used to communicate between CM3 firmware and PP2 driver.
During init PP2 driver informs firmware about used BM pools, RXQs, congestion and depletion thresholds.

The pause frames are generated whenever congestion or depletion in resources is detected.
The back pressure is stopped when the resource reaches a sufficient level.
So the congestion/depletion and sufficient level implement a hysteresis that reduces the XON/XOFF toggle frequency.

Packet Processor v23 hardware introduces support for RX FIFO fill level monitor.
Patch "add PPv23 version definition" to differ between v23 and v22 hardware.
Patch "add TX FC firmware check" verifies that CM3 firmware supports Flow Control monitoring.

v7 --> v8
- Reorder "always compare hw-version vs MVPP21" and "add PPv23 version definition" commits
- Typo fixes
- Remove condition fix from "add RXQ flow control configurations"

v6 --> v7
- Reduce patch set from 18 to 15 patches
 - Documentation change combined into a single patch
 - RXQ and BM size change combined into a single patch
 - Ring size change check moved into "add RXQ flow control configurations" commit

v5 --> v6
- No change

v4 --> v5
- Add missed Signed-off
- Fix warnings in patches 3 and 12
- Add revision requirement to warning message
- Move mss_spinlock into RXQ flow control configurations patch
- Improve FCA RXQ non occupied descriptor threshold commit message

v3 --> v4
- Remove RFC tag

v2 --> v3
- Remove inline functions
- Add PPv2.3 description into marvell-pp2.txt
- Improve mvpp2_interrupts_mask/unmask procedure
- Improve FC enable/disable procedure
- Add priv->sram_pool check
- Remove gen_pool_destroy call
- Reduce Flow Control timer to x100 faster

v1 --> v2
- Add memory requirements information
- Add EPROBE_DEFER if of_gen_pool_get return NULL
- Move Flow control configuration to mvpp2_mac_link_up callback
- Add firmware version info with Flow control support

Konstantin Porotchkin (1):
  dts: marvell: add CM3 SRAM memory to cp11x ethernet device tree

Stefan Chulski (14):
  doc: marvell: add cm3-mem and PPv2.3 description
  net: mvpp2: add CM3 SRAM memory map
  net: mvpp2: always compare hw-version vs MVPP21
  net: mvpp2: add PPv23 version definition
  net: mvpp2: increase BM pool and RXQ size
  net: mvpp2: add FCA periodic timer configurations
  net: mvpp2: add FCA RXQ non occupied descriptor threshold
  net: mvpp2: enable global flow control
  net: mvpp2: add RXQ flow control configurations
  net: mvpp2: add ethtool flow control configuration support
  net: mvpp2: add BM protection underrun feature support
  net: mvpp2: add PPv23 RX FIFO flow control
  net: mvpp2: set 802.3x GoP Flow Control mode
  net: mvpp2: add TX FC firmware check

 Documentation/devicetree/bindings/net/marvell-pp2.txt |   4 +-
 arch/arm64/boot/dts/marvell/armada-cp11x.dtsi         |  10 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h            | 128 ++++-
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c       | 563 ++++++++++++++++++--
 4 files changed, 655 insertions(+), 50 deletions(-)

Comments

Baruch Siach Feb. 7, 2021, 10:45 a.m. UTC | #1
Hi Stefan,

On Sun, Feb 07 2021, stefanc@marvell.com wrote:
> From: Stefan Chulski <stefanc@marvell.com>
>
> This patch adds CM3 memory map and CM3 read/write callbacks.
> No functionality changes.
>
> Signed-off-by: Stefan Chulski <stefanc@marvell.com>
> ---
>  drivers/net/ethernet/marvell/mvpp2/mvpp2.h      |  7 +++
>  drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 63 +++++++++++++++++++-
>  2 files changed, 67 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> index 6bd7e40..aec9179 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> @@ -748,6 +748,9 @@
>  #define MVPP2_TX_FIFO_THRESHOLD(kb)	\
>  		((kb) * 1024 - MVPP2_TX_FIFO_THRESHOLD_MIN)
>  
> +/* MSS Flow control */
> +#define MSS_SRAM_SIZE	0x800
> +
>  /* RX buffer constants */
>  #define MVPP2_SKB_SHINFO_SIZE \
>  	SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
> @@ -925,6 +928,7 @@ struct mvpp2 {
>  	/* Shared registers' base addresses */
>  	void __iomem *lms_base;
>  	void __iomem *iface_base;
> +	void __iomem *cm3_base;
>  
>  	/* On PPv2.2, each "software thread" can access the base
>  	 * register through a separate address space, each 64 KB apart
> @@ -996,6 +1000,9 @@ struct mvpp2 {
>  
>  	/* page_pool allocator */
>  	struct page_pool *page_pool[MVPP2_PORT_MAX_RXQ];
> +
> +	/* CM3 SRAM pool */
> +	struct gen_pool *sram_pool;
>  };
>  
>  struct mvpp2_pcpu_stats {
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> index a07cf60..307f9fd 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> @@ -25,6 +25,7 @@
>  #include <linux/of_net.h>
>  #include <linux/of_address.h>
>  #include <linux/of_device.h>
> +#include <linux/genalloc.h>
>  #include <linux/phy.h>
>  #include <linux/phylink.h>
>  #include <linux/phy/phy.h>
> @@ -6846,6 +6847,44 @@ static int mvpp2_init(struct platform_device *pdev, struct mvpp2 *priv)
>  	return 0;
>  }
>  
> +static int mvpp2_get_sram(struct platform_device *pdev,
> +			  struct mvpp2 *priv)
> +{
> +	struct device_node *dn = pdev->dev.of_node;
> +	static bool defer_once;
> +	struct resource *res;
> +
> +	if (has_acpi_companion(&pdev->dev)) {
> +		res = platform_get_resource(pdev, IORESOURCE_MEM, 2);
> +		if (!res) {
> +			dev_warn(&pdev->dev, "ACPI is too old, Flow control not supported\n");
> +			return 0;
> +		}
> +		priv->cm3_base = devm_ioremap_resource(&pdev->dev, res);
> +		if (IS_ERR(priv->cm3_base))
> +			return PTR_ERR(priv->cm3_base);
> +	} else {
> +		priv->sram_pool = of_gen_pool_get(dn, "cm3-mem", 0);
> +		if (!priv->sram_pool) {
> +			if (!defer_once) {
> +				defer_once = true;
> +				/* Try defer once */
> +				return -EPROBE_DEFER;
> +			}
> +			dev_warn(&pdev->dev, "DT is too old, Flow control not supported\n");

This warning will show on every DT system with no cm3-mem property, right?

> +			return -ENOMEM;
> +		}
> +		/* cm3_base allocated with offset zero into the SRAM since mapping size
> +		 * is equal to requested size.
> +		 */
> +		priv->cm3_base = (void __iomem *)gen_pool_alloc(priv->sram_pool,
> +								MSS_SRAM_SIZE);
> +		if (!priv->cm3_base)
> +			return -ENOMEM;
> +	}
> +	return 0;
> +}
> +
>  static int mvpp2_probe(struct platform_device *pdev)
>  {
>  	const struct acpi_device_id *acpi_id;
> @@ -6902,6 +6941,13 @@ static int mvpp2_probe(struct platform_device *pdev)
>  		priv->iface_base = devm_ioremap_resource(&pdev->dev, res);
>  		if (IS_ERR(priv->iface_base))
>  			return PTR_ERR(priv->iface_base);
> +
> +		/* Map CM3 SRAM */
> +		err = mvpp2_get_sram(pdev, priv);
> +		if (err == -EPROBE_DEFER)
> +			return err;
> +		else if (err)
> +			dev_warn(&pdev->dev, "Fail to alloc CM3 SRAM\n");

This one will show as well.

I would not expect that from a patch that makes "no functional change".

baruch

>  	}
>  
>  	if (priv->hw_version == MVPP22 && dev_of_node(&pdev->dev)) {
> @@ -6947,11 +6993,13 @@ static int mvpp2_probe(struct platform_device *pdev)
>  
>  	if (dev_of_node(&pdev->dev)) {
>  		priv->pp_clk = devm_clk_get(&pdev->dev, "pp_clk");
> -		if (IS_ERR(priv->pp_clk))
> -			return PTR_ERR(priv->pp_clk);
> +		if (IS_ERR(priv->pp_clk)) {
> +			err = PTR_ERR(priv->pp_clk);
> +			goto err_cm3;
> +		}
>  		err = clk_prepare_enable(priv->pp_clk);
>  		if (err < 0)
> -			return err;
> +			goto err_cm3;
>  
>  		priv->gop_clk = devm_clk_get(&pdev->dev, "gop_clk");
>  		if (IS_ERR(priv->gop_clk)) {
> @@ -7087,6 +7135,11 @@ static int mvpp2_probe(struct platform_device *pdev)
>  	clk_disable_unprepare(priv->gop_clk);
>  err_pp_clk:
>  	clk_disable_unprepare(priv->pp_clk);
> +err_cm3:
> +	if (priv->sram_pool && priv->cm3_base)
> +		gen_pool_free(priv->sram_pool, (unsigned long)priv->cm3_base,
> +			      MSS_SRAM_SIZE);
> +
>  	return err;
>  }
>  
> @@ -7127,6 +7180,10 @@ static int mvpp2_remove(struct platform_device *pdev)
>  				  aggr_txq->descs_dma);
>  	}
>  
> +	if (priv->sram_pool && priv->cm3_base)
> +		gen_pool_free(priv->sram_pool, (unsigned long)priv->cm3_base,
> +			      MSS_SRAM_SIZE);
> +
>  	if (is_acpi_node(port_fwnode))
>  		return 0;
Stefan Chulski Feb. 7, 2021, 4:51 p.m. UTC | #2
> > +		priv->sram_pool = of_gen_pool_get(dn, "cm3-mem", 0);
> > +		if (!priv->sram_pool) {
> > +			if (!defer_once) {
> > +				defer_once = true;
> > +				/* Try defer once */
> > +				return -EPROBE_DEFER;
> > +			}
> > +			dev_warn(&pdev->dev, "DT is too old, Flow control
> not supported\n");
> > +			return -ENOMEM;
> > +		}
> > +		/* cm3_base allocated with offset zero into the SRAM since
> mapping size
> > +		 * is equal to requested size.
> > +		 */
> > +		priv->cm3_base = (void __iomem *)gen_pool_alloc(priv-
> >sram_pool,
> > +
> 	MSS_SRAM_SIZE);
> > +		if (!priv->cm3_base)
> > +			return -ENOMEM;
> > +	}
> 
> For v2 i asked:
> 
> > I'm wondering if using a pool even makes sense. The ACPI case just
> > ioremap() the memory region. Either this memory is dedicated, and then
> > there is no need to use a pool, or the memory is shared, and at some
> > point the ACPI code is going to run into problems when some other
> > driver also wants access.
> 
> There was never an answer to this.

Sorry probably missed this. Currently this memory not shared and I can just ioremap same way as in ACPI case.
In this case I can remove EPROBE_DEFER.

Thanks,
Stefan.