[v2,net-next,0/7] net: ethernet: ti: cpsw: Add XDP support

Message ID	20190530182039.4945-1-ivan.khoronzhuk@linaro.org
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> To: grygorii.strashko@ti.com, hawk@kernel.org, davem@davemloft.net Cc: ast@kernel.org, linux-kernel@vger.kernel.org, linux-omap@vger.kernel.org, xdp-newbies@vger.kernel.org, ilias.apalodimas@linaro.org, netdev@vger.kernel.org, daniel@iogearbox.net, jakub.kicinski@netronome.com, john.fastabend@gmail.com, Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Subject: [PATCH v2 net-next 0/7] net: ethernet: ti: cpsw: Add XDP support Date: Thu, 30 May 2019 21:20:32 +0300 Message-Id: <20190530182039.4945-1-ivan.khoronzhuk@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk
Series	net: ethernet: ti: cpsw: Add XDP support \| expand [v2,net-next,0/7] net: ethernet: ti: cpsw: Add XDP support [v2,net-next,1/7] net: page_pool: add helper function to retrieve dma addresses [v2,net-next,2/7] net: page_pool: add helper function to unmap dma addresses [v2,net-next,3/7] net: ethernet: ti: cpsw: use cpsw as drv data [v2,net-next,4/7] net: ethernet: ti: cpsw_ethtool: simplify slave loops [v2,net-next,5/7] net: ethernet: ti: davinci_cpdma: add dma mapped submit [v2,net-next,6/7] net: ethernet: ti: davinci_cpdma: return handler status

Message ID

20190530182039.4945-1-ivan.khoronzhuk@linaro.org

Headers

Received-SPF: pass (google.com: best guess record for domain of
	linux-kernel-owner@vger.kernel.org designates 209.132.180.67
	as permitted sender) client-ip=209.132.180.67; 
From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
To: grygorii.strashko@ti.com, hawk@kernel.org, davem@davemloft.net
Cc: ast@kernel.org, linux-kernel@vger.kernel.org,
	linux-omap@vger.kernel.org, xdp-newbies@vger.kernel.org,
	ilias.apalodimas@linaro.org, netdev@vger.kernel.org,
	daniel@iogearbox.net, jakub.kicinski@netronome.com,
	john.fastabend@gmail.com, Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Subject: [PATCH v2 net-next 0/7] net: ethernet: ti: cpsw: Add XDP support
Date: Thu, 30 May 2019 21:20:32 +0300
Message-Id: <20190530182039.4945-1-ivan.khoronzhuk@linaro.org>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk

Series

net: ethernet: ti: cpsw: Add XDP support | expand

Message

Ivan Khoronzhuk May 30, 2019, 6:20 p.m. UTC

This patchset adds XDP support for TI cpsw driver and base it on
page_pool allocator. It was verified on af_xdp socket drop,
af_xdp l2f, ebpf XDP_DROP, XDP_REDIRECT, XDP_PASS, XDP_TX.

It was verified with following configs enabled:
CONFIG_JIT=y
CONFIG_BPFILTER=y
CONFIG_BPF_SYSCALL=y
CONFIG_XDP_SOCKETS=y
CONFIG_BPF_EVENTS=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_JIT=y
CONFIG_CGROUP_BPF=y

Link on previous v1:
https://lkml.org/lkml/2019/5/23/795

Also regular tests with iperf2 were done in order to verify impact on
regular netstack performance, compared with base commit:
https://pastebin.com/JSMT0iZ4

v1..v2:
- combined xdp_xmit functions
- used page allocation w/o refcnt juggle
- unmapped page for skb netstack
- moved rxq/page pool allocation to open/close pair
- added several preliminary patches:
  net: page_pool: add helper function to retrieve dma addresses
  net: page_pool: add helper function to unmap dma addresses
  net: ethernet: ti: cpsw: use cpsw as drv data
  net: ethernet: ti: cpsw_ethtool: simplify slave loops


Based on net-next/master

Ilias Apalodimas (2):
  net: page_pool: add helper function to retrieve dma addresses
  net: page_pool: add helper function to unmap dma addresses

Ivan Khoronzhuk (5):
  net: ethernet: ti: cpsw: use cpsw as drv data
  net: ethernet: ti: cpsw_ethtool: simplify slave loops
  net: ethernet: ti: davinci_cpdma: add dma mapped submit
  net: ethernet: ti: davinci_cpdma: return handler status
  net: ethernet: ti: cpsw: add XDP support

 drivers/net/ethernet/ti/Kconfig         |   1 +
 drivers/net/ethernet/ti/cpsw.c          | 537 ++++++++++++++++++++----
 drivers/net/ethernet/ti/cpsw_ethtool.c  | 136 +++++-
 drivers/net/ethernet/ti/cpsw_priv.h     |  12 +-
 drivers/net/ethernet/ti/davinci_cpdma.c | 122 ++++--
 drivers/net/ethernet/ti/davinci_cpdma.h |   6 +-
 drivers/net/ethernet/ti/davinci_emac.c  |  18 +-
 include/net/page_pool.h                 |   6 +
 net/core/page_pool.c                    |   7 +
 9 files changed, 710 insertions(+), 135 deletions(-)

-- 
2.17.1

Comments

Jesper Dangaard Brouer May 31, 2019, 3:46 p.m. UTC | #1

Hi Ivan,

From below code snippets, it looks like you only allocated 1 page_pool
and sharing it with several RX-queues, as I don't have the full context
and don't know this driver, I might be wrong?

To be clear, a page_pool object is needed per RX-queue, as it is
accessing a small RX page cache (which protected by NAPI/softirq).

On Thu, 30 May 2019 21:20:39 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> @@ -1404,6 +1711,14 @@ static int cpsw_ndo_open(struct net_device *ndev)

>  			enable_irq(cpsw->irqs_table[0]);

>  		}

>  

> +		pool_size = cpdma_get_num_rx_descs(cpsw->dma);

> +		cpsw->page_pool = cpsw_create_page_pool(cpsw, pool_size);

> +		if (IS_ERR(cpsw->page_pool)) {

> +			ret = PTR_ERR(cpsw->page_pool);

> +			cpsw->page_pool = NULL;

> +			goto err_cleanup;

> +		}


On Thu, 30 May 2019 21:20:39 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> @@ -675,10 +742,33 @@ int cpsw_set_ringparam(struct net_device *ndev,

>  	if (cpsw->usage_count)

>  		cpdma_chan_split_pool(cpsw->dma);

>  

> +	for (i = 0; i < cpsw->data.slaves; i++) {

> +		struct net_device *ndev = cpsw->slaves[i].ndev;

> +

> +		if (!(ndev && netif_running(ndev)))

> +			continue;

> +

> +		cpsw_xdp_unreg_rxqs(netdev_priv(ndev));

> +	}

> +

> +	page_pool_destroy(cpsw->page_pool);

> +	cpsw->page_pool = pool;

> +


On Thu, 30 May 2019 21:20:39 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> +void cpsw_xdp_unreg_rxqs(struct cpsw_priv *priv)

> +{

> +	struct cpsw_common *cpsw = priv->cpsw;

> +	int i;

> +

> +	for (i = 0; i < cpsw->rx_ch_num; i++)

> +		xdp_rxq_info_unreg(&priv->xdp_rxq[i]);

> +}



On Thu, 30 May 2019 21:20:39 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> +int cpsw_xdp_reg_rxq(struct cpsw_priv *priv, int ch)

> +{

> +	struct xdp_rxq_info *xdp_rxq = &priv->xdp_rxq[ch];

> +	struct cpsw_common *cpsw = priv->cpsw;

> +	int ret;

> +

> +	ret = xdp_rxq_info_reg(xdp_rxq, priv->ndev, ch);

> +	if (ret)

> +		goto err_cleanup;

> +

> +	ret = xdp_rxq_info_reg_mem_model(xdp_rxq, MEM_TYPE_PAGE_POOL,

> +					 cpsw->page_pool);

> +	if (ret)

> +		goto err_cleanup;

> +

> +	return 0;




-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Ivan Khoronzhuk May 31, 2019, 4:25 p.m. UTC | #2

On Fri, May 31, 2019 at 05:46:43PM +0200, Jesper Dangaard Brouer wrote:

Hi Jesper,

>

>Hi Ivan,

>

>From below code snippets, it looks like you only allocated 1 page_pool

>and sharing it with several RX-queues, as I don't have the full context

>and don't know this driver, I might be wrong?

>

>To be clear, a page_pool object is needed per RX-queue, as it is

>accessing a small RX page cache (which protected by NAPI/softirq).


There is one RX interrupt and one RX NAPI for all rx channels.

>

>On Thu, 30 May 2019 21:20:39 +0300

>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

>

>> @@ -1404,6 +1711,14 @@ static int cpsw_ndo_open(struct net_device *ndev)

>>  			enable_irq(cpsw->irqs_table[0]);

>>  		}

>>

>> +		pool_size = cpdma_get_num_rx_descs(cpsw->dma);

>> +		cpsw->page_pool = cpsw_create_page_pool(cpsw, pool_size);

>> +		if (IS_ERR(cpsw->page_pool)) {

>> +			ret = PTR_ERR(cpsw->page_pool);

>> +			cpsw->page_pool = NULL;

>> +			goto err_cleanup;

>> +		}

>

>On Thu, 30 May 2019 21:20:39 +0300

>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

>

>> @@ -675,10 +742,33 @@ int cpsw_set_ringparam(struct net_device *ndev,

>>  	if (cpsw->usage_count)

>>  		cpdma_chan_split_pool(cpsw->dma);

>>

>> +	for (i = 0; i < cpsw->data.slaves; i++) {

>> +		struct net_device *ndev = cpsw->slaves[i].ndev;

>> +

>> +		if (!(ndev && netif_running(ndev)))

>> +			continue;

>> +

>> +		cpsw_xdp_unreg_rxqs(netdev_priv(ndev));

>> +	}

>> +

>> +	page_pool_destroy(cpsw->page_pool);

>> +	cpsw->page_pool = pool;

>> +

>

>On Thu, 30 May 2019 21:20:39 +0300

>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

>

>> +void cpsw_xdp_unreg_rxqs(struct cpsw_priv *priv)

>> +{

>> +	struct cpsw_common *cpsw = priv->cpsw;

>> +	int i;

>> +

>> +	for (i = 0; i < cpsw->rx_ch_num; i++)

>> +		xdp_rxq_info_unreg(&priv->xdp_rxq[i]);

>> +}

>

>

>On Thu, 30 May 2019 21:20:39 +0300

>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

>

>> +int cpsw_xdp_reg_rxq(struct cpsw_priv *priv, int ch)

>> +{

>> +	struct xdp_rxq_info *xdp_rxq = &priv->xdp_rxq[ch];

>> +	struct cpsw_common *cpsw = priv->cpsw;

>> +	int ret;

>> +

>> +	ret = xdp_rxq_info_reg(xdp_rxq, priv->ndev, ch);

>> +	if (ret)

>> +		goto err_cleanup;

>> +

>> +	ret = xdp_rxq_info_reg_mem_model(xdp_rxq, MEM_TYPE_PAGE_POOL,

>> +					 cpsw->page_pool);

>> +	if (ret)

>> +		goto err_cleanup;

>> +

>> +	return 0;

>

>

>

>-- 

>Best regards,

>  Jesper Dangaard Brouer

>  MSc.CS, Principal Kernel Engineer at Red Hat

>  LinkedIn: http://www.linkedin.com/in/brouer


-- 
Regards,
Ivan Khoronzhuk

Jesper Dangaard Brouer May 31, 2019, 10:37 p.m. UTC | #3

On Fri, 31 May 2019 20:03:33 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> Probably it's not good example for others how it should be used, not

> a big problem to move it to separate pools.., even don't remember why

> I decided to use shared pool, there was some more reasons... need

> search in history.

Using a shared pool is makes it a lot harder to solve the issue I'm
currently working on.  That is handling/waiting for in-flight frames to
complete, before removing the mem ID from the (r)hashtable lookup.  I
have working code, that basically remove page_pool_destroy() from
public API, and instead lets xdp_rxq_info_unreg() call it when
in-flight count reach zero (and delay fully removing the mem ID).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Ivan Khoronzhuk May 31, 2019, 11 p.m. UTC | #4

On Fri, May 31, 2019 at 10:08:03PM +0000, Saeed Mahameed wrote:
>On Fri, 2019-05-31 at 20:03 +0300, Ivan Khoronzhuk wrote:

>> On Fri, May 31, 2019 at 06:32:41PM +0200, Jesper Dangaard Brouer

>> wrote:

>> > On Fri, 31 May 2019 19:25:24 +0300 Ivan Khoronzhuk <

>> > ivan.khoronzhuk@linaro.org> wrote:

>> >

>> > > On Fri, May 31, 2019 at 05:46:43PM +0200, Jesper Dangaard Brouer

>> > > wrote:

>> > > > From below code snippets, it looks like you only allocated 1

>> > > > page_pool

>> > > > and sharing it with several RX-queues, as I don't have the full

>> > > > context

>> > > > and don't know this driver, I might be wrong?

>> > > >

>> > > > To be clear, a page_pool object is needed per RX-queue, as it

>> > > > is

>> > > > accessing a small RX page cache (which protected by

>> > > > NAPI/softirq).

>> > >

>> > > There is one RX interrupt and one RX NAPI for all rx channels.

>> >

>> > So, what are you saying?

>> >

>> > You _are_ sharing the page_pool between several RX-channels, but it

>> > is

>> > safe because this hardware only have one RX interrupt + NAPI

>> > instance??

>>

>> I can miss smth but in case of cpsw technically it means:

>> 1) RX interrupts are disabled while NAPI is scheduled,

>>    not for particular CPU or channel, but at all, for whole cpsw

>> module.

>> 2) RX channels are handled one by one by priority.

>

>Hi Ivan, I got a silly question..

>

>What is the reason behind having multiple RX rings and one CPU/NAPI

>handling all of them ? priority ? how do you priorities ?

Several.
One of the reason, from what I know, it can handle for several cpus/napi but
because of errata on some SoCs or for all of them it was discarded, but idea was
it can. Second it uses same davinci_cpdma API as tx channels that can be rate
limited, and it's used not only by cpsw but also by other driver, so can't be
modified easily and no reason. And third one, h/w has ability to steer some
filtered traffic to rx queues and can be potentially configured with ethtool
ntuples or so, but it's not implemented....yet.

>

>> 3) After all of them handled and no more in budget - interrupts are

>> enabled.

>> 4) If page is returned to the pool, and it's within NAPI, no races as

>> it's

>>    returned protected by softirq. If it's returned not in softirq

>> it's protected

>>    by producer lock of the ring.

>>

>> Probably it's not good example for others how it should be used, not

>> a big

>> problem to move it to separate pools.., even don't remember why I

>> decided to

>> use shared pool, there was some more reasons... need search in

>> history.

>>

>> > --

>> > Best regards,

>> >  Jesper Dangaard Brouer

>> >  MSc.CS, Principal Kernel Engineer at Red Hat

>> >  LinkedIn: http://www.linkedin.com/in/brouer


-- 
Regards,
Ivan Khoronzhuk