mbox series

[net-next,0/3] net: ethernet: ti: cpsw: Add XDP support

Message ID 20190523182035.9283-1-ivan.khoronzhuk@linaro.org
Headers show
Series net: ethernet: ti: cpsw: Add XDP support | expand

Message

Ivan Khoronzhuk May 23, 2019, 6:20 p.m. UTC
This patchset add XDP support for TI cpsw driver and base it on
page_pool allocator. It was verified on af_xdp socket drop,
af_xdp l2f, ebpf XDP_DROP, XDP_REDIRECT, XDP_PASS, XDP_TX.

It was verified with following configs enabled:
CONFIG_JIT=y
CONFIG_BPFILTER=y
CONFIG_BPF_SYSCALL=y
CONFIG_XDP_SOCKETS=y
CONFIG_BPF_EVENTS=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_JIT=y
CONFIG_CGROUP_BPF=y

Link on previous RFC:
https://lkml.org/lkml/2019/4/17/861

Also regular tests with iperf2 were done in order to verify impact on
regular netstack performance, compared with base commit:
https://pastebin.com/JSMT0iZ4

Based on net-next/master

Ivan Khoronzhuk (3):
  net: ethernet: ti: davinci_cpdma: add dma mapped submit
  net: ethernet: ti: davinci_cpdma: return handler status
  net: ethernet: ti: cpsw: add XDP support

 drivers/net/ethernet/ti/Kconfig         |   1 +
 drivers/net/ethernet/ti/cpsw.c          | 570 +++++++++++++++++++++---
 drivers/net/ethernet/ti/cpsw_ethtool.c  |  55 ++-
 drivers/net/ethernet/ti/cpsw_priv.h     |   9 +-
 drivers/net/ethernet/ti/davinci_cpdma.c | 122 +++--
 drivers/net/ethernet/ti/davinci_cpdma.h |   6 +-
 drivers/net/ethernet/ti/davinci_emac.c  |  18 +-
 7 files changed, 675 insertions(+), 106 deletions(-)

-- 
2.17.1

Comments

Ivan Khoronzhuk May 27, 2019, 6:10 p.m. UTC | #1
On Fri, May 24, 2019 at 01:54:18PM +0200, Jesper Dangaard Brouer wrote:
>On Thu, 23 May 2019 21:20:35 +0300

>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

>

>> Add XDP support based on rx page_pool allocator, one frame per page.

>> Page pool allocator is used with assumption that only one rx_handler

>> is running simultaneously. DMA map/unmap is reused from page pool

>> despite there is no need to map whole page.

>

>When using page_pool for DMA-mapping, your XDP-memory model must use

>1-page per packet, which you state you do.  This is because

>__page_pool_put_page() fallback mode does a __page_pool_clean_page()

>unmapping the DMA.  Ilias and I are looking at options for removing this

>restriction as Mlx5 would need it (when we extend the SKB to return

>pages to page_pool).

Thank for what you do, it can simplify a lot...

>

>Unfortunately, I've found another blocker for drivers using the DMA

>mapping feature of page_pool.  We don't properly handle the case, where

>a remote TX-driver have xdp_frame's in-flight, and simultaneously the

>sending driver is unloaded and take down the page_pool.  Nothing crash,

>but we end-up calling put_page() on a page that is still DMA-mapped.


Seems so, ... for generic solution, but looks like in case of cpsw there
is no issue due to "like direct" dma map by adding offset, so whether page_pool
dma map or dma map/unmap per rx/xmit, shouldn't be big difference.  Not sure
about all SoCs thought...

Despite of it, for cpsw I keep page_pool while down/up that I'm going to change
in v2.

>

>I'm working on different solutions for fixing this, see here:

> https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool03_shutdown_inflight.org

Hope there will be no changes in page_pool API.

>-- Best regards,

>  Jesper Dangaard Brouer

>  MSc.CS, Principal Kernel Engineer at Red Hat

>  LinkedIn: http://www.linkedin.com/in/brouer


-- 
Regards,
Ivan Khoronzhuk
Jesper Dangaard Brouer May 29, 2019, 8:16 a.m. UTC | #2
On Thu, 23 May 2019 21:20:35 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> +static struct page *cpsw_alloc_page(struct cpsw_common *cpsw)

> +{

> +	struct page_pool *pool = cpsw->rx_page_pool;

> +	struct page *page, *prev_page = NULL;

> +	int try = pool->p.pool_size << 2;

> +	int start_free = 0, ret;

> +

> +	do {

> +		page = page_pool_dev_alloc_pages(pool);

> +		if (!page)

> +			return NULL;

> +

> +		/* if netstack has page_pool recycling remove the rest */

> +		if (page_ref_count(page) == 1)

> +			break;

> +

> +		/* start free pages in use, shouldn't happen */

> +		if (prev_page == page || start_free) {

> +			/* dma unmap/puts page if rfcnt != 1 */

> +			page_pool_recycle_direct(pool, page);

> +			start_free = 1;

> +			continue;

> +		}

> +

> +		/* if refcnt > 1, page has been holding by netstack, it's pity,

> +		 * so put it to the ring to be consumed later when fast cash is

> +		 * empty. If ring is full then free page by recycling as above.

> +		 */

> +		ret = ptr_ring_produce(&pool->ring, page);


This looks very wrong to me!  First of all you are manipulation
directly with the internal pool->ring and not using the API, which
makes this code un-maintainable.  Second this is wrong, as page_pool
assume the in-variance that pages on the ring have refcnt==1.

> +		if (ret) {

> +			page_pool_recycle_direct(pool, page);

> +			continue;

> +		}

> +

> +		if (!prev_page)

> +			prev_page = page;

> +	} while (try--);

> +

> +	return page;

> +}



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer