From patchwork Wed Mar 31 20:08:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Oltean X-Patchwork-Id: 413365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A180C43461 for ; Wed, 31 Mar 2021 20:09:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB3FB610A7 for ; Wed, 31 Mar 2021 20:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236445AbhCaUJV (ORCPT ); Wed, 31 Mar 2021 16:09:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236421AbhCaUJR (ORCPT ); Wed, 31 Mar 2021 16:09:17 -0400 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E64F3C061574; Wed, 31 Mar 2021 13:09:16 -0700 (PDT) Received: by mail-ej1-x636.google.com with SMTP id r12so31929771ejr.5; Wed, 31 Mar 2021 13:09:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6BtFP60GJ3rBYTRbobD3G+mjNuY8Ark7ywqMvtfDSOU=; b=P9ZBZDeIxAyz+PzOcLZuJWz7MmmgwwYlr0YUepH5OLhjinQaqCRq7vnJPEf4CcnDVH 0TZxodwv3gfzmgIwMdfuqzY7NbcGXF3dj0OHpRBDMb+x5TsMsO2Rs+3qJxSq83+Lx5op ewAM5ATIpb4C+5unpvJbgPT3mAjkBJFHv0hV1w0jReaRWtVAASh5g5SNqbM4XCZHMhhC /cscMW935xUVM5t9rXwOtyEyNAEyeeSvl0+cGOu9DEbHeoflphtzGdAhd61X8bVjrdeg PezZLxz8UaNZnz/cPdl75Jhx8m+0MyPkG1ZdCPhEt2jEBdEGp3C9UmbEdhRCk7gMdu8G 1mcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6BtFP60GJ3rBYTRbobD3G+mjNuY8Ark7ywqMvtfDSOU=; b=MVtSOdLRcxTPUQKFonxhKavdnWX1WKxJIupiBtKHP7/ObgCivS2fwCyaEZGw8ZUoxB ryFO4/992PAUi5+tmfLMzgGYC7ADT1Fj0Z3BrX6UFu3Lg+okrKlUQqbhaR/yp6JXbmSa GhjCQKYK5rmRwfrV+p/qGsniU/vgOAU8pYcijsAO1p90K7oULIf6oBsxEemOHMZv7X/7 psG9zfTlP8iRDMHEo+82FEPwyP9qMVRleSqMDKaLxuW+eb5eMb5F0pc06NIGa/dq5RxY rRfu8BVppNQ7TEL4UHzN2sjTcr+neyR059sZf3uIq3H05bG89L3tr+dOE0VFApG3mE7H acOw== X-Gm-Message-State: AOAM533vRWwZjlek/9IStUrHWL3NdZA6okZPJtrG98rXYG3L9GiJU9AP p6Iq6RLcUwnaA9n53cydEiE= X-Google-Smtp-Source: ABdhPJyYoF8LPo571KICu6JTEsa4k0rFRfMyf5OhQiEU0ipa7QqT4EqiNE6h6pQdLWczlt35C1H5pA== X-Received: by 2002:a17:907:7745:: with SMTP id kx5mr5431657ejc.3.1617221355702; Wed, 31 Mar 2021 13:09:15 -0700 (PDT) Received: from localhost.localdomain (5-12-16-165.residential.rdsnet.ro. [5.12.16.165]) by smtp.gmail.com with ESMTPSA id r19sm1691305ejr.55.2021.03.31.13.09.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 13:09:15 -0700 (PDT) From: Vladimir Oltean To: Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , "David S. Miller" , netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Alexander Duyck , Ioana Ciornei , Alex Marginean , Claudiu Manoil , Ilias Apalodimas , Vladimir Oltean Subject: [PATCH net-next 2/9] net: enetc: move skb creation into enetc_build_skb Date: Wed, 31 Mar 2021 23:08:50 +0300 Message-Id: <20210331200857.3274425-3-olteanv@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331200857.3274425-1-olteanv@gmail.com> References: <20210331200857.3274425-1-olteanv@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vladimir Oltean We need to build an skb from two code paths now: from the plain RX data path and from the XDP data path when the verdict is XDP_PASS. Create a new enetc_build_skb function which contains the essential steps for building an skb based on the first and last positions of buffer descriptors within the RX ring. We also squash the enetc_process_skb function into enetc_build_skb, because what that function did wasn't very meaningful on its own. The "rx_frm_cnt++" instruction has been moved around napi_gro_receive for cosmetic reasons, to be in the same spot as rx_byte_cnt++, which itself must be before napi_gro_receive, because that's when we lose ownership of the skb. Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/freescale/enetc/enetc.c | 81 +++++++++++--------- 1 file changed, 44 insertions(+), 37 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 362cfba7ce14..b2071b8dc316 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -513,13 +513,6 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring, #endif } -static void enetc_process_skb(struct enetc_bdr *rx_ring, - struct sk_buff *skb) -{ - skb_record_rx_queue(skb, rx_ring->index); - skb->protocol = eth_type_trans(skb, rx_ring->ndev); -} - static bool enetc_page_reusable(struct page *page) { return (!page_is_pfmemalloc(page) && page_ref_count(page) == 1); @@ -627,6 +620,47 @@ static bool enetc_check_bd_errors_and_consume(struct enetc_bdr *rx_ring, return true; } +static struct sk_buff *enetc_build_skb(struct enetc_bdr *rx_ring, + u32 bd_status, union enetc_rx_bd **rxbd, + int *i, int *cleaned_cnt) +{ + struct sk_buff *skb; + u16 size; + + size = le16_to_cpu((*rxbd)->r.buf_len); + skb = enetc_map_rx_buff_to_skb(rx_ring, *i, size); + if (!skb) + return NULL; + + enetc_get_offloads(rx_ring, *rxbd, skb); + + (*cleaned_cnt)++; + + enetc_rxbd_next(rx_ring, rxbd, i); + + /* not last BD in frame? */ + while (!(bd_status & ENETC_RXBD_LSTATUS_F)) { + bd_status = le32_to_cpu((*rxbd)->r.lstatus); + size = ENETC_RXB_DMA_SIZE; + + if (bd_status & ENETC_RXBD_LSTATUS_F) { + dma_rmb(); + size = le16_to_cpu((*rxbd)->r.buf_len); + } + + enetc_add_rx_buff_to_skb(rx_ring, *i, size, skb); + + (*cleaned_cnt)++; + + enetc_rxbd_next(rx_ring, rxbd, i); + } + + skb_record_rx_queue(skb, rx_ring->index); + skb->protocol = eth_type_trans(skb, rx_ring->ndev); + + return skb; +} + #define ENETC_RXBD_BUNDLE 16 /* # of BDs to update at once */ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, @@ -643,7 +677,6 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, union enetc_rx_bd *rxbd; struct sk_buff *skb; u32 bd_status; - u16 size; if (cleaned_cnt >= ENETC_RXBD_BUNDLE) cleaned_cnt -= enetc_refill_rx_ring(rx_ring, @@ -661,41 +694,15 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, &rxbd, &i)) break; - size = le16_to_cpu(rxbd->r.buf_len); - skb = enetc_map_rx_buff_to_skb(rx_ring, i, size); + skb = enetc_build_skb(rx_ring, bd_status, &rxbd, &i, + &cleaned_cnt); if (!skb) break; - enetc_get_offloads(rx_ring, rxbd, skb); - - cleaned_cnt++; - - enetc_rxbd_next(rx_ring, &rxbd, &i); - - /* not last BD in frame? */ - while (!(bd_status & ENETC_RXBD_LSTATUS_F)) { - bd_status = le32_to_cpu(rxbd->r.lstatus); - size = ENETC_RXB_DMA_SIZE; - - if (bd_status & ENETC_RXBD_LSTATUS_F) { - dma_rmb(); - size = le16_to_cpu(rxbd->r.buf_len); - } - - enetc_add_rx_buff_to_skb(rx_ring, i, size, skb); - - cleaned_cnt++; - - enetc_rxbd_next(rx_ring, &rxbd, &i); - } - rx_byte_cnt += skb->len; - - enetc_process_skb(rx_ring, skb); + rx_frm_cnt++; napi_gro_receive(napi, skb); - - rx_frm_cnt++; } rx_ring->next_to_clean = i; From patchwork Wed Mar 31 20:08:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Oltean X-Patchwork-Id: 413363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40533C43617 for ; Wed, 31 Mar 2021 20:10:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 03AC1610A7 for ; Wed, 31 Mar 2021 20:10:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236489AbhCaUJz (ORCPT ); Wed, 31 Mar 2021 16:09:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236397AbhCaUJU (ORCPT ); Wed, 31 Mar 2021 16:09:20 -0400 Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD5CDC061574; Wed, 31 Mar 2021 13:09:19 -0700 (PDT) Received: by mail-ej1-x62f.google.com with SMTP id w3so31958985ejc.4; Wed, 31 Mar 2021 13:09:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YDYM5IcD4Re3JLEPXhD3YUxrHnmifCPM/Hta5XlLd2Y=; b=oobSI1gL4nqIQcV9WzqNoGmjuBni/DHXsFTMZ3WI5jnKiMIetHU4X/FV3JtY9x3NDc 0Mv4oywvpF6suX6qYNafbMdjzNvonWK/vYA4kv/fW6PRLDiFS8I/oRv65+XSndZRvgVe //pTbFxbT/bwxtfqVtdW3WN2NGWsi7Ss3AK68cGz/fbWR3Q1+rUF597FPhyHEBYOzdL2 2uWVlZsqewejEDmFbOamFAEkCksbHUKUtyLTFdSlyPS+/fZXD1H+frqUBQSn1Yc7HZv8 z7Y3nCQn9+0T0YN3QJOyzcwi15tIDEaGDtAhUoJQ6wvAQ6Ws7f+CHb997GFMqW6EKK/0 BQ0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YDYM5IcD4Re3JLEPXhD3YUxrHnmifCPM/Hta5XlLd2Y=; b=LqLkdqLbfpis65mas7ACdssav1YxoFFVQIJ3lu7egFKCw9heKdWHmUvXOOsGH/u8I7 KbNf7zvYVkYkLExl1mKaIYsb3o7hE0Dw+lxSX+oP8MTUTwkoJBI95tSyX7ppH78BSwCC m3CHBmA5tvKwQqfg/04DorS6p+SB0svXbXciV/Bq6I6+0twmVLFj5B8WeTBTGQ5oGoH9 66ohCpketAztUwx5FntpWbJ5zt/Ll94NRarifzaHlpWFKjdGbCqZ1iHKJjO/NadgQI2X pHWCHHaQEUoU+rgHANLIi3BzB4yyrPouP2mENS7W3h8KHSIZ5YYAMkj+3kBgw6pLeAFa ssVg== X-Gm-Message-State: AOAM533YNXhseNKhmhWKRLPvN6kXineaUJOsMIOZQ3cYje4WrHEfxOxU QAbfR/6l+ayBh1RgcluEu3E= X-Google-Smtp-Source: ABdhPJzF49ILummIdYctzGId5AjDp3kRO4kgPpqUTfWMmxuLxsD2nCPY9PENn2T4iMZnlasS7C06LQ== X-Received: by 2002:a17:906:4dce:: with SMTP id f14mr5392198ejw.349.1617221358459; Wed, 31 Mar 2021 13:09:18 -0700 (PDT) Received: from localhost.localdomain (5-12-16-165.residential.rdsnet.ro. [5.12.16.165]) by smtp.gmail.com with ESMTPSA id r19sm1691305ejr.55.2021.03.31.13.09.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 13:09:18 -0700 (PDT) From: Vladimir Oltean To: Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , "David S. Miller" , netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Alexander Duyck , Ioana Ciornei , Alex Marginean , Claudiu Manoil , Ilias Apalodimas , Vladimir Oltean Subject: [PATCH net-next 4/9] net: enetc: clean the TX software BD on the TX confirmation path Date: Wed, 31 Mar 2021 23:08:52 +0300 Message-Id: <20210331200857.3274425-5-olteanv@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331200857.3274425-1-olteanv@gmail.com> References: <20210331200857.3274425-1-olteanv@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vladimir Oltean With the future introduction of some new fields into enetc_tx_swbd such as is_xdp_tx, is_xdp_redirect etc, we need not only to set these bits to true from the XDP_TX/XDP_REDIRECT code path, but also to false from the old code paths. This is because TX software buffer descriptors are kept in a ring that is shadow of the hardware TX ring, so these structures keep getting reused, and there is always the possibility that when a software BD is reused (after we ran a full circle through the TX ring), the old user of the tx_swbd had set is_xdp_tx = true, and now we are sending a regular skb, which would need to set is_xdp_tx = false. To be minimally invasive to the old code paths, let's just scrub the software TX BD in the TX confirmation path (enetc_clean_tx_ring), once we know that nobody uses this software TX BD (tx_ring->next_to_clean hasn't yet been updated, and the TX paths check enetc_bd_unused which tells them if there's any more space in the TX ring for a new enqueue). Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/freescale/enetc/enetc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 37d2d142a744..ade05518b496 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -344,6 +344,10 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget) } tx_byte_cnt += tx_swbd->len; + /* Scrub the swbd here so we don't have to do that + * when we reuse it during xmit + */ + memset(tx_swbd, 0, sizeof(*tx_swbd)); bds_to_clean--; tx_swbd++; From patchwork Wed Mar 31 20:08:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Oltean X-Patchwork-Id: 413364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0ECCC43600 for ; Wed, 31 Mar 2021 20:10:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC4306109E for ; Wed, 31 Mar 2021 20:10:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236467AbhCaUJx (ORCPT ); Wed, 31 Mar 2021 16:09:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236421AbhCaUJX (ORCPT ); Wed, 31 Mar 2021 16:09:23 -0400 Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB346C06174A; Wed, 31 Mar 2021 13:09:22 -0700 (PDT) Received: by mail-ej1-x62a.google.com with SMTP id a7so31925315ejs.3; Wed, 31 Mar 2021 13:09:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=j5CGryKzNfO/b/2j2Zt95ia8ciPSb8SKAQXMHdOsDPU=; b=qfgUjtbKR+PNknsiwrSNPxn0RLDbm6TQuaXMCWA6VrcFNVQ5IL4967e9sUKVC+COw2 cMZhsDVypftNJPRxfc62i5D4yEh4bUvXrn82nPCCwS/TWM/ZgcrMJQykzZAXzCLGQmD0 +3ooaJbTrBwzSh6L9+eYignk6sHDj8eOqOThcUb0m8BOFJ9cjMtIcEDZmK1bE/N6oTNz cVfN8ses01TEpvT8YzZ+7AcrYNDoyWB9SHZXI+RvcbEudaqw1CokFhEZNNyaGOr52nwv TaG48pJU07zaUJalHzFAsnPZQnIKfXqXvf6QVyOxAmuqZW1uvccHKVRPXkQ71Zpivxiq TDrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=j5CGryKzNfO/b/2j2Zt95ia8ciPSb8SKAQXMHdOsDPU=; b=ePO8M7fLBTgdq1MonGRdQYssBvitexig3nFUkwz0C2M3z+mcQn5IlbfomWOXZW4H4e /Lzp7fNIiM85OoiGzHMVYsQg1FM3AnxQl8Oji82TzHK9uP7KVPpTqXWAxuFc0XE0fzXF iKwYkbqP6zHrwo/PublWqR/eeQdK6KpcM+PfRvLO1ltlhtVLD55+/YB3EtPOvWjLCdGb LEgOqI7V7oN7qjDRHFLFq2M62eDsYZJfOoh9BnOZ6kqDRFbg6AuFJ88w5fScotKC1oKs 7xIv/0/zFZWUy99P5tCBYwyc0bDy790yJS4WSrYCzYjJ2XiNKjhg/aM1P5HxNx8Y7Np0 yjaA== X-Gm-Message-State: AOAM530qGK3ZxMviGcJAGC9OJPcyVQqM+LnZpXbjNAzb895OnNSCgj7X yy3sp359Twh1OnqGNNpcabU= X-Google-Smtp-Source: ABdhPJzlld0T9y+/visjkyjinbCUMSCHbRqyrOlA9PeTIximnavwLfoJLEK1yxUml+JcViB3h/Szng== X-Received: by 2002:a17:906:dfcc:: with SMTP id jt12mr5507911ejc.31.1617221361380; Wed, 31 Mar 2021 13:09:21 -0700 (PDT) Received: from localhost.localdomain (5-12-16-165.residential.rdsnet.ro. [5.12.16.165]) by smtp.gmail.com with ESMTPSA id r19sm1691305ejr.55.2021.03.31.13.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 13:09:20 -0700 (PDT) From: Vladimir Oltean To: Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , "David S. Miller" , netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Alexander Duyck , Ioana Ciornei , Alex Marginean , Claudiu Manoil , Ilias Apalodimas , Vladimir Oltean Subject: [PATCH net-next 6/9] net: enetc: add support for XDP_DROP and XDP_PASS Date: Wed, 31 Mar 2021 23:08:54 +0300 Message-Id: <20210331200857.3274425-7-olteanv@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331200857.3274425-1-olteanv@gmail.com> References: <20210331200857.3274425-1-olteanv@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vladimir Oltean For the RX ring, enetc uses an allocation scheme based on pages split into two buffers, which is already very efficient in terms of preventing reallocations / maximizing reuse, so I see no reason why I would change that. +--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half B | half B | half B | half B | half B | half B | half B | | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half A | half A | half A | half A | half A | half A | half A | RX ring | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ ^ ^ | | next_to_clean next_to_alloc next_to_use +--------+--------+--------+--------+--------+ | | | | | | | half B | half B | half B | half B | half B | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | | | | | | | half B | half B | half A | half A | half A | half A | half A | RX ring | | | | | | | | +--------+--------+--------+--------+--------+--------+--------+ | | | ^ ^ | half A | half A | | | | | | next_to_clean next_to_use +--------+--------+ ^ | next_to_alloc then when enetc_refill_rx_ring is called, whose purpose is to advance next_to_use, it sees that it can take buffers up to next_to_alloc, and it says "oh, hey, rx_swbd->page isn't NULL, I don't need to allocate one!". The only problem is that for default PAGE_SIZE values of 4096, buffer sizes are 2048 bytes. While this is enough for normal skb allocations at an MTU of 1500 bytes, for XDP it isn't, because the XDP headroom is 256 bytes, and including skb_shared_info and alignment, we end up being able to make use of only 1472 bytes, which is insufficient for the default MTU. To solve that problem, we implement scatter/gather processing in the driver, because we would really like to keep the existing allocation scheme. A packet of 1500 bytes is received in a buffer of 1472 bytes and another one of 28 bytes. Because the headroom required by XDP is different (and much larger) than the one required by the network stack, whenever a BPF program is added or deleted on the port, we drain the existing RX buffers and seed new ones with the required headroom. We also keep the required headroom in rx_ring->buffer_offset. The simplest way to implement XDP_PASS, where an skb must be created, is to create an xdp_buff based on the next_to_clean RX BDs, but not clear those BDs from the RX ring yet, just keep the original index at which the BDs for this frame started. Then, if the verdict is XDP_PASS, instead of converting the xdb_buff to an skb, we replay a call to enetc_build_skb (just as in the normal enetc_clean_rx_ring case), starting from the original BD index. We would also like to be minimally invasive to the regular RX data path, and not check whether there is a BPF program attached to the ring on every packet. So we create a separate RX ring processing function for XDP. Because we only install/remove the BPF program while the interface is down, we forgo the rcu_read_lock() in enetc_clean_rx_ring, since there shouldn't be any circumstance in which we are processing packets and there is a potentially freed BPF program attached to the RX ring. Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/freescale/enetc/enetc.c | 284 ++++++++++++++++-- drivers/net/ethernet/freescale/enetc/enetc.h | 14 + .../ethernet/freescale/enetc/enetc_ethtool.c | 2 + .../net/ethernet/freescale/enetc/enetc_pf.c | 1 + 4 files changed, 281 insertions(+), 20 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 38301d0d7f0c..58bb0b78585a 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -2,6 +2,7 @@ /* Copyright 2017-2019 NXP */ #include "enetc.h" +#include #include #include #include @@ -420,7 +421,7 @@ static bool enetc_new_page(struct enetc_bdr *rx_ring, rx_swbd->dma = addr; rx_swbd->page = page; - rx_swbd->page_offset = ENETC_RXB_PAD; + rx_swbd->page_offset = rx_ring->buffer_offset; return true; } @@ -550,6 +551,8 @@ static void enetc_put_rx_buff(struct enetc_bdr *rx_ring, struct enetc_rx_swbd *rx_swbd) { if (likely(enetc_page_reusable(rx_swbd->page))) { + size_t buffer_size = ENETC_RXB_TRUESIZE - rx_ring->buffer_offset; + rx_swbd->page_offset ^= ENETC_RXB_TRUESIZE; page_ref_inc(rx_swbd->page); @@ -558,8 +561,7 @@ static void enetc_put_rx_buff(struct enetc_bdr *rx_ring, /* sync for use by the device */ dma_sync_single_range_for_device(rx_ring->dev, rx_swbd->dma, rx_swbd->page_offset, - ENETC_RXB_DMA_SIZE, - DMA_FROM_DEVICE); + buffer_size, DMA_FROM_DEVICE); } else { dma_unmap_page(rx_ring->dev, rx_swbd->dma, PAGE_SIZE, DMA_FROM_DEVICE); @@ -576,13 +578,13 @@ static struct sk_buff *enetc_map_rx_buff_to_skb(struct enetc_bdr *rx_ring, void *ba; ba = page_address(rx_swbd->page) + rx_swbd->page_offset; - skb = build_skb(ba - ENETC_RXB_PAD, ENETC_RXB_TRUESIZE); + skb = build_skb(ba - rx_ring->buffer_offset, ENETC_RXB_TRUESIZE); if (unlikely(!skb)) { rx_ring->stats.rx_alloc_errs++; return NULL; } - skb_reserve(skb, ENETC_RXB_PAD); + skb_reserve(skb, rx_ring->buffer_offset); __skb_put(skb, size); enetc_put_rx_buff(rx_ring, rx_swbd); @@ -625,7 +627,7 @@ static bool enetc_check_bd_errors_and_consume(struct enetc_bdr *rx_ring, static struct sk_buff *enetc_build_skb(struct enetc_bdr *rx_ring, u32 bd_status, union enetc_rx_bd **rxbd, - int *i, int *cleaned_cnt) + int *i, int *cleaned_cnt, int buffer_size) { struct sk_buff *skb; u16 size; @@ -644,7 +646,7 @@ static struct sk_buff *enetc_build_skb(struct enetc_bdr *rx_ring, /* not last BD in frame? */ while (!(bd_status & ENETC_RXBD_LSTATUS_F)) { bd_status = le32_to_cpu((*rxbd)->r.lstatus); - size = ENETC_RXB_DMA_SIZE; + size = buffer_size; if (bd_status & ENETC_RXBD_LSTATUS_F) { dma_rmb(); @@ -698,7 +700,7 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, break; skb = enetc_build_skb(rx_ring, bd_status, &rxbd, &i, - &cleaned_cnt); + &cleaned_cnt, ENETC_RXB_DMA_SIZE); if (!skb) break; @@ -716,10 +718,174 @@ static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring, return rx_frm_cnt; } +static void enetc_map_rx_buff_to_xdp(struct enetc_bdr *rx_ring, int i, + struct xdp_buff *xdp_buff, u16 size) +{ + struct enetc_rx_swbd *rx_swbd = enetc_get_rx_buff(rx_ring, i, size); + void *hard_start = page_address(rx_swbd->page) + rx_swbd->page_offset; + struct skb_shared_info *shinfo; + + xdp_prepare_buff(xdp_buff, hard_start - rx_ring->buffer_offset, + rx_ring->buffer_offset, size, false); + + shinfo = xdp_get_shared_info_from_buff(xdp_buff); + shinfo->nr_frags = 0; +} + +static void enetc_add_rx_buff_to_xdp(struct enetc_bdr *rx_ring, int i, + u16 size, struct xdp_buff *xdp_buff) +{ + struct skb_shared_info *shinfo = xdp_get_shared_info_from_buff(xdp_buff); + struct enetc_rx_swbd *rx_swbd = enetc_get_rx_buff(rx_ring, i, size); + skb_frag_t *frag = &shinfo->frags[shinfo->nr_frags]; + + skb_frag_off_set(frag, rx_swbd->page_offset); + skb_frag_size_set(frag, size); + __skb_frag_set_page(frag, rx_swbd->page); + + shinfo->nr_frags++; +} + +static void enetc_build_xdp_buff(struct enetc_bdr *rx_ring, u32 bd_status, + union enetc_rx_bd **rxbd, int *i, + int *cleaned_cnt, struct xdp_buff *xdp_buff) +{ + u16 size = le16_to_cpu((*rxbd)->r.buf_len); + + xdp_init_buff(xdp_buff, ENETC_RXB_TRUESIZE, &rx_ring->xdp.rxq); + + enetc_map_rx_buff_to_xdp(rx_ring, *i, xdp_buff, size); + (*cleaned_cnt)++; + enetc_rxbd_next(rx_ring, rxbd, i); + + /* not last BD in frame? */ + while (!(bd_status & ENETC_RXBD_LSTATUS_F)) { + bd_status = le32_to_cpu((*rxbd)->r.lstatus); + size = ENETC_RXB_DMA_SIZE_XDP; + + if (bd_status & ENETC_RXBD_LSTATUS_F) { + dma_rmb(); + size = le16_to_cpu((*rxbd)->r.buf_len); + } + + enetc_add_rx_buff_to_xdp(rx_ring, *i, size, xdp_buff); + (*cleaned_cnt)++; + enetc_rxbd_next(rx_ring, rxbd, i); + } +} + +/* Reuse the current page without performing half-page buffer flipping */ +static void enetc_put_xdp_buff(struct enetc_bdr *rx_ring, + struct enetc_rx_swbd *rx_swbd) +{ + enetc_reuse_page(rx_ring, rx_swbd); + + /* sync for use by the device */ + dma_sync_single_range_for_device(rx_ring->dev, rx_swbd->dma, + rx_swbd->page_offset, + ENETC_RXB_DMA_SIZE_XDP, + DMA_FROM_DEVICE); + + rx_swbd->page = NULL; +} + +static void enetc_xdp_drop(struct enetc_bdr *rx_ring, int rx_ring_first, + int rx_ring_last) +{ + while (rx_ring_first != rx_ring_last) { + enetc_put_xdp_buff(rx_ring, + &rx_ring->rx_swbd[rx_ring_first]); + enetc_bdr_idx_inc(rx_ring, &rx_ring_first); + } + rx_ring->stats.xdp_drops++; +} + +static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, + struct napi_struct *napi, int work_limit, + struct bpf_prog *prog) +{ + int rx_frm_cnt = 0, rx_byte_cnt = 0; + int cleaned_cnt, i; + u32 xdp_act; + + cleaned_cnt = enetc_bd_unused(rx_ring); + /* next descriptor to process */ + i = rx_ring->next_to_clean; + + while (likely(rx_frm_cnt < work_limit)) { + union enetc_rx_bd *rxbd, *orig_rxbd; + int orig_i, orig_cleaned_cnt; + struct xdp_buff xdp_buff; + struct sk_buff *skb; + u32 bd_status; + + if (cleaned_cnt >= ENETC_RXBD_BUNDLE) + cleaned_cnt -= enetc_refill_rx_ring(rx_ring, + cleaned_cnt); + + rxbd = enetc_rxbd(rx_ring, i); + bd_status = le32_to_cpu(rxbd->r.lstatus); + if (!bd_status) + break; + + enetc_wr_reg_hot(rx_ring->idr, BIT(rx_ring->index)); + dma_rmb(); /* for reading other rxbd fields */ + + if (enetc_check_bd_errors_and_consume(rx_ring, bd_status, + &rxbd, &i)) + break; + + orig_rxbd = rxbd; + orig_cleaned_cnt = cleaned_cnt; + orig_i = i; + + enetc_build_xdp_buff(rx_ring, bd_status, &rxbd, &i, + &cleaned_cnt, &xdp_buff); + + xdp_act = bpf_prog_run_xdp(prog, &xdp_buff); + + switch (xdp_act) { + case XDP_ABORTED: + trace_xdp_exception(rx_ring->ndev, prog, xdp_act); + fallthrough; + case XDP_DROP: + enetc_xdp_drop(rx_ring, orig_i, i); + break; + case XDP_PASS: + rxbd = orig_rxbd; + cleaned_cnt = orig_cleaned_cnt; + i = orig_i; + + skb = enetc_build_skb(rx_ring, bd_status, &rxbd, + &i, &cleaned_cnt, + ENETC_RXB_DMA_SIZE_XDP); + if (unlikely(!skb)) + /* Exit the switch/case, not the loop */ + break; + + napi_gro_receive(napi, skb); + break; + default: + bpf_warn_invalid_xdp_action(xdp_act); + } + + rx_frm_cnt++; + } + + rx_ring->next_to_clean = i; + + rx_ring->stats.packets += rx_frm_cnt; + rx_ring->stats.bytes += rx_byte_cnt; + + return rx_frm_cnt; +} + static int enetc_poll(struct napi_struct *napi, int budget) { struct enetc_int_vector *v = container_of(napi, struct enetc_int_vector, napi); + struct enetc_bdr *rx_ring = &v->rx_ring; + struct bpf_prog *prog; bool complete = true; int work_done; int i; @@ -730,7 +896,11 @@ static int enetc_poll(struct napi_struct *napi, int budget) if (!enetc_clean_tx_ring(&v->tx_ring[i], budget)) complete = false; - work_done = enetc_clean_rx_ring(&v->rx_ring, napi, budget); + prog = rx_ring->xdp.prog; + if (prog) + work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog); + else + work_done = enetc_clean_rx_ring(rx_ring, napi, budget); if (work_done == budget) complete = false; if (work_done) @@ -1120,7 +1290,10 @@ static void enetc_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_rxbdr_wr(hw, idx, ENETC_RBLENR, ENETC_RTBLENR_LEN(rx_ring->bd_count)); - enetc_rxbdr_wr(hw, idx, ENETC_RBBSR, ENETC_RXB_DMA_SIZE); + if (rx_ring->xdp.prog) + enetc_rxbdr_wr(hw, idx, ENETC_RBBSR, ENETC_RXB_DMA_SIZE_XDP); + else + enetc_rxbdr_wr(hw, idx, ENETC_RBBSR, ENETC_RXB_DMA_SIZE); enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0); @@ -1511,6 +1684,54 @@ int enetc_setup_tc(struct net_device *ndev, enum tc_setup_type type, } } +static int enetc_setup_xdp_prog(struct net_device *dev, struct bpf_prog *prog, + struct netlink_ext_ack *extack) +{ + struct enetc_ndev_priv *priv = netdev_priv(dev); + struct bpf_prog *old_prog; + bool is_up; + int i; + + /* The buffer layout is changing, so we need to drain the old + * RX buffers and seed new ones. + */ + is_up = netif_running(dev); + if (is_up) + dev_close(dev); + + old_prog = xchg(&priv->xdp_prog, prog); + if (old_prog) + bpf_prog_put(old_prog); + + for (i = 0; i < priv->num_rx_rings; i++) { + struct enetc_bdr *rx_ring = priv->rx_ring[i]; + + rx_ring->xdp.prog = prog; + + if (prog) + rx_ring->buffer_offset = XDP_PACKET_HEADROOM; + else + rx_ring->buffer_offset = ENETC_RXB_PAD; + } + + if (is_up) + return dev_open(dev, extack); + + return 0; +} + +int enetc_setup_bpf(struct net_device *dev, struct netdev_bpf *xdp) +{ + switch (xdp->command) { + case XDP_SETUP_PROG: + return enetc_setup_xdp_prog(dev, xdp->prog, xdp->extack); + default: + return -EINVAL; + } + + return 0; +} + struct net_device_stats *enetc_get_stats(struct net_device *ndev) { struct enetc_ndev_priv *priv = netdev_priv(ndev); @@ -1727,6 +1948,28 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv) priv->int_vector[i] = v; + bdr = &v->rx_ring; + bdr->index = i; + bdr->ndev = priv->ndev; + bdr->dev = priv->dev; + bdr->bd_count = priv->rx_bd_count; + bdr->buffer_offset = ENETC_RXB_PAD; + priv->rx_ring[i] = bdr; + + err = xdp_rxq_info_reg(&bdr->xdp.rxq, priv->ndev, i, 0); + if (err) { + kfree(v); + goto fail; + } + + err = xdp_rxq_info_reg_mem_model(&bdr->xdp.rxq, + MEM_TYPE_PAGE_SHARED, NULL); + if (err) { + xdp_rxq_info_unreg(&bdr->xdp.rxq); + kfree(v); + goto fail; + } + /* init defaults for adaptive IC */ if (priv->ic_mode & ENETC_IC_RX_ADAPTIVE) { v->rx_ictt = 0x1; @@ -1754,22 +1997,20 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv) bdr->bd_count = priv->tx_bd_count; priv->tx_ring[idx] = bdr; } - - bdr = &v->rx_ring; - bdr->index = i; - bdr->ndev = priv->ndev; - bdr->dev = priv->dev; - bdr->bd_count = priv->rx_bd_count; - priv->rx_ring[i] = bdr; } return 0; fail: while (i--) { - netif_napi_del(&priv->int_vector[i]->napi); - cancel_work_sync(&priv->int_vector[i]->rx_dim.work); - kfree(priv->int_vector[i]); + struct enetc_int_vector *v = priv->int_vector[i]; + struct enetc_bdr *rx_ring = &v->rx_ring; + + xdp_rxq_info_unreg_mem_model(&rx_ring->xdp.rxq); + xdp_rxq_info_unreg(&rx_ring->xdp.rxq); + netif_napi_del(&v->napi); + cancel_work_sync(&v->rx_dim.work); + kfree(v); } pci_free_irq_vectors(pdev); @@ -1783,7 +2024,10 @@ void enetc_free_msix(struct enetc_ndev_priv *priv) for (i = 0; i < priv->bdr_int_num; i++) { struct enetc_int_vector *v = priv->int_vector[i]; + struct enetc_bdr *rx_ring = &v->rx_ring; + xdp_rxq_info_unreg_mem_model(&rx_ring->xdp.rxq); + xdp_rxq_info_unreg(&rx_ring->xdp.rxq); netif_napi_del(&v->napi); cancel_work_sync(&v->rx_dim.work); } diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index d9e75644b89c..5815addfe966 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -33,6 +33,8 @@ struct enetc_tx_swbd { #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ #define ENETC_RXB_DMA_SIZE \ (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD) +#define ENETC_RXB_DMA_SIZE_XDP \ + (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - XDP_PACKET_HEADROOM) struct enetc_rx_swbd { dma_addr_t dma; @@ -44,6 +46,12 @@ struct enetc_ring_stats { unsigned int packets; unsigned int bytes; unsigned int rx_alloc_errs; + unsigned int xdp_drops; +}; + +struct enetc_xdp_data { + struct xdp_rxq_info rxq; + struct bpf_prog *prog; }; #define ENETC_RX_RING_DEFAULT_SIZE 512 @@ -72,6 +80,9 @@ struct enetc_bdr { }; void __iomem *idr; /* Interrupt Detect Register pointer */ + int buffer_offset; + struct enetc_xdp_data xdp; + struct enetc_ring_stats stats; dma_addr_t bd_dma_base; @@ -276,6 +287,8 @@ struct enetc_ndev_priv { struct phylink *phylink; int ic_mode; u32 tx_ictt; + + struct bpf_prog *xdp_prog; }; /* Messaging */ @@ -315,6 +328,7 @@ int enetc_set_features(struct net_device *ndev, int enetc_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd); int enetc_setup_tc(struct net_device *ndev, enum tc_setup_type type, void *type_data); +int enetc_setup_bpf(struct net_device *dev, struct netdev_bpf *xdp); /* ethtool */ void enetc_set_ethtool_ops(struct net_device *ndev); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c index 89e558135432..0183c13f8c1e 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c @@ -192,6 +192,7 @@ static const struct { static const char rx_ring_stats[][ETH_GSTRING_LEN] = { "Rx ring %2d frames", "Rx ring %2d alloc errors", + "Rx ring %2d XDP drops", }; static const char tx_ring_stats[][ETH_GSTRING_LEN] = { @@ -273,6 +274,7 @@ static void enetc_get_ethtool_stats(struct net_device *ndev, for (i = 0; i < priv->num_rx_rings; i++) { data[o++] = priv->rx_ring[i]->stats.packets; data[o++] = priv->rx_ring[i]->stats.rx_alloc_errs; + data[o++] = priv->rx_ring[i]->stats.xdp_drops; } if (!enetc_si_is_pf(priv->si)) diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index 5e95afd61c87..0484dbe13422 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -707,6 +707,7 @@ static const struct net_device_ops enetc_ndev_ops = { .ndo_set_features = enetc_pf_set_features, .ndo_do_ioctl = enetc_ioctl, .ndo_setup_tc = enetc_setup_tc, + .ndo_bpf = enetc_setup_bpf, }; static void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev, From patchwork Wed Mar 31 20:08:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Oltean X-Patchwork-Id: 413362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70B7DC43619 for ; Wed, 31 Mar 2021 20:10:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D3A6610CD for ; Wed, 31 Mar 2021 20:10:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236505AbhCaUJ6 (ORCPT ); Wed, 31 Mar 2021 16:09:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236449AbhCaUJ1 (ORCPT ); Wed, 31 Mar 2021 16:09:27 -0400 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24822C061574; Wed, 31 Mar 2021 13:09:27 -0700 (PDT) Received: by mail-ej1-x62b.google.com with SMTP id jy13so31954559ejc.2; Wed, 31 Mar 2021 13:09:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f3KM4a6OLQRnYhBSYHdPt+GN1/+0sqBukLjCDrxOj9Y=; b=FwEa96JN8k3BvcxESsDmX5k7f6BT/orBb+SI9d7XD+b76fkMTdyhOAj4ONaQp3ihCW hJOdb2QVm9/1E73GSY9VaBhKFRZo5FjZBTEwYm8jgXVyawWJ3OCPo2zkIc7u67902dLt 4BYeo1HilwnlQrm8MUxYPmgiOOJVaI2ENcJ8dxu+OMp7ubNi5eFNPonI1M/LejI7BZSp S4m4vfc7stWeNI3WIRfUAxXGth6HWQRTYHjbMJ5hitzdRtlQyXq0ZgRdTwZSxLKh6xMK nLyjYw1TgHcseM27NIh//ZtWLnFk/NOrLci1JK46bfy859Ywm9zsQQwO8QRdUhd8eYAo Ysxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f3KM4a6OLQRnYhBSYHdPt+GN1/+0sqBukLjCDrxOj9Y=; b=dh/9NbhhPd4Fha4X2nSHI2fuPIDkcMOadPaO4AfSix3tIrTCOWXGmKMVyIYTKKyHQX 1K0ZT8Pes2F3uKBVjIEE5BlD6zYRYsRPfWKiXaqc+ZNUnNi0kgNOAJcXAOcT5WLuwa/5 N/LlFHTl8lAvtd+dLlTxrAcs5gf8hcrx2BHZaGq/87E/+hm6zPcA7zdB8SDG4EbbPprp aH6xONvndUtX+JPTDzXWVjnmyQNggAgpcXsBHm3XqWSP3/qUqIQFaqhD37mN2Xb1AVuZ lU6LbNvvC1FVMViMCA0XtttjlvGPHk2UT2T5Ul/AzOZ++iIRr2CUTYtsrhSXgxZVXVUa 12Wg== X-Gm-Message-State: AOAM531M7VTlZUnl4WpTcslDAjMAtuHGLlmzKkPQIctrSjgdT4BIeI13 jAnEMBRcmVkGKqfBrhDJrtQ= X-Google-Smtp-Source: ABdhPJy4DyywNuU+3QbyU8g8JiOvKzMrW1WP/u3zhISN0oYWI7u1Mb8ONoM/9h5BtBEr5YBPF90KYg== X-Received: by 2002:a17:906:1a16:: with SMTP id i22mr5546241ejf.522.1617221365819; Wed, 31 Mar 2021 13:09:25 -0700 (PDT) Received: from localhost.localdomain (5-12-16-165.residential.rdsnet.ro. [5.12.16.165]) by smtp.gmail.com with ESMTPSA id r19sm1691305ejr.55.2021.03.31.13.09.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 13:09:25 -0700 (PDT) From: Vladimir Oltean To: Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , "David S. Miller" , netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Alexander Duyck , Ioana Ciornei , Alex Marginean , Claudiu Manoil , Ilias Apalodimas , Vladimir Oltean Subject: [PATCH net-next 9/9] net: enetc: add support for XDP_REDIRECT Date: Wed, 31 Mar 2021 23:08:57 +0300 Message-Id: <20210331200857.3274425-10-olteanv@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331200857.3274425-1-olteanv@gmail.com> References: <20210331200857.3274425-1-olteanv@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vladimir Oltean The driver implementation of the XDP_REDIRECT action reuses parts from XDP_TX, most notably the enetc_xdp_tx function which transmits an array of TX software BDs. Only this time, the buffers don't have DMA mappings, we need to create them. When a BPF program reaches the XDP_REDIRECT verdict for a frame, we can employ the same buffer reuse strategy as for the normal processing path and for XDP_PASS: we can flip to the other page half and seed that to the RX ring. Note that scatter/gather support is there, but disabled due to lack of multi-buffer support in XDP (which is added by this series): https://patchwork.kernel.org/project/netdevbpf/cover/cover.1616179034.git.lorenzo@kernel.org/ Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/freescale/enetc/enetc.c | 212 +++++++++++++++++- drivers/net/ethernet/freescale/enetc/enetc.h | 11 +- .../ethernet/freescale/enetc/enetc_ethtool.c | 6 + .../net/ethernet/freescale/enetc/enetc_pf.c | 1 + 4 files changed, 218 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index ba5313a5d7a4..57049ae97201 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -8,6 +8,23 @@ #include #include +static struct sk_buff *enetc_tx_swbd_get_skb(struct enetc_tx_swbd *tx_swbd) +{ + if (tx_swbd->is_xdp_tx || tx_swbd->is_xdp_redirect) + return NULL; + + return tx_swbd->skb; +} + +static struct xdp_frame * +enetc_tx_swbd_get_xdp_frame(struct enetc_tx_swbd *tx_swbd) +{ + if (tx_swbd->is_xdp_redirect) + return tx_swbd->xdp_frame; + + return NULL; +} + static void enetc_unmap_tx_buff(struct enetc_bdr *tx_ring, struct enetc_tx_swbd *tx_swbd) { @@ -25,14 +42,20 @@ static void enetc_unmap_tx_buff(struct enetc_bdr *tx_ring, tx_swbd->dma = 0; } -static void enetc_free_tx_skb(struct enetc_bdr *tx_ring, - struct enetc_tx_swbd *tx_swbd) +static void enetc_free_tx_frame(struct enetc_bdr *tx_ring, + struct enetc_tx_swbd *tx_swbd) { + struct xdp_frame *xdp_frame = enetc_tx_swbd_get_xdp_frame(tx_swbd); + struct sk_buff *skb = enetc_tx_swbd_get_skb(tx_swbd); + if (tx_swbd->dma) enetc_unmap_tx_buff(tx_ring, tx_swbd); - if (tx_swbd->skb) { - dev_kfree_skb_any(tx_swbd->skb); + if (xdp_frame) { + xdp_return_frame(tx_swbd->xdp_frame); + tx_swbd->xdp_frame = NULL; + } else if (skb) { + dev_kfree_skb_any(skb); tx_swbd->skb = NULL; } } @@ -183,7 +206,7 @@ static int enetc_map_tx_buffs(struct enetc_bdr *tx_ring, struct sk_buff *skb, do { tx_swbd = &tx_ring->tx_swbd[i]; - enetc_free_tx_skb(tx_ring, tx_swbd); + enetc_free_tx_frame(tx_ring, tx_swbd); if (i == 0) i = tx_ring->bd_count; i--; @@ -381,6 +404,9 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget) do_tstamp = false; while (bds_to_clean && tx_frm_cnt < ENETC_DEFAULT_TX_WORK) { + struct xdp_frame *xdp_frame = enetc_tx_swbd_get_xdp_frame(tx_swbd); + struct sk_buff *skb = enetc_tx_swbd_get_skb(tx_swbd); + if (unlikely(tx_swbd->check_wb)) { struct enetc_ndev_priv *priv = netdev_priv(ndev); union enetc_tx_bd *txbd; @@ -400,12 +426,15 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget) else if (likely(tx_swbd->dma)) enetc_unmap_tx_buff(tx_ring, tx_swbd); - if (tx_swbd->skb) { + if (xdp_frame) { + xdp_return_frame(xdp_frame); + tx_swbd->xdp_frame = NULL; + } else if (skb) { if (unlikely(do_tstamp)) { - enetc_tstamp_tx(tx_swbd->skb, tstamp); + enetc_tstamp_tx(skb, tstamp); do_tstamp = false; } - napi_consume_skb(tx_swbd->skb, napi_budget); + napi_consume_skb(skb, napi_budget); tx_swbd->skb = NULL; } @@ -827,6 +856,109 @@ static bool enetc_xdp_tx(struct enetc_bdr *tx_ring, return true; } +static int enetc_xdp_frame_to_xdp_tx_swbd(struct enetc_bdr *tx_ring, + struct enetc_tx_swbd *xdp_tx_arr, + struct xdp_frame *xdp_frame) +{ + struct enetc_tx_swbd *xdp_tx_swbd = &xdp_tx_arr[0]; + struct skb_shared_info *shinfo; + void *data = xdp_frame->data; + int len = xdp_frame->len; + skb_frag_t *frag; + dma_addr_t dma; + unsigned int f; + int n = 0; + + dma = dma_map_single(tx_ring->dev, data, len, DMA_TO_DEVICE); + if (unlikely(dma_mapping_error(tx_ring->dev, dma))) { + netdev_err(tx_ring->ndev, "DMA map error\n"); + return -1; + } + + xdp_tx_swbd->dma = dma; + xdp_tx_swbd->dir = DMA_TO_DEVICE; + xdp_tx_swbd->len = len; + xdp_tx_swbd->is_xdp_redirect = true; + xdp_tx_swbd->is_eof = false; + xdp_tx_swbd->xdp_frame = NULL; + + n++; + xdp_tx_swbd = &xdp_tx_arr[n]; + + shinfo = xdp_get_shared_info_from_frame(xdp_frame); + + for (f = 0, frag = &shinfo->frags[0]; f < shinfo->nr_frags; + f++, frag++) { + data = skb_frag_address(frag); + len = skb_frag_size(frag); + + dma = dma_map_single(tx_ring->dev, data, len, DMA_TO_DEVICE); + if (unlikely(dma_mapping_error(tx_ring->dev, dma))) { + /* Undo the DMA mapping for all fragments */ + while (n-- >= 0) + enetc_unmap_tx_buff(tx_ring, &xdp_tx_arr[n]); + + netdev_err(tx_ring->ndev, "DMA map error\n"); + return -1; + } + + xdp_tx_swbd->dma = dma; + xdp_tx_swbd->dir = DMA_TO_DEVICE; + xdp_tx_swbd->len = len; + xdp_tx_swbd->is_xdp_redirect = true; + xdp_tx_swbd->is_eof = false; + xdp_tx_swbd->xdp_frame = NULL; + + n++; + xdp_tx_swbd = &xdp_tx_arr[n]; + } + + xdp_tx_arr[n - 1].is_eof = true; + xdp_tx_arr[n - 1].xdp_frame = xdp_frame; + + return n; +} + +int enetc_xdp_xmit(struct net_device *ndev, int num_frames, + struct xdp_frame **frames, u32 flags) +{ + struct enetc_tx_swbd xdp_redirect_arr[ENETC_MAX_SKB_FRAGS] = {0}; + struct enetc_ndev_priv *priv = netdev_priv(ndev); + struct enetc_bdr *tx_ring; + int xdp_tx_bd_cnt, i, k; + int xdp_tx_frm_cnt = 0; + + tx_ring = priv->tx_ring[smp_processor_id()]; + + prefetchw(ENETC_TXBD(*tx_ring, tx_ring->next_to_use)); + + for (k = 0; k < num_frames; k++) { + xdp_tx_bd_cnt = enetc_xdp_frame_to_xdp_tx_swbd(tx_ring, + xdp_redirect_arr, + frames[k]); + if (unlikely(xdp_tx_bd_cnt < 0)) + break; + + if (unlikely(!enetc_xdp_tx(tx_ring, xdp_redirect_arr, + xdp_tx_bd_cnt))) { + for (i = 0; i < xdp_tx_bd_cnt; i++) + enetc_unmap_tx_buff(tx_ring, + &xdp_redirect_arr[i]); + tx_ring->stats.xdp_tx_drops++; + break; + } + + xdp_tx_frm_cnt++; + } + + if (unlikely((flags & XDP_XMIT_FLUSH) || k != xdp_tx_frm_cnt)) + enetc_update_tx_ring_tail(tx_ring); + + tx_ring->stats.xdp_tx += xdp_tx_frm_cnt; + + return xdp_tx_frm_cnt; +} + static void enetc_map_rx_buff_to_xdp(struct enetc_bdr *rx_ring, int i, struct xdp_buff *xdp_buff, u16 size) { @@ -948,14 +1080,31 @@ static void enetc_xdp_drop(struct enetc_bdr *rx_ring, int rx_ring_first, rx_ring->stats.xdp_drops++; } +static void enetc_xdp_free(struct enetc_bdr *rx_ring, int rx_ring_first, + int rx_ring_last) +{ + while (rx_ring_first != rx_ring_last) { + struct enetc_rx_swbd *rx_swbd = &rx_ring->rx_swbd[rx_ring_first]; + + if (rx_swbd->page) { + dma_unmap_page(rx_ring->dev, rx_swbd->dma, PAGE_SIZE, + rx_swbd->dir); + __free_page(rx_swbd->page); + rx_swbd->page = NULL; + } + enetc_bdr_idx_inc(rx_ring, &rx_ring_first); + } + rx_ring->stats.xdp_redirect_failures++; +} + static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, struct napi_struct *napi, int work_limit, struct bpf_prog *prog) { + int xdp_tx_bd_cnt, xdp_tx_frm_cnt = 0, xdp_redirect_frm_cnt = 0; struct enetc_tx_swbd xdp_tx_arr[ENETC_MAX_SKB_FRAGS] = {0}; struct enetc_ndev_priv *priv = netdev_priv(rx_ring->ndev); struct enetc_bdr *tx_ring = priv->tx_ring[rx_ring->index]; - int xdp_tx_bd_cnt, xdp_tx_frm_cnt = 0; int rx_frm_cnt = 0, rx_byte_cnt = 0; int cleaned_cnt, i; u32 xdp_act; @@ -969,6 +1118,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, int orig_i, orig_cleaned_cnt; struct xdp_buff xdp_buff; struct sk_buff *skb; + int tmp_orig_i, err; u32 bd_status; rxbd = enetc_rxbd(rx_ring, i); @@ -1026,6 +1176,43 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, rx_ring->xdp.xdp_tx_in_flight += xdp_tx_bd_cnt; xdp_tx_frm_cnt++; } + break; + case XDP_REDIRECT: + /* xdp_return_frame does not support S/G in the sense + * that it leaks the fragments (__xdp_return should not + * call page_frag_free only for the initial buffer). + * Until XDP_REDIRECT gains support for S/G let's keep + * the code structure in place, but dead. We drop the + * S/G frames ourselves to avoid memory leaks which + * would otherwise leave the kernel OOM. + */ + if (unlikely(cleaned_cnt - orig_cleaned_cnt != 1)) { + enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_redirect_sg++; + break; + } + + tmp_orig_i = orig_i; + + while (orig_i != i) { + enetc_put_rx_buff(rx_ring, + &rx_ring->rx_swbd[orig_i]); + enetc_bdr_idx_inc(rx_ring, &orig_i); + } + + err = xdp_do_redirect(rx_ring->ndev, &xdp_buff, prog); + if (unlikely(err)) { + enetc_xdp_free(rx_ring, tmp_orig_i, i); + } else { + xdp_redirect_frm_cnt++; + rx_ring->stats.xdp_redirect++; + } + + if (unlikely(xdp_redirect_frm_cnt > ENETC_DEFAULT_TX_WORK)) { + xdp_do_flush_map(); + xdp_redirect_frm_cnt = 0; + } + break; default: bpf_warn_invalid_xdp_action(xdp_act); @@ -1039,6 +1226,9 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, rx_ring->stats.packets += rx_frm_cnt; rx_ring->stats.bytes += rx_byte_cnt; + if (xdp_redirect_frm_cnt) + xdp_do_flush_map(); + if (xdp_tx_frm_cnt) enetc_update_tx_ring_tail(tx_ring); @@ -1173,7 +1363,7 @@ static void enetc_free_txbdr(struct enetc_bdr *txr) int size, i; for (i = 0; i < txr->bd_count; i++) - enetc_free_tx_skb(txr, &txr->tx_swbd[i]); + enetc_free_tx_frame(txr, &txr->tx_swbd[i]); size = txr->bd_count * sizeof(union enetc_tx_bd); @@ -1290,7 +1480,7 @@ static void enetc_free_tx_ring(struct enetc_bdr *tx_ring) for (i = 0; i < tx_ring->bd_count; i++) { struct enetc_tx_swbd *tx_swbd = &tx_ring->tx_swbd[i]; - enetc_free_tx_skb(tx_ring, tx_swbd); + enetc_free_tx_frame(tx_ring, tx_swbd); } tx_ring->next_to_clean = 0; diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index d0619fcbbe97..05474f46b0d9 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -19,7 +19,10 @@ (ETH_FCS_LEN + ETH_HLEN + VLAN_HLEN)) struct enetc_tx_swbd { - struct sk_buff *skb; + union { + struct sk_buff *skb; + struct xdp_frame *xdp_frame; + }; dma_addr_t dma; struct page *page; /* valid only if is_xdp_tx */ u16 page_offset; /* valid only if is_xdp_tx */ @@ -30,6 +33,7 @@ struct enetc_tx_swbd { u8 do_tstamp:1; u8 is_eof:1; u8 is_xdp_tx:1; + u8 is_xdp_redirect:1; }; #define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE @@ -61,6 +65,9 @@ struct enetc_ring_stats { unsigned int xdp_drops; unsigned int xdp_tx; unsigned int xdp_tx_drops; + unsigned int xdp_redirect; + unsigned int xdp_redirect_failures; + unsigned int xdp_redirect_sg; unsigned int recycles; unsigned int recycle_failures; }; @@ -354,6 +361,8 @@ int enetc_ioctl(struct net_device *ndev, struct ifreq *rq, int cmd); int enetc_setup_tc(struct net_device *ndev, enum tc_setup_type type, void *type_data); int enetc_setup_bpf(struct net_device *dev, struct netdev_bpf *xdp); +int enetc_xdp_xmit(struct net_device *ndev, int num_frames, + struct xdp_frame **frames, u32 flags); /* ethtool */ void enetc_set_ethtool_ops(struct net_device *ndev); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c index 37821a8b225e..7cc81b453bd7 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c @@ -195,6 +195,9 @@ static const char rx_ring_stats[][ETH_GSTRING_LEN] = { "Rx ring %2d XDP drops", "Rx ring %2d recycles", "Rx ring %2d recycle failures", + "Rx ring %2d redirects", + "Rx ring %2d redirect failures", + "Rx ring %2d redirect S/G", }; static const char tx_ring_stats[][ETH_GSTRING_LEN] = { @@ -284,6 +287,9 @@ static void enetc_get_ethtool_stats(struct net_device *ndev, data[o++] = priv->rx_ring[i]->stats.xdp_drops; data[o++] = priv->rx_ring[i]->stats.recycles; data[o++] = priv->rx_ring[i]->stats.recycle_failures; + data[o++] = priv->rx_ring[i]->stats.xdp_redirect; + data[o++] = priv->rx_ring[i]->stats.xdp_redirect_failures; + data[o++] = priv->rx_ring[i]->stats.xdp_redirect_sg; } if (!enetc_si_is_pf(priv->si)) diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index 0484dbe13422..f61fedf462e5 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -708,6 +708,7 @@ static const struct net_device_ops enetc_ndev_ops = { .ndo_do_ioctl = enetc_ioctl, .ndo_setup_tc = enetc_setup_tc, .ndo_bpf = enetc_setup_bpf, + .ndo_xdp_xmit = enetc_xdp_xmit, }; static void enetc_pf_netdev_setup(struct enetc_si *si, struct net_device *ndev,