From patchwork Mon Nov 9 23:36:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Awogbemila X-Patchwork-Id: 322024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0526C2D0A3 for ; Mon, 9 Nov 2020 23:37:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84350206ED for ; Mon, 9 Nov 2020 23:37:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="EHXBqfTT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731058AbgKIXhH (ORCPT ); Mon, 9 Nov 2020 18:37:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729454AbgKIXhG (ORCPT ); Mon, 9 Nov 2020 18:37:06 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AAB0C0613CF for ; Mon, 9 Nov 2020 15:37:06 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id c196so13004724ybf.0 for ; Mon, 09 Nov 2020 15:37:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=1YWUAiKB6ie/2+y6SPRo8lqO99NPRLjIDB3pd9DskRA=; b=EHXBqfTT0XKXYYPEBixOQL20co7qzxh8aFEljPBguj7/V03hkCZZrTnS+iJWKqQHrI N++VpZjWvoQ9g3DP9mdHihmiwJ+XXK7myC5+nPUDf+VXlicZgyPV6h+tQAaxGoRCGv21 S8nBI6eaCxvso6KRWDpZsZfmqzt3etmbb8LZufUvjg3IiLEADRKRGpEleI+3QsjqD548 zsYEN3ZCEO0GPyf20lRfEJBhKRmnSlJSJ+COHYeAdHbjM8AzpOmFv2eCPjLsPOt6yaQl d/gV019g7GPxpvRTFg1LVD19VGXSsgNqt2jGgJt07rDR/qrOfEqqHY0HYSVitAmqrZ7P GK6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=1YWUAiKB6ie/2+y6SPRo8lqO99NPRLjIDB3pd9DskRA=; b=cugAFWU8QxLDV63epLFi+1gkOqJH0kUaRz5blTHh0FSyE6Mkg6VRfxhPEvDorNhSFe RmUUf/w535WXQjr4qQ+KuxDutR3bx2loPWtvTPhAXNKXdwZ5p3223kXzVaiQKJ3EVupo 1xjtSx5xCTI+FP1f2SzeFvL4NREQfQGz1AQWHymncF+QnX7JDMI/HmMxkr/J/BSVIMO4 0WfO1b3Ud2RpSCWcJal+HQPqxTUeR7n3RSHvPilIUfk/WENACxeSgL9lMlYUOlTC7AAd eCHWM1c6BRFNmo0PyqrkeK+X+pAaEs2bktnr2U0cCovA386sZd0o9g/lvpl7/z+/jRGr KBGg== X-Gm-Message-State: AOAM5337CHL/izdW0elSyZgTVAaeHJLO1H4pAox/E2dIrJdkuZR2+6HT 4hLDI5vQwTsw7KYnBhjIR5nf5YfK5pag/a2Z0DIybne/Yv3OdHFT+JKHxIqce3Igac5K36thU/z hRH6KaD/JzLn/Mt3BBChEAhi7yM676Ihf1SPUa07fBtH08FOwojAdONxL5Zcc8hLlAZ3zew1i X-Google-Smtp-Source: ABdhPJzy8HE4thVAXhgCL9jLs/vJ8OEffyR3DokKLRTS6vW/CjdZ7dokRjxqibzl37S40oO2UcpKdmv6U7BVk8vQ Sender: "awogbemila via sendgmr" X-Received: from awogbemila.sea.corp.google.com ([2620:15c:100:202:1ea0:b8ff:fe73:6cc0]) (user=awogbemila job=sendgmr) by 2002:a25:6cd6:: with SMTP id h205mr20518712ybc.49.1604965025156; Mon, 09 Nov 2020 15:37:05 -0800 (PST) Date: Mon, 9 Nov 2020 15:36:56 -0800 In-Reply-To: <20201109233659.1953461-1-awogbemila@google.com> Message-Id: <20201109233659.1953461-2-awogbemila@google.com> Mime-Version: 1.0 References: <20201109233659.1953461-1-awogbemila@google.com> X-Mailer: git-send-email 2.29.2.222.g5d2a92d10f8-goog Subject: [PATCH net-next v6 1/4] gve: Add support for raw addressing device option From: David Awogbemila To: netdev@vger.kernel.org Cc: Catherine Sullivan , Yangchun Fu , David Awogbemila Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Catherine Sullivan Add support to describe device for parsing device options. As the first device option, add raw addressing. "Raw Addressing" mode (as opposed to the current "qpl" mode) is an operational mode which allows the driver avoid bounce buffer copies which it currently performs using pre-allocated qpls (queue_page_lists) when sending and receiving packets. For egress packets, the provided skb data addresses will be dma_map'ed and passed to the device, allowing the NIC can perform DMA directly - the driver will not have to copy the buffer content into pre-allocated buffers/qpls (as in qpl mode). For ingress packets, copies are also eliminated as buffers are handed to the networking stack and then recycled or re-allocated as necessary, avoiding the use of skb_copy_to_linear_data(). This patch only introduces the option to the driver. Subsequent patches will add the ingress and egress functionality. Reviewed-by: Yangchun Fu Signed-off-by: Catherine Sullivan Signed-off-by: David Awogbemila Reviewed-by: Alexander Duyck --- drivers/net/ethernet/google/gve/gve.h | 1 + drivers/net/ethernet/google/gve/gve_adminq.c | 64 ++++++++++++++++++++ drivers/net/ethernet/google/gve/gve_adminq.h | 15 +++-- drivers/net/ethernet/google/gve/gve_main.c | 9 +++ 4 files changed, 85 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h index f5c80229ea96..80cdae06ee39 100644 --- a/drivers/net/ethernet/google/gve/gve.h +++ b/drivers/net/ethernet/google/gve/gve.h @@ -199,6 +199,7 @@ struct gve_priv { u64 num_registered_pages; /* num pages registered with NIC */ u32 rx_copybreak; /* copy packets smaller than this */ u16 default_num_queues; /* default num queues to set up */ + bool raw_addressing; /* true if this dev supports raw addressing */ struct gve_queue_config tx_cfg; struct gve_queue_config rx_cfg; diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c index 24ae6a28a806..3e6de659b274 100644 --- a/drivers/net/ethernet/google/gve/gve_adminq.c +++ b/drivers/net/ethernet/google/gve/gve_adminq.c @@ -14,6 +14,18 @@ #define GVE_ADMINQ_SLEEP_LEN 20 #define GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK 100 +static inline +struct gve_device_option *gve_get_next_option(struct gve_device_descriptor *descriptor, + struct gve_device_option *option) +{ + void *option_end, *descriptor_end; + + option_end = (void *)option + sizeof(*option) + be16_to_cpu(option->option_length); + descriptor_end = (void *)descriptor + be16_to_cpu(descriptor->total_length); + + return option_end > descriptor_end ? NULL : (struct gve_device_option *)option_end; +} + int gve_adminq_alloc(struct device *dev, struct gve_priv *priv) { priv->adminq = dma_alloc_coherent(dev, PAGE_SIZE, @@ -460,11 +472,14 @@ int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 num_queues) int gve_adminq_describe_device(struct gve_priv *priv) { struct gve_device_descriptor *descriptor; + struct gve_device_option *dev_opt; union gve_adminq_command cmd; dma_addr_t descriptor_bus; + u16 num_options; int err = 0; u8 *mac; u16 mtu; + int i; memset(&cmd, 0, sizeof(cmd)); descriptor = dma_alloc_coherent(&priv->pdev->dev, PAGE_SIZE, @@ -518,6 +533,55 @@ int gve_adminq_describe_device(struct gve_priv *priv) priv->rx_desc_cnt = priv->rx_pages_per_qpl; } priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues); + dev_opt = (void *)(descriptor + 1); + + num_options = be16_to_cpu(descriptor->num_device_options); + for (i = 0; i < num_options; i++) { + u16 option_length = be16_to_cpu(dev_opt->option_length); + u16 option_id = be16_to_cpu(dev_opt->option_id); + struct gve_device_option *next_opt; + + next_opt = gve_get_next_option(descriptor, dev_opt); + if (!next_opt) { + dev_err(&priv->dev->dev, + "options exceed device_descriptor's total length.\n"); + err = -EINVAL; + goto free_device_descriptor; + } + + switch (option_id) { + case GVE_DEV_OPT_ID_RAW_ADDRESSING: + /* If the length or feature mask doesn't match, + * continue without enabling the feature. + */ + if (option_length != GVE_DEV_OPT_LEN_RAW_ADDRESSING || + dev_opt->feat_mask != + cpu_to_be32(GVE_DEV_OPT_FEAT_MASK_RAW_ADDRESSING)) { + dev_warn(&priv->pdev->dev, + "Raw addressing option error:\n" + " Expected: length=%d, feature_mask=%x.\n" + " Actual: length=%d, feature_mask=%x.\n", + GVE_DEV_OPT_LEN_RAW_ADDRESSING, + cpu_to_be32(GVE_DEV_OPT_FEAT_MASK_RAW_ADDRESSING), + option_length, dev_opt->feat_mask); + priv->raw_addressing = false; + } else { + dev_info(&priv->pdev->dev, + "Raw addressing device option enabled.\n"); + priv->raw_addressing = true; + } + break; + default: + /* If we don't recognize the option just continue + * without doing anything. + */ + dev_dbg(&priv->pdev->dev, + "Unrecognized device option 0x%hx not enabled.\n", + option_id); + break; + } + dev_opt = next_opt; + } free_device_descriptor: dma_free_coherent(&priv->pdev->dev, sizeof(*descriptor), descriptor, diff --git a/drivers/net/ethernet/google/gve/gve_adminq.h b/drivers/net/ethernet/google/gve/gve_adminq.h index 281de8326bc5..af5f586167bd 100644 --- a/drivers/net/ethernet/google/gve/gve_adminq.h +++ b/drivers/net/ethernet/google/gve/gve_adminq.h @@ -79,12 +79,17 @@ struct gve_device_descriptor { static_assert(sizeof(struct gve_device_descriptor) == 40); -struct device_option { - __be32 option_id; - __be32 option_length; +struct gve_device_option { + __be16 option_id; + __be16 option_length; + __be32 feat_mask; }; -static_assert(sizeof(struct device_option) == 8); +static_assert(sizeof(struct gve_device_option) == 8); + +#define GVE_DEV_OPT_ID_RAW_ADDRESSING 0x1 +#define GVE_DEV_OPT_LEN_RAW_ADDRESSING 0x0 +#define GVE_DEV_OPT_FEAT_MASK_RAW_ADDRESSING 0x0 struct gve_adminq_configure_device_resources { __be64 counter_array; @@ -111,6 +116,8 @@ struct gve_adminq_unregister_page_list { static_assert(sizeof(struct gve_adminq_unregister_page_list) == 4); +#define GVE_RAW_ADDRESSING_QPL_ID 0xFFFFFFFF + struct gve_adminq_create_tx_queue { __be32 queue_id; __be32 reserved; diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index 48a433154ce0..70685c10db0e 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -678,6 +678,10 @@ static int gve_alloc_qpls(struct gve_priv *priv) int i, j; int err; + /* Raw addressing means no QPLs */ + if (priv->raw_addressing) + return 0; + priv->qpls = kvzalloc(num_qpls * sizeof(*priv->qpls), GFP_KERNEL); if (!priv->qpls) return -ENOMEM; @@ -718,6 +722,10 @@ static void gve_free_qpls(struct gve_priv *priv) int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv); int i; + /* Raw addressing means no QPLs */ + if (priv->raw_addressing) + return; + kvfree(priv->qpl_cfg.qpl_id_map); for (i = 0; i < num_qpls; i++) @@ -1078,6 +1086,7 @@ static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device) if (skip_describe_device) goto setup_device; + priv->raw_addressing = false; /* Get the initial information we need from the device */ err = gve_adminq_describe_device(priv); if (err) { From patchwork Mon Nov 9 23:36:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Awogbemila X-Patchwork-Id: 322023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C3A6C5517A for ; Mon, 9 Nov 2020 23:37:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C13F3206ED for ; Mon, 9 Nov 2020 23:37:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WZRX9Z2L" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731366AbgKIXhN (ORCPT ); Mon, 9 Nov 2020 18:37:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731300AbgKIXhM (ORCPT ); Mon, 9 Nov 2020 18:37:12 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5275C0613CF for ; Mon, 9 Nov 2020 15:37:09 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id k7so12674410ybm.13 for ; Mon, 09 Nov 2020 15:37:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=N8vzLRcFiBb85BBzt95yvvNBdMhNjguXcz1//KWnQ8I=; b=WZRX9Z2LrgiioCDwS3IVIqisD20zsq6qgyLvG4l1oHxDRkA/0XW62LJkpJYe9eJj94 jDPANvQzjbkST5x8fEKMzn96WiK6yvnHnvrMRCXwv2Qcs7qc+A9sRZm64V2jgHhP/v4T Bp8TeXOfzS56Clf4nwb5ZaXfLj1bDQDiO4REhvecSZJtpWZrTmRFBmcww/jeNDYiy7hT YU6rYaRWHLrEYTZfDl9bPUc2WUzVbk+hyYEAYMRRxBlt8Kaa8AXmu0aR5hmEs3NCVrDr D5MjD8yqMWhFS+F1oMuA1TjqE1JbiBJNIAYeQyvd89Z8nmTiYqPXSqFCTn5nCqvVhvFq 6t0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=N8vzLRcFiBb85BBzt95yvvNBdMhNjguXcz1//KWnQ8I=; b=IerSRZxe+nKo/IGdcntunOuHi/a6lQ2kFw5A3keso8Kl/+Ofg94UfyGpvQ2iTsR07j LDHre74p+vKNvnceUBcbeipuIXhFvRGrTGiyIaKMp0oi5G0fQPzP3g1QTQOQbRLBgUsM Xy9k5E+cdrZJwOJ/vimfgxZBjIbLjp2a2/aMfilc158XmfKgNLCogTMdBCsp/cdLg82J rgKmHUcRUFtexQj/ur4ohgjbQSdrESXpKLmwe98AiJI44TCkZWJx90Kxcp4gHHhquylH +eT1t4H7uoaWdwiirjujH9tRxzoK6+Dakaak1Pf56vBP4f8VNT0leryRbkHitsqOT9qj +F5A== X-Gm-Message-State: AOAM532jA2R+JVKTYB3a+SpysWOkAKrVfEDYDP+x9LLAsSQOjIXcAiNr dun0xU/Mi3hpgvxE+mLmyHtDl4S16lSB9Y8df1hcNGPXCzZIt71dXsFbKiUr6RuzJyAaQvoQqvs 5EupuywTxAowHwbptzgutCFqrIVx6RYCnGjj1euYN+W6yIMNdSfwdNsNbKWRUW50Y8Gzys/cY X-Google-Smtp-Source: ABdhPJyNQU7JnL8y+jptRB3tYTCJq5rSBdszHiXMSQZXOGUQySnlgxCc2Fw1i5es9AAIdJjS1o2+PMzb+5IsR3Oc Sender: "awogbemila via sendgmr" X-Received: from awogbemila.sea.corp.google.com ([2620:15c:100:202:1ea0:b8ff:fe73:6cc0]) (user=awogbemila job=sendgmr) by 2002:a25:4058:: with SMTP id n85mr1065588yba.69.1604965028913; Mon, 09 Nov 2020 15:37:08 -0800 (PST) Date: Mon, 9 Nov 2020 15:36:58 -0800 In-Reply-To: <20201109233659.1953461-1-awogbemila@google.com> Message-Id: <20201109233659.1953461-4-awogbemila@google.com> Mime-Version: 1.0 References: <20201109233659.1953461-1-awogbemila@google.com> X-Mailer: git-send-email 2.29.2.222.g5d2a92d10f8-goog Subject: [PATCH net-next v6 3/4] gve: Rx Buffer Recycling From: David Awogbemila To: netdev@vger.kernel.org Cc: David Awogbemila Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch lets the driver reuse buffers that have been freed by the networking stack. In the raw addressing case, this allows the driver avoid allocating new buffers. In the qpl case, the driver can avoid copies. Signed-off-by: David Awogbemila --- drivers/net/ethernet/google/gve/gve.h | 10 +- drivers/net/ethernet/google/gve/gve_rx.c | 196 +++++++++++++++-------- 2 files changed, 131 insertions(+), 75 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h index a8c589dd14e4..9dcf9fd8d128 100644 --- a/drivers/net/ethernet/google/gve/gve.h +++ b/drivers/net/ethernet/google/gve/gve.h @@ -50,6 +50,7 @@ struct gve_rx_slot_page_info { struct page *page; void *page_address; u32 page_offset; /* offset to write to in page */ + bool can_flip; }; /* A list of pages registered with the device during setup and used by a queue @@ -500,15 +501,6 @@ static inline enum dma_data_direction gve_qpl_dma_dir(struct gve_priv *priv, return DMA_FROM_DEVICE; } -/* Returns true if the max mtu allows page recycling */ -static inline bool gve_can_recycle_pages(struct net_device *dev) -{ - /* We can't recycle the pages if we can't fit a packet into half a - * page. - */ - return dev->max_mtu <= PAGE_SIZE / 2; -} - /* buffers */ int gve_alloc_page(struct gve_priv *priv, struct device *dev, struct page **page, dma_addr_t *dma, diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c index 49646caf930c..ff28581f4427 100644 --- a/drivers/net/ethernet/google/gve/gve_rx.c +++ b/drivers/net/ethernet/google/gve/gve_rx.c @@ -287,8 +287,7 @@ static enum pkt_hash_types gve_rss_type(__be16 pkt_flags) return PKT_HASH_TYPE_L2; } -static struct sk_buff *gve_rx_copy(struct gve_rx_ring *rx, - struct net_device *dev, +static struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi, struct gve_rx_slot_page_info *page_info, u16 len) @@ -306,10 +305,6 @@ static struct sk_buff *gve_rx_copy(struct gve_rx_ring *rx, skb->protocol = eth_type_trans(skb, dev); - u64_stats_update_begin(&rx->statss); - rx->rx_copied_pkt++; - u64_stats_update_end(&rx->statss); - return skb; } @@ -334,22 +329,90 @@ static void gve_rx_flip_buff(struct gve_rx_slot_page_info *page_info, { u64 addr = be64_to_cpu(data_ring->addr); + /* "flip" to other packet buffer on this page */ page_info->page_offset ^= PAGE_SIZE / 2; addr ^= PAGE_SIZE / 2; data_ring->addr = cpu_to_be64(addr); } +static bool gve_rx_can_flip_buffers(struct net_device *netdev) +{ +#if PAGE_SIZE == 4096 + /* We can't flip a buffer if we can't fit a packet + * into half a page. + */ + return netdev->mtu + GVE_RX_PAD + ETH_HLEN <= PAGE_SIZE / 2; +#else + /* PAGE_SIZE != 4096 - don't try to reuse */ + return false; +#endif +} + +static int gve_rx_can_recycle_buffer(struct page *page) +{ + int pagecount = page_count(page); + + /* This page is not being used by any SKBs - reuse */ + if (pagecount == 1) + return 1; + /* This page is still being used by an SKB - we can't reuse */ + else if (pagecount >= 2) + return 0; + WARN(pagecount < 1, "Pagecount should never be < 1"); + return -1; +} + static struct sk_buff * gve_rx_raw_addressing(struct device *dev, struct net_device *netdev, struct gve_rx_slot_page_info *page_info, u16 len, struct napi_struct *napi, - struct gve_rx_data_slot *data_slot) + struct gve_rx_data_slot *data_slot, bool can_flip) { - struct sk_buff *skb = gve_rx_add_frags(napi, page_info, len); + struct sk_buff *skb; + skb = gve_rx_add_frags(napi, page_info, len); if (!skb) return NULL; + /* Optimistically stop the kernel from freeing the page by increasing + * the page bias. We will check the refcount in refill to determine if + * we need to alloc a new page. + */ + get_page(page_info->page); + page_info->can_flip = can_flip; + + return skb; +} + +static struct sk_buff * +gve_rx_qpl(struct device *dev, struct net_device *netdev, + struct gve_rx_ring *rx, struct gve_rx_slot_page_info *page_info, + u16 len, struct napi_struct *napi, + struct gve_rx_data_slot *data_slot, bool recycle) +{ + struct sk_buff *skb; + + /* if raw_addressing mode is not enabled gvnic can only receive into + * registered segments. If the buffer can't be recycled, our only + * choice is to copy the data out of it so that we can return it to the + * device. + */ + if (recycle) { + skb = gve_rx_add_frags(napi, page_info, len); + /* No point in recycling if we didn't get the skb */ + if (skb) { + /* Make sure the networking stack can't free the page */ + get_page(page_info->page); + gve_rx_flip_buff(page_info, data_slot); + } + } else { + skb = gve_rx_copy(netdev, napi, page_info, len); + if (skb) { + u64_stats_update_begin(&rx->statss); + rx->rx_copied_pkt++; + u64_stats_update_end(&rx->statss); + } + } return skb; } @@ -363,7 +426,6 @@ static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc, struct gve_rx_data_slot *data_slot; struct sk_buff *skb = NULL; dma_addr_t page_bus; - int pagecount; u16 len; /* drop this packet */ @@ -384,64 +446,37 @@ static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc, dma_sync_single_for_cpu(&priv->pdev->dev, page_bus, PAGE_SIZE, DMA_FROM_DEVICE); - if (PAGE_SIZE == 4096) { - if (len <= priv->rx_copybreak) { - /* Just copy small packets */ - skb = gve_rx_copy(rx, dev, napi, page_info, len); - u64_stats_update_begin(&rx->statss); - rx->rx_copybreak_pkt++; - u64_stats_update_end(&rx->statss); - goto have_skb; + if (len <= priv->rx_copybreak) { + /* Just copy small packets */ + skb = gve_rx_copy(dev, napi, page_info, len); + u64_stats_update_begin(&rx->statss); + rx->rx_copied_pkt++; + rx->rx_copybreak_pkt++; + u64_stats_update_end(&rx->statss); + } else { + bool can_flip = gve_rx_can_flip_buffers(dev); + int recycle = 0; + + if (can_flip) { + recycle = gve_rx_can_recycle_buffer(page_info->page); + if (recycle < 0) { + if (!rx->data.raw_addressing) + gve_schedule_reset(priv); + return false; + } } if (rx->data.raw_addressing) { skb = gve_rx_raw_addressing(&priv->pdev->dev, dev, page_info, len, napi, - data_slot); - goto have_skb; - } - if (unlikely(!gve_can_recycle_pages(dev))) { - skb = gve_rx_copy(rx, dev, napi, page_info, len); - goto have_skb; - } - pagecount = page_count(page_info->page); - if (pagecount == 1) { - /* No part of this page is used by any SKBs; we attach - * the page fragment to a new SKB and pass it up the - * stack. - */ - skb = gve_rx_add_frags(napi, page_info, len); - if (!skb) { - u64_stats_update_begin(&rx->statss); - rx->rx_skb_alloc_fail++; - u64_stats_update_end(&rx->statss); - return false; - } - /* Make sure the kernel stack can't release the page */ - get_page(page_info->page); - /* "flip" to other packet buffer on this page */ - gve_rx_flip_buff(page_info, &rx->data.data_ring[idx]); - } else if (pagecount >= 2) { - /* We have previously passed the other half of this - * page up the stack, but it has not yet been freed. - */ - skb = gve_rx_copy(rx, dev, napi, page_info, len); + data_slot, + can_flip && recycle); } else { - WARN(pagecount < 1, "Pagecount should never be < 1"); - return false; + skb = gve_rx_qpl(&priv->pdev->dev, dev, rx, + page_info, len, napi, data_slot, + can_flip && recycle); } - } else { - if (rx->data.raw_addressing) - skb = gve_rx_raw_addressing(&priv->pdev->dev, dev, - page_info, len, napi, - data_slot); - else - skb = gve_rx_copy(rx, dev, napi, page_info, len); } -have_skb: - /* We didn't manage to allocate an skb but we haven't had any - * reset worthy failures. - */ if (!skb) { u64_stats_update_begin(&rx->statss); rx->rx_skb_alloc_fail++; @@ -494,16 +529,45 @@ static bool gve_rx_refill_buffers(struct gve_priv *priv, struct gve_rx_ring *rx) while (empty || ((fill_cnt & rx->mask) != (rx->cnt & rx->mask))) { struct gve_rx_slot_page_info *page_info; - struct device *dev = &priv->pdev->dev; - struct gve_rx_data_slot *data_slot; u32 idx = fill_cnt & rx->mask; page_info = &rx->data.page_info[idx]; - data_slot = &rx->data.data_ring[idx]; - gve_rx_free_buffer(dev, page_info, data_slot); - page_info->page = NULL; - if (gve_rx_alloc_buffer(priv, dev, page_info, data_slot, rx)) - break; + if (page_info->can_flip) { + /* The other half of the page is free because it was + * free when we processed the descriptor. Flip to it. + */ + struct gve_rx_data_slot *data_slot = + &rx->data.data_ring[idx]; + + gve_rx_flip_buff(page_info, data_slot); + page_info->can_flip = false; + } else { + /* It is possible that the networking stack has already + * finished processing all outstanding packets in the buffer + * and it can be reused. + * Flipping is unnecessary here - if the networking stack still + * owns half the page it is impossible to tell which half. Either + * the whole page is free or it needs to be replaced. + */ + int recycle = gve_rx_can_recycle_buffer(page_info->page); + + if (recycle < 0) { + if (!rx->data.raw_addressing) + gve_schedule_reset(priv); + return false; + } + if (!recycle) { + /* We can't reuse the buffer - alloc a new one*/ + struct gve_rx_data_slot *data_slot = + &rx->data.data_ring[idx]; + struct device *dev = &priv->pdev->dev; + + gve_rx_free_buffer(dev, page_info, data_slot); + page_info->page = NULL; + if (gve_rx_alloc_buffer(priv, dev, page_info, data_slot, rx)) + break; + } + } empty = false; fill_cnt++; }