From patchwork Wed Jun 16 06:36:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 461925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3370DC48BE5 for ; Wed, 16 Jun 2021 06:39:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CB0C61375 for ; Wed, 16 Jun 2021 06:39:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231689AbhFPGln (ORCPT ); Wed, 16 Jun 2021 02:41:43 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:7453 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231132AbhFPGlh (ORCPT ); Wed, 16 Jun 2021 02:41:37 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4G4b7Q3kNWzZhbw; Wed, 16 Jun 2021 14:36:34 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:29 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:29 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 1/7] net: hns3: minor refactor related to desc_cb handling Date: Wed, 16 Jun 2021 14:36:11 +0800 Message-ID: <1623825377-41948-2-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> References: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yunsheng Lin desc_cb is used to store mapping and freeing info for the corresponding desc, which is used in the cleaning process. There will be more desc_cb type coming up when supporting the tx bounce buffer, change desc_cb type to bit-wise value in order to reduce the desc_cb type checking operation in the data path. Also move the desc_cb type definition to hns3_enet.h because it is only used in hns3_enet.c, and declare a local variable desc_cb in hns3_clear_desc() to reduce lines of code. Signed-off-by: Yunsheng Lin Signed-off-by: Guangbin Huang --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 7 ----- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 40 +++++++++++-------------- drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 7 +++++ 3 files changed, 25 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index ba883b0a19f0..5822fc06f767 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -159,13 +159,6 @@ enum HNAE3_PF_CAP_BITS { #define ring_ptr_move_bw(ring, p) \ ((ring)->p = ((ring)->p - 1 + (ring)->desc_num) % (ring)->desc_num) -enum hns_desc_type { - DESC_TYPE_UNKNOWN, - DESC_TYPE_SKB, - DESC_TYPE_FRAGLIST_SKB, - DESC_TYPE_PAGE, -}; - struct hnae3_handle; struct hnae3_queue { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 9a45f3cde6a2..f03a7a962eb0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -1413,7 +1413,7 @@ static int hns3_fill_skb_desc(struct hns3_enet_ring *ring, } static int hns3_fill_desc(struct hns3_enet_ring *ring, void *priv, - unsigned int size, enum hns_desc_type type) + unsigned int size, unsigned int type) { #define HNS3_LIKELY_BD_NUM 1 @@ -1425,8 +1425,7 @@ static int hns3_fill_desc(struct hns3_enet_ring *ring, void *priv, int k, sizeoflast; dma_addr_t dma; - if (type == DESC_TYPE_FRAGLIST_SKB || - type == DESC_TYPE_SKB) { + if (type & (DESC_TYPE_FRAGLIST_SKB | DESC_TYPE_SKB)) { struct sk_buff *skb = (struct sk_buff *)priv; dma = dma_map_single(dev, skb->data, size, DMA_TO_DEVICE); @@ -1704,6 +1703,7 @@ static void hns3_clear_desc(struct hns3_enet_ring *ring, int next_to_use_orig) for (i = 0; i < ring->desc_num; i++) { struct hns3_desc *desc = &ring->desc[ring->next_to_use]; + struct hns3_desc_cb *desc_cb; memset(desc, 0, sizeof(*desc)); @@ -1714,31 +1714,27 @@ static void hns3_clear_desc(struct hns3_enet_ring *ring, int next_to_use_orig) /* rollback one */ ring_ptr_move_bw(ring, next_to_use); - if (!ring->desc_cb[ring->next_to_use].dma) + desc_cb = &ring->desc_cb[ring->next_to_use]; + + if (!desc_cb->dma) continue; /* unmap the descriptor dma address */ - if (ring->desc_cb[ring->next_to_use].type == DESC_TYPE_SKB || - ring->desc_cb[ring->next_to_use].type == - DESC_TYPE_FRAGLIST_SKB) - dma_unmap_single(dev, - ring->desc_cb[ring->next_to_use].dma, - ring->desc_cb[ring->next_to_use].length, - DMA_TO_DEVICE); - else if (ring->desc_cb[ring->next_to_use].length) - dma_unmap_page(dev, - ring->desc_cb[ring->next_to_use].dma, - ring->desc_cb[ring->next_to_use].length, + if (desc_cb->type & (DESC_TYPE_SKB | DESC_TYPE_FRAGLIST_SKB)) + dma_unmap_single(dev, desc_cb->dma, desc_cb->length, + DMA_TO_DEVICE); + else if (desc_cb->length) + dma_unmap_page(dev, desc_cb->dma, desc_cb->length, DMA_TO_DEVICE); - ring->desc_cb[ring->next_to_use].length = 0; - ring->desc_cb[ring->next_to_use].dma = 0; - ring->desc_cb[ring->next_to_use].type = DESC_TYPE_UNKNOWN; + desc_cb->length = 0; + desc_cb->dma = 0; + desc_cb->type = DESC_TYPE_UNKNOWN; } } static int hns3_fill_skb_to_desc(struct hns3_enet_ring *ring, - struct sk_buff *skb, enum hns_desc_type type) + struct sk_buff *skb, unsigned int type) { unsigned int size = skb_headlen(skb); struct sk_buff *frag_skb; @@ -2859,7 +2855,7 @@ static int hns3_alloc_buffer(struct hns3_enet_ring *ring, static void hns3_free_buffer(struct hns3_enet_ring *ring, struct hns3_desc_cb *cb, int budget) { - if (cb->type == DESC_TYPE_SKB) + if (cb->type & DESC_TYPE_SKB) napi_consume_skb(cb->priv, budget); else if (!HNAE3_IS_TX_RING(ring) && cb->pagecnt_bias) __page_frag_cache_drain(cb->priv, cb->pagecnt_bias); @@ -2880,7 +2876,7 @@ static int hns3_map_buffer(struct hns3_enet_ring *ring, struct hns3_desc_cb *cb) static void hns3_unmap_buffer(struct hns3_enet_ring *ring, struct hns3_desc_cb *cb) { - if (cb->type == DESC_TYPE_SKB || cb->type == DESC_TYPE_FRAGLIST_SKB) + if (cb->type & (DESC_TYPE_SKB | DESC_TYPE_FRAGLIST_SKB)) dma_unmap_single(ring_to_dev(ring), cb->dma, cb->length, ring_to_dma_dir(ring)); else if (cb->length) @@ -3037,7 +3033,7 @@ static bool hns3_nic_reclaim_desc(struct hns3_enet_ring *ring, desc_cb = &ring->desc_cb[ntc]; - if (desc_cb->type == DESC_TYPE_SKB) { + if (desc_cb->type & DESC_TYPE_SKB) { (*pkts)++; (*bytes) += desc_cb->send_bytes; } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index 79821c7bdc16..9d18b9430b54 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -299,6 +299,13 @@ struct __packed hns3_desc { }; }; +enum hns3_desc_type { + DESC_TYPE_UNKNOWN = 0, + DESC_TYPE_SKB = 1 << 0, + DESC_TYPE_FRAGLIST_SKB = 1 << 1, + DESC_TYPE_PAGE = 1 << 2, +}; + struct hns3_desc_cb { dma_addr_t dma; /* dma address of this desc */ void *buf; /* cpu addr for a desc */ From patchwork Wed Jun 16 06:36:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 461924 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D9F0C48BE6 for ; Wed, 16 Jun 2021 06:40:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 244AB613C2 for ; Wed, 16 Jun 2021 06:40:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231861AbhFPGlv (ORCPT ); Wed, 16 Jun 2021 02:41:51 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:4802 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231400AbhFPGlj (ORCPT ); Wed, 16 Jun 2021 02:41:39 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4G4b533rkFzWtL4; Wed, 16 Jun 2021 14:34:31 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 4/7] net: hns3: add support to query tx spare buffer size for pf Date: Wed, 16 Jun 2021 14:36:14 +0800 Message-ID: <1623825377-41948-5-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> References: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Huazhong Tan Add support to query tx spare buffer size from configuration file, and use this info to do spare buffer initialization when the module parameter 'tx_spare_buf_size' is not specified. Signed-off-by: Huazhong Tan Signed-off-by: Guangbin Huang --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 + drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 7 +++++-- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h | 2 ++ drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 14 ++++++++++++++ drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 2 ++ 5 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index 5822fc06f767..0b202f4def83 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -760,6 +760,7 @@ struct hnae3_knic_private_info { u16 rx_buf_len; u16 num_tx_desc; u16 num_rx_desc; + u32 tx_spare_buf_size; struct hnae3_tc_info tc_info; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index e5466daac1c4..d86b3735aa9f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -1005,13 +1005,16 @@ static void hns3_init_tx_spare_buffer(struct hns3_enet_ring *ring) { struct hns3_tx_spare *tx_spare; struct page *page; + u32 alloc_size; dma_addr_t dma; int order; - if (!tx_spare_buf_size) + alloc_size = tx_spare_buf_size ? tx_spare_buf_size : + ring->tqp->handle->kinfo.tx_spare_buf_size; + if (!alloc_size) return; - order = get_order(tx_spare_buf_size); + order = get_order(alloc_size); tx_spare = devm_kzalloc(ring_to_dev(ring), sizeof(*tx_spare), GFP_KERNEL); if (!tx_spare) { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h index 51be76f1795e..a322dfeba5cf 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h @@ -542,6 +542,8 @@ struct hclge_pf_res_cmd { #define HCLGE_CFG_UMV_TBL_SPACE_M GENMASK(31, 16) #define HCLGE_CFG_PF_RSS_SIZE_S 0 #define HCLGE_CFG_PF_RSS_SIZE_M GENMASK(3, 0) +#define HCLGE_CFG_TX_SPARE_BUF_SIZE_S 4 +#define HCLGE_CFG_TX_SPARE_BUF_SIZE_M GENMASK(15, 4) #define HCLGE_CFG_CMD_CNT 4 diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index f6fdf93c8cad..f3e482ab3c71 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -1279,6 +1279,7 @@ static u32 hclge_get_max_speed(u16 speed_ability) static void hclge_parse_cfg(struct hclge_cfg *cfg, struct hclge_desc *desc) { +#define HCLGE_TX_SPARE_SIZE_UNIT 4096 #define SPEED_ABILITY_EXT_SHIFT 8 struct hclge_cfg_param_cmd *req; @@ -1358,6 +1359,15 @@ static void hclge_parse_cfg(struct hclge_cfg *cfg, struct hclge_desc *desc) cfg->pf_rss_size_max = cfg->pf_rss_size_max ? 1U << cfg->pf_rss_size_max : cfg->vf_rss_size_max; + + /* The unit of the tx spare buffer size queried from configuration + * file is HCLGE_TX_SPARE_SIZE_UNIT(4096) bytes, so a conversion is + * needed here. + */ + cfg->tx_spare_buf_size = hnae3_get_field(__le32_to_cpu(req->param[2]), + HCLGE_CFG_TX_SPARE_BUF_SIZE_M, + HCLGE_CFG_TX_SPARE_BUF_SIZE_S); + cfg->tx_spare_buf_size *= HCLGE_TX_SPARE_SIZE_UNIT; } /* hclge_get_cfg: query the static parameter from flash @@ -1539,6 +1549,7 @@ static int hclge_configure(struct hclge_dev *hdev) hdev->tc_max = cfg.tc_num; hdev->tm_info.hw_pfc_map = 0; hdev->wanted_umv_size = cfg.umv_space; + hdev->tx_spare_buf_size = cfg.tx_spare_buf_size; if (cfg.vlan_fliter_cap == HCLGE_VLAN_FLTR_CAN_MDF) set_bit(HNAE3_DEV_SUPPORT_VLAN_FLTR_MDF_B, ae_dev->caps); @@ -1736,6 +1747,7 @@ static int hclge_knic_setup(struct hclge_vport *vport, u16 num_tqps, kinfo->num_rx_desc = num_rx_desc; kinfo->rx_buf_len = hdev->rx_buf_len; + kinfo->tx_spare_buf_size = hdev->tx_spare_buf_size; kinfo->tqp = devm_kcalloc(&hdev->pdev->dev, num_tqps, sizeof(struct hnae3_queue *), GFP_KERNEL); @@ -11059,6 +11071,8 @@ static void hclge_info_show(struct hclge_dev *hdev) hdev->flag & HCLGE_FLAG_DCB_ENABLE ? "enable" : "disable"); dev_info(dev, "MQPRIO %s\n", hdev->flag & HCLGE_FLAG_MQPRIO_ENABLE ? "enable" : "disable"); + dev_info(dev, "Default tx spare buffer size: %u\n", + hdev->tx_spare_buf_size); dev_info(dev, "PF info end.\n"); } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h index 02852738ce21..3d3352491dba 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h @@ -384,6 +384,7 @@ struct hclge_cfg { u8 mac_addr[ETH_ALEN]; u8 default_speed; u32 numa_node_map; + u32 tx_spare_buf_size; u16 speed_ability; u16 umv_space; }; @@ -848,6 +849,7 @@ struct hclge_dev { u16 alloc_rss_size; /* Allocated RSS task queue */ u16 vf_rss_size_max; /* HW defined VF max RSS task queue */ u16 pf_rss_size_max; /* HW defined PF max RSS task queue */ + u32 tx_spare_buf_size; /* HW defined TX spare buffer size */ u16 fdir_pf_filter_count; /* Num of guaranteed filters for this PF */ u16 num_alloc_vport; /* Num vports this driver supports */ From patchwork Wed Jun 16 06:36:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 461922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 604CFC48BE5 for ; Wed, 16 Jun 2021 06:40:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48542613B9 for ; Wed, 16 Jun 2021 06:40:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232100AbhFPGmb (ORCPT ); Wed, 16 Jun 2021 02:42:31 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:7327 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231309AbhFPGli (ORCPT ); Wed, 16 Jun 2021 02:41:38 -0400 Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4G4b6C46MHz6y9v; Wed, 16 Jun 2021 14:35:31 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 5/7] net: hns3: support dma_map_sg() for multi frags skb Date: Wed, 16 Jun 2021 14:36:15 +0800 Message-ID: <1623825377-41948-6-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> References: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yunsheng Lin Using the queue based tx buffer, it is also possible to allocate a sgl buffer, and use skb_to_sgvec() to convert the skb to the sgvec in order to support the dma_map_sg() to decreases the overhead of IOMMU mapping and unmapping. Firstly, it reduces the number of buffers. For example, a tcp skb may have a 66-byte header and 3 fragments of 4328, 32768, and 28064 bytes. With this patch, dma_map_sg() will combine them into two buffers, 66-bytes header and one 65160-bytes fragment by using IOMMU. Secondly, it reduces the number of dma mapping and unmapping. All the original 4 buffers are mapped only once rather than 4 times. The throughput improves above 10% when running single thread of iperf using TCP when IOMMU is in strict mode. Suggested-by: Barry Song Signed-off-by: Yunsheng Lin Signed-off-by: Guangbin Huang --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 111 ++++++++++++++++++++- drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 4 + drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 3 + 3 files changed, 113 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index d86b3735aa9f..f60a344a6a9f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -57,6 +57,15 @@ static unsigned int tx_spare_buf_size; module_param(tx_spare_buf_size, uint, 0400); MODULE_PARM_DESC(tx_spare_buf_size, "Size used to allocate tx spare buffer"); +static unsigned int tx_sgl = 1; +module_param(tx_sgl, uint, 0600); +MODULE_PARM_DESC(tx_sgl, "Minimum number of frags when using dma_map_sg() to optimize the IOMMU mapping"); + +#define HNS3_SGL_SIZE(nfrag) (sizeof(struct scatterlist) * (nfrag) + \ + sizeof(struct sg_table)) +#define HNS3_MAX_SGL_SIZE ALIGN(HNS3_SGL_SIZE(HNS3_MAX_TSO_BD_NUM),\ + dma_get_cache_alignment()) + #define DEFAULT_MSG_LEVEL (NETIF_MSG_PROBE | NETIF_MSG_LINK | \ NETIF_MSG_IFDOWN | NETIF_MSG_IFUP) @@ -1001,6 +1010,25 @@ static bool hns3_can_use_tx_bounce(struct hns3_enet_ring *ring, return true; } +static bool hns3_can_use_tx_sgl(struct hns3_enet_ring *ring, + struct sk_buff *skb, + u32 space) +{ + if (skb->len <= ring->tx_copybreak || !tx_sgl || + (!skb_has_frag_list(skb) && + skb_shinfo(skb)->nr_frags < tx_sgl)) + return false; + + if (space < HNS3_MAX_SGL_SIZE) { + u64_stats_update_begin(&ring->syncp); + ring->stats.tx_spare_full++; + u64_stats_update_end(&ring->syncp); + return false; + } + + return true; +} + static void hns3_init_tx_spare_buffer(struct hns3_enet_ring *ring) { struct hns3_tx_spare *tx_spare; @@ -1108,14 +1136,19 @@ static void hns3_tx_spare_reclaim_cb(struct hns3_enet_ring *ring, /* This tx spare buffer is only really reclaimed after calling * hns3_tx_spare_update(), so it is still safe to use the info in - * the tx buffer to do the dma sync after tx_spare->next_to_clean - * is moved forword. + * the tx buffer to do the dma sync or sg unmapping after + * tx_spare->next_to_clean is moved forword. */ if (cb->type & (DESC_TYPE_BOUNCE_HEAD | DESC_TYPE_BOUNCE_ALL)) { dma_addr_t dma = tx_spare->dma + ntc; dma_sync_single_for_cpu(ring_to_dev(ring), dma, len, DMA_TO_DEVICE); + } else { + struct sg_table *sgt = tx_spare->buf + ntc; + + dma_unmap_sg(ring_to_dev(ring), sgt->sgl, sgt->orig_nents, + DMA_TO_DEVICE); } } @@ -2058,6 +2091,65 @@ static int hns3_handle_tx_bounce(struct hns3_enet_ring *ring, return bd_num; } +static int hns3_handle_tx_sgl(struct hns3_enet_ring *ring, + struct sk_buff *skb) +{ + struct hns3_desc_cb *desc_cb = &ring->desc_cb[ring->next_to_use]; + u32 nfrag = skb_shinfo(skb)->nr_frags + 1; + struct sg_table *sgt; + int i, bd_num = 0; + dma_addr_t dma; + u32 cb_len; + int nents; + + if (skb_has_frag_list(skb)) + nfrag = HNS3_MAX_TSO_BD_NUM; + + /* hns3_can_use_tx_sgl() is called to ensure the below + * function can always return the tx buffer. + */ + sgt = hns3_tx_spare_alloc(ring, HNS3_SGL_SIZE(nfrag), + &dma, &cb_len); + + /* scatterlist follows by the sg table */ + sgt->sgl = (struct scatterlist *)(sgt + 1); + sg_init_table(sgt->sgl, nfrag); + nents = skb_to_sgvec(skb, sgt->sgl, 0, skb->len); + if (unlikely(nents < 0)) { + hns3_tx_spare_rollback(ring, cb_len); + u64_stats_update_begin(&ring->syncp); + ring->stats.skb2sgl_err++; + u64_stats_update_end(&ring->syncp); + return -ENOMEM; + } + + sgt->orig_nents = nents; + sgt->nents = dma_map_sg(ring_to_dev(ring), sgt->sgl, sgt->orig_nents, + DMA_TO_DEVICE); + if (unlikely(!sgt->nents)) { + hns3_tx_spare_rollback(ring, cb_len); + u64_stats_update_begin(&ring->syncp); + ring->stats.map_sg_err++; + u64_stats_update_end(&ring->syncp); + return -ENOMEM; + } + + desc_cb->priv = skb; + desc_cb->length = cb_len; + desc_cb->dma = dma; + desc_cb->type = DESC_TYPE_SGL_SKB; + + for (i = 0; i < sgt->nents; i++) + bd_num += hns3_fill_desc(ring, sg_dma_address(sgt->sgl + i), + sg_dma_len(sgt->sgl + i)); + + u64_stats_update_begin(&ring->syncp); + ring->stats.tx_sgl++; + u64_stats_update_end(&ring->syncp); + + return bd_num; +} + static int hns3_handle_desc_filling(struct hns3_enet_ring *ring, struct sk_buff *skb) { @@ -2068,6 +2160,9 @@ static int hns3_handle_desc_filling(struct hns3_enet_ring *ring, space = hns3_tx_spare_space(ring); + if (hns3_can_use_tx_sgl(ring, skb, space)) + return hns3_handle_tx_sgl(ring, skb); + if (hns3_can_use_tx_bounce(ring, skb, space)) return hns3_handle_tx_bounce(ring, skb); @@ -2324,6 +2419,8 @@ static void hns3_nic_get_stats64(struct net_device *netdev, tx_drop += ring->stats.over_max_recursion; tx_drop += ring->stats.hw_limitation; tx_drop += ring->stats.copy_bits_err; + tx_drop += ring->stats.skb2sgl_err; + tx_drop += ring->stats.map_sg_err; tx_errors += ring->stats.sw_err_cnt; tx_errors += ring->stats.tx_vlan_err; tx_errors += ring->stats.tx_l4_proto_err; @@ -2332,6 +2429,8 @@ static void hns3_nic_get_stats64(struct net_device *netdev, tx_errors += ring->stats.over_max_recursion; tx_errors += ring->stats.hw_limitation; tx_errors += ring->stats.copy_bits_err; + tx_errors += ring->stats.skb2sgl_err; + tx_errors += ring->stats.map_sg_err; } while (u64_stats_fetch_retry_irq(&ring->syncp, start)); /* fetch the rx stats */ @@ -3126,7 +3225,7 @@ static void hns3_free_buffer(struct hns3_enet_ring *ring, struct hns3_desc_cb *cb, int budget) { if (cb->type & (DESC_TYPE_SKB | DESC_TYPE_BOUNCE_HEAD | - DESC_TYPE_BOUNCE_ALL)) + DESC_TYPE_BOUNCE_ALL | DESC_TYPE_SGL_SKB)) napi_consume_skb(cb->priv, budget); else if (!HNAE3_IS_TX_RING(ring) && cb->pagecnt_bias) __page_frag_cache_drain(cb->priv, cb->pagecnt_bias); @@ -3153,7 +3252,8 @@ static void hns3_unmap_buffer(struct hns3_enet_ring *ring, else if ((cb->type & DESC_TYPE_PAGE) && cb->length) dma_unmap_page(ring_to_dev(ring), cb->dma, cb->length, ring_to_dma_dir(ring)); - else if (cb->type & (DESC_TYPE_BOUNCE_ALL | DESC_TYPE_BOUNCE_HEAD)) + else if (cb->type & (DESC_TYPE_BOUNCE_ALL | DESC_TYPE_BOUNCE_HEAD | + DESC_TYPE_SGL_SKB)) hns3_tx_spare_reclaim_cb(ring, cb); } @@ -3307,7 +3407,8 @@ static bool hns3_nic_reclaim_desc(struct hns3_enet_ring *ring, desc_cb = &ring->desc_cb[ntc]; if (desc_cb->type & (DESC_TYPE_SKB | DESC_TYPE_BOUNCE_ALL | - DESC_TYPE_BOUNCE_HEAD)) { + DESC_TYPE_BOUNCE_HEAD | + DESC_TYPE_SGL_SKB)) { (*pkts)++; (*bytes) += desc_cb->send_bytes; } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index 8d147c1dab2c..22ae291471aa 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -306,6 +306,7 @@ enum hns3_desc_type { DESC_TYPE_PAGE = 1 << 2, DESC_TYPE_BOUNCE_ALL = 1 << 3, DESC_TYPE_BOUNCE_HEAD = 1 << 4, + DESC_TYPE_SGL_SKB = 1 << 5, }; struct hns3_desc_cb { @@ -410,6 +411,9 @@ struct ring_stats { u64 tx_bounce; u64 tx_spare_full; u64 copy_bits_err; + u64 tx_sgl; + u64 skb2sgl_err; + u64 map_sg_err; }; struct { u64 rx_pkts; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index f306de16d73f..d7852716aaad 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -49,6 +49,9 @@ static const struct hns3_stats hns3_txq_stats[] = { HNS3_TQP_STAT("bounce", tx_bounce), HNS3_TQP_STAT("spare_full", tx_spare_full), HNS3_TQP_STAT("copy_bits_err", copy_bits_err), + HNS3_TQP_STAT("sgl", tx_sgl), + HNS3_TQP_STAT("skb2sgl_err", skb2sgl_err), + HNS3_TQP_STAT("map_sg_err", map_sg_err), }; #define HNS3_TXQ_STATS_COUNT ARRAY_SIZE(hns3_txq_stats) From patchwork Wed Jun 16 06:36:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 461923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEB78C48BE5 for ; Wed, 16 Jun 2021 06:40:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AA92E613BF for ; Wed, 16 Jun 2021 06:40:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232047AbhFPGmX (ORCPT ); Wed, 16 Jun 2021 02:42:23 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:4800 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231383AbhFPGlj (ORCPT ); Wed, 16 Jun 2021 02:41:39 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4G4b533xc8zWtQx; Wed, 16 Jun 2021 14:34:31 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Wed, 16 Jun 2021 14:39:30 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 7/7] net: hns3: use bounce buffer when rx page can not be reused Date: Wed, 16 Jun 2021 14:36:17 +0800 Message-ID: <1623825377-41948-8-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> References: <1623825377-41948-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yunsheng Lin Currently rx page will be reused to receive future packet when the stack releases the previous skb quickly. If the old page can not be reused, a new page will be allocated and mapped, which comsumes a lot of cpu when IOMMU is in the strict mode, especially when the application and irq/NAPI happens to run on the same cpu. So allocate a new frag to memcpy the data to avoid the costly IOMMU unmapping/mapping operation, and add "frag_alloc_err" and "frag_alloc" stats in "ethtool -S ethX" cmd. The throughput improves above 50% when running single thread of iperf using TCP when IOMMU is in strict mode and iperf shares the same cpu with irq/NAPI(rx_copybreak = 2048 and mtu = 1500). Signed-off-by: Yunsheng Lin Signed-off-by: Guangbin Huang --- drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 2 ++ drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 23 ++++++++++++++++++++++ drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 4 ++++ drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 12 +++++++++++ 4 files changed, 41 insertions(+) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c index a24a75c47cad..34b6cd904a1a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c @@ -450,6 +450,7 @@ static const struct hns3_dbg_item rx_queue_info_items[] = { { "HEAD", 2 }, { "FBDNUM", 2 }, { "PKTNUM", 2 }, + { "COPYBREAK", 2 }, { "RING_EN", 2 }, { "RX_RING_EN", 2 }, { "BASE_ADDR", 10 }, @@ -481,6 +482,7 @@ static void hns3_dump_rx_queue_info(struct hns3_enet_ring *ring, sprintf(result[j++], "%6u", readl_relaxed(ring->tqp->io_base + HNS3_RING_RX_RING_PKTNUM_RECORD_REG)); + sprintf(result[j++], "%9u", ring->rx_copybreak); sprintf(result[j++], "%7s", readl_relaxed(ring->tqp->io_base + HNS3_RING_EN_REG) ? "on" : "off"); diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 98e8a548edb8..51bbf5f760c5 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -3552,6 +3552,28 @@ static void hns3_nic_reuse_page(struct sk_buff *skb, int i, hns3_page_size(ring)) { desc_cb->page_offset += truesize; desc_cb->reuse_flag = 1; + } else if (frag_size <= ring->rx_copybreak) { + void *frag = napi_alloc_frag(frag_size); + + if (unlikely(!frag)) { + u64_stats_update_begin(&ring->syncp); + ring->stats.frag_alloc_err++; + u64_stats_update_end(&ring->syncp); + + hns3_rl_err(ring_to_netdev(ring), + "failed to allocate rx frag\n"); + goto out; + } + + desc_cb->reuse_flag = 1; + memcpy(frag, desc_cb->buf + frag_offset, frag_size); + skb_add_rx_frag(skb, i, virt_to_page(frag), + offset_in_page(frag), frag_size, frag_size); + + u64_stats_update_begin(&ring->syncp); + ring->stats.frag_alloc++; + u64_stats_update_end(&ring->syncp); + return; } out: @@ -4620,6 +4642,7 @@ static void hns3_ring_get_cfg(struct hnae3_queue *q, struct hns3_nic_priv *priv, ring = &priv->ring[q->tqp_index + queue_num]; desc_num = priv->ae_handle->kinfo.num_rx_desc; ring->queue_index = q->tqp_index; + ring->rx_copybreak = priv->rx_copybreak; } hnae3_set_bit(ring->flag, HNAE3_RING_TYPE_B, ring_type); diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index 22ae291471aa..15af3d93857b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -427,6 +427,8 @@ struct ring_stats { u64 csum_complete; u64 rx_multicast; u64 non_reuse_pg; + u64 frag_alloc_err; + u64 frag_alloc; }; __le16 csum; }; @@ -478,6 +480,7 @@ struct hns3_enet_ring { /* for Rx ring */ struct { u32 pull_len; /* memcpy len for current rx packet */ + u32 rx_copybreak; u32 frag_num; /* first buffer address for current packet */ unsigned char *va; @@ -569,6 +572,7 @@ struct hns3_nic_priv { struct hns3_enet_coalesce tx_coal; struct hns3_enet_coalesce rx_coal; u32 tx_copybreak; + u32 rx_copybreak; }; union l3_hdr_info { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index d7852716aaad..82061ab6930f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -71,6 +71,8 @@ static const struct hns3_stats hns3_rxq_stats[] = { HNS3_TQP_STAT("csum_complete", csum_complete), HNS3_TQP_STAT("multicast", rx_multicast), HNS3_TQP_STAT("non_reuse_pg", non_reuse_pg), + HNS3_TQP_STAT("frag_alloc_err", frag_alloc_err), + HNS3_TQP_STAT("frag_alloc", frag_alloc), }; #define HNS3_PRIV_FLAGS_LEN ARRAY_SIZE(hns3_priv_flags) @@ -1610,6 +1612,9 @@ static int hns3_get_tunable(struct net_device *netdev, /* all the tx rings have the same tx_copybreak */ *(u32 *)data = priv->tx_copybreak; break; + case ETHTOOL_RX_COPYBREAK: + *(u32 *)data = priv->rx_copybreak; + break; default: ret = -EOPNOTSUPP; break; @@ -1634,6 +1639,13 @@ static int hns3_set_tunable(struct net_device *netdev, priv->ring[i].tx_copybreak = priv->tx_copybreak; break; + case ETHTOOL_RX_COPYBREAK: + priv->rx_copybreak = *(u32 *)data; + + for (i = h->kinfo.num_tqps; i < h->kinfo.num_tqps * 2; i++) + priv->ring[i].rx_copybreak = priv->rx_copybreak; + + break; default: ret = -EOPNOTSUPP; break;