From patchwork Wed Feb 10 16:28:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77742C433DB for ; Wed, 10 Feb 2021 16:30:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 341E364E7A for ; Wed, 10 Feb 2021 16:30:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232516AbhBJQaP (ORCPT ); Wed, 10 Feb 2021 11:30:15 -0500 Received: from mail-40136.protonmail.ch ([185.70.40.136]:31166 "EHLO mail-40136.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232389AbhBJQ31 (ORCPT ); Wed, 10 Feb 2021 11:29:27 -0500 Date: Wed, 10 Feb 2021 16:28:35 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974524; bh=vxbPqVTP+yyafvrJZY9PHRZqLx4yQzye7I3rd2J7vYA=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=EbxFHNC20eEJDuv+NCounvy7JGbNTqrrASRJCqFaH1NPfAdd9noeNXRGDnYZx4zix LF2yc7A8pi/w+kp9fDgDgwuciz7cgX9HwseajCwWCrt8eDUfHDvZmM7LCYaKSZHeOS nM+bBLkkWw7zo5ltbIj10shwfB6izlJYG9wPgKzVxVpmUhq9POv+bfW5BJQRcba5Bg iIwWu06b6gEg5C3BB1w2E/e5nYRJyi0DOpXDskih0YMv8+9iaK8hEa8ogIa6Defy+D JaMWRYe9RHaT7EdoG8uN142USIrOri36vpQpk/Zh2bvIwUOaBIRApQWjaen6vAJ/SV n+Ww9u4hU6doQ== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 01/11] skbuff: move __alloc_skb() next to the other skb allocation functions Message-ID: <20210210162732.80467-2-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In preparation before reusing several functions in all three skb allocation variants, move __alloc_skb() next to the __netdev_alloc_skb() and __napi_alloc_skb(). No functional changes. Signed-off-by: Alexander Lobakin --- net/core/skbuff.c | 284 +++++++++++++++++++++++----------------------- 1 file changed, 142 insertions(+), 142 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d380c7b5a12d..a0f846872d19 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -119,148 +119,6 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) skb_panic(skb, sz, addr, __func__); } -/* - * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells - * the caller if emergency pfmemalloc reserves are being used. If it is and - * the socket is later found to be SOCK_MEMALLOC then PFMEMALLOC reserves - * may be used. Otherwise, the packet data may be discarded until enough - * memory is free - */ -#define kmalloc_reserve(size, gfp, node, pfmemalloc) \ - __kmalloc_reserve(size, gfp, node, _RET_IP_, pfmemalloc) - -static void *__kmalloc_reserve(size_t size, gfp_t flags, int node, - unsigned long ip, bool *pfmemalloc) -{ - void *obj; - bool ret_pfmemalloc = false; - - /* - * Try a regular allocation, when that fails and we're not entitled - * to the reserves, fail. - */ - obj = kmalloc_node_track_caller(size, - flags | __GFP_NOMEMALLOC | __GFP_NOWARN, - node); - if (obj || !(gfp_pfmemalloc_allowed(flags))) - goto out; - - /* Try again but now we are using pfmemalloc reserves */ - ret_pfmemalloc = true; - obj = kmalloc_node_track_caller(size, flags, node); - -out: - if (pfmemalloc) - *pfmemalloc = ret_pfmemalloc; - - return obj; -} - -/* Allocate a new skbuff. We do this ourselves so we can fill in a few - * 'private' fields and also do memory statistics to find all the - * [BEEP] leaks. - * - */ - -/** - * __alloc_skb - allocate a network buffer - * @size: size to allocate - * @gfp_mask: allocation mask - * @flags: If SKB_ALLOC_FCLONE is set, allocate from fclone cache - * instead of head cache and allocate a cloned (child) skb. - * If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for - * allocations in case the data is required for writeback - * @node: numa node to allocate memory on - * - * Allocate a new &sk_buff. The returned buffer has no headroom and a - * tail room of at least size bytes. The object has a reference count - * of one. The return is the buffer. On a failure the return is %NULL. - * - * Buffers may only be allocated from interrupts using a @gfp_mask of - * %GFP_ATOMIC. - */ -struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, - int flags, int node) -{ - struct kmem_cache *cache; - struct skb_shared_info *shinfo; - struct sk_buff *skb; - u8 *data; - bool pfmemalloc; - - cache = (flags & SKB_ALLOC_FCLONE) - ? skbuff_fclone_cache : skbuff_head_cache; - - if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX)) - gfp_mask |= __GFP_MEMALLOC; - - /* Get the HEAD */ - skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node); - if (!skb) - goto out; - prefetchw(skb); - - /* We do our best to align skb_shared_info on a separate cache - * line. It usually works because kmalloc(X > SMP_CACHE_BYTES) gives - * aligned memory blocks, unless SLUB/SLAB debug is enabled. - * Both skb->head and skb_shared_info are cache line aligned. - */ - size = SKB_DATA_ALIGN(size); - size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); - if (!data) - goto nodata; - /* kmalloc(size) might give us more room than requested. - * Put skb_shared_info exactly at the end of allocated zone, - * to allow max possible filling before reallocation. - */ - size = SKB_WITH_OVERHEAD(ksize(data)); - prefetchw(data + size); - - /* - * Only clear those fields we need to clear, not those that we will - * actually initialise below. Hence, don't put any more fields after - * the tail pointer in struct sk_buff! - */ - memset(skb, 0, offsetof(struct sk_buff, tail)); - /* Account for allocated memory : skb + skb->head */ - skb->truesize = SKB_TRUESIZE(size); - skb->pfmemalloc = pfmemalloc; - refcount_set(&skb->users, 1); - skb->head = data; - skb->data = data; - skb_reset_tail_pointer(skb); - skb->end = skb->tail + size; - skb->mac_header = (typeof(skb->mac_header))~0U; - skb->transport_header = (typeof(skb->transport_header))~0U; - - /* make sure we initialize shinfo sequentially */ - shinfo = skb_shinfo(skb); - memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)); - atomic_set(&shinfo->dataref, 1); - - if (flags & SKB_ALLOC_FCLONE) { - struct sk_buff_fclones *fclones; - - fclones = container_of(skb, struct sk_buff_fclones, skb1); - - skb->fclone = SKB_FCLONE_ORIG; - refcount_set(&fclones->fclone_ref, 1); - - fclones->skb2.fclone = SKB_FCLONE_CLONE; - } - - skb_set_kcov_handle(skb, kcov_common_handle()); - -out: - return skb; -nodata: - kmem_cache_free(cache, skb); - skb = NULL; - goto out; -} -EXPORT_SYMBOL(__alloc_skb); - /* Caller must provide SKB that is memset cleared */ static struct sk_buff *__build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size) @@ -408,6 +266,148 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) } EXPORT_SYMBOL(__netdev_alloc_frag_align); +/* + * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells + * the caller if emergency pfmemalloc reserves are being used. If it is and + * the socket is later found to be SOCK_MEMALLOC then PFMEMALLOC reserves + * may be used. Otherwise, the packet data may be discarded until enough + * memory is free + */ +#define kmalloc_reserve(size, gfp, node, pfmemalloc) \ + __kmalloc_reserve(size, gfp, node, _RET_IP_, pfmemalloc) + +static void *__kmalloc_reserve(size_t size, gfp_t flags, int node, + unsigned long ip, bool *pfmemalloc) +{ + void *obj; + bool ret_pfmemalloc = false; + + /* + * Try a regular allocation, when that fails and we're not entitled + * to the reserves, fail. + */ + obj = kmalloc_node_track_caller(size, + flags | __GFP_NOMEMALLOC | __GFP_NOWARN, + node); + if (obj || !(gfp_pfmemalloc_allowed(flags))) + goto out; + + /* Try again but now we are using pfmemalloc reserves */ + ret_pfmemalloc = true; + obj = kmalloc_node_track_caller(size, flags, node); + +out: + if (pfmemalloc) + *pfmemalloc = ret_pfmemalloc; + + return obj; +} + +/* Allocate a new skbuff. We do this ourselves so we can fill in a few + * 'private' fields and also do memory statistics to find all the + * [BEEP] leaks. + * + */ + +/** + * __alloc_skb - allocate a network buffer + * @size: size to allocate + * @gfp_mask: allocation mask + * @flags: If SKB_ALLOC_FCLONE is set, allocate from fclone cache + * instead of head cache and allocate a cloned (child) skb. + * If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for + * allocations in case the data is required for writeback + * @node: numa node to allocate memory on + * + * Allocate a new &sk_buff. The returned buffer has no headroom and a + * tail room of at least size bytes. The object has a reference count + * of one. The return is the buffer. On a failure the return is %NULL. + * + * Buffers may only be allocated from interrupts using a @gfp_mask of + * %GFP_ATOMIC. + */ +struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, + int flags, int node) +{ + struct kmem_cache *cache; + struct skb_shared_info *shinfo; + struct sk_buff *skb; + u8 *data; + bool pfmemalloc; + + cache = (flags & SKB_ALLOC_FCLONE) + ? skbuff_fclone_cache : skbuff_head_cache; + + if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX)) + gfp_mask |= __GFP_MEMALLOC; + + /* Get the HEAD */ + skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node); + if (!skb) + goto out; + prefetchw(skb); + + /* We do our best to align skb_shared_info on a separate cache + * line. It usually works because kmalloc(X > SMP_CACHE_BYTES) gives + * aligned memory blocks, unless SLUB/SLAB debug is enabled. + * Both skb->head and skb_shared_info are cache line aligned. + */ + size = SKB_DATA_ALIGN(size); + size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); + if (!data) + goto nodata; + /* kmalloc(size) might give us more room than requested. + * Put skb_shared_info exactly at the end of allocated zone, + * to allow max possible filling before reallocation. + */ + size = SKB_WITH_OVERHEAD(ksize(data)); + prefetchw(data + size); + + /* + * Only clear those fields we need to clear, not those that we will + * actually initialise below. Hence, don't put any more fields after + * the tail pointer in struct sk_buff! + */ + memset(skb, 0, offsetof(struct sk_buff, tail)); + /* Account for allocated memory : skb + skb->head */ + skb->truesize = SKB_TRUESIZE(size); + skb->pfmemalloc = pfmemalloc; + refcount_set(&skb->users, 1); + skb->head = data; + skb->data = data; + skb_reset_tail_pointer(skb); + skb->end = skb->tail + size; + skb->mac_header = (typeof(skb->mac_header))~0U; + skb->transport_header = (typeof(skb->transport_header))~0U; + + /* make sure we initialize shinfo sequentially */ + shinfo = skb_shinfo(skb); + memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)); + atomic_set(&shinfo->dataref, 1); + + if (flags & SKB_ALLOC_FCLONE) { + struct sk_buff_fclones *fclones; + + fclones = container_of(skb, struct sk_buff_fclones, skb1); + + skb->fclone = SKB_FCLONE_ORIG; + refcount_set(&fclones->fclone_ref, 1); + + fclones->skb2.fclone = SKB_FCLONE_CLONE; + } + + skb_set_kcov_handle(skb, kcov_common_handle()); + +out: + return skb; +nodata: + kmem_cache_free(cache, skb); + skb = NULL; + goto out; +} +EXPORT_SYMBOL(__alloc_skb); + /** * __netdev_alloc_skb - allocate an skbuff for rx on a specific device * @dev: network device to receive on From patchwork Wed Feb 10 16:29:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D154AC43381 for ; Wed, 10 Feb 2021 16:31:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8076764E28 for ; Wed, 10 Feb 2021 16:31:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232166AbhBJQbH (ORCPT ); Wed, 10 Feb 2021 11:31:07 -0500 Received: from mail-40136.protonmail.ch ([185.70.40.136]:18492 "EHLO mail-40136.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232489AbhBJQaC (ORCPT ); Wed, 10 Feb 2021 11:30:02 -0500 Date: Wed, 10 Feb 2021 16:29:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974548; bh=yJJMTncfQhh58ye6yAPpr/XHKg/skCQ3mD0O/lX7vIM=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=MIIGrlbocUP9/X5eBiQ6YcfpGp9ltvi+Th4d3DSHmsQarWwtzfHN49zBjeQHI4QOJ i/BW63QCJt8wAJ+BtnGiPY2Xov9j2D4fsjROnr/PAiWr11aJRj17b2K3bmLM6sorrT +AM1YN/fhcIITzOeqs20CTnXMoZD3r/TOxnOHW9SCMjtclxsXBSXJKLjLyVGTfqVyz cvJ4mGJOAsnnBe21knbrY+T+Pei9omB0QioRn3ACPFMsn0L+Wbjb3RxDG7gD9AlwmT Eaey5EWGdRXV3n/3iH1zzvFCyh2CzvHbV2xd/8NH0cJS1KQ3Sv5JFS5aTlGcAapfMr dwDf4bEh8gyKQ== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 03/11] skbuff: make __build_skb_around() return void Message-ID: <20210210162732.80467-4-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org __build_skb_around() can never fail and always returns passed skb. Make it return void to simplify and optimize the code. Signed-off-by: Alexander Lobakin --- net/core/skbuff.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 70289f22a6f4..c7d184e11547 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -120,8 +120,8 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) } /* Caller must provide SKB that is memset cleared */ -static struct sk_buff *__build_skb_around(struct sk_buff *skb, - void *data, unsigned int frag_size) +static void __build_skb_around(struct sk_buff *skb, void *data, + unsigned int frag_size) { struct skb_shared_info *shinfo; unsigned int size = frag_size ? : ksize(data); @@ -144,8 +144,6 @@ static struct sk_buff *__build_skb_around(struct sk_buff *skb, atomic_set(&shinfo->dataref, 1); skb_set_kcov_handle(skb, kcov_common_handle()); - - return skb; } /** @@ -176,8 +174,9 @@ struct sk_buff *__build_skb(void *data, unsigned int frag_size) return NULL; memset(skb, 0, offsetof(struct sk_buff, tail)); + __build_skb_around(skb, data, frag_size); - return __build_skb_around(skb, data, frag_size); + return skb; } /* build_skb() is wrapper over __build_skb(), that specifically @@ -210,9 +209,9 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, if (unlikely(!skb)) return NULL; - skb = __build_skb_around(skb, data, frag_size); + __build_skb_around(skb, data, frag_size); - if (skb && frag_size) { + if (frag_size) { skb->head_frag = 1; if (page_is_pfmemalloc(virt_to_head_page(data))) skb->pfmemalloc = 1; From patchwork Wed Feb 10 16:29:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFE7DC433DB for ; Wed, 10 Feb 2021 16:32:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6CC5D64E87 for ; Wed, 10 Feb 2021 16:32:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232414AbhBJQcA (ORCPT ); Wed, 10 Feb 2021 11:32:00 -0500 Received: from mail-40136.protonmail.ch ([185.70.40.136]:20594 "EHLO mail-40136.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232537AbhBJQae (ORCPT ); Wed, 10 Feb 2021 11:30:34 -0500 Date: Wed, 10 Feb 2021 16:29:24 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974568; bh=vHu/tYcVI950QN5Rrw2gSn3oUyiXAKr/E1pyrHPMHnU=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=X2mB14Zre1cBumX33fMkvOTJrgIUhnlZlMNHlZEncYmFJReS5ErVDr0Bp9B0D3mGO 7H+gj7Gpe1uh3P+QB8yv5oSLYY2z4oW3xM2BdrpOxRZ5uyZ1S+PiG5dWknocuaqhqJ lUr0uH2xWyo0z8AiHzONnplhpbtJAm0CHHgSimMq+jYz8eFsskMee/B9+HebJe/qYM ArD4Z1m7p06IOoQOeEgSgmzerc9pVnwQjakxYyL7sMkrCvE0sW5cNiS8hhc+g2juD3 3p5cU5hh/AoLLWVWAQJn1ae4ciutBOzwdM4DW2p6ao6UylqERj9GDe9G4KCbenhagN TfaNWejyoMgPA== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 04/11] skbuff: simplify __alloc_skb() a bit Message-ID: <20210210162732.80467-5-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Use unlikely() annotations for skbuff_head and data similarly to the two other allocation functions and remove totally redundant goto. Signed-off-by: Alexander Lobakin --- net/core/skbuff.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index c7d184e11547..88566de26cd1 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -339,8 +339,8 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, /* Get the HEAD */ skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node); - if (!skb) - goto out; + if (unlikely(!skb)) + return NULL; prefetchw(skb); /* We do our best to align skb_shared_info on a separate cache @@ -351,7 +351,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, size = SKB_DATA_ALIGN(size); size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); - if (!data) + if (unlikely(!data)) goto nodata; /* kmalloc(size) might give us more room than requested. * Put skb_shared_info exactly at the end of allocated zone, @@ -395,12 +395,11 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, skb_set_kcov_handle(skb, kcov_common_handle()); -out: return skb; + nodata: kmem_cache_free(cache, skb); - skb = NULL; - goto out; + return NULL; } EXPORT_SYMBOL(__alloc_skb); From patchwork Wed Feb 10 16:30:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 279CAC433E9 for ; Wed, 10 Feb 2021 16:33:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CDF0A64E8C for ; Wed, 10 Feb 2021 16:33:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232787AbhBJQcz (ORCPT ); Wed, 10 Feb 2021 11:32:55 -0500 Received: from mail-40133.protonmail.ch ([185.70.40.133]:17023 "EHLO mail-40133.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232541AbhBJQbC (ORCPT ); Wed, 10 Feb 2021 11:31:02 -0500 Date: Wed, 10 Feb 2021 16:30:09 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974620; bh=HZHaNJCAig0YSUt683wmcffffm0YebCE8doSL87HOwk=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=AAVz4XLzmIZ7ZKgQf5/pTKIllBOLjbg5ahLMuzgVNUV3Ck8VZrvugYGJrz18g57tg 7FB9nGwKd4MeViCBO4mQuZYW2s0x8P1tgR0X9kVhY6X9ZC5RPauq6O1V4IgxM6QcPx ZDF80s8j9zAvu04cQWoVbzYK8KV0voV5SIbnSIr/l1d1aVesv0jFfUTWwnSXh5oxYJ dDIZbOVv1JLkyHaRM/LGqgNvYY394LNADNQ0kUwMr+RuE0cQphYItqIfIcKkq8Hrxz SH2+NL6gDXSJf/a2dYD2YjYSBRV5KS0CIQvepy3MLzpcF8dTT6W2Ui/cxzD//pRRK+ oh0ErizMVcu3g== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 07/11] skbuff: move NAPI cache declarations upper in the file Message-ID: <20210210162732.80467-8-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org NAPI cache structures will be used for allocating skbuff_heads, so move their declarations a bit upper. Signed-off-by: Alexander Lobakin --- net/core/skbuff.c | 90 +++++++++++++++++++++++------------------------ 1 file changed, 45 insertions(+), 45 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 4be2bb969535..860a9d4f752f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -119,6 +119,51 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) skb_panic(skb, sz, addr, __func__); } +#define NAPI_SKB_CACHE_SIZE 64 + +struct napi_alloc_cache { + struct page_frag_cache page; + unsigned int skb_count; + void *skb_cache[NAPI_SKB_CACHE_SIZE]; +}; + +static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); +static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); + +static void *__alloc_frag_align(unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask) +{ + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + + return page_frag_alloc_align(&nc->page, fragsz, gfp_mask, align_mask); +} + +void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) +{ + fragsz = SKB_DATA_ALIGN(fragsz); + + return __alloc_frag_align(fragsz, GFP_ATOMIC, align_mask); +} +EXPORT_SYMBOL(__napi_alloc_frag_align); + +void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) +{ + struct page_frag_cache *nc; + void *data; + + fragsz = SKB_DATA_ALIGN(fragsz); + if (in_irq() || irqs_disabled()) { + nc = this_cpu_ptr(&netdev_alloc_cache); + data = page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, align_mask); + } else { + local_bh_disable(); + data = __alloc_frag_align(fragsz, GFP_ATOMIC, align_mask); + local_bh_enable(); + } + return data; +} +EXPORT_SYMBOL(__netdev_alloc_frag_align); + /* Caller must provide SKB that is memset cleared */ static void __build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size) @@ -220,51 +265,6 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, } EXPORT_SYMBOL(build_skb_around); -#define NAPI_SKB_CACHE_SIZE 64 - -struct napi_alloc_cache { - struct page_frag_cache page; - unsigned int skb_count; - void *skb_cache[NAPI_SKB_CACHE_SIZE]; -}; - -static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); -static DEFINE_PER_CPU(struct napi_alloc_cache, napi_alloc_cache); - -static void *__alloc_frag_align(unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) -{ - struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); - - return page_frag_alloc_align(&nc->page, fragsz, gfp_mask, align_mask); -} - -void *__napi_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) -{ - fragsz = SKB_DATA_ALIGN(fragsz); - - return __alloc_frag_align(fragsz, GFP_ATOMIC, align_mask); -} -EXPORT_SYMBOL(__napi_alloc_frag_align); - -void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) -{ - struct page_frag_cache *nc; - void *data; - - fragsz = SKB_DATA_ALIGN(fragsz); - if (in_irq() || irqs_disabled()) { - nc = this_cpu_ptr(&netdev_alloc_cache); - data = page_frag_alloc_align(nc, fragsz, GFP_ATOMIC, align_mask); - } else { - local_bh_disable(); - data = __alloc_frag_align(fragsz, GFP_ATOMIC, align_mask); - local_bh_enable(); - } - return data; -} -EXPORT_SYMBOL(__netdev_alloc_frag_align); - /* * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells * the caller if emergency pfmemalloc reserves are being used. If it is and From patchwork Wed Feb 10 16:30:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07071C433DB for ; Wed, 10 Feb 2021 16:33:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AC52364E8A for ; Wed, 10 Feb 2021 16:33:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232743AbhBJQcs (ORCPT ); Wed, 10 Feb 2021 11:32:48 -0500 Received: from mail-40131.protonmail.ch ([185.70.40.131]:13336 "EHLO mail-40131.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232565AbhBJQbO (ORCPT ); Wed, 10 Feb 2021 11:31:14 -0500 Date: Wed, 10 Feb 2021 16:30:23 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974630; bh=OJItbnlO9/miYI8tp2rCg3ReZOArter9GMDQWwcIoz0=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=pCxz9nMakoYTN7EV8AAK71I0iI+zWLIZQ2vRRCjvIImEmug5z9bgRhNA08/4fIC+J BKBiZmsH+sDQCNHeR7r/tahSzm8/FftqYFQtjRPH63Ji/1VjzYLR6SMrX4Om2Whbgx W/6g5s8x4GmvKkk2+vN/IU6mEdP/9kizjZtc21F7vQmaqEwqWj9Abze9SUsE7notf4 3v8zgYV706gfYwE5Gbu4ACElPWyn167lGaSC7G9YVFK74kOPW5LNJd89kBfI6zoyE8 qenuWGACsCE4OR7/qdR3S54d0O6Mil3clvnyO3S4pflGNZVOWg4gwqsJbvLYVAAY3D qE/S8XiU8sJkQ== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 08/11] skbuff: introduce {, __}napi_build_skb() which reuses NAPI cache heads Message-ID: <20210210162732.80467-9-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Instead of just bulk-flushing skbuff_heads queued up through napi_consume_skb() or __kfree_skb_defer(), try to reuse them on allocation path. If the cache is empty on allocation, bulk-allocate the first 16 elements, which is more efficient than per-skb allocation. If the cache is full on freeing, bulk-wipe the second half of the cache (32 elements). This also includes custom KASAN poisoning/unpoisoning to be double sure there are no use-after-free cases. To not change current behaviour, introduce a new function, napi_build_skb(), to optionally use a new approach later in drivers. Note on selected bulk size, 16: - this equals to XDP_BULK_QUEUE_SIZE, DEV_MAP_BULK_SIZE and especially VETH_XDP_BATCH, which is also used to bulk-allocate skbuff_heads and was tested on powerful setups; - this also showed the best performance in the actual test series (from the array of {8, 16, 32}). Suggested-by: Edward Cree # Divide on two halves Suggested-by: Eric Dumazet # KASAN poisoning Cc: Dmitry Vyukov # Help with KASAN Cc: Paolo Abeni # Reduced batch size Signed-off-by: Alexander Lobakin --- include/linux/skbuff.h | 2 + net/core/skbuff.c | 94 ++++++++++++++++++++++++++++++++++++------ 2 files changed, 83 insertions(+), 13 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 0e0707296098..906122eac82a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1087,6 +1087,8 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size); struct sk_buff *build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size); +struct sk_buff *napi_build_skb(void *data, unsigned int frag_size); + /** * alloc_skb - allocate a network buffer * @size: size to allocate diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 860a9d4f752f..9e1a8ded4acc 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -120,6 +120,8 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) } #define NAPI_SKB_CACHE_SIZE 64 +#define NAPI_SKB_CACHE_BULK 16 +#define NAPI_SKB_CACHE_HALF (NAPI_SKB_CACHE_SIZE / 2) struct napi_alloc_cache { struct page_frag_cache page; @@ -164,6 +166,25 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) } EXPORT_SYMBOL(__netdev_alloc_frag_align); +static struct sk_buff *napi_skb_cache_get(void) +{ + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct sk_buff *skb; + + if (unlikely(!nc->skb_count)) + nc->skb_count = kmem_cache_alloc_bulk(skbuff_head_cache, + GFP_ATOMIC, + NAPI_SKB_CACHE_BULK, + nc->skb_cache); + if (unlikely(!nc->skb_count)) + return NULL; + + skb = nc->skb_cache[--nc->skb_count]; + kasan_unpoison_object_data(skbuff_head_cache, skb); + + return skb; +} + /* Caller must provide SKB that is memset cleared */ static void __build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size) @@ -265,6 +286,53 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, } EXPORT_SYMBOL(build_skb_around); +/** + * __napi_build_skb - build a network buffer + * @data: data buffer provided by caller + * @frag_size: size of data, or 0 if head was kmalloced + * + * Version of __build_skb() that uses NAPI percpu caches to obtain + * skbuff_head instead of inplace allocation. + * + * Returns a new &sk_buff on success, %NULL on allocation failure. + */ +static struct sk_buff *__napi_build_skb(void *data, unsigned int frag_size) +{ + struct sk_buff *skb; + + skb = napi_skb_cache_get(); + if (unlikely(!skb)) + return NULL; + + memset(skb, 0, offsetof(struct sk_buff, tail)); + __build_skb_around(skb, data, frag_size); + + return skb; +} + +/** + * napi_build_skb - build a network buffer + * @data: data buffer provided by caller + * @frag_size: size of data, or 0 if head was kmalloced + * + * Version of __napi_build_skb() that takes care of skb->head_frag + * and skb->pfmemalloc when the data is a page or page fragment. + * + * Returns a new &sk_buff on success, %NULL on allocation failure. + */ +struct sk_buff *napi_build_skb(void *data, unsigned int frag_size) +{ + struct sk_buff *skb = __napi_build_skb(data, frag_size); + + if (likely(skb) && frag_size) { + skb->head_frag = 1; + skb_propagate_pfmemalloc(virt_to_head_page(data), skb); + } + + return skb; +} +EXPORT_SYMBOL(napi_build_skb); + /* * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells * the caller if emergency pfmemalloc reserves are being used. If it is and @@ -838,31 +906,31 @@ void __consume_stateless_skb(struct sk_buff *skb) kfree_skbmem(skb); } -static inline void _kfree_skb_defer(struct sk_buff *skb) +static void napi_skb_cache_put(struct sk_buff *skb) { struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + u32 i; /* drop skb->head and call any destructors for packet */ skb_release_all(skb); - /* record skb to CPU local list */ + kasan_poison_object_data(skbuff_head_cache, skb); nc->skb_cache[nc->skb_count++] = skb; -#ifdef CONFIG_SLUB - /* SLUB writes into objects when freeing */ - prefetchw(skb); -#endif - - /* flush skb_cache if it is filled */ if (unlikely(nc->skb_count == NAPI_SKB_CACHE_SIZE)) { - kmem_cache_free_bulk(skbuff_head_cache, NAPI_SKB_CACHE_SIZE, - nc->skb_cache); - nc->skb_count = 0; + for (i = NAPI_SKB_CACHE_HALF; i < NAPI_SKB_CACHE_SIZE; i++) + kasan_unpoison_object_data(skbuff_head_cache, + nc->skb_cache[i]); + + kmem_cache_free_bulk(skbuff_head_cache, NAPI_SKB_CACHE_HALF, + nc->skb_cache + NAPI_SKB_CACHE_HALF); + nc->skb_count = NAPI_SKB_CACHE_HALF; } } + void __kfree_skb_defer(struct sk_buff *skb) { - _kfree_skb_defer(skb); + napi_skb_cache_put(skb); } void napi_consume_skb(struct sk_buff *skb, int budget) @@ -887,7 +955,7 @@ void napi_consume_skb(struct sk_buff *skb, int budget) return; } - _kfree_skb_defer(skb); + napi_skb_cache_put(skb); } EXPORT_SYMBOL(napi_consume_skb); From patchwork Wed Feb 10 16:31:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 380717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69BB4C433E0 for ; Wed, 10 Feb 2021 16:34:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 35F2B64E87 for ; Wed, 10 Feb 2021 16:34:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232439AbhBJQeB (ORCPT ); Wed, 10 Feb 2021 11:34:01 -0500 Received: from mail-40134.protonmail.ch ([185.70.40.134]:17907 "EHLO mail-40134.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232658AbhBJQcL (ORCPT ); Wed, 10 Feb 2021 11:32:11 -0500 Date: Wed, 10 Feb 2021 16:31:10 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1612974681; bh=o2wiiUInU+gAyFME41HQ/iCz8nud3W/AU2pVW55YAus=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=B9DsQvFOMGOV4Yd/Qwrj2OjbCRcFkCUSXxophs7z5tDqdKz0r+iJPOgT6qZqQKcv+ yWOM5BIS50Sy+HlAbRocmO3Wx9No+PB8IZE7Zg2IVI3Kd+Z8i+l0f1C3GU9ZZRk+Sq IHNasTC+vNwHzIE4OvFpGUXSxkDKMewvROqqIZh48jmmLyS1SbfPIK+b2nRWuBsRyw tqT4ZzLaQShAE3LVWdJ25YTl7GaU/IOz/Vm9b5/tM38L84Ro8ljFoLmevHzKGT1IVS X307clRi009uWvEmb+ItJocgjQU6lT7ZyHZy4fq9+YzPnAIDd0hGGsLrOnZAWp9rdM aItweEqRwreaQ== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v4 net-next 11/11] skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing Message-ID: <20210210162732.80467-12-alobakin@pm.me> In-Reply-To: <20210210162732.80467-1-alobakin@pm.me> References: <20210210162732.80467-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org napi_frags_finish() and napi_skb_finish() can only be called inside NAPI Rx context, so we can feed NAPI cache with skbuff_heads that got NAPI_MERGED_FREE verdict instead of immediate freeing. Replace __kfree_skb() with __kfree_skb_defer() in napi_skb_finish() and move napi_skb_free_stolen_head() to skbuff.c, so it can drop skbs to NAPI cache. As many drivers call napi_alloc_skb()/napi_get_frags() on their receive path, this becomes especially useful. Signed-off-by: Alexander Lobakin --- include/linux/skbuff.h | 1 + net/core/dev.c | 9 +-------- net/core/skbuff.c | 12 +++++++++--- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 906122eac82a..6d0a33d1c0db 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2921,6 +2921,7 @@ static inline struct sk_buff *napi_alloc_skb(struct napi_struct *napi, } void napi_consume_skb(struct sk_buff *skb, int budget); +void napi_skb_free_stolen_head(struct sk_buff *skb); void __kfree_skb_defer(struct sk_buff *skb); /** diff --git a/net/core/dev.c b/net/core/dev.c index 7134ae2fc0db..f04877295b4f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6094,13 +6094,6 @@ struct packet_offload *gro_find_complete_by_type(__be16 type) } EXPORT_SYMBOL(gro_find_complete_by_type); -static void napi_skb_free_stolen_head(struct sk_buff *skb) -{ - skb_dst_drop(skb); - skb_ext_put(skb); - kmem_cache_free(skbuff_head_cache, skb); -} - static gro_result_t napi_skb_finish(struct napi_struct *napi, struct sk_buff *skb, gro_result_t ret) @@ -6114,7 +6107,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD) napi_skb_free_stolen_head(skb); else - __kfree_skb(skb); + __kfree_skb_defer(skb); break; case GRO_HELD: diff --git a/net/core/skbuff.c b/net/core/skbuff.c index ac6e0172f206..9ff701afa837 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -917,9 +917,6 @@ static void napi_skb_cache_put(struct sk_buff *skb) struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); u32 i; - /* drop skb->head and call any destructors for packet */ - skb_release_all(skb); - kasan_poison_object_data(skbuff_head_cache, skb); nc->skb_cache[nc->skb_count++] = skb; @@ -936,6 +933,14 @@ static void napi_skb_cache_put(struct sk_buff *skb) void __kfree_skb_defer(struct sk_buff *skb) { + skb_release_all(skb); + napi_skb_cache_put(skb); +} + +void napi_skb_free_stolen_head(struct sk_buff *skb) +{ + skb_dst_drop(skb); + skb_ext_put(skb); napi_skb_cache_put(skb); } @@ -961,6 +966,7 @@ void napi_consume_skb(struct sk_buff *skb, int budget) return; } + skb_release_all(skb); napi_skb_cache_put(skb); } EXPORT_SYMBOL(napi_consume_skb);