From patchwork Thu Feb 11 18:52:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 382314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 493FBC433DB for ; Thu, 11 Feb 2021 19:01:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 097CE64E66 for ; Thu, 11 Feb 2021 19:01:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230320AbhBKSyJ (ORCPT ); Thu, 11 Feb 2021 13:54:09 -0500 Received: from mail2.protonmail.ch ([185.70.40.22]:54782 "EHLO mail2.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230197AbhBKSxm (ORCPT ); Thu, 11 Feb 2021 13:53:42 -0500 Date: Thu, 11 Feb 2021 18:52:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1613069578; bh=vxbPqVTP+yyafvrJZY9PHRZqLx4yQzye7I3rd2J7vYA=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=hdpHWAPPVJSfSTDCnL7IrePIyXrssmVMikJKnEfj1zNtT60PgPQEptiqX8ao3gDdz rJOjTHAUrg5IQD9fFYBgDNAAIhxwdS1grrpSxwW+NCwire8854n34cPIvt2LZhTLcJ 3O5snRqMytivIrAU4FPpQW/0crg2s47l/803LviV2XtFRvwPLS54IZV2ga+YiEH3Ai CLi1lkVQOIcrjGORtThg7d+TMcMWMuwBJZUV4E/i61ozMt/GnWHNVikGJm3SFg5zH2 lXRP6nLLmJIqKPlw0GVmxJj219VhnR4NETA71MqFnK+pwImYG6FnzpxLPOw4pN81Hp WMiwE4Md2J6TA== To: "David S. Miller" , Jakub Kicinski From: Alexander Lobakin Cc: Jonathan Lemon , Eric Dumazet , Dmitry Vyukov , Willem de Bruijn , Alexander Lobakin , Randy Dunlap , Kevin Hao , Pablo Neira Ayuso , Jakub Sitnicki , Marco Elver , Dexuan Cui , Paolo Abeni , Jesper Dangaard Brouer , Alexander Duyck , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Taehee Yoo , Cong Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Miaohe Lin , Guillaume Nault , Yonghong Song , zhudi , Michal Kubecek , Marcelo Ricardo Leitner , Dmitry Safonov <0x7f454c46@gmail.com>, Yang Yingliang , Florian Westphal , Edward Cree , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Reply-To: Alexander Lobakin Subject: [PATCH v5 net-next 01/11] skbuff: move __alloc_skb() next to the other skb allocation functions Message-ID: <20210211185220.9753-2-alobakin@pm.me> In-Reply-To: <20210211185220.9753-1-alobakin@pm.me> References: <20210211185220.9753-1-alobakin@pm.me> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In preparation before reusing several functions in all three skb allocation variants, move __alloc_skb() next to the __netdev_alloc_skb() and __napi_alloc_skb(). No functional changes. Signed-off-by: Alexander Lobakin --- net/core/skbuff.c | 284 +++++++++++++++++++++++----------------------- 1 file changed, 142 insertions(+), 142 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d380c7b5a12d..a0f846872d19 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -119,148 +119,6 @@ static void skb_under_panic(struct sk_buff *skb, unsigned int sz, void *addr) skb_panic(skb, sz, addr, __func__); } -/* - * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells - * the caller if emergency pfmemalloc reserves are being used. If it is and - * the socket is later found to be SOCK_MEMALLOC then PFMEMALLOC reserves - * may be used. Otherwise, the packet data may be discarded until enough - * memory is free - */ -#define kmalloc_reserve(size, gfp, node, pfmemalloc) \ - __kmalloc_reserve(size, gfp, node, _RET_IP_, pfmemalloc) - -static void *__kmalloc_reserve(size_t size, gfp_t flags, int node, - unsigned long ip, bool *pfmemalloc) -{ - void *obj; - bool ret_pfmemalloc = false; - - /* - * Try a regular allocation, when that fails and we're not entitled - * to the reserves, fail. - */ - obj = kmalloc_node_track_caller(size, - flags | __GFP_NOMEMALLOC | __GFP_NOWARN, - node); - if (obj || !(gfp_pfmemalloc_allowed(flags))) - goto out; - - /* Try again but now we are using pfmemalloc reserves */ - ret_pfmemalloc = true; - obj = kmalloc_node_track_caller(size, flags, node); - -out: - if (pfmemalloc) - *pfmemalloc = ret_pfmemalloc; - - return obj; -} - -/* Allocate a new skbuff. We do this ourselves so we can fill in a few - * 'private' fields and also do memory statistics to find all the - * [BEEP] leaks. - * - */ - -/** - * __alloc_skb - allocate a network buffer - * @size: size to allocate - * @gfp_mask: allocation mask - * @flags: If SKB_ALLOC_FCLONE is set, allocate from fclone cache - * instead of head cache and allocate a cloned (child) skb. - * If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for - * allocations in case the data is required for writeback - * @node: numa node to allocate memory on - * - * Allocate a new &sk_buff. The returned buffer has no headroom and a - * tail room of at least size bytes. The object has a reference count - * of one. The return is the buffer. On a failure the return is %NULL. - * - * Buffers may only be allocated from interrupts using a @gfp_mask of - * %GFP_ATOMIC. - */ -struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, - int flags, int node) -{ - struct kmem_cache *cache; - struct skb_shared_info *shinfo; - struct sk_buff *skb; - u8 *data; - bool pfmemalloc; - - cache = (flags & SKB_ALLOC_FCLONE) - ? skbuff_fclone_cache : skbuff_head_cache; - - if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX)) - gfp_mask |= __GFP_MEMALLOC; - - /* Get the HEAD */ - skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node); - if (!skb) - goto out; - prefetchw(skb); - - /* We do our best to align skb_shared_info on a separate cache - * line. It usually works because kmalloc(X > SMP_CACHE_BYTES) gives - * aligned memory blocks, unless SLUB/SLAB debug is enabled. - * Both skb->head and skb_shared_info are cache line aligned. - */ - size = SKB_DATA_ALIGN(size); - size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); - if (!data) - goto nodata; - /* kmalloc(size) might give us more room than requested. - * Put skb_shared_info exactly at the end of allocated zone, - * to allow max possible filling before reallocation. - */ - size = SKB_WITH_OVERHEAD(ksize(data)); - prefetchw(data + size); - - /* - * Only clear those fields we need to clear, not those that we will - * actually initialise below. Hence, don't put any more fields after - * the tail pointer in struct sk_buff! - */ - memset(skb, 0, offsetof(struct sk_buff, tail)); - /* Account for allocated memory : skb + skb->head */ - skb->truesize = SKB_TRUESIZE(size); - skb->pfmemalloc = pfmemalloc; - refcount_set(&skb->users, 1); - skb->head = data; - skb->data = data; - skb_reset_tail_pointer(skb); - skb->end = skb->tail + size; - skb->mac_header = (typeof(skb->mac_header))~0U; - skb->transport_header = (typeof(skb->transport_header))~0U; - - /* make sure we initialize shinfo sequentially */ - shinfo = skb_shinfo(skb); - memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)); - atomic_set(&shinfo->dataref, 1); - - if (flags & SKB_ALLOC_FCLONE) { - struct sk_buff_fclones *fclones; - - fclones = container_of(skb, struct sk_buff_fclones, skb1); - - skb->fclone = SKB_FCLONE_ORIG; - refcount_set(&fclones->fclone_ref, 1); - - fclones->skb2.fclone = SKB_FCLONE_CLONE; - } - - skb_set_kcov_handle(skb, kcov_common_handle()); - -out: - return skb; -nodata: - kmem_cache_free(cache, skb); - skb = NULL; - goto out; -} -EXPORT_SYMBOL(__alloc_skb); - /* Caller must provide SKB that is memset cleared */ static struct sk_buff *__build_skb_around(struct sk_buff *skb, void *data, unsigned int frag_size) @@ -408,6 +266,148 @@ void *__netdev_alloc_frag_align(unsigned int fragsz, unsigned int align_mask) } EXPORT_SYMBOL(__netdev_alloc_frag_align); +/* + * kmalloc_reserve is a wrapper around kmalloc_node_track_caller that tells + * the caller if emergency pfmemalloc reserves are being used. If it is and + * the socket is later found to be SOCK_MEMALLOC then PFMEMALLOC reserves + * may be used. Otherwise, the packet data may be discarded until enough + * memory is free + */ +#define kmalloc_reserve(size, gfp, node, pfmemalloc) \ + __kmalloc_reserve(size, gfp, node, _RET_IP_, pfmemalloc) + +static void *__kmalloc_reserve(size_t size, gfp_t flags, int node, + unsigned long ip, bool *pfmemalloc) +{ + void *obj; + bool ret_pfmemalloc = false; + + /* + * Try a regular allocation, when that fails and we're not entitled + * to the reserves, fail. + */ + obj = kmalloc_node_track_caller(size, + flags | __GFP_NOMEMALLOC | __GFP_NOWARN, + node); + if (obj || !(gfp_pfmemalloc_allowed(flags))) + goto out; + + /* Try again but now we are using pfmemalloc reserves */ + ret_pfmemalloc = true; + obj = kmalloc_node_track_caller(size, flags, node); + +out: + if (pfmemalloc) + *pfmemalloc = ret_pfmemalloc; + + return obj; +} + +/* Allocate a new skbuff. We do this ourselves so we can fill in a few + * 'private' fields and also do memory statistics to find all the + * [BEEP] leaks. + * + */ + +/** + * __alloc_skb - allocate a network buffer + * @size: size to allocate + * @gfp_mask: allocation mask + * @flags: If SKB_ALLOC_FCLONE is set, allocate from fclone cache + * instead of head cache and allocate a cloned (child) skb. + * If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for + * allocations in case the data is required for writeback + * @node: numa node to allocate memory on + * + * Allocate a new &sk_buff. The returned buffer has no headroom and a + * tail room of at least size bytes. The object has a reference count + * of one. The return is the buffer. On a failure the return is %NULL. + * + * Buffers may only be allocated from interrupts using a @gfp_mask of + * %GFP_ATOMIC. + */ +struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask, + int flags, int node) +{ + struct kmem_cache *cache; + struct skb_shared_info *shinfo; + struct sk_buff *skb; + u8 *data; + bool pfmemalloc; + + cache = (flags & SKB_ALLOC_FCLONE) + ? skbuff_fclone_cache : skbuff_head_cache; + + if (sk_memalloc_socks() && (flags & SKB_ALLOC_RX)) + gfp_mask |= __GFP_MEMALLOC; + + /* Get the HEAD */ + skb = kmem_cache_alloc_node(cache, gfp_mask & ~__GFP_DMA, node); + if (!skb) + goto out; + prefetchw(skb); + + /* We do our best to align skb_shared_info on a separate cache + * line. It usually works because kmalloc(X > SMP_CACHE_BYTES) gives + * aligned memory blocks, unless SLUB/SLAB debug is enabled. + * Both skb->head and skb_shared_info are cache line aligned. + */ + size = SKB_DATA_ALIGN(size); + size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + data = kmalloc_reserve(size, gfp_mask, node, &pfmemalloc); + if (!data) + goto nodata; + /* kmalloc(size) might give us more room than requested. + * Put skb_shared_info exactly at the end of allocated zone, + * to allow max possible filling before reallocation. + */ + size = SKB_WITH_OVERHEAD(ksize(data)); + prefetchw(data + size); + + /* + * Only clear those fields we need to clear, not those that we will + * actually initialise below. Hence, don't put any more fields after + * the tail pointer in struct sk_buff! + */ + memset(skb, 0, offsetof(struct sk_buff, tail)); + /* Account for allocated memory : skb + skb->head */ + skb->truesize = SKB_TRUESIZE(size); + skb->pfmemalloc = pfmemalloc; + refcount_set(&skb->users, 1); + skb->head = data; + skb->data = data; + skb_reset_tail_pointer(skb); + skb->end = skb->tail + size; + skb->mac_header = (typeof(skb->mac_header))~0U; + skb->transport_header = (typeof(skb->transport_header))~0U; + + /* make sure we initialize shinfo sequentially */ + shinfo = skb_shinfo(skb); + memset(shinfo, 0, offsetof(struct skb_shared_info, dataref)); + atomic_set(&shinfo->dataref, 1); + + if (flags & SKB_ALLOC_FCLONE) { + struct sk_buff_fclones *fclones; + + fclones = container_of(skb, struct sk_buff_fclones, skb1); + + skb->fclone = SKB_FCLONE_ORIG; + refcount_set(&fclones->fclone_ref, 1); + + fclones->skb2.fclone = SKB_FCLONE_CLONE; + } + + skb_set_kcov_handle(skb, kcov_common_handle()); + +out: + return skb; +nodata: + kmem_cache_free(cache, skb); + skb = NULL; + goto out; +} +EXPORT_SYMBOL(__alloc_skb); + /** * __netdev_alloc_skb - allocate an skbuff for rx on a specific device * @dev: network device to receive on