From patchwork Fri Oct 2 22:25:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 289052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45B5FC35257 for ; Fri, 2 Oct 2020 22:25:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8DFF20796 for ; Fri, 2 Oct 2020 22:25:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AIEpJtiD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725819AbgJBWZh (ORCPT ); Fri, 2 Oct 2020 18:25:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725379AbgJBWZh (ORCPT ); Fri, 2 Oct 2020 18:25:37 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64DB2C0613D0 for ; Fri, 2 Oct 2020 15:25:35 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id d15so3238005ybk.0 for ; Fri, 02 Oct 2020 15:25:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=0wvsohMyHdXnObtAKPx1AR8JBXMOhGggZVbB5ys0L5s=; b=AIEpJtiDFEvHr83ipBkCQt5FpTrRj4QKNHnFl4BRHlmwqj3bIqJ3AOdaU0oOaUwF3C F33vTkcRyE9/uzsiHI2VEbK4w/YmZuWNGvZhKMmej3WLTlq+76B9gWQQ45ut1tOM9Ssd 7uE8J3bRX0JfG9GgUvbZG8SXfXKFKQuH3YAEcIRu34/9NxmvtG4fJvkAdAMQV2Q1LBZR qrYIuxyZuAB5wruv/pfjFMkH69KoxNaF8jtp5gaZuLESk76j9kcBCWHrXrl9vjIzFmkq E3Ie/HD1u7aALy/QnkOuJrggmXXz/SHlOlGkX0diJILiboQdkrE0lqzO4nG/DAeyyDlm tClQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=0wvsohMyHdXnObtAKPx1AR8JBXMOhGggZVbB5ys0L5s=; b=hrVpPS80ftabT1a/AD2E3H5LbSuSQvC1kF9ekA3cx+QOZ4Bom9Zl1Y+dBbS8Ba347o 7X7Kcy0UMivIx2TsV47soAjc2ak41XoKhGKZpD1eNauC2L5izwBuy8D5jm2x3D/yi3/w wlUVidwOOpYxT9s7YwpuU1IZ2OzWIiAenYOJ8FneNCJ2MOIw1jjrcnZaOoJiYJIoLsk8 CcyJ5ljUvRonhgFgNMEANAqy2XD9E9Fm3a7yxMr6Ticw7R2a8DgJnbVqCTyddFc2B/Ab RbHKh737O0GLP/P2HNW+PevB3lp2TI22dO/r8QJznNY60lBDhIxvCiGbUDB79iPiPGaO fWZA== X-Gm-Message-State: AOAM53018NkcJcoSXaBznyQFAWA5X5cYb5ImlqmAfWgkqzgQw1DBmLKS KEqJ3L8uxK8H7dGwM7acL4+oe7itPHE= X-Google-Smtp-Source: ABdhPJwosaFiwKDOon6H7Coxg12th8+mIcV7NyqWbkjoSPslNu2OiBFqZ/rWxTYgWNjOrJVg2fgaGT6WOWo= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:dbcf:: with SMTP id g198mr5900208ybf.354.1601677534581; Fri, 02 Oct 2020 15:25:34 -0700 (PDT) Date: Fri, 2 Oct 2020 15:25:10 -0700 In-Reply-To: <20201002222514.1159492-1-weiwan@google.com> Message-Id: <20201002222514.1159492-2-weiwan@google.com> Mime-Version: 1.0 References: <20201002222514.1159492-1-weiwan@google.com> X-Mailer: git-send-email 2.28.0.806.g8561365e88-goog Subject: [PATCH net-next v2 1/5] net: implement threaded-able napi poll loop support From: Wei Wang To: "David S . Miller" , netdev@vger.kernel.org Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Paolo Abeni This patch allows running each napi poll loop inside its own kernel thread. The rx mode can be enabled per napi instance via the newly addded napi_set_threaded() api; the requested kthread will be created on demand and shut down on device stop. Once that threaded mode is enabled and the kthread is started, napi_schedule() will wake-up such thread instead of scheduling the softirq. The threaded poll loop behaves quite likely the net_rx_action, but it does not have to manipulate local irqs and uses an explicit scheduling point based on netdev_budget. Signed-off-by: Paolo Abeni Signed-off-by: Hannes Frederic Sowa Signed-off-by: Wei Wang --- include/linux/netdevice.h | 5 ++ net/core/dev.c | 113 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 28cfa53daf72..b3516e77371e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -348,6 +348,7 @@ struct napi_struct { struct list_head dev_list; struct hlist_node napi_hash_node; unsigned int napi_id; + struct task_struct *thread; }; enum { @@ -358,6 +359,7 @@ enum { NAPI_STATE_LISTED, /* NAPI added to system lists */ NAPI_STATE_NO_BUSY_POLL,/* Do not add in napi_hash, no busy polling */ NAPI_STATE_IN_BUSY_POLL,/* sk_busy_loop() owns this NAPI */ + NAPI_STATE_THREADED, /* The poll is performed inside its own thread*/ }; enum { @@ -368,6 +370,7 @@ enum { NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), + NAPIF_STATE_THREADED = BIT(NAPI_STATE_THREADED), }; enum gro_result { @@ -489,6 +492,8 @@ static inline bool napi_complete(struct napi_struct *n) return napi_complete_done(n, 0); } +int napi_set_threaded(struct napi_struct *n, bool threded); + /** * napi_disable - prevent NAPI from scheduling * @n: NAPI context diff --git a/net/core/dev.c b/net/core/dev.c index 9d55bf5d1a65..259cd7f3434f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -91,6 +91,7 @@ #include #include #include +#include #include #include #include @@ -1487,9 +1488,19 @@ void netdev_notify_peers(struct net_device *dev) } EXPORT_SYMBOL(netdev_notify_peers); +static int napi_threaded_poll(void *data); + +static void napi_thread_start(struct napi_struct *n) +{ + if (test_bit(NAPI_STATE_THREADED, &n->state) && !n->thread) + n->thread = kthread_create(napi_threaded_poll, n, "%s-%d", + n->dev->name, n->napi_id); +} + static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) { const struct net_device_ops *ops = dev->netdev_ops; + struct napi_struct *n; int ret; ASSERT_RTNL(); @@ -1521,6 +1532,9 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) if (!ret && ops->ndo_open) ret = ops->ndo_open(dev); + list_for_each_entry(n, &dev->napi_list, dev_list) + napi_thread_start(n); + netpoll_poll_enable(dev); if (ret) @@ -1566,6 +1580,14 @@ int dev_open(struct net_device *dev, struct netlink_ext_ack *extack) } EXPORT_SYMBOL(dev_open); +static void napi_thread_stop(struct napi_struct *n) +{ + if (!n->thread) + return; + kthread_stop(n->thread); + n->thread = NULL; +} + static void __dev_close_many(struct list_head *head) { struct net_device *dev; @@ -1594,6 +1616,7 @@ static void __dev_close_many(struct list_head *head) list_for_each_entry(dev, head, close_list) { const struct net_device_ops *ops = dev->netdev_ops; + struct napi_struct *n; /* * Call the device specific close. This cannot fail. @@ -1605,6 +1628,9 @@ static void __dev_close_many(struct list_head *head) if (ops->ndo_stop) ops->ndo_stop(dev); + list_for_each_entry(n, &dev->napi_list, dev_list) + napi_thread_stop(n); + dev->flags &= ~IFF_UP; netpoll_poll_enable(dev); } @@ -4241,6 +4267,11 @@ int gro_normal_batch __read_mostly = 8; static inline void ____napi_schedule(struct softnet_data *sd, struct napi_struct *napi) { + if (napi->thread) { + wake_up_process(napi->thread); + return; + } + list_add_tail(&napi->poll_list, &sd->poll_list); __raise_softirq_irqoff(NET_RX_SOFTIRQ); } @@ -6654,6 +6685,30 @@ static void init_gro_hash(struct napi_struct *napi) napi->gro_bitmask = 0; } +int napi_set_threaded(struct napi_struct *n, bool threaded) +{ + ASSERT_RTNL(); + + if (n->dev->flags & IFF_UP) + return -EBUSY; + + if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state)) + return 0; + if (threaded) + set_bit(NAPI_STATE_THREADED, &n->state); + else + clear_bit(NAPI_STATE_THREADED, &n->state); + + /* if the device is initializing, nothing todo */ + if (test_bit(__LINK_STATE_START, &n->dev->state)) + return 0; + + napi_thread_stop(n); + napi_thread_start(n); + return 0; +} +EXPORT_SYMBOL(napi_set_threaded); + void netif_napi_add(struct net_device *dev, struct napi_struct *napi, int (*poll)(struct napi_struct *, int), int weight) { @@ -6794,6 +6849,64 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) return work; } +static int napi_thread_wait(struct napi_struct *napi) +{ + set_current_state(TASK_INTERRUPTIBLE); + + while (!kthread_should_stop() && !napi_disable_pending(napi)) { + if (test_bit(NAPI_STATE_SCHED, &napi->state)) { + __set_current_state(TASK_RUNNING); + return 0; + } + + schedule(); + set_current_state(TASK_INTERRUPTIBLE); + } + __set_current_state(TASK_RUNNING); + return -1; +} + +static int napi_threaded_poll(void *data) +{ + struct napi_struct *napi = data; + + while (!napi_thread_wait(napi)) { + struct list_head dummy_repoll; + int budget = netdev_budget; + unsigned long time_limit; + bool again = true; + + INIT_LIST_HEAD(&dummy_repoll); + local_bh_disable(); + time_limit = jiffies + 2; + do { + /* ensure that the poll list is not empty */ + if (list_empty(&dummy_repoll)) + list_add(&napi->poll_list, &dummy_repoll); + + budget -= napi_poll(napi, &dummy_repoll); + if (unlikely(budget <= 0 || + time_after_eq(jiffies, time_limit))) { + cond_resched(); + + /* refresh the budget */ + budget = netdev_budget; + __kfree_skb_flush(); + time_limit = jiffies + 2; + } + + if (napi_disable_pending(napi)) + again = false; + else if (!test_bit(NAPI_STATE_SCHED, &napi->state)) + again = false; + } while (again); + + __kfree_skb_flush(); + local_bh_enable(); + } + return 0; +} + static __latent_entropy void net_rx_action(struct softirq_action *h) { struct softnet_data *sd = this_cpu_ptr(&softnet_data); From patchwork Fri Oct 2 22:25:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 267659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2ACFAC35257 for ; Fri, 2 Oct 2020 22:25:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B4C1D20754 for ; Fri, 2 Oct 2020 22:25:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GOqlNMjR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725838AbgJBWZl (ORCPT ); Fri, 2 Oct 2020 18:25:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725379AbgJBWZi (ORCPT ); Fri, 2 Oct 2020 18:25:38 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57DDBC0613D0 for ; Fri, 2 Oct 2020 15:25:38 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id v4so3190991ybk.5 for ; Fri, 02 Oct 2020 15:25:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=HMGl8RgPHbOfJMMQ5ObBOMaCDpe5l4zVOjyOlhKjxwM=; b=GOqlNMjRuAEHzpMQWaGR2zOlXEDtJA5pD+Ny2PFrD6rB9j8jOBo4XdYDy3RueVfzGM EOKwAD/ltxT37vQei3F0RVERDw6Xb844Hln+3ZT/ohSpYFtA3tGzPJEDpiCppJ0S3tAB eFC3fFnOU6W/NVJosLcyh4QjF02zFG5A1pbc+cpGRqsz5d6jGW90yioVQ1N8Fdyv+PLA Za6hVMPXQsBkn6+pGNSdRJQLeYSZU9hEYn+9cAR94Wkhlnpr6KMWBVeERSkLaMm8ZtPv 6AjlBBc1kjcUs8wxkAjfqkyltesJgcz2vnKTIHX5ccDTCb5MjFdYe0HcS29ITDZ/MTfw Jfzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HMGl8RgPHbOfJMMQ5ObBOMaCDpe5l4zVOjyOlhKjxwM=; b=R2vXMizKh51aJPqElfxNONkM4jARBpCQb/I6NPMShCT/e1Atf5o4nGsbnkwqe14qIH ZdGVzw27bxHFRPUd19Kld1xGUOx5iSWRn8Wy5Qbzo3gWsPo9CS5sfrynNQ40wGV5QnGV Z+CdVdSTl9WeYRFUjeLVjmJRmrBiRyvAGVfbowXbDxUa7iVP7pWgP1ljscmk9sROmUw7 L9mFLHWKrB0nAPJqo0PYu3UfsQQ38BMUlZ1aobrVECJ3dxSmL5AELNgPezobVdD1/moN wVaVFYZnemctolc10c7l+v3ZOoXR7uKn93TfTm23sfg4fACoOFg75ZKUwLdxFbVp2Czk P9SQ== X-Gm-Message-State: AOAM531uFTWEqP29/dDojLSx0BHx7qdqoNY5lM8dyZJqW/a+LYWQ5pBq FsNgFa7rae1FrQFJjf2CroKAXmgTsn4= X-Google-Smtp-Source: ABdhPJxgNMDGYxtVbHo/AC3eUyZDUDkYXCHwQomxruKIKSLRuMZNqi9UvR8qIrqLyp67EFqglvRNekG5EhU= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:7d04:: with SMTP id y4mr6142288ybc.246.1601677537559; Fri, 02 Oct 2020 15:25:37 -0700 (PDT) Date: Fri, 2 Oct 2020 15:25:11 -0700 In-Reply-To: <20201002222514.1159492-1-weiwan@google.com> Message-Id: <20201002222514.1159492-3-weiwan@google.com> Mime-Version: 1.0 References: <20201002222514.1159492-1-weiwan@google.com> X-Mailer: git-send-email 2.28.0.806.g8561365e88-goog Subject: [PATCH net-next v2 2/5] net: add sysfs attribute to control napi threaded mode From: Wei Wang To: "David S . Miller" , netdev@vger.kernel.org Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Paolo Abeni this patch adds a new sysfs attribute to the network device class. Said attribute is a bitmask that allows controlling the threaded mode for all the napi instances of the given network device. The threaded mode can be switched only if related network device is down. Signed-off-by: Paolo Abeni Signed-off-by: Hannes Frederic Sowa Signed-off-by: Wei Wang --- net/core/net-sysfs.c | 103 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 94fff0700bdd..df8dd25e5e4b 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -538,6 +538,108 @@ static ssize_t phys_switch_id_show(struct device *dev, } static DEVICE_ATTR_RO(phys_switch_id); +static unsigned long *__alloc_thread_bitmap(struct net_device *netdev, + int *bits) +{ + struct napi_struct *n; + + *bits = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) + (*bits)++; + + return kmalloc_array(BITS_TO_LONGS(*bits), sizeof(unsigned long), + GFP_ATOMIC | __GFP_ZERO); +} + +static ssize_t threaded_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct net_device *netdev = to_net_dev(dev); + struct napi_struct *n; + unsigned long *bmap; + size_t count = 0; + int i, bits; + + if (!rtnl_trylock()) + return restart_syscall(); + + if (!dev_isalive(netdev)) + goto unlock; + + bmap = __alloc_thread_bitmap(netdev, &bits); + if (!bmap) { + count = -ENOMEM; + goto unlock; + } + + i = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) { + if (test_bit(NAPI_STATE_THREADED, &n->state)) + set_bit(i, bmap); + i++; + } + + count = bitmap_print_to_pagebuf(true, buf, bmap, bits); + kfree(bmap); + +unlock: + rtnl_unlock(); + + return count; +} + +static ssize_t threaded_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + struct net_device *netdev = to_net_dev(dev); + struct napi_struct *n; + unsigned long *bmap; + int i, bits; + size_t ret; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + if (!rtnl_trylock()) + return restart_syscall(); + + if (!dev_isalive(netdev)) { + ret = len; + goto unlock; + } + + if (netdev->flags & IFF_UP) { + ret = -EBUSY; + goto unlock; + } + + bmap = __alloc_thread_bitmap(netdev, &bits); + if (!bmap) { + ret = -ENOMEM; + goto unlock; + } + + ret = bitmap_parselist(buf, bmap, bits); + if (ret) + goto free_unlock; + + i = 0; + list_for_each_entry(n, &netdev->napi_list, dev_list) { + napi_set_threaded(n, test_bit(i, bmap)); + i++; + } + ret = len; + +free_unlock: + kfree(bmap); + +unlock: + rtnl_unlock(); + return ret; +} +static DEVICE_ATTR_RW(threaded); + static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_netdev_group.attr, &dev_attr_type.attr, @@ -570,6 +672,7 @@ static struct attribute *net_class_attrs[] __ro_after_init = { &dev_attr_proto_down.attr, &dev_attr_carrier_up_count.attr, &dev_attr_carrier_down_count.attr, + &dev_attr_threaded.attr, NULL, }; ATTRIBUTE_GROUPS(net_class); From patchwork Fri Oct 2 22:25:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 289051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED,USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81D4EC41604 for ; Fri, 2 Oct 2020 22:25:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E7AE20796 for ; Fri, 2 Oct 2020 22:25:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kPD0QaR5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725833AbgJBWZk (ORCPT ); Fri, 2 Oct 2020 18:25:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725828AbgJBWZk (ORCPT ); Fri, 2 Oct 2020 18:25:40 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 18EDCC0613E2 for ; Fri, 2 Oct 2020 15:25:40 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id u64so3179754ybb.8 for ; Fri, 02 Oct 2020 15:25:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=+/Y13zHFtQa/zSznIFcvahVMTAlYDeryTWEPgZln9ko=; b=kPD0QaR55PRUjryQ9DkDw1e3MA+qYzV23mfiOIsLsC7YwxsLflYQvHK6i2w7M3pimf +qfP+scRRy3m9/NoheejEw6/FmVA8qIoPOJ1RMnk7cEq/nKM+R1vb1dfmi8TgG9PoBJQ 5eoJgzSQcqArLN8uySwIuNbI7JPnHPfBgaE3N+MzEHJvvaZffVO6wKlBHQXLasbwoHae hbdReBCw+k9sEyCb7AmUVJhhGLDRde0fL3skspOLnQ7HLQtFQGdDwPZbPjfpnXDvkA7R 2IonxNxqvxm1dIP6cEhuava3T44/0br6o+tXuCsSaOfjcuGrgBUumOhpkt8uhg0iAW2k 13GQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=+/Y13zHFtQa/zSznIFcvahVMTAlYDeryTWEPgZln9ko=; b=mXFW/6QTM7MNoUwkkRiT5qLJiscX3dX3/3BWbyhrwL0c2DNCYEdTJs+TheW8JxwomM aLHp5X7dSmFkP/TR50qWRJmI/sB53jLDtDSfHxt+/GnasY2qDra8ew15zQP4iBf2Z4TI lvEwgBhHfe/E43SJhgReMjegV+AmaOiwnlanpHoKzxd/+wv/UCsJ/cC9YPoRamIdiXzk YoCJDZSQuk2p1A+PkQ4dmzBFrfs1OQQWeECKKhczMLMHA6bno4OcViDLAB9YSSxahdgy JlaVGyYwHA5cH8emYSqo+t4EpJ/uMdzahy/i3lzaL72HKKcy8/d+ajCE9PbBA+KOTOHw Pp8A== X-Gm-Message-State: AOAM533PVPt1WZMIubX/cAkWDHG1S+drIGw1PxAtWY/cgpmkozzaSgu+ 2gtQ9ejB9ZgrNU3Ni/6moXlX5ZlaL2s= X-Google-Smtp-Source: ABdhPJy7tpzGYt4zncwzOmFQUqg0OjaqAnthWZW649Bh3jszsoCizpyRTGPn0VIol8RFs7Opw10ZC3mPPBQ= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:4e55:: with SMTP id c82mr5715101ybb.416.1601677539345; Fri, 02 Oct 2020 15:25:39 -0700 (PDT) Date: Fri, 2 Oct 2020 15:25:12 -0700 In-Reply-To: <20201002222514.1159492-1-weiwan@google.com> Message-Id: <20201002222514.1159492-4-weiwan@google.com> Mime-Version: 1.0 References: <20201002222514.1159492-1-weiwan@google.com> X-Mailer: git-send-email 2.28.0.806.g8561365e88-goog Subject: [PATCH net-next v2 3/5] net: extract napi poll functionality to __napi_poll() From: Wei Wang To: "David S . Miller" , netdev@vger.kernel.org Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Felix Fietkau This commit introduces a new function __napi_poll() which does the main logic of the existing napi_poll() function, and will be called by other functions in later commits. This idea and implementation is done by Felix Fietkau and is proposed as part of the patch to move napi work to work_queue context. This commit by itself is a code restructure. Signed-off-by: Felix Fietkau Signed-off-by: Wei Wang --- net/core/dev.c | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 259cd7f3434f..c82522262ca8 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6783,15 +6783,10 @@ void __netif_napi_del(struct napi_struct *napi) } EXPORT_SYMBOL(__netif_napi_del); -static int napi_poll(struct napi_struct *n, struct list_head *repoll) +static int __napi_poll(struct napi_struct *n, bool *repoll) { - void *have; int work, weight; - list_del_init(&n->poll_list); - - have = netpoll_poll_lock(n); - weight = n->weight; /* This NAPI_STATE_SCHED test is for avoiding a race @@ -6811,7 +6806,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) n->poll, work, weight); if (likely(work < weight)) - goto out_unlock; + return work; /* Drivers must not modify the NAPI state if they * consume the entire weight. In such cases this code @@ -6820,7 +6815,7 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) */ if (unlikely(napi_disable_pending(n))) { napi_complete(n); - goto out_unlock; + return work; } if (n->gro_bitmask) { @@ -6832,6 +6827,26 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) gro_normal_list(n); + *repoll = true; + + return work; +} + +static int napi_poll(struct napi_struct *n, struct list_head *repoll) +{ + bool do_repoll = false; + void *have; + int work; + + list_del_init(&n->poll_list); + + have = netpoll_poll_lock(n); + + work = __napi_poll(n, &do_repoll); + + if (!do_repoll) + goto out_unlock; + /* Some drivers may have called napi_schedule * prior to exhausting their budget. */ From patchwork Fri Oct 2 22:25:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 267658 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBF35C35257 for ; Fri, 2 Oct 2020 22:25:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8381020754 for ; Fri, 2 Oct 2020 22:25:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dd93fBN2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725847AbgJBWZs (ORCPT ); Fri, 2 Oct 2020 18:25:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725379AbgJBWZm (ORCPT ); Fri, 2 Oct 2020 18:25:42 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 287C3C0613D0 for ; Fri, 2 Oct 2020 15:25:42 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id n13so3146599ybk.9 for ; Fri, 02 Oct 2020 15:25:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=0e+7IEbYcjux0MzEdsGB+Kd6sxVaCkN3KuJ4/wrBOPU=; b=dd93fBN2s95RJPg5HzDE/4OQUnFwEEe1/tA9O66EEr5oXOBlEtkwHoJsW73U/lFBdU 6RPFdXwnb4ZlnAx+yr6Z1Tj/BrIkWxvuQpovX/LREM71rA6a3HMWw4yoWQYr5L82LaSl U8mPRz0LqPibBLbCRsJqCfcAVcSc0Yy+HRLVZXEe0Ax6kJP8KyJKg8MwxRw1kOZ86gU3 7VyBQG3rKT24YSglsWOFoQkHo1+NV+E64kzmjlDAACjK7HBJzaLzi1WPs8bE4w9nSjOO 92biOyWmYUP1DV8/BJsT36qljOQGGHXjf17l8/ZQklSfa5u6dfnpb6iIfMLLiy9fDcHT hfTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=0e+7IEbYcjux0MzEdsGB+Kd6sxVaCkN3KuJ4/wrBOPU=; b=ndbxJ0oqEto2nV2Estfsk8ftGxNPW3puVHadcP4WDfFPOu9iOU/KxVFfEMZFgrMkJy iBn+xVi68PbxgT1L6vN+2PnDxBUs4zRbsHC/jyAOUYkKEp1JcaXswWTRNg9caBmR+l8x U2dEDWWmjLMj2Qs7Be99uk5UaM0qC8JksHCGYMIalAwEZTN1r6JPdITTMrfCX2dCbB43 ukAQBL0adUG5Sd8axHsIwGaVSp+RQPBmtZ64E7PAkX1Mat7ZaF9bw+qhKe9/ZVlE0bj3 uuohuFyD2S+UhB8K3SPPrsr4kkqxsVRx/ljDm1T37XsZThT6lMs9uLyrVoHD3ibamWC6 q5Fw== X-Gm-Message-State: AOAM530Hd/YIUbvrueROMr8nW4Biun+P/PbIWoXlscRblTKXTc2Q36kP pVHjckYFfmYEONBbKSIambePAgU0T0U= X-Google-Smtp-Source: ABdhPJyI+rd6pxsGPq+vCAusSogCB7tZcybC6IR47SY2Ec7vOYgnj1Yo7qrB4VMmJVR6xu3hAcuEq7tzxJI= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:a25:ab8e:: with SMTP id v14mr5754235ybi.465.1601677541283; Fri, 02 Oct 2020 15:25:41 -0700 (PDT) Date: Fri, 2 Oct 2020 15:25:13 -0700 In-Reply-To: <20201002222514.1159492-1-weiwan@google.com> Message-Id: <20201002222514.1159492-5-weiwan@google.com> Mime-Version: 1.0 References: <20201002222514.1159492-1-weiwan@google.com> X-Mailer: git-send-email 2.28.0.806.g8561365e88-goog Subject: [PATCH net-next v2 4/5] net: modify kthread handler to use __napi_poll() From: Wei Wang To: "David S . Miller" , netdev@vger.kernel.org Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jakub Kicinski The current kthread handler calls napi_poll() and has to pass a dummy repoll list to the function, which seems redundent. The new proposed kthread handler calls the newly proposed __napi_poll(), and respects napi->weight as before. If repoll is needed, cond_resched() is called first to give other tasks a chance to run before repolling. This change is proposed by Jakub Kicinski on top of the previous patch. Signed-off-by: Jakub Kicinski Signed-off-by: Wei Wang --- net/core/dev.c | 62 +++++++++++++++++++------------------------------- 1 file changed, 24 insertions(+), 38 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index c82522262ca8..b4f33e442b5e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6827,6 +6827,15 @@ static int __napi_poll(struct napi_struct *n, bool *repoll) gro_normal_list(n); + /* Some drivers may have called napi_schedule + * prior to exhausting their budget. + */ + if (unlikely(!list_empty(&n->poll_list))) { + pr_warn_once("%s: Budget exhausted after napi rescheduled\n", + n->dev ? n->dev->name : "backlog"); + return work; + } + *repoll = true; return work; @@ -6847,15 +6856,6 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) if (!do_repoll) goto out_unlock; - /* Some drivers may have called napi_schedule - * prior to exhausting their budget. - */ - if (unlikely(!list_empty(&n->poll_list))) { - pr_warn_once("%s: Budget exhausted after napi rescheduled\n", - n->dev ? n->dev->name : "backlog"); - goto out_unlock; - } - list_add_tail(&n->poll_list, repoll); out_unlock: @@ -6884,40 +6884,26 @@ static int napi_thread_wait(struct napi_struct *napi) static int napi_threaded_poll(void *data) { struct napi_struct *napi = data; + void *have; while (!napi_thread_wait(napi)) { - struct list_head dummy_repoll; - int budget = netdev_budget; - unsigned long time_limit; - bool again = true; + for (;;) { + bool repoll = false; - INIT_LIST_HEAD(&dummy_repoll); - local_bh_disable(); - time_limit = jiffies + 2; - do { - /* ensure that the poll list is not empty */ - if (list_empty(&dummy_repoll)) - list_add(&napi->poll_list, &dummy_repoll); - - budget -= napi_poll(napi, &dummy_repoll); - if (unlikely(budget <= 0 || - time_after_eq(jiffies, time_limit))) { - cond_resched(); - - /* refresh the budget */ - budget = netdev_budget; - __kfree_skb_flush(); - time_limit = jiffies + 2; - } + local_bh_disable(); - if (napi_disable_pending(napi)) - again = false; - else if (!test_bit(NAPI_STATE_SCHED, &napi->state)) - again = false; - } while (again); + have = netpoll_poll_lock(napi); + __napi_poll(napi, &repoll); + netpoll_poll_unlock(have); - __kfree_skb_flush(); - local_bh_enable(); + __kfree_skb_flush(); + local_bh_enable(); + + if (!repoll) + break; + + cond_resched(); + } } return 0; } From patchwork Fri Oct 2 22:25:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Wang X-Patchwork-Id: 289050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6400C41604 for ; Fri, 2 Oct 2020 22:25:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5CB3A20754 for ; Fri, 2 Oct 2020 22:25:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ioQ9MZRb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725842AbgJBWZr (ORCPT ); Fri, 2 Oct 2020 18:25:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725766AbgJBWZp (ORCPT ); Fri, 2 Oct 2020 18:25:45 -0400 Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7627C0613E2 for ; Fri, 2 Oct 2020 15:25:43 -0700 (PDT) Received: by mail-qk1-x74a.google.com with SMTP id 125so2147586qkh.4 for ; Fri, 02 Oct 2020 15:25:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VoT3IBpFIUfj3LGz+u23OWtCKeHsMR5qZdla+sIEeXA=; b=ioQ9MZRbEK6VRYUJwz+R2Qke5oQ6403brWCQiXGKzHCV2GVifB4b1t/En/Sn15a8JY bKCkmPM1XJUOy/DtmsVpnX9jut+XvEaLrVhM8+gOq14rfPyTQw2fU2HlZbuWhkErc56d v/yEEhSe/V4j7o0P1lY9Bw9Wt8GBDFWKlX2b8WCpuANlSmFFC7WNqCa31Kt+cdKRSRQj /s45NvxgU6gqxewlEDyNbODGcX+4fpK4gBFI7nt+tIv8ILmQBH7Gff9rSz8c6ofTEkgc FPHxEF6wX7XqhjTkhZBK9bA4/Xs6PqrT8zge0Y64CHfRtugWVw9KBjK2tNh2ZsNG4bjV Uw6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VoT3IBpFIUfj3LGz+u23OWtCKeHsMR5qZdla+sIEeXA=; b=Qx1cpfrHs+pWPABBzjKc5rgHx/4S4zaD7uLBggy/OAPt+1CWVtYUneXNlSU/T6opuv IDxx3wjznFIG5VI8YTqbSMU77vuN3udro5HvijXW70cDsHql9jas+c+PA8mrMcGjbseV ylq83JcpzZGEqVFzLGD46O0yOT3YwU/N36wb+VYPec1cr+k9KQiVE3SinUf+UDZ9erfk J0FpKp9d1DJ+VhPhnHSXQiGQoNTidYubHxl0qfiFy/34vMC7pa6OKo3b7MglBHpMwZWF VQYyljJ+XkqtskwtgErJcIjnLHTnocgd264S5fMdFYXuQUG3MzjFFfFm4RhEayCQWzWc qu/w== X-Gm-Message-State: AOAM532tUTa0RRpXYeKX7uO1bmVumUkh9/RNe1VY/MFYPpzYW/UcU1Pa mT6ZEJjQ+JpNM/gao9DpWCsK9X0ZLH8= X-Google-Smtp-Source: ABdhPJxgeXinEVt/mA+qo4mI1RA4URIBvqQRnP06iK6me7KzHgpCO+vFGxdxSJDmPmM204OtDrtEX6jxlVc= Sender: "weiwan via sendgmr" X-Received: from weiwan.svl.corp.google.com ([2620:15c:2c4:201:1ea0:b8ff:fe75:cf08]) (user=weiwan job=sendgmr) by 2002:ad4:58c7:: with SMTP id dh7mr4399684qvb.20.1601677543100; Fri, 02 Oct 2020 15:25:43 -0700 (PDT) Date: Fri, 2 Oct 2020 15:25:14 -0700 In-Reply-To: <20201002222514.1159492-1-weiwan@google.com> Message-Id: <20201002222514.1159492-6-weiwan@google.com> Mime-Version: 1.0 References: <20201002222514.1159492-1-weiwan@google.com> X-Mailer: git-send-email 2.28.0.806.g8561365e88-goog Subject: [PATCH net-next v2 5/5] net: improve napi threaded config From: Wei Wang To: "David S . Miller" , netdev@vger.kernel.org Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hannes Frederic Sowa , Felix Fietkau , Wei Wang Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This commit mainly addresses the threaded config to make the switch between softirq based and kthread based NAPI processing not require a device down/up. It also moves the kthread_create() call to the sysfs handler when user tries to enable "threaded" on napi, and properly handles the kthread_create() failure. This is because certain drivers do not have the napi created and linked to the dev when dev_open() is called. So the previous implementation does not work properly there. Signed-off-by: Wei Wang --- Changes since v1: replaced kthread_create() with kthread_run() Changes since RFC: changed the thread name to napi/- net/core/dev.c | 53 ++++++++++++++++++++++++++------------------ net/core/net-sysfs.c | 9 +++----- 2 files changed, 35 insertions(+), 27 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index b4f33e442b5e..e89a7f869c73 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1490,17 +1490,28 @@ EXPORT_SYMBOL(netdev_notify_peers); static int napi_threaded_poll(void *data); -static void napi_thread_start(struct napi_struct *n) +static int napi_kthread_create(struct napi_struct *n) { - if (test_bit(NAPI_STATE_THREADED, &n->state) && !n->thread) - n->thread = kthread_create(napi_threaded_poll, n, "%s-%d", - n->dev->name, n->napi_id); + int err = 0; + + /* Create and wake up the kthread once to put it in + * TASK_INTERRUPTIBLE mode to avoid the blocked task + * warning and work with loadavg. + */ + n->thread = kthread_run(napi_threaded_poll, n, "napi/%s-%d", + n->dev->name, n->napi_id); + if (IS_ERR(n->thread)) { + err = PTR_ERR(n->thread); + pr_err("kthread_run failed with err %d\n", err); + n->thread = NULL; + } + + return err; } static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) { const struct net_device_ops *ops = dev->netdev_ops; - struct napi_struct *n; int ret; ASSERT_RTNL(); @@ -1532,9 +1543,6 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) if (!ret && ops->ndo_open) ret = ops->ndo_open(dev); - list_for_each_entry(n, &dev->napi_list, dev_list) - napi_thread_start(n); - netpoll_poll_enable(dev); if (ret) @@ -1585,6 +1593,7 @@ static void napi_thread_stop(struct napi_struct *n) if (!n->thread) return; kthread_stop(n->thread); + clear_bit(NAPI_STATE_THREADED, &n->state); n->thread = NULL; } @@ -4267,7 +4276,7 @@ int gro_normal_batch __read_mostly = 8; static inline void ____napi_schedule(struct softnet_data *sd, struct napi_struct *napi) { - if (napi->thread) { + if (test_bit(NAPI_STATE_THREADED, &napi->state)) { wake_up_process(napi->thread); return; } @@ -6687,25 +6696,25 @@ static void init_gro_hash(struct napi_struct *napi) int napi_set_threaded(struct napi_struct *n, bool threaded) { - ASSERT_RTNL(); + int err = 0; - if (n->dev->flags & IFF_UP) - return -EBUSY; + ASSERT_RTNL(); if (threaded == !!test_bit(NAPI_STATE_THREADED, &n->state)) return 0; - if (threaded) + if (threaded) { + if (!n->thread) { + err = napi_kthread_create(n); + if (err) + goto out; + } set_bit(NAPI_STATE_THREADED, &n->state); - else + } else { clear_bit(NAPI_STATE_THREADED, &n->state); + } - /* if the device is initializing, nothing todo */ - if (test_bit(__LINK_STATE_START, &n->dev->state)) - return 0; - - napi_thread_stop(n); - napi_thread_start(n); - return 0; +out: + return err; } EXPORT_SYMBOL(napi_set_threaded); @@ -6750,6 +6759,7 @@ void napi_disable(struct napi_struct *n) msleep(1); hrtimer_cancel(&n->timer); + napi_thread_stop(n); clear_bit(NAPI_STATE_DISABLE, &n->state); } @@ -6870,6 +6880,7 @@ static int napi_thread_wait(struct napi_struct *napi) while (!kthread_should_stop() && !napi_disable_pending(napi)) { if (test_bit(NAPI_STATE_SCHED, &napi->state)) { + WARN_ON(!list_empty(&napi->poll_list)); __set_current_state(TASK_RUNNING); return 0; } diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index df8dd25e5e4b..1e24c1e81ad8 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -609,11 +609,6 @@ static ssize_t threaded_store(struct device *dev, goto unlock; } - if (netdev->flags & IFF_UP) { - ret = -EBUSY; - goto unlock; - } - bmap = __alloc_thread_bitmap(netdev, &bits); if (!bmap) { ret = -ENOMEM; @@ -626,7 +621,9 @@ static ssize_t threaded_store(struct device *dev, i = 0; list_for_each_entry(n, &netdev->napi_list, dev_list) { - napi_set_threaded(n, test_bit(i, bmap)); + ret = napi_set_threaded(n, test_bit(i, bmap)); + if (ret) + goto free_unlock; i++; } ret = len;