From patchwork Fri Dec 18 13:45:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 345896 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BF65C2BBD4 for ; Fri, 18 Dec 2020 13:46:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D6E323AC1 for ; Fri, 18 Dec 2020 13:46:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726686AbgLRNq3 (ORCPT ); Fri, 18 Dec 2020 08:46:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726540AbgLRNq3 (ORCPT ); Fri, 18 Dec 2020 08:46:29 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAC0BC0617B0; Fri, 18 Dec 2020 05:45:48 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id g3so1445712plp.2; Fri, 18 Dec 2020 05:45:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QDmLdv1yL5l0hpXF2TgoENNSs0aO/B/hx+r7ZWVrQq8=; b=rIcsyXuJ+FI7mRmQkH/THTxRPnPtkJwg8xqgHBgytb0Rukh2v1zjsKlJt3DtDO/eAW vbhk9AQtONtGOm1N3ERLto1VNZs+ltml4AJVgc2p9oEDNo7oPkqTGOPD/jhHw36IjPGP nwgY4gfe37RJjWGMxuLqI6QqAl0DjOhHSAVmXGnc+gfzlOj2zMr67OAwB6hKb4truNSU LhH2HzoG+CJhhKR7zypbP1rycf3mCJP83AhAuiYJJZeJY2HhyUZFTpuW6F97vE9AgarK oYF5+SKNgMk1O/En0iyAluNcRKq27rzC1XYMzeiLKSstWef6SSu91bijfXf/wIij2G4A Bnsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QDmLdv1yL5l0hpXF2TgoENNSs0aO/B/hx+r7ZWVrQq8=; b=b1PcVE4IMZhcHQLaxMrfCXs9sdXMhKydSTz1Xo7jE3BTjgyh784eLVbM1Pj1Q5D32x 4u3RNJsYiL7tjuQYcgv6o+VKqJY2w24ezXgCyaLjancLbAEnGlYHuxq3W+7i7x7vaJXf MY1zpAmacOuNZIBQQxVytqI+bCrPZRTsdDYJLMriSo2d1nGipIrTzTKidABmaajEfPRe M47nf1WVIZwVSM5/X7liss2isyh6i2sZG5MOcJAlsHQ9wVBOALix1ggm4r2A85MIQqMp 03o8hqS+cAPShL7VsZKuuBZ2bo323NQSTV1dWXn/8X49S26TsnWeUYOXZUs+kmUrzfCu fyMA== X-Gm-Message-State: AOAM531Y8MALbNIS5mHBgFQAsOOf/vb6aR+p3U5LC91CZCzpwH5DjB5a 4KGwtIeoWXmogtDwnuxCqlg= X-Google-Smtp-Source: ABdhPJwsabkLRsmpIUfbqaSdCbMYfwd0pCSQBLtKRG2zayVeZDSq7txZ8s4PDdgunJZta9lWk4Kptw== X-Received: by 2002:a17:90b:46d2:: with SMTP id jx18mr4510165pjb.106.1608299148342; Fri, 18 Dec 2020 05:45:48 -0800 (PST) Received: from VM.ger.corp.intel.com ([192.55.54.44]) by smtp.gmail.com with ESMTPSA id r185sm9075906pfc.53.2020.12.18.05.45.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Dec 2020 05:45:47 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com Cc: bpf@vger.kernel.org, A.Zema@falconvsystems.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, Xuan Zhuo Subject: [PATCH bpf v2 1/2] xsk: fix race in SKB mode transmit with shared cq Date: Fri, 18 Dec 2020 14:45:24 +0100 Message-Id: <20201218134525.13119-2-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20201218134525.13119-1-magnus.karlsson@gmail.com> References: <20201218134525.13119-1-magnus.karlsson@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Fix a race when multiple sockets are simultaneously calling sendto() when the completion ring is shared in the SKB case. This is the case when you share the same netdev and queue id through the XDP_SHARED_UMEM bind flag. The problem is that multiple processes can be in xsk_generic_xmit() and call the backpressure mechanism in xskq_prod_reserve(xs->pool->cq). As this is a shared resource in this specific scenario, a race might occur since the rings are single-producer single-consumer. Fix this by moving the tx_completion_lock from the socket to the pool as the pool is shared between the sockets that share the completion ring. (The pool is not shared when this is not the case.) And then protect the accesses to xskq_prod_reserve() with this lock. The tx_completion_lock is renamed cq_lock to better reflect that it protects accesses to the potentially shared completion ring. Fixes: 35fcde7f8deb ("xsk: support for Tx") Signed-off-by: Magnus Karlsson Reported-by: Xuan Zhuo Acked-by: Björn Töpel --- include/net/xdp_sock.h | 4 ---- include/net/xsk_buff_pool.h | 5 +++++ net/xdp/xsk.c | 9 ++++++--- net/xdp/xsk_buff_pool.c | 1 + 4 files changed, 12 insertions(+), 7 deletions(-) diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 4f4e93bf814c..cc17bc957548 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -58,10 +58,6 @@ struct xdp_sock { struct xsk_queue *tx ____cacheline_aligned_in_smp; struct list_head tx_list; - /* Mutual exclusion of NAPI TX thread and sendmsg error paths - * in the SKB destructor callback. - */ - spinlock_t tx_completion_lock; /* Protects generic receive. */ spinlock_t rx_lock; diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 01755b838c74..eaa8386dbc63 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -73,6 +73,11 @@ struct xsk_buff_pool { bool dma_need_sync; bool unaligned; void *addrs; + /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect: + * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when + * sockets share a single cq when the same netdev and queue id is shared. + */ + spinlock_t cq_lock; struct xdp_buff_xsk *free_heads[]; }; diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index c6532d77fde7..d531f9cd0de6 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -423,9 +423,9 @@ static void xsk_destruct_skb(struct sk_buff *skb) struct xdp_sock *xs = xdp_sk(skb->sk); unsigned long flags; - spin_lock_irqsave(&xs->tx_completion_lock, flags); + spin_lock_irqsave(&xs->pool->cq_lock, flags); xskq_prod_submit_addr(xs->pool->cq, addr); - spin_unlock_irqrestore(&xs->tx_completion_lock, flags); + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); sock_wfree(skb); } @@ -437,6 +437,7 @@ static int xsk_generic_xmit(struct sock *sk) bool sent_frame = false; struct xdp_desc desc; struct sk_buff *skb; + unsigned long flags; int err = 0; mutex_lock(&xs->mutex); @@ -468,10 +469,13 @@ static int xsk_generic_xmit(struct sock *sk) * if there is space in it. This avoids having to implement * any buffering in the Tx path. */ + spin_lock_irqsave(&xs->pool->cq_lock, flags); if (unlikely(err) || xskq_prod_reserve(xs->pool->cq)) { + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); kfree_skb(skb); goto out; } + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); skb->dev = xs->dev; skb->priority = sk->sk_priority; @@ -1303,7 +1307,6 @@ static int xsk_create(struct net *net, struct socket *sock, int protocol, xs->state = XSK_READY; mutex_init(&xs->mutex); spin_lock_init(&xs->rx_lock); - spin_lock_init(&xs->tx_completion_lock); INIT_LIST_HEAD(&xs->map_list); spin_lock_init(&xs->map_list_lock); diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 818b75060922..20598eea658c 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -71,6 +71,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xsk_tx_list); spin_lock_init(&pool->xsk_tx_list_lock); + spin_lock_init(&pool->cq_lock); refcount_set(&pool->users, 1); pool->fq = xs->fq_tmp; From patchwork Fri Dec 18 13:45:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Magnus Karlsson X-Patchwork-Id: 346199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A87EFC2D0E4 for ; Fri, 18 Dec 2020 13:46:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7200F23A9C for ; Fri, 18 Dec 2020 13:46:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726719AbgLRNqd (ORCPT ); Fri, 18 Dec 2020 08:46:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726540AbgLRNqd (ORCPT ); Fri, 18 Dec 2020 08:46:33 -0500 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5E6BC06138C; Fri, 18 Dec 2020 05:45:52 -0800 (PST) Received: by mail-pf1-x42b.google.com with SMTP id s21so1549847pfu.13; Fri, 18 Dec 2020 05:45:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=puW3DbJ+oWMyv153IUMmlQ4G7FVM7E3yY+4xe0UfzkA=; b=FCJ8Yw15ZeymnEvZv6tabs1xv8nR/E4Nn9pAqHXavlPiofDnnrrMeO6QeDW5S+S0nn NpbWrcIa/QtJA989zng4CDLJgGYc+K5kn/t/nkPaYJl/phfTt89vv7OoyjXYbqmyR6Z6 hYGq89/p3/3SBZF9tD23EeI/Q3/tnePky6QjDrWLBiLg3o8YRzYSMahtdUOGVvFW3ATu 17+W5sd1/2rQxXq+tG+tfyyrWQqFQhWeRPUUSLaO7I4aE7XYkyAV7s9MAfCJo4gCebrp K8/ChM2vAA7T8jrSMpgmYe0f/uGYFc+3IHYAYEPOysCmUiQvP1gHtxanxBLT3CtFkkb+ K4qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=puW3DbJ+oWMyv153IUMmlQ4G7FVM7E3yY+4xe0UfzkA=; b=jTjQhHgjSvLgIOc8CBFJDkv8fIjjCwqkBYHB8SckfPLxjlSzIteAZYZjeAp7XaeIjs iwnvuc/U30XIIvyUT/sFd42Rxv373/aQl3t67jCTP6pEETjAQP2E4YKH2DRlJJc0FFuJ u0+cCLfe8255xxDZ55ieFeXFIfixw10fBw3vfJ7ZhQ9d170i7LOEa5fDMJJ5lVEcueSJ /Y45bva+XYFVA82qAEJ4J2U4hZBRPmVfUQLSLTQh7aH+oSesFNGTUxyvnZic6Dpa8ez6 +ZkpgZCXUWJL6zUSMsO3leTMbtWliI/kQwjueQdhqctb7bS4n0sC5bcpweJlLvClGthA aKbA== X-Gm-Message-State: AOAM532vtA1yzN1DTs7suE3vz7Ug/Ji6sA4DtBfBj15iUIpiRBLAERcx 3IXsPo4e4tShEKMkoYQ6Jn8= X-Google-Smtp-Source: ABdhPJz0tU/9pyS2OnRIpSGN2I6ROVPelmMlTbNo3XMMNncgzYMcCwm7WkPPNN3mct/m5pLZHd/8Fw== X-Received: by 2002:aa7:84d5:0:b029:19d:da20:73fe with SMTP id x21-20020aa784d50000b029019dda2073femr4364071pfn.16.1608299152520; Fri, 18 Dec 2020 05:45:52 -0800 (PST) Received: from VM.ger.corp.intel.com ([192.55.54.44]) by smtp.gmail.com with ESMTPSA id r185sm9075906pfc.53.2020.12.18.05.45.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Dec 2020 05:45:52 -0800 (PST) From: Magnus Karlsson To: magnus.karlsson@intel.com, bjorn.topel@intel.com, ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org, jonathan.lemon@gmail.com Cc: bpf@vger.kernel.org, A.Zema@falconvsystems.com, maciej.fijalkowski@intel.com, maciejromanfijalkowski@gmail.com, Xuan Zhuo Subject: [PATCH bpf v2 2/2] xsk: rollback reservation at NETDEV_TX_BUSY Date: Fri, 18 Dec 2020 14:45:25 +0100 Message-Id: <20201218134525.13119-3-magnus.karlsson@gmail.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20201218134525.13119-1-magnus.karlsson@gmail.com> References: <20201218134525.13119-1-magnus.karlsson@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Magnus Karlsson Rollback the reservation in the completion ring when we get a NETDEV_TX_BUSY. When this error is received from the driver, we are supposed to let the user application retry the transmit again. And in order to do this, we need to roll back the failed send so it can be retried. Unfortunately, we did not cancel the reservation we had made in the completion ring. By not doing this, we actually make the completion ring one entry smaller per NETDEV_TX_BUSY error we get, and after enough of these errors the completion ring will be of size zero and transmit will stop working. Fix this by cancelling the reservation when we get a NETDEV_TX_BUSY error. Fixes: 642e450b6b59 ("xsk: Do not discard packet when NETDEV_TX_BUSY") Signed-off-by: Magnus Karlsson Reported-by: Xuan Zhuo Acked-by: Björn Töpel --- net/xdp/xsk.c | 3 +++ net/xdp/xsk_queue.h | 5 +++++ 2 files changed, 8 insertions(+) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index d531f9cd0de6..8037b04a9edd 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -487,6 +487,9 @@ static int xsk_generic_xmit(struct sock *sk) if (err == NETDEV_TX_BUSY) { /* Tell user-space to retry the send */ skb->destructor = sock_wfree; + spin_lock_irqsave(&xs->pool->cq_lock, flags); + xskq_prod_cancel(xs->pool->cq); + spin_unlock_irqrestore(&xs->pool->cq_lock, flags); /* Free skb without triggering the perf drop trace */ consume_skb(skb); err = -EAGAIN; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 4a9663aa7afe..2823b7c3302d 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -334,6 +334,11 @@ static inline bool xskq_prod_is_full(struct xsk_queue *q) return xskq_prod_nb_free(q, 1) ? false : true; } +static inline void xskq_prod_cancel(struct xsk_queue *q) +{ + q->cached_prod--; +} + static inline int xskq_prod_reserve(struct xsk_queue *q) { if (xskq_prod_is_full(q))