From patchwork Fri Oct 2 18:06:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F260EC4363D for ; Fri, 2 Oct 2020 18:07:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B9127208A9 for ; Fri, 2 Oct 2020 18:07:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662038; bh=U8P41oSAqhaQ7HCq8VzIHFjjhCogaVVfm4fsuFKd1Gg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=tyND4Zys3mvIQgnoV6KLhbsRlUH6yxSscNyN0Drifc5psGl+Q3m04rciwgg64EsgU MTlbiovUCJKpkXaO3vCkquYhuNMv7J46KHQsbC3foxM6t7/TbV/hJEWGqCuzH3lcRD M/+BURmJEowPJHmThPH1jGuX1lE+B2GIaHRUx5Rc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388401AbgJBSHJ (ORCPT ); Fri, 2 Oct 2020 14:07:09 -0400 Received: from mail.kernel.org ([198.145.29.99]:36646 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726813AbgJBSHG (ORCPT ); Fri, 2 Oct 2020 14:07:06 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C470C208B6; Fri, 2 Oct 2020 18:07:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662026; bh=U8P41oSAqhaQ7HCq8VzIHFjjhCogaVVfm4fsuFKd1Gg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wamQAP6qFZ/ywFxwDe/NztDvDFn90DsdsZgtV2nH1WxtTqm8cSbXrR1m5T7zildZV SoXKDK8KDQf9Uz3MGCy7SWLO/RJfoMldnFiRvQEahr8lJ5ntA9s0EHejwwwfDUA7sK DgHxUvAeUZlWeeVPLXwUwQV3y5mfcPOY/DplzJi8= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Eran Ben Elisha , Moshe Shemesh , Saeed Mahameed Subject: [net V3 01/14] net/mlx5: Fix a race when moving command interface to polling mode Date: Fri, 2 Oct 2020 11:06:41 -0700 Message-Id: <20201002180654.262800-2-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eran Ben Elisha As part of driver unload, it destroys the commands EQ (via FW command). As the commands EQ is destroyed, FW will not generate EQEs for any command that driver sends afterwards. Driver should poll for later commands status. Driver commands mode metadata is updated before the commands EQ is actually destroyed. This can lead for double completion handle by the driver (polling and interrupt), if a command is executed and completed by FW after the mode was changed, but before the EQ was destroyed. Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee that only DESTROY_EQ command can be executed during this time period. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 31ef9f8420c8..1318d774b18f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -656,8 +656,10 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev) cleanup_async_eq(dev, &table->pages_eq, "pages"); cleanup_async_eq(dev, &table->async_eq, "async"); + mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ); mlx5_cmd_use_polling(dev); cleanup_async_eq(dev, &table->cmd_eq, "cmd"); + mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL); mlx5_eq_notifier_unregister(dev, &table->cq_err_nb); } From patchwork Fri Oct 2 18:06:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52A0DC4363D for ; Fri, 2 Oct 2020 18:07:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 117E2208A9 for ; Fri, 2 Oct 2020 18:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662031; bh=It20qKll7Ze87iyoXN8IhoE60kfo1EN0+/NuBMlqnuQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=nN60iGvN2+Pn8j+ITBHUS6O5ovcci0YCzr9HUl8f9o0R0ja3dF2/g+kZypQhEDjsA paS4/31sAEAEpJsyTDgI9BzQUFXFvqmH7joWKU3eLS7xOlGYTr9WnWz9NASXmNn2fT Pckrru6F89YY3YUE5EBVyZw9xxTBZVTGLmo+bEMw= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388407AbgJBSHK (ORCPT ); Fri, 2 Oct 2020 14:07:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:36670 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387768AbgJBSHH (ORCPT ); Fri, 2 Oct 2020 14:07:07 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3CD102137B; Fri, 2 Oct 2020 18:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662026; bh=It20qKll7Ze87iyoXN8IhoE60kfo1EN0+/NuBMlqnuQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JP9T9rsJnQhloUKxypIfa5urisU/hJhRQliuHveL16c5h0198NVYnVUghN7wWHF8x juTX+17o8ByOUz6XFwv4ATy6AOgK5cE8eeEptVFOiaRB7sCyGVEcXU672OyTe5QaNa 56OqLWaKVMC6EN9gLPyCRBZMgR7VH/BpuXznwmEU= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Eran Ben Elisha , Moshe Shemesh , Saeed Mahameed Subject: [net V3 02/14] net/mlx5: Avoid possible free of command entry while timeout comp handler Date: Fri, 2 Oct 2020 11:06:42 -0700 Message-Id: <20201002180654.262800-3-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eran Ben Elisha Upon command completion timeout, driver simulates a forced command completion. In a rare case where real interrupt for that command arrives simultaneously, it might release the command entry while the forced handler might still access it. Fix that by adding an entry refcount, to track current amount of allowed handlers. Command entry to be released only when this refcount is decremented to zero. Command refcount is always initialized to one. For callback commands, command completion handler is the symmetric flow to decrement it. For non-callback commands, it is wait_func(). Before ringing the doorbell, increment the refcount for the real completion handler. Once the real completion handler is called, it will decrement it. For callback commands, once the delayed work is scheduled, increment the refcount. Upon callback command completion handler, we will try to cancel the timeout callback. In case of success, we need to decrement the callback refcount as it will never run. In addition, gather the entry index free and the entry free into a one flow for all command types release. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 109 ++++++++++++------ include/linux/mlx5/driver.h | 2 + 2 files changed, 73 insertions(+), 38 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 1d91a0d0ab1d..c0055f5479ce 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -69,12 +69,10 @@ enum { MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR = 0x10, }; -static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd, - struct mlx5_cmd_msg *in, - struct mlx5_cmd_msg *out, - void *uout, int uout_size, - mlx5_cmd_cbk_t cbk, - void *context, int page_queue) +static struct mlx5_cmd_work_ent * +cmd_alloc_ent(struct mlx5_cmd *cmd, struct mlx5_cmd_msg *in, + struct mlx5_cmd_msg *out, void *uout, int uout_size, + mlx5_cmd_cbk_t cbk, void *context, int page_queue) { gfp_t alloc_flags = cbk ? GFP_ATOMIC : GFP_KERNEL; struct mlx5_cmd_work_ent *ent; @@ -83,6 +81,7 @@ static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd, if (!ent) return ERR_PTR(-ENOMEM); + ent->idx = -EINVAL; ent->in = in; ent->out = out; ent->uout = uout; @@ -91,10 +90,16 @@ static struct mlx5_cmd_work_ent *alloc_cmd(struct mlx5_cmd *cmd, ent->context = context; ent->cmd = cmd; ent->page_queue = page_queue; + refcount_set(&ent->refcnt, 1); return ent; } +static void cmd_free_ent(struct mlx5_cmd_work_ent *ent) +{ + kfree(ent); +} + static u8 alloc_token(struct mlx5_cmd *cmd) { u8 token; @@ -109,7 +114,7 @@ static u8 alloc_token(struct mlx5_cmd *cmd) return token; } -static int alloc_ent(struct mlx5_cmd *cmd) +static int cmd_alloc_index(struct mlx5_cmd *cmd) { unsigned long flags; int ret; @@ -123,7 +128,7 @@ static int alloc_ent(struct mlx5_cmd *cmd) return ret < cmd->max_reg_cmds ? ret : -ENOMEM; } -static void free_ent(struct mlx5_cmd *cmd, int idx) +static void cmd_free_index(struct mlx5_cmd *cmd, int idx) { unsigned long flags; @@ -132,6 +137,22 @@ static void free_ent(struct mlx5_cmd *cmd, int idx) spin_unlock_irqrestore(&cmd->alloc_lock, flags); } +static void cmd_ent_get(struct mlx5_cmd_work_ent *ent) +{ + refcount_inc(&ent->refcnt); +} + +static void cmd_ent_put(struct mlx5_cmd_work_ent *ent) +{ + if (!refcount_dec_and_test(&ent->refcnt)) + return; + + if (ent->idx >= 0) + cmd_free_index(ent->cmd, ent->idx); + + cmd_free_ent(ent); +} + static struct mlx5_cmd_layout *get_inst(struct mlx5_cmd *cmd, int idx) { return cmd->cmd_buf + (idx << cmd->log_stride); @@ -219,11 +240,6 @@ static void poll_timeout(struct mlx5_cmd_work_ent *ent) ent->ret = -ETIMEDOUT; } -static void free_cmd(struct mlx5_cmd_work_ent *ent) -{ - kfree(ent); -} - static int verify_signature(struct mlx5_cmd_work_ent *ent) { struct mlx5_cmd_mailbox *next = ent->out->next; @@ -842,6 +858,7 @@ static void cb_timeout_handler(struct work_struct *work) mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in)); mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); + cmd_ent_put(ent); /* for the cmd_ent_get() took on schedule delayed work */ } static void free_msg(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *msg); @@ -873,14 +890,14 @@ static void cmd_work_handler(struct work_struct *work) sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem; down(sem); if (!ent->page_queue) { - alloc_ret = alloc_ent(cmd); + alloc_ret = cmd_alloc_index(cmd); if (alloc_ret < 0) { mlx5_core_err_rl(dev, "failed to allocate command entry\n"); if (ent->callback) { ent->callback(-EAGAIN, ent->context); mlx5_free_cmd_msg(dev, ent->out); free_msg(dev, ent->in); - free_cmd(ent); + cmd_ent_put(ent); } else { ent->ret = -EAGAIN; complete(&ent->done); @@ -916,8 +933,8 @@ static void cmd_work_handler(struct work_struct *work) ent->ts1 = ktime_get_ns(); cmd_mode = cmd->mode; - if (ent->callback) - schedule_delayed_work(&ent->cb_timeout_work, cb_timeout); + if (ent->callback && schedule_delayed_work(&ent->cb_timeout_work, cb_timeout)) + cmd_ent_get(ent); set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state); /* Skip sending command to fw if internal error */ @@ -933,13 +950,10 @@ static void cmd_work_handler(struct work_struct *work) MLX5_SET(mbox_out, ent->out, syndrome, drv_synd); mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); - /* no doorbell, no need to keep the entry */ - free_ent(cmd, ent->idx); - if (ent->callback) - free_cmd(ent); return; } + cmd_ent_get(ent); /* for the _real_ FW event on completion */ /* ring doorbell after the descriptor is valid */ mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx); wmb(); @@ -1039,11 +1053,16 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in, if (callback && page_queue) return -EINVAL; - ent = alloc_cmd(cmd, in, out, uout, uout_size, callback, context, - page_queue); + ent = cmd_alloc_ent(cmd, in, out, uout, uout_size, + callback, context, page_queue); if (IS_ERR(ent)) return PTR_ERR(ent); + /* put for this ent is when consumed, depending on the use case + * 1) (!callback) blocking flow: by caller after wait_func completes + * 2) (callback) flow: by mlx5_cmd_comp_handler() when ent is handled + */ + ent->token = token; ent->polling = force_polling; @@ -1062,12 +1081,10 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in, } if (callback) - goto out; + goto out; /* mlx5_cmd_comp_handler() will put(ent) */ err = wait_func(dev, ent); - if (err == -ETIMEDOUT) - goto out; - if (err == -ECANCELED) + if (err == -ETIMEDOUT || err == -ECANCELED) goto out_free; ds = ent->ts2 - ent->ts1; @@ -1085,7 +1102,7 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, struct mlx5_cmd_msg *in, *status = ent->status; out_free: - free_cmd(ent); + cmd_ent_put(ent); out: return err; } @@ -1516,14 +1533,19 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force if (!forced) { mlx5_core_err(dev, "Command completion arrived after timeout (entry idx = %d).\n", ent->idx); - free_ent(cmd, ent->idx); - free_cmd(ent); + cmd_ent_put(ent); } continue; } - if (ent->callback) - cancel_delayed_work(&ent->cb_timeout_work); + if (ent->callback && cancel_delayed_work(&ent->cb_timeout_work)) + cmd_ent_put(ent); /* timeout work was canceled */ + + if (!forced || /* Real FW completion */ + pci_channel_offline(dev->pdev) || /* FW is inaccessible */ + dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) + cmd_ent_put(ent); + if (ent->page_queue) sem = &cmd->pages_sem; else @@ -1545,10 +1567,6 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force ent->ret, deliv_status_to_str(ent->status), ent->status); } - /* only real completion will free the entry slot */ - if (!forced) - free_ent(cmd, ent->idx); - if (ent->callback) { ds = ent->ts2 - ent->ts1; if (ent->op < MLX5_CMD_OP_MAX) { @@ -1576,10 +1594,13 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force free_msg(dev, ent->in); err = err ? err : ent->status; - if (!forced) - free_cmd(ent); + /* final consumer is done, release ent */ + cmd_ent_put(ent); callback(err, context); } else { + /* release wait_func() so mlx5_cmd_invoke() + * can make the final ent_put() + */ complete(&ent->done); } up(sem); @@ -1589,8 +1610,11 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev) { + struct mlx5_cmd *cmd = &dev->cmd; + unsigned long bitmask; unsigned long flags; u64 vector; + int i; /* wait for pending handlers to complete */ mlx5_eq_synchronize_cmd_irq(dev); @@ -1599,11 +1623,20 @@ void mlx5_cmd_trigger_completions(struct mlx5_core_dev *dev) if (!vector) goto no_trig; + bitmask = vector; + /* we must increment the allocated entries refcount before triggering the completions + * to guarantee pending commands will not get freed in the meanwhile. + * For that reason, it also has to be done inside the alloc_lock. + */ + for_each_set_bit(i, &bitmask, (1 << cmd->log_sz)) + cmd_ent_get(cmd->ent_arr[i]); vector |= MLX5_TRIGGERED_CMD_COMP; spin_unlock_irqrestore(&dev->cmd.alloc_lock, flags); mlx5_core_dbg(dev, "vector 0x%llx\n", vector); mlx5_cmd_comp_handler(dev, vector, true); + for_each_set_bit(i, &bitmask, (1 << cmd->log_sz)) + cmd_ent_put(cmd->ent_arr[i]); return; no_trig: diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index c145de0473bc..897156822f0d 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -767,6 +767,8 @@ struct mlx5_cmd_work_ent { u64 ts2; u16 op; bool polling; + /* Track the max comp handlers */ + refcount_t refcnt; }; struct mlx5_pas { From patchwork Fri Oct 2 18:06:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EAA4C35257 for ; Fri, 2 Oct 2020 18:07:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4AD9221481 for ; Fri, 2 Oct 2020 18:07:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662032; bh=rY8ZY7uJs9oLek3IRJ2tdPzmQUY3MTUdUnWv2bhLNG8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=EBCX12rvzsDuuBK1p3jPJvx0C2uRtTpxRaR0RI8fOJyrn6sTtAWA5BR4RbnjGPOpM kYRJ0efs7YFarWxa5osJ95ezLX7nRei+9LXr13wYLs3WF8Ou/WNXW21dS/23moShIF c6iGpwcr83VkfIoyRrH3BO47QhJsloUvqh/bB8CQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388416AbgJBSHL (ORCPT ); Fri, 2 Oct 2020 14:07:11 -0400 Received: from mail.kernel.org ([198.145.29.99]:36684 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726224AbgJBSHH (ORCPT ); Fri, 2 Oct 2020 14:07:07 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AD3C221481; Fri, 2 Oct 2020 18:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662026; bh=rY8ZY7uJs9oLek3IRJ2tdPzmQUY3MTUdUnWv2bhLNG8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DrQZhIwyYq24tPC0vX4b44BMgVuHgz3GlcHW+bk9+tJDDP+GjLSgDpD+1YMCCYh1M cJnnm5g0tEIWKXKENIzIAD9LW/G7FcjPWJNkgS8JsIrfsqV9WgXTtAqyBUr/iNI3WK kY+iFGRfCOCjzbBDyo6AVQ60LykJiYzShNV1rbic= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Eran Ben Elisha , Saeed Mahameed Subject: [net V3 03/14] net/mlx5: poll cmd EQ in case of command timeout Date: Fri, 2 Oct 2020 11:06:43 -0700 Message-Id: <20201002180654.262800-4-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eran Ben Elisha Once driver detects a command interface command timeout, it warns the user and returns timeout error to the caller. In such case, the entry of the command is not evacuated (because only real event interrupt is allowed to clear command interface entry). If the HW event interrupt of this entry will never arrive, this entry will be left unused forever. Command interface entries are limited and eventually we can end up without the ability to post a new command. In addition, if driver will not consume the EQE of the lost interrupt and rearm the EQ, no new interrupts will arrive for other commands. Add a resiliency mechanism for manually polling the command EQ in case of a command timeout. In case resiliency mechanism will find non-handled EQE, it will consume it, and the command interface will be fully functional again. Once the resiliency flow finished, wait another 5 seconds for the command interface to complete for this command entry. Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow. Add an async EQ spinlock to avoid races between resiliency flows and real interrupts that might run simultaneously. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 53 ++++++++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 40 +++++++++++++- .../net/ethernet/mellanox/mlx5/core/lib/eq.h | 2 + 3 files changed, 86 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index c0055f5479ce..37dae95e61d5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -853,11 +853,21 @@ static void cb_timeout_handler(struct work_struct *work) struct mlx5_core_dev *dev = container_of(ent->cmd, struct mlx5_core_dev, cmd); + mlx5_cmd_eq_recover(dev); + + /* Maybe got handled by eq recover ? */ + if (!test_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state)) { + mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, recovered after timeout\n", ent->idx, + mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in)); + goto out; /* phew, already handled */ + } + ent->ret = -ETIMEDOUT; - mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n", - mlx5_command_str(msg_to_opcode(ent->in)), - msg_to_opcode(ent->in)); + mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, timeout. Will cause a leak of a command resource\n", + ent->idx, mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in)); mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); + +out: cmd_ent_put(ent); /* for the cmd_ent_get() took on schedule delayed work */ } @@ -997,6 +1007,35 @@ static const char *deliv_status_to_str(u8 status) } } +enum { + MLX5_CMD_TIMEOUT_RECOVER_MSEC = 5 * 1000, +}; + +static void wait_func_handle_exec_timeout(struct mlx5_core_dev *dev, + struct mlx5_cmd_work_ent *ent) +{ + unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_RECOVER_MSEC); + + mlx5_cmd_eq_recover(dev); + + /* Re-wait on the ent->done after executing the recovery flow. If the + * recovery flow (or any other recovery flow running simultaneously) + * has recovered an EQE, it should cause the entry to be completed by + * the command interface. + */ + if (wait_for_completion_timeout(&ent->done, timeout)) { + mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) recovered after timeout\n", ent->idx, + mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in)); + return; + } + + mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) No done completion\n", ent->idx, + mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in)); + + ent->ret = -ETIMEDOUT; + mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); +} + static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent) { unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC); @@ -1008,12 +1047,10 @@ static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent) ent->ret = -ECANCELED; goto out_err; } - if (cmd->mode == CMD_MODE_POLLING || ent->polling) { + if (cmd->mode == CMD_MODE_POLLING || ent->polling) wait_for_completion(&ent->done); - } else if (!wait_for_completion_timeout(&ent->done, timeout)) { - ent->ret = -ETIMEDOUT; - mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true); - } + else if (!wait_for_completion_timeout(&ent->done, timeout)) + wait_func_handle_exec_timeout(dev, ent); out_err: err = ent->ret; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 1318d774b18f..22a19d391e17 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -189,6 +189,29 @@ u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq) return count_eqe; } +static void mlx5_eq_async_int_lock(struct mlx5_eq_async *eq, unsigned long *flags) + __acquires(&eq->lock) +{ + if (in_irq()) + spin_lock(&eq->lock); + else + spin_lock_irqsave(&eq->lock, *flags); +} + +static void mlx5_eq_async_int_unlock(struct mlx5_eq_async *eq, unsigned long *flags) + __releases(&eq->lock) +{ + if (in_irq()) + spin_unlock(&eq->lock); + else + spin_unlock_irqrestore(&eq->lock, *flags); +} + +enum async_eq_nb_action { + ASYNC_EQ_IRQ_HANDLER = 0, + ASYNC_EQ_RECOVER = 1, +}; + static int mlx5_eq_async_int(struct notifier_block *nb, unsigned long action, void *data) { @@ -198,11 +221,14 @@ static int mlx5_eq_async_int(struct notifier_block *nb, struct mlx5_eq_table *eqt; struct mlx5_core_dev *dev; struct mlx5_eqe *eqe; + unsigned long flags; int num_eqes = 0; dev = eq->dev; eqt = dev->priv.eq_table; + mlx5_eq_async_int_lock(eq_async, &flags); + eqe = next_eqe_sw(eq); if (!eqe) goto out; @@ -223,8 +249,19 @@ static int mlx5_eq_async_int(struct notifier_block *nb, out: eq_update_ci(eq, 1); + mlx5_eq_async_int_unlock(eq_async, &flags); - return 0; + return unlikely(action == ASYNC_EQ_RECOVER) ? num_eqes : 0; +} + +void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_async *eq = &dev->priv.eq_table->cmd_eq; + int eqes; + + eqes = mlx5_eq_async_int(&eq->irq_nb, ASYNC_EQ_RECOVER, NULL); + if (eqes) + mlx5_core_warn(dev, "Recovered %d EQEs on cmd_eq\n", eqes); } static void init_eq_buf(struct mlx5_eq *eq) @@ -569,6 +606,7 @@ setup_async_eq(struct mlx5_core_dev *dev, struct mlx5_eq_async *eq, int err; eq->irq_nb.notifier_call = mlx5_eq_async_int; + spin_lock_init(&eq->lock); err = create_async_eq(dev, &eq->core, param); if (err) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h index 4aaca7400fb2..5c681e31983b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h @@ -37,6 +37,7 @@ struct mlx5_eq { struct mlx5_eq_async { struct mlx5_eq core; struct notifier_block irq_nb; + spinlock_t lock; /* To avoid irq EQ handle races with resiliency flows */ }; struct mlx5_eq_comp { @@ -81,6 +82,7 @@ void mlx5_cq_tasklet_cb(unsigned long data); struct cpumask *mlx5_eq_comp_cpumask(struct mlx5_core_dev *dev, int ix); u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq); +void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev); void mlx5_eq_synchronize_async_irq(struct mlx5_core_dev *dev); void mlx5_eq_synchronize_cmd_irq(struct mlx5_core_dev *dev); From patchwork Fri Oct 2 18:06:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267680 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DCD3C4363D for ; Fri, 2 Oct 2020 18:07:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2DC86208A9 for ; Fri, 2 Oct 2020 18:07:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662035; bh=fUdoqY/U1lt6LJV0V8BgmigzUIZs6e0XLOTNZB6wCss=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=O21Y01QpMMEYaeG00K44dgvFR6JPg4EKct1LrLt9ytf3EwXLWftapl8QFh5BQXZQ/ GeX5lX+/XAq3TVDCFs+DUReMDex4+DNwljZErg5XTxsulube2veg6hOsHQqFJZf3lm OIeyb+G8/63wpOU/ODoPFa9dLd6vuNCjZt+UikSI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388423AbgJBSHN (ORCPT ); Fri, 2 Oct 2020 14:07:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:36702 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388174AbgJBSHI (ORCPT ); Fri, 2 Oct 2020 14:07:08 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 198D52177B; Fri, 2 Oct 2020 18:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662027; bh=fUdoqY/U1lt6LJV0V8BgmigzUIZs6e0XLOTNZB6wCss=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YIi8xDEjy7vBFCK3NHaqrmFHmPwG+4CKDICx108JLaEt9V2ImZWXqyzHlc0mG88O0 Qdbdb+FdlABUIR1J2PTal1otr6gdUrPLL8wAR1/b7YfNfq7jVcBdGnF1h0w5oVLFSa 2ojqduITuhyIYMrfKDTx8GFoTjQPeRtDYKhoLHMg= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Eran Ben Elisha , Moshe Shemesh , Saeed Mahameed Subject: [net V3 04/14] net/mlx5: Add retry mechanism to the command entry index allocation Date: Fri, 2 Oct 2020 11:06:44 -0700 Message-Id: <20201002180654.262800-5-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eran Ben Elisha It is possible that new command entry index allocation will temporarily fail. The new command holds the semaphore, so it means that a free entry should be ready soon. Add one second retry mechanism before returning an error. Patch "net/mlx5: Avoid possible free of command entry while timeout comp handler" increase the possibility to bump into this temporarily failure as it delays the entry index release for non-callback commands. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 37dae95e61d5..2b597ac365f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -883,6 +883,25 @@ static bool opcode_allowed(struct mlx5_cmd *cmd, u16 opcode) return cmd->allowed_opcode == opcode; } +static int cmd_alloc_index_retry(struct mlx5_cmd *cmd) +{ + unsigned long alloc_end = jiffies + msecs_to_jiffies(1000); + int idx; + +retry: + idx = cmd_alloc_index(cmd); + if (idx < 0 && time_before(jiffies, alloc_end)) { + /* Index allocation can fail on heavy load of commands. This is a temporary + * situation as the current command already holds the semaphore, meaning that + * another command completion is being handled and it is expected to release + * the entry index soon. + */ + cpu_relax(); + goto retry; + } + return idx; +} + static void cmd_work_handler(struct work_struct *work) { struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work); @@ -900,7 +919,7 @@ static void cmd_work_handler(struct work_struct *work) sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem; down(sem); if (!ent->page_queue) { - alloc_ret = cmd_alloc_index(cmd); + alloc_ret = cmd_alloc_index_retry(cmd); if (alloc_ret < 0) { mlx5_core_err_rl(dev, "failed to allocate command entry\n"); if (ent->callback) { From patchwork Fri Oct 2 18:06:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8213C4363D for ; Fri, 2 Oct 2020 18:07:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF3EC2085B for ; Fri, 2 Oct 2020 18:07:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662047; bh=dQthWUZ4CmnrlI3D25R7xvvObB7rhNaUH6ftP9xw8iA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=cybsNhfcxOIJrg4CargkQfLmAVVhO0kgTsysnYEi9tzR2cuKYDf/4l1Jc+t4VKqoA gSokciJKHTAtWiYeJGU8t7qQB6JC6vQ9oTAnWUtTIOjgO1RlMZmW556bZYAw32EP/q 5bNRPqgTUGCicpGB8DjB+4CYZsQAYRv+1dHCXFIc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388342AbgJBSH1 (ORCPT ); Fri, 2 Oct 2020 14:07:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:36736 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388209AbgJBSHI (ORCPT ); Fri, 2 Oct 2020 14:07:08 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 86B672184D; Fri, 2 Oct 2020 18:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662027; bh=dQthWUZ4CmnrlI3D25R7xvvObB7rhNaUH6ftP9xw8iA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TSRfofJ1qWxDbAZwxCpuwJLVl/u6O4+j2npt5mGRdwrurK6ODmfHIICQyxDHQbqnH 5OkaBBNRsc+ZOd/DGisrgnU7jHbEqKrQw7RbRPnxU58MxTGrVrfDlolVWTGUp9ezhp 26dlEXEkeb2WraQbdv4HDKsmGKXa/939t2urqNIw= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Saeed Mahameed , Moshe Shemesh Subject: [net V3 05/14] net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible Date: Fri, 2 Oct 2020 11:06:45 -0700 Message-Id: <20201002180654.262800-6-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Saeed Mahameed In case of pci is offline reclaim_pages_cmd() will still try to call the FW to release FW pages, cmd_exec() in this case will return a silent success without actually calling the FW. This is wrong and will cause page leaks, what we should do is to detect pci offline or command interface un-available before tying to access the FW and manually release the FW pages in the driver. In this patch we share the code to check for FW command interface availability and we call it in sensitive places e.g. reclaim_pages_cmd(). Alternative fix: 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value, command success simulation list. 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd(). Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 17 +++++++++-------- .../net/ethernet/mellanox/mlx5/core/pagealloc.c | 2 +- include/linux/mlx5/driver.h | 1 + 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 2b597ac365f8..2d1f4b3be9bf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -902,6 +902,13 @@ static int cmd_alloc_index_retry(struct mlx5_cmd *cmd) return idx; } +bool mlx5_cmd_is_down(struct mlx5_core_dev *dev) +{ + return pci_channel_offline(dev->pdev) || + dev->cmd.state != MLX5_CMDIF_STATE_UP || + dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR; +} + static void cmd_work_handler(struct work_struct *work) { struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work); @@ -967,10 +974,7 @@ static void cmd_work_handler(struct work_struct *work) set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state); /* Skip sending command to fw if internal error */ - if (pci_channel_offline(dev->pdev) || - dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR || - cmd->state != MLX5_CMDIF_STATE_UP || - !opcode_allowed(&dev->cmd, ent->op)) { + if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, ent->op)) { u8 status = 0; u32 drv_synd; @@ -1800,10 +1804,7 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out, u8 token; opcode = MLX5_GET(mbox_in, in, opcode); - if (pci_channel_offline(dev->pdev) || - dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR || - dev->cmd.state != MLX5_CMDIF_STATE_UP || - !opcode_allowed(&dev->cmd, opcode)) { + if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, opcode)) { err = mlx5_internal_err_ret_value(dev, opcode, &drv_synd, &status); MLX5_SET(mbox_out, out, status, status); MLX5_SET(mbox_out, out, syndrome, drv_synd); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c index f9b798af6335..c0e18f2ade99 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c @@ -432,7 +432,7 @@ static int reclaim_pages_cmd(struct mlx5_core_dev *dev, u32 npages; u32 i = 0; - if (dev->state != MLX5_DEVICE_STATE_INTERNAL_ERROR) + if (!mlx5_cmd_is_down(dev)) return mlx5_cmd_exec(dev, in, in_size, out, out_size); /* No hard feelings, we want our pages back! */ diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 897156822f0d..372100c755e7 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -935,6 +935,7 @@ int mlx5_cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out, int mlx5_cmd_exec_polling(struct mlx5_core_dev *dev, void *in, int in_size, void *out, int out_size); void mlx5_cmd_mbox_status(void *out, u8 *status, u32 *syndrome); +bool mlx5_cmd_is_down(struct mlx5_core_dev *dev); int mlx5_core_get_caps(struct mlx5_core_dev *dev, enum mlx5_cap_type cap_type); int mlx5_cmd_alloc_uar(struct mlx5_core_dev *dev, u32 *uarn); From patchwork Fri Oct 2 18:06:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02433C35257 for ; Fri, 2 Oct 2020 18:07:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BFC45208A9 for ; Fri, 2 Oct 2020 18:07:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662048; bh=jgJChUL4/wXeCQLcBq1svCd2yw5YKGhwR/Y77iyYdSE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=oiHAknDODhuDS4jh9F6STKHtGzYpxsQlrtvge3ouh/BdcAM/UPSut3vFplBackor1 ZnyW2tSMnJyP3swcL4/FIJhPOcNTCs2l4IP7JpoA5ifoFy1qxcHOwc893rKHtX4Cy9 aKouScWXxalxeepFFx8L/HTZyIdVP52XloikK4pc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388454AbgJBSH2 (ORCPT ); Fri, 2 Oct 2020 14:07:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:36670 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388286AbgJBSHI (ORCPT ); Fri, 2 Oct 2020 14:07:08 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E8828208A9; Fri, 2 Oct 2020 18:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662028; bh=jgJChUL4/wXeCQLcBq1svCd2yw5YKGhwR/Y77iyYdSE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DjiiLS7+K+j5ec0tv1dfaUzCJSk3CEBdHe30FOnIvSKemzrgcVxyiwnzOMSL1zyw/ puc/U2/MMfmzBaL6Qg/NGlQRph3+YQPyTSkEYveMFimuar8bZ7sVV0ddqijVXotOSS F0Ek9eyTr04qyoweJtoFIvTqikMOw6PxhJadrZ38= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Maor Gottlieb , Eran Ben Elisha , Saeed Mahameed Subject: [net V3 06/14] net/mlx5: Fix request_irqs error flow Date: Fri, 2 Oct 2020 11:06:46 -0700 Message-Id: <20201002180654.262800-7-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Maor Gottlieb Fix error flow handling in request_irqs which try to free irq that we failed to request. It fixes the below trace. WARNING: CPU: 1 PID: 7587 at kernel/irq/manage.c:1684 free_irq+0x4d/0x60 CPU: 1 PID: 7587 Comm: bash Tainted: G W OE 4.15.15-1.el7MELLANOXsmp-x86_64 #1 Hardware name: Advantech SKY-6200/SKY-6200, BIOS F2.00 08/06/2020 RIP: 0010:free_irq+0x4d/0x60 RSP: 0018:ffffc9000ef47af0 EFLAGS: 00010282 RAX: ffff88001476ae00 RBX: 0000000000000655 RCX: 0000000000000000 RDX: ffff88001476ae00 RSI: ffffc9000ef47ab8 RDI: ffff8800398bb478 RBP: ffff88001476a838 R08: ffff88001476ae00 R09: 000000000000156d R10: 0000000000000000 R11: 0000000000000004 R12: ffff88001476a838 R13: 0000000000000006 R14: ffff88001476a888 R15: 00000000ffffffe4 FS: 00007efeadd32740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc9cc010008 CR3: 00000001a2380004 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: mlx5_irq_table_create+0x38d/0x400 [mlx5_core] ? atomic_notifier_chain_register+0x50/0x60 mlx5_load_one+0x7ee/0x1130 [mlx5_core] init_one+0x4c9/0x650 [mlx5_core] pci_device_probe+0xb8/0x120 driver_probe_device+0x2a1/0x470 ? driver_allows_async_probing+0x30/0x30 bus_for_each_drv+0x54/0x80 __device_attach+0xa3/0x100 pci_bus_add_device+0x4a/0x90 pci_iov_add_virtfn+0x2dc/0x2f0 pci_enable_sriov+0x32e/0x420 mlx5_core_sriov_configure+0x61/0x1b0 [mlx5_core] ? kstrtoll+0x22/0x70 num_vf_store+0x4b/0x70 [mlx5_core] kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x140 ? rcu_all_qs+0x5/0x80 ? _cond_resched+0x15/0x30 ? __sb_start_write+0x41/0x80 vfs_write+0xad/0x1a0 SyS_write+0x42/0x90 do_syscall_64+0x60/0x110 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 24163189da48 ("net/mlx5: Separate IRQ request/free from EQ life cycle") Signed-off-by: Maor Gottlieb Reviewed-by: Eran Ben Elisha Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 373981a659c7..6fd974920394 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -115,7 +115,7 @@ static int request_irqs(struct mlx5_core_dev *dev, int nvec) return 0; err_request_irq: - for (; i >= 0; i--) { + while (i--) { struct mlx5_irq *irq = mlx5_irq_get(dev, i); int irqn = pci_irq_vector(dev->pdev, i); From patchwork Fri Oct 2 18:06:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03467C47095 for ; Fri, 2 Oct 2020 18:07:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BA9652085B for ; Fri, 2 Oct 2020 18:07:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662043; bh=ln6/5B6Y0n43IEe4UHqxreVLdNUN+dLGV+5odsOU4m8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=NaCb9RmqjxqbQzqxhLcbXF9Zzn3oU0uURdRJjypZDYDitWxFo1ylS+nsmRchPzouP Pb7BgIuyfhxySvGvla6BHscJaT5hd/AlJNYdNcSsyzHGLMYqOYaS4+wpDoacoEzWX0 42v8OWNJhJk0obT79t2IVkvaMXzlYYrDFiXYmR9c= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388446AbgJBSHV (ORCPT ); Fri, 2 Oct 2020 14:07:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:36702 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388351AbgJBSHJ (ORCPT ); Fri, 2 Oct 2020 14:07:09 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5EA1121D46; Fri, 2 Oct 2020 18:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662028; bh=ln6/5B6Y0n43IEe4UHqxreVLdNUN+dLGV+5odsOU4m8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YRLHrt3Fx83P2JvgKtrpzfvfUkWpdzHLImNDAQocrg6W1AcIqXdCGJJ6SCrDf670x lDpmCTJJeajrkOIDRcu44i89kpS1c0vcZ0FT5KvKM8CJhhQP4QSO4QBe7TrPOCBXlw iUfRfBGtZWsQDMEzsjvxd4jeDXXbIMUefVocu+gM= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Tariq Toukan , Saeed Mahameed Subject: [net V3 07/14] net/mlx5e: Fix error path for RQ alloc Date: Fri, 2 Oct 2020 11:06:47 -0700 Message-Id: <20201002180654.262800-8-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin Increase granularity of the error path to avoid unneeded free/release. Fix the cleanup to be symmetric to the order of creation. Fixes: 0ddf543226ac ("xdp/mlx5: setup xdp_rxq_info") Fixes: 422d4c401edd ("net/mlx5e: RX, Split WQ objects for different RQ types") Signed-off-by: Aya Levin Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 32 ++++++++++--------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index b3cda7b6e5e1..1fbd7a0f3ca6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -396,7 +396,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, rq_xdp_ix += params->num_channels * MLX5E_RQ_GROUP_XSK; err = xdp_rxq_info_reg(&rq->xdp_rxq, rq->netdev, rq_xdp_ix); if (err < 0) - goto err_rq_wq_destroy; + goto err_rq_xdp_prog; rq->buff.map_dir = params->xdp_prog ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE; rq->buff.headroom = mlx5e_get_rq_headroom(mdev, params, xsk); @@ -407,7 +407,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, err = mlx5_wq_ll_create(mdev, &rqp->wq, rqc_wq, &rq->mpwqe.wq, &rq->wq_ctrl); if (err) - goto err_rq_wq_destroy; + goto err_rq_xdp; rq->mpwqe.wq.db = &rq->mpwqe.wq.db[MLX5_RCV_DBR]; @@ -429,13 +429,13 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, err = mlx5e_rq_alloc_mpwqe_info(rq, c); if (err) - goto err_free; + goto err_rq_mkey; break; default: /* MLX5_WQ_TYPE_CYCLIC */ err = mlx5_wq_cyc_create(mdev, &rqp->wq, rqc_wq, &rq->wqe.wq, &rq->wq_ctrl); if (err) - goto err_rq_wq_destroy; + goto err_rq_xdp; rq->wqe.wq.db = &rq->wqe.wq.db[MLX5_RCV_DBR]; @@ -450,19 +450,19 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, GFP_KERNEL, cpu_to_node(c->cpu)); if (!rq->wqe.frags) { err = -ENOMEM; - goto err_free; + goto err_rq_wq_destroy; } err = mlx5e_init_di_list(rq, wq_sz, c->cpu); if (err) - goto err_free; + goto err_rq_frags; rq->mkey_be = c->mkey_be; } err = mlx5e_rq_set_handlers(rq, params, xsk); if (err) - goto err_free; + goto err_free_by_rq_type; if (xsk) { err = xdp_rxq_info_reg_mem_model(&rq->xdp_rxq, @@ -486,13 +486,13 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, if (IS_ERR(rq->page_pool)) { err = PTR_ERR(rq->page_pool); rq->page_pool = NULL; - goto err_free; + goto err_free_by_rq_type; } err = xdp_rxq_info_reg_mem_model(&rq->xdp_rxq, MEM_TYPE_PAGE_POOL, rq->page_pool); } if (err) - goto err_free; + goto err_free_by_rq_type; for (i = 0; i < wq_sz; i++) { if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) { @@ -542,23 +542,25 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, return 0; -err_free: +err_free_by_rq_type: switch (rq->wq_type) { case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ: kvfree(rq->mpwqe.info); +err_rq_mkey: mlx5_core_destroy_mkey(mdev, &rq->umr_mkey); break; default: /* MLX5_WQ_TYPE_CYCLIC */ - kvfree(rq->wqe.frags); mlx5e_free_di_list(rq); +err_rq_frags: + kvfree(rq->wqe.frags); } - err_rq_wq_destroy: + mlx5_wq_destroy(&rq->wq_ctrl); +err_rq_xdp: + xdp_rxq_info_unreg(&rq->xdp_rxq); +err_rq_xdp_prog: if (params->xdp_prog) bpf_prog_put(params->xdp_prog); - xdp_rxq_info_unreg(&rq->xdp_rxq); - page_pool_destroy(rq->page_pool); - mlx5_wq_destroy(&rq->wq_ctrl); return err; } From patchwork Fri Oct 2 18:06:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C7B3C35257 for ; Fri, 2 Oct 2020 18:07:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C6F4A2085B for ; Fri, 2 Oct 2020 18:07:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662042; bh=MI4K65xuGBjssuRrs6f77rCEAPbog4Yk0DxhBnmsewo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=F7VJrlb9XJqrOxnq5no1/Y0oBC2wogvxhWYXU34vdh+K3uyQ9Cd8lmiWTBoFMgPCk GsTTYYINJ1Ge+CWEitbK48qR7OJ1dj36BBClbk5sllgA+pbJPi3EiI5Bffl7NxPxz8 8/lE2WgYVH91BPR+JVLOOomrBqAH1q7XCL8NKOn8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388450AbgJBSHV (ORCPT ); Fri, 2 Oct 2020 14:07:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:36684 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388402AbgJBSHJ (ORCPT ); Fri, 2 Oct 2020 14:07:09 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CB7122193E; Fri, 2 Oct 2020 18:07:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662029; bh=MI4K65xuGBjssuRrs6f77rCEAPbog4Yk0DxhBnmsewo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dFex7hBxYp1veiNYKCIxKwjQcKrBSBPB1ZAg2DSy49Zigw2ZBQXk7XQpm2nNQCViC zTEwwHKaEaO62jZ3YU/hbz2kIWL6eboOTWhQCl+xvbZE1KmWrPZtRjXaW4ZQevLDqd WQHaV9Om5aDHiQ4+JnU0GZUNG8sRzBjETZJHyGH0= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Tariq Toukan , Saeed Mahameed Subject: [net V3 08/14] net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTU Date: Fri, 2 Oct 2020 11:06:48 -0700 Message-Id: <20201002180654.262800-9-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin Prior to this fix, in Striding RQ mode the driver was vulnerable when receiving packets in the range (stride size - headroom, stride size]. Where stride size is calculated by mtu+headroom+tailroom aligned to the closest power of 2. Usually, this filtering is performed by the HW, except for a few cases: - Between 2 VFs over the same PF with different MTUs - On bluefield, when the host physical function sets a larger MTU than the ARM has configured on its representor and uplink representor. When the HW filtering is not present, packets that are larger than MTU might be harmful for the RQ's integrity, in the following impacts: 1) Overflow from one WQE to the next, causing a memory corruption that in most cases is unharmful: as the write happens to the headroom of next packet, which will be overwritten by build_skb(). In very rare cases, high stress/load, this is harmful. When the next WQE is not yet reposted and points to existing SKB head. 2) Each oversize packet overflows to the headroom of the next WQE. On the last WQE of the WQ, where addresses wrap-around, the address of the remainder headroom does not belong to the next WQE, but it is out of the memory region range. This results in a HW CQE error that moves the RQ into an error state. Solution: Add a page buffer at the end of each WQE to absorb the leak. Actually the maximal overflow size is headroom but since all memory units must be of the same size, we use page size to comply with UMR WQEs. The increase in memory consumption is of a single page per RQ. Initialize the mkey with all MTTs pointing to a default page. When the channels are activated, UMR WQEs will redirect the RX WQEs to the actual memory from the RQ's pool, while the overflow MTTs remain mapped to the default page. Fixes: 73281b78a37a ("net/mlx5e: Derive Striding RQ size from MTU") Signed-off-by: Aya Levin Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 8 ++- .../net/ethernet/mellanox/mlx5/core/en_main.c | 55 +++++++++++++++++-- 2 files changed, 58 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 90d5caabd6af..356f5852955f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -91,7 +91,12 @@ struct page_pool; #define MLX5_MPWRQ_PAGES_PER_WQE BIT(MLX5_MPWRQ_WQE_PAGE_ORDER) #define MLX5_MTT_OCTW(npages) (ALIGN(npages, 8) / 2) -#define MLX5E_REQUIRED_WQE_MTTS (ALIGN(MLX5_MPWRQ_PAGES_PER_WQE, 8)) +/* Add another page to MLX5E_REQUIRED_WQE_MTTS as a buffer between + * WQEs, This page will absorb write overflow by the hardware, when + * receiving packets larger than MTU. These oversize packets are + * dropped by the driver at a later stage. + */ +#define MLX5E_REQUIRED_WQE_MTTS (ALIGN(MLX5_MPWRQ_PAGES_PER_WQE + 1, 8)) #define MLX5E_LOG_ALIGNED_MPWQE_PPW (ilog2(MLX5E_REQUIRED_WQE_MTTS)) #define MLX5E_REQUIRED_MTTS(wqes) (wqes * MLX5E_REQUIRED_WQE_MTTS) #define MLX5E_MAX_RQ_NUM_MTTS \ @@ -617,6 +622,7 @@ struct mlx5e_rq { u32 rqn; struct mlx5_core_dev *mdev; struct mlx5_core_mkey umr_mkey; + struct mlx5e_dma_info wqe_overflow; /* XDP read-mostly */ struct xdp_rxq_info xdp_rxq; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 1fbd7a0f3ca6..b1a16fb9667e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -246,12 +246,17 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq, static int mlx5e_create_umr_mkey(struct mlx5_core_dev *mdev, u64 npages, u8 page_shift, - struct mlx5_core_mkey *umr_mkey) + struct mlx5_core_mkey *umr_mkey, + dma_addr_t filler_addr) { - int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); + struct mlx5_mtt *mtt; + int inlen; void *mkc; u32 *in; int err; + int i; + + inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + sizeof(*mtt) * npages; in = kvzalloc(inlen, GFP_KERNEL); if (!in) @@ -271,6 +276,18 @@ static int mlx5e_create_umr_mkey(struct mlx5_core_dev *mdev, MLX5_SET(mkc, mkc, translations_octword_size, MLX5_MTT_OCTW(npages)); MLX5_SET(mkc, mkc, log_page_size, page_shift); + MLX5_SET(create_mkey_in, in, translations_octword_actual_size, + MLX5_MTT_OCTW(npages)); + + /* Initialize the mkey with all MTTs pointing to a default + * page (filler_addr). When the channels are activated, UMR + * WQEs will redirect the RX WQEs to the actual memory from + * the RQ's pool, while the gaps (wqe_overflow) remain mapped + * to the default page. + */ + mtt = MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); + for (i = 0 ; i < npages ; i++) + mtt[i].ptag = cpu_to_be64(filler_addr); err = mlx5_core_create_mkey(mdev, umr_mkey, in, inlen); @@ -282,7 +299,8 @@ static int mlx5e_create_rq_umr_mkey(struct mlx5_core_dev *mdev, struct mlx5e_rq { u64 num_mtts = MLX5E_REQUIRED_MTTS(mlx5_wq_ll_get_size(&rq->mpwqe.wq)); - return mlx5e_create_umr_mkey(mdev, num_mtts, PAGE_SHIFT, &rq->umr_mkey); + return mlx5e_create_umr_mkey(mdev, num_mtts, PAGE_SHIFT, &rq->umr_mkey, + rq->wqe_overflow.addr); } static inline u64 mlx5e_get_mpwqe_offset(struct mlx5e_rq *rq, u16 wqe_ix) @@ -350,6 +368,28 @@ static void mlx5e_rq_err_cqe_work(struct work_struct *recover_work) mlx5e_reporter_rq_cqe_err(rq); } +static int mlx5e_alloc_mpwqe_rq_drop_page(struct mlx5e_rq *rq) +{ + rq->wqe_overflow.page = alloc_page(GFP_KERNEL); + if (!rq->wqe_overflow.page) + return -ENOMEM; + + rq->wqe_overflow.addr = dma_map_page(rq->pdev, rq->wqe_overflow.page, 0, + PAGE_SIZE, rq->buff.map_dir); + if (dma_mapping_error(rq->pdev, rq->wqe_overflow.addr)) { + __free_page(rq->wqe_overflow.page); + return -ENOMEM; + } + return 0; +} + +static void mlx5e_free_mpwqe_rq_drop_page(struct mlx5e_rq *rq) +{ + dma_unmap_page(rq->pdev, rq->wqe_overflow.addr, PAGE_SIZE, + rq->buff.map_dir); + __free_page(rq->wqe_overflow.page); +} + static int mlx5e_alloc_rq(struct mlx5e_channel *c, struct mlx5e_params *params, struct mlx5e_xsk_param *xsk, @@ -409,6 +449,10 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, if (err) goto err_rq_xdp; + err = mlx5e_alloc_mpwqe_rq_drop_page(rq); + if (err) + goto err_rq_wq_destroy; + rq->mpwqe.wq.db = &rq->mpwqe.wq.db[MLX5_RCV_DBR]; wq_sz = mlx5_wq_ll_get_size(&rq->mpwqe.wq); @@ -424,7 +468,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, err = mlx5e_create_rq_umr_mkey(mdev, rq); if (err) - goto err_rq_wq_destroy; + goto err_rq_drop_page; rq->mkey_be = cpu_to_be32(rq->umr_mkey.key); err = mlx5e_rq_alloc_mpwqe_info(rq, c); @@ -548,6 +592,8 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, kvfree(rq->mpwqe.info); err_rq_mkey: mlx5_core_destroy_mkey(mdev, &rq->umr_mkey); +err_rq_drop_page: + mlx5e_free_mpwqe_rq_drop_page(rq); break; default: /* MLX5_WQ_TYPE_CYCLIC */ mlx5e_free_di_list(rq); @@ -582,6 +628,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq) case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ: kvfree(rq->mpwqe.info); mlx5_core_destroy_mkey(rq->mdev, &rq->umr_mkey); + mlx5e_free_mpwqe_rq_drop_page(rq); break; default: /* MLX5_WQ_TYPE_CYCLIC */ kvfree(rq->wqe.frags); From patchwork Fri Oct 2 18:06:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7B60C35257 for ; Fri, 2 Oct 2020 18:07:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7A363208A9 for ; Fri, 2 Oct 2020 18:07:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662036; bh=NL/1RvC+vQMmDlSKfQA4oeem/JPFBYJYA0yCnZeBZ7Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=XpTwvUXMZnFfzQLE/VQIf2D+461qBLABvlAZ9g9KWpetzJXaI+6TlXRAg+gYaMyPa 2CUm9j7o1q40FsLmKniNeClcOVvapnRhmPJh7uj6K90z2ahEd18IjwkX9HYSCcATUk pNAjXEyQTObjcguvrTr9PUduZcUZTIEodUoj0vI4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388427AbgJBSHQ (ORCPT ); Fri, 2 Oct 2020 14:07:16 -0400 Received: from mail.kernel.org ([198.145.29.99]:36798 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388404AbgJBSHK (ORCPT ); Fri, 2 Oct 2020 14:07:10 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4E0F220795; Fri, 2 Oct 2020 18:07:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662029; bh=NL/1RvC+vQMmDlSKfQA4oeem/JPFBYJYA0yCnZeBZ7Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=a1keClnnM3EpaV9oAwrEtIghH8DXO1+1gZ6F4ktngNwZX2+jQl8rPthxIhpeKmTJu 1KgDgjHCGm3bcxVV4tHMMyWtFKyV2mn/rNpga8xKMa5zooqc57Sw2LzYg6aptysFou GvV89iOS3Jl+FegglyIbTbmgPSRIJnd+1nBuq50o= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Maor Dickman , Roi Dayan , Saeed Mahameed Subject: [net V3 09/14] net/mlx5e: CT, Fix coverity issue Date: Fri, 2 Oct 2020 11:06:49 -0700 Message-Id: <20201002180654.262800-10-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Maor Dickman The cited commit introduced the following coverity issue at function mlx5_tc_ct_rule_to_tuple_nat: - Memory - corruptions (OVERRUN) Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte elements at element index 7 (byte offset 31) using index "ip6_offset" (which evaluates to 7). In case of IPv6 destination address rewrite, ip6_offset values are between 4 to 7, which will cause memory overrun of array "tuple->ip.src_v6.in6_u.u6_addr32" to array "tuple->ip.dst_v6.in6_u.u6_addr32". Fixed by writing the value directly to array "tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are between 4 to 7. Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables") Signed-off-by: Maor Dickman Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index bc5f72ec3623..a8be40cbe325 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -246,8 +246,10 @@ mlx5_tc_ct_rule_to_tuple_nat(struct mlx5_ct_tuple *tuple, case FLOW_ACT_MANGLE_HDR_TYPE_IP6: ip6_offset = (offset - offsetof(struct ipv6hdr, saddr)); ip6_offset /= 4; - if (ip6_offset < 8) + if (ip6_offset < 4) tuple->ip.src_v6.s6_addr32[ip6_offset] = cpu_to_be32(val); + else if (ip6_offset < 8) + tuple->ip.dst_v6.s6_addr32[ip6_offset - 4] = cpu_to_be32(val); else return -EOPNOTSUPP; break; From patchwork Fri Oct 2 18:06:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB17EC35257 for ; Fri, 2 Oct 2020 18:07:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ADD282085B for ; Fri, 2 Oct 2020 18:07:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662040; bh=u7ZwC47uolB5RAe2RdrA3nnXH4+/eV1tmsqC0Wsg1lM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=bDCLG52YbrPNzBe79h9pQs5/XD0yH9LrhLtcePfX55qB9LFtjxlBo6EhMCjKc90ZF 1jTLtmy8CSTuxOxRJ2OCHjNsmzixOC9Nu7X+8PQZdabN79XTCNrkXfn2atgL0Sqsdz 7056pqu9uUTw5lXNmLdCzt/8T+92y7wa64UA4Svs= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388439AbgJBSHU (ORCPT ); Fri, 2 Oct 2020 14:07:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:36736 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388409AbgJBSHK (ORCPT ); Fri, 2 Oct 2020 14:07:10 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B37A221707; Fri, 2 Oct 2020 18:07:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662030; bh=u7ZwC47uolB5RAe2RdrA3nnXH4+/eV1tmsqC0Wsg1lM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EZj6/vdjH/1KG91LtoKY2PBimvtlWNcXZNRdSVG2fnXZdhCkaYvJ/qXPP518f+00f RXckbl8u0STXTwFuNacIYba4qVYisseB+Z3R9cjOg2K382PeWTGpnrXCtcdWaphr1N /j+iCvfBpkFcMpGaaqtMLpXNFBvLWZ27Lo2c7rtQ= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Moshe Shemesh , Tariq Toukan , Saeed Mahameed Subject: [net V3 10/14] net/mlx5e: Fix driver's declaration to support GRE offload Date: Fri, 2 Oct 2020 11:06:50 -0700 Message-Id: <20201002180654.262800-11-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin Declare GRE offload support with respect to the inner protocol. Add a list of supported inner protocols on which the driver can offload checksum and GSO. For other protocols, inform the stack to do the needed operations. There is no noticeable impact on GRE performance. Fixes: 2729984149e6 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels") Signed-off-by: Aya Levin Reviewed-by: Moshe Shemesh Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index b1a16fb9667e..42ec28e29834 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4226,6 +4226,21 @@ int mlx5e_get_vf_stats(struct net_device *dev, } #endif +static bool mlx5e_gre_tunnel_inner_proto_offload_supported(struct mlx5_core_dev *mdev, + struct sk_buff *skb) +{ + switch (skb->inner_protocol) { + case htons(ETH_P_IP): + case htons(ETH_P_IPV6): + case htons(ETH_P_TEB): + return true; + case htons(ETH_P_MPLS_UC): + case htons(ETH_P_MPLS_MC): + return MLX5_CAP_ETH(mdev, tunnel_stateless_mpls_over_gre); + } + return false; +} + static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv, struct sk_buff *skb, netdev_features_t features) @@ -4248,7 +4263,9 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv, switch (proto) { case IPPROTO_GRE: - return features; + if (mlx5e_gre_tunnel_inner_proto_offload_supported(priv->mdev, skb)) + return features; + break; case IPPROTO_IPIP: case IPPROTO_IPV6: if (mlx5e_tunnel_proto_supported(priv->mdev, IPPROTO_IPIP)) From patchwork Fri Oct 2 18:06:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BC01C46466 for ; Fri, 2 Oct 2020 18:07:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 179C32085B for ; Fri, 2 Oct 2020 18:07:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662040; bh=1q9hPhhtfibTMsqVbaaAYf+OabxR7NCQzWZkd98DhL4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=GhR5qt2e7yvuwRcoRl79l5MrGvf0/4XBip/DfLdSZObLwZnVX4c64rvT1oF/oeIs0 8OQ41qabFWqSIuCFIrWQJy1SVcCF8Gf/Ie3ENqzlaokBdFT18yX7KZmdsyHhHwyE/F vGuWrrkXLw7LqDmfrlai3LWcgVMwcMqxP1slje8Q= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388433AbgJBSHS (ORCPT ); Fri, 2 Oct 2020 14:07:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:36670 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387768AbgJBSHK (ORCPT ); Fri, 2 Oct 2020 14:07:10 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 379682177B; Fri, 2 Oct 2020 18:07:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662030; bh=1q9hPhhtfibTMsqVbaaAYf+OabxR7NCQzWZkd98DhL4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fXXtjoh2r02Dbi4pXz88f89Ojtix7QY+8b2Hr9+ezK6It7VXKe3oZUIQb3i1C64tv w5pbCqFLMcjRNjuUsd3Vqggi44B2Zso9MvCqNhWI3DoF7zy7CIDXxpcqdpw79AngiS DS5GlKPmbuV6756GqWzbUo4aF7t+Gx+iBXYDkmgY= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Eran Ben Elisha , Moshe Shemesh , Saeed Mahameed Subject: [net V3 11/14] net/mlx5e: Fix return status when setting unsupported FEC mode Date: Fri, 2 Oct 2020 11:06:51 -0700 Message-Id: <20201002180654.262800-12-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin Verify the configured FEC mode is supported by at least a single link mode before applying the command. Otherwise fail the command and return "Operation not supported". Prior to this patch, the command was successful, yet it falsely set all link modes to FEC auto mode - like configuring FEC mode to auto. Auto mode is the default configuration if a link mode doesn't support the configured FEC mode. Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links") Signed-off-by: Aya Levin Reviewed-by: Eran Ben Elisha Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en/port.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c index 96608dbb9314..308fd279669e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c @@ -569,6 +569,9 @@ int mlx5e_set_fec_mode(struct mlx5_core_dev *dev, u16 fec_policy) if (fec_policy >= (1 << MLX5E_FEC_LLRS_272_257_1) && !fec_50g_per_lane) return -EOPNOTSUPP; + if (fec_policy && !mlx5e_fec_in_caps(dev, fec_policy)) + return -EOPNOTSUPP; + MLX5_SET(pplm_reg, in, local_port, 1); err = mlx5_core_access_reg(dev, in, sz, out, sz, MLX5_REG_PPLM, 0, 0); if (err) From patchwork Fri Oct 2 18:06:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8D90C4363D for ; Fri, 2 Oct 2020 18:07:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B90AD20795 for ; Fri, 2 Oct 2020 18:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662061; bh=w+QVxxP55yI3sX1a82g1F0LBqlN4hzZX0fsd64D7n/U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=0dwCrUXi/BT/eISUlnUP5b6ijfOeoMyyc+mSPOQPcs85ceOLyWkOmePIYO/nYOzI3 2Kc8xQvV5jyMXo0e5NmPXH8sKW2AgcjL2QN9kPkWciTffesWa+dIv3t/n42AlRx3Xr fcgs92IIDi0NAZuc7Y27pdzEr3dDU650WiaITk4I= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388463AbgJBSHe (ORCPT ); Fri, 2 Oct 2020 14:07:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:36850 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388415AbgJBSHM (ORCPT ); Fri, 2 Oct 2020 14:07:12 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B890F20795; Fri, 2 Oct 2020 18:07:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662031; bh=w+QVxxP55yI3sX1a82g1F0LBqlN4hzZX0fsd64D7n/U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Cm9iDxYYxJdpgE9574g0nVwwPvmghw0Az95Hjt76LXCURzfzc5Kb0sZoyaTxFMRjE fh8/jt/E2u8CXxyJcOuIUBidWWejFMbFLqtV+WGntIJghX12nbvz6Z67OV9zwQuZyV 8x7VZ2+3IDXLuSYXa7GC/oOaFc9LimZzO1o1VRnw= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Moshe Shemesh , Saeed Mahameed Subject: [net V3 12/14] net/mlx5e: Fix VLAN cleanup flow Date: Fri, 2 Oct 2020 11:06:52 -0700 Message-Id: <20201002180654.262800-13-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin Prior to this patch unloading an interface in promiscuous mode with RX VLAN filtering feature turned off - resulted in a warning. This is due to a wrong condition in the VLAN rules cleanup flow, which left the any-vid rules in the VLAN steering table. These rules prevented destroying the flow group and the flow table. The any-vid rules are removed in 2 flows, but none of them remove it in case both promiscuous is set and VLAN filtering is off. Fix the issue by changing the condition of the VLAN table cleanup flow to clean also in case of promiscuous mode. mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1 ... ... ------------[ cut here ]------------ FW pages counter is 11560 after reclaiming all pages WARNING: CPU: 1 PID: 28729 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660 mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: mlx5_function_teardown+0x2f/0x90 [mlx5_core] mlx5_unload_one+0x71/0x110 [mlx5_core] remove_one+0x44/0x80 [mlx5_core] pci_device_remove+0x3e/0xc0 device_release_driver_internal+0xfb/0x1c0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x68/0x90 pci_stop_and_remove_bus_device+0x12/0x20 hv_eject_device_work+0x6f/0x170 [pci_hyperv] ? __schedule+0x349/0x790 process_one_work+0x206/0x400 worker_thread+0x34/0x3f0 ? process_one_work+0x400/0x400 kthread+0x126/0x140 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 ---[ end trace 6283bde8d26170dc ]--- Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 64d002d92250..55a4c3adaa05 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -415,8 +415,12 @@ static void mlx5e_del_vlan_rules(struct mlx5e_priv *priv) for_each_set_bit(i, priv->fs.vlan.active_svlans, VLAN_N_VID) mlx5e_del_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_STAG_VID, i); - if (priv->fs.vlan.cvlan_filter_disabled && - !(priv->netdev->flags & IFF_PROMISC)) + WARN_ON_ONCE(!(test_bit(MLX5E_STATE_DESTROYING, &priv->state))); + + /* must be called after DESTROY bit is set and + * set_rx_mode is called and flushed + */ + if (priv->fs.vlan.cvlan_filter_disabled) mlx5e_del_any_vid_rules(priv); } From patchwork Fri Oct 2 18:06:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 289071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA6F6C4363D for ; Fri, 2 Oct 2020 18:07:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3A812085B for ; Fri, 2 Oct 2020 18:07:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662056; bh=jqZZAlz+GooEXSMANr50m6cxt6WDQ5l2qx+zaQKbcFE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=on9sA1+zJrxJVAvG3I/g9HfEuG+3bYJLSii/9bc/UQ1vRKjHvsctAoxCKyt/Ng6b8 /bcIwL398TVAS/0vd9OWMmU0lqcrgNRa6opCsXadfNYMHIyaf5nl6JZS4mRMe13+cH Pn/fUSjxeqd6IwUAI2OiwwHbBC5zVbI45syjN9Nk= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388478AbgJBSHg (ORCPT ); Fri, 2 Oct 2020 14:07:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:36868 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388417AbgJBSHM (ORCPT ); Fri, 2 Oct 2020 14:07:12 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2ECED21789; Fri, 2 Oct 2020 18:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662031; bh=jqZZAlz+GooEXSMANr50m6cxt6WDQ5l2qx+zaQKbcFE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=zQZltrJeXbUD3obL0cOK26HIQOFlYYGRROvgMemXsa1v8XqNCnF4ngWY48dubnF4C js9qX5QDJFjg1jWv5+HaN+Tlix9GqU6YUeEELGORKCuRwjA6VtB90ClUpdq0oI/7dh HypcQIwdxq1PqW6vpmZ8HJJEHAVfL2oa0UQGpaiU= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Aya Levin , Moshe Shemesh , Saeed Mahameed Subject: [net V3 13/14] net/mlx5e: Fix VLAN create flow Date: Fri, 2 Oct 2020 11:06:53 -0700 Message-Id: <20201002180654.262800-14-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Aya Levin When interface is attached while in promiscuous mode and with VLAN filtering turned off, both configurations are not respected and VLAN filtering is performed. There are 2 flows which add the any-vid rules during interface attach: VLAN creation table and set rx mode. Each is relaying on the other to add any-vid rules, eventually non of them does. Fix this by adding any-vid rules on VLAN creation regardless of promiscuous mode. Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_fs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 55a4c3adaa05..1f48f99c0997 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -217,6 +217,9 @@ static int __mlx5e_add_vlan_rule(struct mlx5e_priv *priv, break; } + if (WARN_ONCE(*rule_p, "VLAN rule already exists type %d", rule_type)) + return 0; + *rule_p = mlx5_add_flow_rules(ft, spec, &flow_act, &dest, 1); if (IS_ERR(*rule_p)) { @@ -397,8 +400,7 @@ static void mlx5e_add_vlan_rules(struct mlx5e_priv *priv) for_each_set_bit(i, priv->fs.vlan.active_svlans, VLAN_N_VID) mlx5e_add_vlan_rule(priv, MLX5E_VLAN_RULE_TYPE_MATCH_STAG_VID, i); - if (priv->fs.vlan.cvlan_filter_disabled && - !(priv->netdev->flags & IFF_PROMISC)) + if (priv->fs.vlan.cvlan_filter_disabled) mlx5e_add_any_vid_rules(priv); } From patchwork Fri Oct 2 18:06:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 267675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9756C46466 for ; Fri, 2 Oct 2020 18:07:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A4E982085B for ; Fri, 2 Oct 2020 18:07:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662057; bh=nQ7XBHy384gkT3QC8Zm2HtCQoZ5HIHLd6k04/7tH+nQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=w/7zIIAkn+y5PDpIlT9KuUdVugb43mzOMTS7llB4Q6Ki1hTMqwwYnJ40CB6N54rh/ zgST5xfMO7BAWOdeneARLXb+Xgxnb07cFPvJQNPAtnvGjxYRZrprn4vpsLqFlH7mmU zhX91GmGUpepNC+OWZIHkf+ojk/UaIgOkdz1aJYk= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388475AbgJBSHf (ORCPT ); Fri, 2 Oct 2020 14:07:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:36884 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388418AbgJBSHN (ORCPT ); Fri, 2 Oct 2020 14:07:13 -0400 Received: from sx1.mtl.com (c-24-6-56-119.hsd1.ca.comcast.net [24.6.56.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9ACB3208C7; Fri, 2 Oct 2020 18:07:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601662031; bh=nQ7XBHy384gkT3QC8Zm2HtCQoZ5HIHLd6k04/7tH+nQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y8rUFV++TS9LZ3GcALCZdOTGvEy/CR5w4JRHkrkbCPwDI1S07EAfhanVOsVC3XuB7 0VoWvfMWlCSPlD1G58ANQNU/1eb48PEEdiq4vB3UkYH1NTjKl2e2B+rmA3Ns7eOtHT GQ9phnp8wVIoIBsRYVVDAO/QBce2ei/A1SDqcqJ8= From: saeed@kernel.org To: "David S. Miller" , Jakub Kicinski Cc: netdev@vger.kernel.org, Vlad Buslov , Roi Dayan , Saeed Mahameed Subject: [net V3 14/14] net/mlx5e: Fix race condition on nhe->n pointer in neigh update Date: Fri, 2 Oct 2020 11:06:54 -0700 Message-Id: <20201002180654.262800-15-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201002180654.262800-1-saeed@kernel.org> References: <20201002180654.262800-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vlad Buslov Current neigh update event handler implementation takes reference to neighbour structure, assigns it to nhe->n, tries to schedule workqueue task and releases the reference if task was already enqueued. This results potentially overwriting existing nhe->n pointer with another neighbour instance, which causes double release of the instance (once in neigh update handler that failed to enqueue to workqueue and another one in neigh update workqueue task that processes updated nhe->n pointer instead of original one): [ 3376.512806] ------------[ cut here ]------------ [ 3376.513534] refcount_t: underflow; use-after-free. [ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c _core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw] [ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6 [ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0 [ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13 [ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286 [ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000 [ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40 [ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d [ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000 [ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900 [ 3376.541444] FS: 0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000 [ 3376.542732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0 [ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3376.546344] PKRU: 55555554 [ 3376.546911] Call Trace: [ 3376.547479] mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core] [ 3376.548299] process_one_work+0x1d8/0x390 [ 3376.548977] worker_thread+0x4d/0x3e0 [ 3376.549631] ? rescuer_thread+0x3e0/0x3e0 [ 3376.550295] kthread+0x118/0x130 [ 3376.550914] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3376.551675] ret_from_fork+0x1f/0x30 [ 3376.552312] ---[ end trace d84e8f46d2a77eec ]--- Fix the bug by moving work_struct to dedicated dynamically-allocated structure. This enabled every event handler to work on its own private neighbour pointer and removes the need for handling the case when task is already enqueued. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Vlad Buslov Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../mellanox/mlx5/core/en/rep/neigh.c | 81 ++++++++++++------- .../net/ethernet/mellanox/mlx5/core/en_rep.h | 6 -- 2 files changed, 50 insertions(+), 37 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/neigh.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/neigh.c index 906292035088..58e27038c947 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/neigh.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/neigh.c @@ -110,11 +110,25 @@ static void mlx5e_rep_neigh_stats_work(struct work_struct *work) rtnl_unlock(); } +struct neigh_update_work { + struct work_struct work; + struct neighbour *n; + struct mlx5e_neigh_hash_entry *nhe; +}; + +static void mlx5e_release_neigh_update_work(struct neigh_update_work *update_work) +{ + neigh_release(update_work->n); + mlx5e_rep_neigh_entry_release(update_work->nhe); + kfree(update_work); +} + static void mlx5e_rep_neigh_update(struct work_struct *work) { - struct mlx5e_neigh_hash_entry *nhe = - container_of(work, struct mlx5e_neigh_hash_entry, neigh_update_work); - struct neighbour *n = nhe->n; + struct neigh_update_work *update_work = container_of(work, struct neigh_update_work, + work); + struct mlx5e_neigh_hash_entry *nhe = update_work->nhe; + struct neighbour *n = update_work->n; struct mlx5e_encap_entry *e; unsigned char ha[ETH_ALEN]; struct mlx5e_priv *priv; @@ -146,30 +160,42 @@ static void mlx5e_rep_neigh_update(struct work_struct *work) mlx5e_rep_update_flows(priv, e, neigh_connected, ha); mlx5e_encap_put(priv, e); } - mlx5e_rep_neigh_entry_release(nhe); rtnl_unlock(); - neigh_release(n); + mlx5e_release_neigh_update_work(update_work); } -static void mlx5e_rep_queue_neigh_update_work(struct mlx5e_priv *priv, - struct mlx5e_neigh_hash_entry *nhe, - struct neighbour *n) +static struct neigh_update_work *mlx5e_alloc_neigh_update_work(struct mlx5e_priv *priv, + struct neighbour *n) { - /* Take a reference to ensure the neighbour and mlx5 encap - * entry won't be destructed until we drop the reference in - * delayed work. - */ - neigh_hold(n); + struct neigh_update_work *update_work; + struct mlx5e_neigh_hash_entry *nhe; + struct mlx5e_neigh m_neigh = {}; - /* This assignment is valid as long as the the neigh reference - * is taken - */ - nhe->n = n; + update_work = kzalloc(sizeof(*update_work), GFP_ATOMIC); + if (WARN_ON(!update_work)) + return NULL; - if (!queue_work(priv->wq, &nhe->neigh_update_work)) { - mlx5e_rep_neigh_entry_release(nhe); - neigh_release(n); + m_neigh.dev = n->dev; + m_neigh.family = n->ops->family; + memcpy(&m_neigh.dst_ip, n->primary_key, n->tbl->key_len); + + /* Obtain reference to nhe as last step in order not to release it in + * atomic context. + */ + rcu_read_lock(); + nhe = mlx5e_rep_neigh_entry_lookup(priv, &m_neigh); + rcu_read_unlock(); + if (!nhe) { + kfree(update_work); + return NULL; } + + INIT_WORK(&update_work->work, mlx5e_rep_neigh_update); + neigh_hold(n); + update_work->n = n; + update_work->nhe = nhe; + + return update_work; } static int mlx5e_rep_netevent_event(struct notifier_block *nb, @@ -181,7 +207,7 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb, struct net_device *netdev = rpriv->netdev; struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5e_neigh_hash_entry *nhe = NULL; - struct mlx5e_neigh m_neigh = {}; + struct neigh_update_work *update_work; struct neigh_parms *p; struct neighbour *n; bool found = false; @@ -196,17 +222,11 @@ static int mlx5e_rep_netevent_event(struct notifier_block *nb, #endif return NOTIFY_DONE; - m_neigh.dev = n->dev; - m_neigh.family = n->ops->family; - memcpy(&m_neigh.dst_ip, n->primary_key, n->tbl->key_len); - - rcu_read_lock(); - nhe = mlx5e_rep_neigh_entry_lookup(priv, &m_neigh); - rcu_read_unlock(); - if (!nhe) + update_work = mlx5e_alloc_neigh_update_work(priv, n); + if (!update_work) return NOTIFY_DONE; - mlx5e_rep_queue_neigh_update_work(priv, nhe, n); + queue_work(priv->wq, &update_work->work); break; case NETEVENT_DELAY_PROBE_TIME_UPDATE: @@ -352,7 +372,6 @@ int mlx5e_rep_neigh_entry_create(struct mlx5e_priv *priv, (*nhe)->priv = priv; memcpy(&(*nhe)->m_neigh, &e->m_neigh, sizeof(e->m_neigh)); - INIT_WORK(&(*nhe)->neigh_update_work, mlx5e_rep_neigh_update); spin_lock_init(&(*nhe)->encap_list_lock); INIT_LIST_HEAD(&(*nhe)->encap_list); refcount_set(&(*nhe)->refcnt, 1); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 622c27ae4ac7..0d1562e20118 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -135,12 +135,6 @@ struct mlx5e_neigh_hash_entry { /* encap list sharing the same neigh */ struct list_head encap_list; - /* valid only when the neigh reference is taken during - * neigh_update_work workqueue callback. - */ - struct neighbour *n; - struct work_struct neigh_update_work; - /* neigh hash entry can be deleted only when the refcount is zero. * refcount is needed to avoid neigh hash entry removal by TC, while * it's used by the neigh notification call.