From patchwork Tue Aug 3 23:19:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AC02C4320A for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 12D676103C for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233444AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:38934 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233205AbhHCXUa (ORCPT ); Tue, 3 Aug 2021 19:20:30 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 447D660FA0; Tue, 3 Aug 2021 23:20:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032818; bh=Qktmr9TWKmVNw8s/sQBxRP1Jh/I+VL/hnnYwlQ34WnY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pf4EHnVdsacIMfE4ibILfczso4Eeu4L79LLCnmFVVoQpwWOOUU8hzvIaOg5XTkhC/ O6Xxr0u+LgeMCOZR1JwL0QMGC7BcnQl34w92BtrPLnJ1TmTeKrctiNN7CtLvAEd5kM WlCuCr8RT9RW9smhpkjWgbdMzzDmYtgXECcovKxbtvtZP3s9NtdxPbGRlZd8X/Ks0e 0vhZVJd7P9uV9mpNJUN/KzSZnQ9bhBADjRtULQWie+WjEMCLaAKXwRD8qt9dCjo3xM VF+t0GP7LhH8+LThTw/R6dTMA+IJFLuZRYjve9MODmzED1ohlUueBnXps4yEs/sUtG 0pwycsco1B6Jw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 02/14] net/mlx5: Lag, add initial logic for shared FDB Date: Tue, 3 Aug 2021 16:19:47 -0700 Message-Id: <20210803231959.26513-3-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mark Bloch As shared FDB requires changes in two subsystems first expose the needed core functions so the RDMA side can be changed. mlx5_lag_is_master(): return true if a given mlx5 device is the lag master. mlx5_lag_is_shared_fdb(): Returns true if the lag mode is shared FDB. mlx5_lag_get_peer_mdev(): Return the peer mdev in lag. The mentioned functions will be used by downstream patches in order to add support for shared FDB for the RDMA side. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 49 +++++++++++++++++++ drivers/net/ethernet/mellanox/mlx5/core/lag.h | 1 + include/linux/mlx5/driver.h | 3 ++ 3 files changed, 53 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 5c043c5cc403..3049de648256 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -746,6 +746,21 @@ bool mlx5_lag_is_active(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_lag_is_active); +bool mlx5_lag_is_master(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + bool res; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + res = ldev && __mlx5_lag_is_active(ldev) && + dev == ldev->pf[MLX5_LAG_P1].dev; + spin_unlock(&lag_lock); + + return res; +} +EXPORT_SYMBOL(mlx5_lag_is_master); + bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev; @@ -760,6 +775,20 @@ bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_lag_is_sriov); +bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + bool res; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + res = ldev && __mlx5_lag_is_sriov(ldev) && ldev->shared_fdb; + spin_unlock(&lag_lock); + + return res; +} +EXPORT_SYMBOL(mlx5_lag_is_shared_fdb); + void mlx5_lag_update(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev; @@ -827,6 +856,26 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev, } EXPORT_SYMBOL(mlx5_lag_get_slave_port); +struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev) +{ + struct mlx5_core_dev *peer_dev = NULL; + struct mlx5_lag *ldev; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + if (!ldev) + goto unlock; + + peer_dev = ldev->pf[MLX5_LAG_P1].dev == dev ? + ldev->pf[MLX5_LAG_P2].dev : + ldev->pf[MLX5_LAG_P1].dev; + +unlock: + spin_unlock(&lag_lock); + return peer_dev; +} +EXPORT_SYMBOL(mlx5_lag_get_peer_mdev); + int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev, u64 *values, int num_counters, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index 191392c37558..70b244b1a09e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -39,6 +39,7 @@ struct lag_tracker { */ struct mlx5_lag { u8 flags; + bool shared_fdb; u8 v2p_map[MLX5_MAX_PORTS]; struct kref ref; struct lag_func pf[MLX5_MAX_PORTS]; diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 1efe37466969..af4dd6e9f97f 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1138,6 +1138,8 @@ bool mlx5_lag_is_roce(struct mlx5_core_dev *dev); bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev); bool mlx5_lag_is_multipath(struct mlx5_core_dev *dev); bool mlx5_lag_is_active(struct mlx5_core_dev *dev); +bool mlx5_lag_is_master(struct mlx5_core_dev *dev); +bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev); struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev); u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev, struct net_device *slave); @@ -1145,6 +1147,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev, u64 *values, int num_counters, size_t *offsets); +struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev); struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev); void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up); int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type, From patchwork Tue Aug 3 23:19:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 653B2C4320A for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4BAE260FA0 for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233007AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:38950 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233362AbhHCXUb (ORCPT ); Tue, 3 Aug 2021 19:20:31 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 31ABC60F9C; Tue, 3 Aug 2021 23:20:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032819; bh=V5VFVrJWUkXo8/cOXZnkCjvTJpHkG/7Y51NWhGHWV0c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y3yvlFpQsrpwgSNYYBMJhEqsuaaVHjjFSSd0KM6wRTK4L7vnNA+rhq7XyndnIeZs4 v24zTCDjOAVAJMB/tXzeZmCLqMD0O0MJRVugGV0+aOMAHJuwJtAhhrtizpZ7Hx1iPO EYH6Lj2FjCK09cTg/CqywJI3G/nC1mtYBo9GFewqewoxExMydHX5890VWHrEkkpESC cGNqXnn1Vvo9Ryw4YN6hrZ4mOFw8+bSybb8m4PE6gNVRbCVVvmEZvkHp3N/9nY6g7Z c9KRmIQzQrF5Y9wJ0YQPtW0WZu09yJKH9FBgcxwFzqbG4RtE40j8swGDX/NAxy86N6 TaSDfRo513fQA== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 04/14] {net, RDMA}/mlx5: Extend send to vport rules Date: Tue, 3 Aug 2021 16:19:49 -0700 Message-Id: <20210803231959.26513-5-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mark Bloch In shared FDB there is only one eswitch which is active and it receives traffic from all representors and all vports in the HCA. While the Ethernet representor will always reside on its native PF the IB representor will not. Extend send to vport rule creation to support such flows. Need to account for source vport that sends the traffic (on which the representors resides) and the target eswitch the traffic which reach. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/infiniband/hw/mlx5/ib_rep.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 5 +++-- include/linux/mlx5/eswitch.h | 1 + 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c index b25e0b33a11a..bf5a6e4d1c03 100644 --- a/drivers/infiniband/hw/mlx5/ib_rep.c +++ b/drivers/infiniband/hw/mlx5/ib_rep.c @@ -123,7 +123,7 @@ struct mlx5_flow_handle *create_flow_rule_vport_sq(struct mlx5_ib_dev *dev, rep = dev->port[port - 1].rep; - return mlx5_eswitch_add_send_to_vport_rule(esw, rep, sq->base.mqp.qpn); + return mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep, sq->base.mqp.qpn); } static int mlx5r_rep_probe(struct auxiliary_device *adev, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index bf94bcb6fa5d..1d016cc64015 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -337,7 +337,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, } /* Add re-inject rule to the PF/representor sqs */ - flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, rep, + flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep, sqns_array[i]); if (IS_ERR(flow_rule)) { err = PTR_ERR(flow_rule); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 7579f3402776..12567002997f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -925,6 +925,7 @@ int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw, struct mlx5_flow_handle * mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, + struct mlx5_eswitch *from_esw, struct mlx5_eswitch_rep *rep, u32 sqn) { @@ -943,10 +944,10 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters); MLX5_SET(fte_match_set_misc, misc, source_sqn, sqn); /* source vport is the esw manager */ - MLX5_SET(fte_match_set_misc, misc, source_port, rep->esw->manager_vport); + MLX5_SET(fte_match_set_misc, misc, source_port, from_esw->manager_vport); if (MLX5_CAP_ESW(on_esw->dev, merged_eswitch)) MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id, - MLX5_CAP_GEN(rep->esw->dev, vhca_id)); + MLX5_CAP_GEN(from_esw->dev, vhca_id)); misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters); MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_sqn); diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h index c2a34ff85188..0bfcf7b8ecf9 100644 --- a/include/linux/mlx5/eswitch.h +++ b/include/linux/mlx5/eswitch.h @@ -63,6 +63,7 @@ struct mlx5_eswitch_rep *mlx5_eswitch_vport_rep(struct mlx5_eswitch *esw, void *mlx5_eswitch_uplink_get_proto_dev(struct mlx5_eswitch *esw, u8 rep_type); struct mlx5_flow_handle * mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, + struct mlx5_eswitch *from_esw, struct mlx5_eswitch_rep *rep, u32 sqn); #ifdef CONFIG_MLX5_ESWITCH From patchwork Tue Aug 3 23:19:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 761E7C43216 for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 60BAE60F9C for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233834AbhHCXUi (ORCPT ); Tue, 3 Aug 2021 19:20:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:38958 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229986AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 23EF260F93; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032820; bh=CKn/KvEeyZAMuD+iFiH3gFauEveHmSMXEL5LWTioEbc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t6eXw/E7p6Ij0lxShllsqHoBbFM3/hiGROIRM2adIiIkdeNJgOIJgSTthlygfxJVm ioa0yfwArIxq+fexGzcqOeuo6JT3o4vyc5jZMXgtW62z2E4nwjC2GHMEj5tYiU7vxk nHAsgvI+5KU5xF0qyxijFIJpi4NaBjbq3HDEJKCocz/Zi0yzIO6TxYEg6kgWNor3g6 3liyiJUEbFSja5be6sPo/SGHLtivN4+EnjSVTHgszsuHi/D1ur6xxOnNZEUFjnl4fQ XvzfzHo2wW+3lwt+vSeX86q3GrJhr8tIvEbRLUqU1sVbZgBvbKNIYk3gv0mW1xEM3H 4l4gjMqJJHc6w== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Ariel Levkovich , Roi Dayan Subject: [PATCH mlx5-next 06/14] net/mlx5: E-Switch, set flow source for send to uplink rule Date: Tue, 3 Aug 2021 16:19:51 -0700 Message-Id: <20210803231959.26513-7-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ariel Levkovich Set the flow source param to local vport for the uplink rep send-to-vport rule. This will comply with the recent changes in SW steering that use the flow source as an indication for the rule type - rx or tx. Since the uplink send-to-vport rule is forwarding traffic to the wire it has to indicate that it is an sx rule and can't use the any port value in the flow source. Signed-off-by: Ariel Levkovich Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 12567002997f..1735be77e1fd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -963,6 +963,9 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, dest.vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID; flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + if (rep->vport == MLX5_VPORT_UPLINK) + spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; + flow_rule = mlx5_add_flow_rules(on_esw->fdb_table.offloads.slow_fdb, spec, &flow_act, &dest, 1); if (IS_ERR(flow_rule)) From patchwork Tue Aug 3 23:19:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65942C19F39 for ; Tue, 3 Aug 2021 23:20:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 49A6860F70 for ; Tue, 3 Aug 2021 23:20:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233939AbhHCXUk (ORCPT ); Tue, 3 Aug 2021 19:20:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:38974 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233498AbhHCXUd (ORCPT ); Tue, 3 Aug 2021 19:20:33 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 07BB560F94; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032821; bh=AFc/6k5YwUwP9AAOKk7bsRXYfF/doRJC0xMN4+zaCwM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z8rtetJNIy1A+OrVyI3/SqkUmx/G5HW7LAfoE5s1M8AK064Mkk/pDGL3LOY4C01pl Qy+NbvwGYEuiko8zSBAqG2L0CWkpfrkSRKfEKbqevVqAJuLSgSxMtA1JahPaV5I6Xb Nq3wtO+bB0/LoohoeEzRu/CrJ8bG6u0gYIwiletYdTkOflDGdrQSzO/MLaZyPeAeGe /z17Rg/WGMIqm+lqZIvUlZjR66FdHyMyXBCzKdyRp5PiU6C+5NC8rq3NIZqXV+3mu5 CmVkjOWpGdYxDzpYajjQQbfe636pceY52F8Aq/rn5UnB5Te+ekneZjedioGPJVnAoT /HheTpl/Dc3Zw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Roi Dayan Subject: [PATCH mlx5-next 08/14] net/mlx5e: Use shared mappings for restoring from metadata Date: Tue, 3 Aug 2021 16:19:53 -0700 Message-Id: <20210803231959.26513-9-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Roi Dayan FTEs are added with mapped metadata which is saved per eswitch. When uplink reps are bonded and we are in a single FDB mode, we could fail to find metadata which was stored on one eswitch mapping but not the other or with a different id. To resolve this issue use shared mapping between eswitch ports. We do not have any conflict using a single mapping, for a type, between the ports. Signed-off-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 9 ++++++-- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 21 ++++++++++++++----- .../net/ethernet/mellanox/mlx5/core/eswitch.h | 8 +++++++ .../mellanox/mlx5/core/eswitch_offloads.c | 11 +++++++--- 4 files changed, 39 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 91e7a01e32be..b1707b86aa16 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -2138,6 +2138,7 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains, struct mlx5_tc_ct_priv *ct_priv; struct mlx5_core_dev *dev; const char *msg; + u64 mapping_id; int err; dev = priv->mdev; @@ -2153,13 +2154,17 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains, if (!ct_priv) goto err_alloc; - ct_priv->zone_mapping = mapping_create(sizeof(u16), 0, true); + mapping_id = mlx5_query_nic_system_image_guid(dev); + + ct_priv->zone_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_ZONE, + sizeof(u16), 0, true); if (IS_ERR(ct_priv->zone_mapping)) { err = PTR_ERR(ct_priv->zone_mapping); goto err_mapping_zone; } - ct_priv->labels_mapping = mapping_create(sizeof(u32) * 4, 0, true); + ct_priv->labels_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_LABELS, + sizeof(u32) * 4, 0, true); if (IS_ERR(ct_priv->labels_mapping)) { err = PTR_ERR(ct_priv->labels_mapping); goto err_mapping_labels; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 629a61e8022f..aca677933423 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -4848,6 +4848,7 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) struct mlx5_core_dev *dev = priv->mdev; struct mapping_ctx *chains_mapping; struct mlx5_chains_attr attr = {}; + u64 mapping_id; int err; mlx5e_mod_hdr_tbl_init(&tc->mod_hdr); @@ -4861,8 +4862,12 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) lockdep_set_class(&tc->ht.mutex, &tc_ht_lock_key); - chains_mapping = mapping_create(sizeof(struct mlx5_mapped_obj), - MLX5E_TC_TABLE_CHAIN_TAG_MASK, true); + mapping_id = mlx5_query_nic_system_image_guid(dev); + + chains_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_CHAIN, + sizeof(struct mlx5_mapped_obj), + MLX5E_TC_TABLE_CHAIN_TAG_MASK, true); + if (IS_ERR(chains_mapping)) { err = PTR_ERR(chains_mapping); goto err_mapping; @@ -4951,6 +4956,7 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) struct mapping_ctx *mapping; struct mlx5_eswitch *esw; struct mlx5e_priv *priv; + u64 mapping_id; int err = 0; uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, tc_ht); @@ -4967,8 +4973,12 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) uplink_priv->esw_psample = mlx5_esw_sample_init(netdev_priv(priv->netdev)); #endif - mapping = mapping_create(sizeof(struct tunnel_match_key), - TUNNEL_INFO_BITS_MASK, true); + mapping_id = mlx5_query_nic_system_image_guid(esw->dev); + + mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_TUNNEL, + sizeof(struct tunnel_match_key), + TUNNEL_INFO_BITS_MASK, true); + if (IS_ERR(mapping)) { err = PTR_ERR(mapping); goto err_tun_mapping; @@ -4976,7 +4986,8 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) uplink_priv->tunnel_mapping = mapping; /* 0xFFF is reserved for stack devices slow path table mark */ - mapping = mapping_create(sz_enc_opts, ENC_OPTS_BITS_MASK - 1, true); + mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_TUNNEL_ENC_OPTS, + sz_enc_opts, ENC_OPTS_BITS_MASK - 1, true); if (IS_ERR(mapping)) { err = PTR_ERR(mapping); goto err_enc_opts_mapping; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 48cac5bf606d..c3a47349f447 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -86,6 +86,14 @@ struct mlx5_mapped_obj { #define esw_chains(esw) \ ((esw)->fdb_table.offloads.esw_chains_priv) +enum { + MAPPING_TYPE_CHAIN, + MAPPING_TYPE_TUNNEL, + MAPPING_TYPE_TUNNEL_ENC_OPTS, + MAPPING_TYPE_LABELS, + MAPPING_TYPE_ZONE, +}; + struct vport_ingress { struct mlx5_flow_table *acl; struct mlx5_flow_handle *allow_rule; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 1735be77e1fd..dd5eadd6047b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -2787,6 +2787,7 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) struct mapping_ctx *reg_c0_obj_pool; struct mlx5_vport *vport; unsigned long i; + u64 mapping_id; int err; if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat) && @@ -2810,9 +2811,13 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) if (err) goto err_vport_metadata; - reg_c0_obj_pool = mapping_create(sizeof(struct mlx5_mapped_obj), - ESW_REG_C0_USER_DATA_METADATA_MASK, - true); + mapping_id = mlx5_query_nic_system_image_guid(esw->dev); + + reg_c0_obj_pool = mapping_create_for_id(mapping_id, MAPPING_TYPE_CHAIN, + sizeof(struct mlx5_mapped_obj), + ESW_REG_C0_USER_DATA_METADATA_MASK, + true); + if (IS_ERR(reg_c0_obj_pool)) { err = PTR_ERR(reg_c0_obj_pool); goto err_pool; From patchwork Tue Aug 3 23:19:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 156ADC4320A for ; Tue, 3 Aug 2021 23:20:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 00F4B60F94 for ; Tue, 3 Aug 2021 23:20:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234017AbhHCXUp (ORCPT ); Tue, 3 Aug 2021 19:20:45 -0400 Received: from mail.kernel.org ([198.145.29.99]:38988 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233570AbhHCXUe (ORCPT ); Tue, 3 Aug 2021 19:20:34 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2A61461040; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032822; bh=g1GRdjhhmYnq4odN6yrFsA6z4OJcL9sJuSOMn+enSIA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YcGfgtmdmwyGVqyUyWxGjIp4/Glg9XvPPHR8jN8kXdqskAACQqQB+fRYYqe9OSyN5 AsRvkpRBmiY1DEyGInfZOjxBJUei3H9g5ukiyfXO8wKy4qtoYKdq71jn+KlQYPID7h gOJIXXAsBDos31Mx2/DUjdeziiLSUU5USmVkdGpKrPBWvDCfDnFrTdrvW1kE4SNwss vmd/gbFAtmrrUCCdBgXYXp92SPxkzuOJliAE7jOPUvNNqO2b8sjddYEKtmfaeL5ueF 4Y86vzD0ajiwtpCNmAvmEgA+1rtNew4MgUe/asCsv5rtEKlhiNJI2ZtAsDFGr5Yd/v acCJPq88GPnbw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 10/14] net/mlx5: Add send to vport rules on paired device Date: Tue, 3 Aug 2021 16:19:55 -0700 Message-Id: <20210803231959.26513-11-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mark Bloch When two mlx5 devices are paired in switchdev mode, always offload the send-to-vport rule to the peer E-Switch. This allows to abstract the logic when this is really necessary (single FDB) and combine the logic of both cases into one. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 86 ++++++++++++++++++- .../net/ethernet/mellanox/mlx5/core/en_rep.h | 2 + .../mellanox/mlx5/core/eswitch_offloads.c | 16 +++- 3 files changed, 101 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 1d016cc64015..cc34600b4dde 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -49,6 +49,7 @@ #include "en/devlink.h" #include "fs_core.h" #include "lib/mlx5.h" +#include "lib/devcom.h" #define CREATE_TRACE_POINTS #include "diag/en_rep_tracepoint.h" #include "en_accel/ipsec.h" @@ -310,6 +311,8 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw, rpriv = mlx5e_rep_to_rep_priv(rep); list_for_each_entry_safe(rep_sq, tmp, &rpriv->vport_sqs_list, list) { mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule); + if (rep_sq->send_to_vport_rule_peer) + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer); list_del(&rep_sq->list); kfree(rep_sq); } @@ -319,6 +322,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, struct mlx5_eswitch_rep *rep, u32 *sqns_array, int sqns_num) { + struct mlx5_eswitch *peer_esw = NULL; struct mlx5_flow_handle *flow_rule; struct mlx5e_rep_priv *rpriv; struct mlx5e_rep_sq *rep_sq; @@ -329,6 +333,10 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, return 0; rpriv = mlx5e_rep_to_rep_priv(rep); + if (mlx5_devcom_is_paired(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS)) + peer_esw = mlx5_devcom_get_peer_data(esw->dev->priv.devcom, + MLX5_DEVCOM_ESW_OFFLOADS); + for (i = 0; i < sqns_num; i++) { rep_sq = kzalloc(sizeof(*rep_sq), GFP_KERNEL); if (!rep_sq) { @@ -345,12 +353,34 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, goto out_err; } rep_sq->send_to_vport_rule = flow_rule; + rep_sq->sqn = sqns_array[i]; + + if (peer_esw) { + flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, + rep, sqns_array[i]); + if (IS_ERR(flow_rule)) { + err = PTR_ERR(flow_rule); + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule); + kfree(rep_sq); + goto out_err; + } + rep_sq->send_to_vport_rule_peer = flow_rule; + } + list_add(&rep_sq->list, &rpriv->vport_sqs_list); } + + if (peer_esw) + mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS); + return 0; out_err: mlx5e_sqs2vport_stop(esw, rep); + + if (peer_esw) + mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS); + return err; } @@ -1264,10 +1294,64 @@ static void *mlx5e_vport_rep_get_proto_dev(struct mlx5_eswitch_rep *rep) return rpriv->netdev; } +static void mlx5e_vport_rep_event_unpair(struct mlx5_eswitch_rep *rep) +{ + struct mlx5e_rep_priv *rpriv; + struct mlx5e_rep_sq *rep_sq; + + rpriv = mlx5e_rep_to_rep_priv(rep); + list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) { + if (!rep_sq->send_to_vport_rule_peer) + continue; + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer); + rep_sq->send_to_vport_rule_peer = NULL; + } +} + +static int mlx5e_vport_rep_event_pair(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + struct mlx5_eswitch *peer_esw) +{ + struct mlx5_flow_handle *flow_rule; + struct mlx5e_rep_priv *rpriv; + struct mlx5e_rep_sq *rep_sq; + + rpriv = mlx5e_rep_to_rep_priv(rep); + list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) { + if (rep_sq->send_to_vport_rule_peer) + continue; + flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, rep, rep_sq->sqn); + if (IS_ERR(flow_rule)) + goto err_out; + rep_sq->send_to_vport_rule_peer = flow_rule; + } + + return 0; +err_out: + mlx5e_vport_rep_event_unpair(rep); + return PTR_ERR(flow_rule); +} + +static int mlx5e_vport_rep_event(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + enum mlx5_switchdev_event event, + void *data) +{ + int err = 0; + + if (event == MLX5_SWITCHDEV_EVENT_PAIR) + err = mlx5e_vport_rep_event_pair(esw, rep, data); + else if (event == MLX5_SWITCHDEV_EVENT_UNPAIR) + mlx5e_vport_rep_event_unpair(rep); + + return err; +} + static const struct mlx5_eswitch_rep_ops rep_ops = { .load = mlx5e_vport_rep_load, .unload = mlx5e_vport_rep_unload, - .get_proto_dev = mlx5e_vport_rep_get_proto_dev + .get_proto_dev = mlx5e_vport_rep_get_proto_dev, + .event = mlx5e_vport_rep_event, }; static int mlx5e_rep_probe(struct auxiliary_device *adev, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 47a2dfb7792a..8f0c82448eec 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -207,6 +207,8 @@ struct mlx5e_encap_entry { struct mlx5e_rep_sq { struct mlx5_flow_handle *send_to_vport_rule; + struct mlx5_flow_handle *send_to_vport_rule_peer; + u32 sqn; struct list_head list; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index b57a5c188832..e02a8bd2bd96 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1616,7 +1616,18 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) goto ns_err; } - table_size = esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + + /* To be strictly correct: + * MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ) + * should be: + * esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + + * peer_esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + * but as the peer device might not be in switchdev mode it's not + * possible. We use the fact that by default FW sets max vfs and max sfs + * to the same value on both devices. If it needs to be changed in the future note + * the peer miss group should also be created based on the number of + * total vports of the peer (currently is also uses esw->total_vports). + */ + table_size = MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ) + MLX5_ESW_MISS_FLOWS + esw->total_vports + esw->esw_funcs.num_vfs; /* create the slow path fdb with encap set, so further table instances @@ -1673,7 +1684,8 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) source_eswitch_owner_vhca_id_valid, 1); } - ix = esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ; + /* See comment above table_size calculation */ + ix = MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ); MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0); MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, ix - 1); From patchwork Tue Aug 3 23:19:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99980C4338F for ; Tue, 3 Aug 2021 23:20:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C61A60F58 for ; Tue, 3 Aug 2021 23:20:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234175AbhHCXUu (ORCPT ); Tue, 3 Aug 2021 19:20:50 -0400 Received: from mail.kernel.org ([198.145.29.99]:38966 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233670AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 21CFA60F93; Tue, 3 Aug 2021 23:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032823; bh=JcEopKgjR7krcW6ZEC6zhfogi5KM4huoKxlKQeSHzDQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HGF2wfSd2w6Dpu4I+XW80N4ha3px7GwDxpBJVREXt+rgusGBfT8ru+6L9UYqb4b4E w6lpCQRyzj8o3NMiuBbuw+OHYC32cSaXmTGT4IDCQmFaFLB0SH2vaDfQq73P6PAJx/ kP+Mkvg6P6/bzKO1oOmDsNLIpsKJcldnQg09pBnkzMqXilEm9/qUtDpc03UvVonFFo rN6FHlj2FaZM0LcfE7sOjtE4eHBcu0ngM/PlxLSlSey2onOjTmXUMZ5fakkQ4wOAZ6 UzvWh0WPU+SJouQ+Zmlnj34KpomkPW5PKaRWRdc1URGKJ/kKI9jg89yJk5V/cuLZYL gYxNRyrxiNU7g== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 12/14] net/mlx5: Lag, move lag destruction to a workqueue Date: Tue, 3 Aug 2021 16:19:57 -0700 Message-Id: <20210803231959.26513-13-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mark Bloch If a netdev is removed from the lag the lag should be destroyed. With downstream patches this might trigger a reconfiguration of representors on a different eswitch and such we don't have the proper locking to so from this path. Move the destruction to be done by the workqueue. As the destruction won't affect the netdev side it okay to do so. The RDMA side will be reconfigured and it already coded to handle such reconfiguration. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 459e3e5ef13f..89cd2b2af50a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -371,12 +371,13 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) bool do_bond, roce_lag; int err; - if (!mlx5_lag_is_ready(ldev)) - return; - - tracker = ldev->tracker; + if (!mlx5_lag_is_ready(ldev)) { + do_bond = false; + } else { + tracker = ldev->tracker; - do_bond = tracker.is_bonded && mlx5_lag_check_prereq(ldev); + do_bond = tracker.is_bonded && mlx5_lag_check_prereq(ldev); + } if (do_bond && !__mlx5_lag_is_active(ldev)) { roce_lag = !mlx5_sriov_is_enabled(dev0) && @@ -733,11 +734,11 @@ void mlx5_lag_remove_netdev(struct mlx5_core_dev *dev, if (!ldev) return; - if (__mlx5_lag_is_active(ldev)) - mlx5_disable_lag(ldev); - mlx5_ldev_remove_netdev(ldev, netdev); ldev->flags &= ~MLX5_LAG_FLAG_READY; + + if (__mlx5_lag_is_active(ldev)) + mlx5_queue_bond_work(ldev, 0); } /* Must be called with intf_mutex held */ From patchwork Tue Aug 3 23:19:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 491325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD078C4338F for ; Tue, 3 Aug 2021 23:20:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A655B60F70 for ; Tue, 3 Aug 2021 23:20:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233856AbhHCXUy (ORCPT ); Tue, 3 Aug 2021 19:20:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:38984 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233743AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1510760F70; Tue, 3 Aug 2021 23:20:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032824; bh=wPMbOjqoH1dBdJO1XE08P8TUVvPbWReG9PxCWFW6RTE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TXyLsvAg7vUfVeQZiTI971az1RFIJbbbInyZ/Pyiz2iwc0k1uZOLPx5lhxmT5RbXY 6V37Lk42+Zt6xKOXy0ytIc3gLWM5M/uMk1Te9pb04GB686BjQyOm9iV4tgwjYL/C0k n3LRATgsI/CLSCUsVfNLerkacPAEZtt3N9pM+6UHzrDdcaOKH4u2Q4pa5blJejSukz xjdWhn6gVtqWCkeFztG1cynIB9VCGezMCBJQcUfg2rQtpaZxH0bKjSCQGbqfozxOtm ll4AaiMT6274LjZA2LD4HP3R1Hx37pDhu0oKJj63c1PosZkylbQO4AZAoqXRwDhA+A 7aPJyh8phggFQ== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch Subject: [PATCH mlx5-next 14/14] net/mlx5: Lag, Create shared FDB when in switchdev mode Date: Tue, 3 Aug 2021 16:19:59 -0700 Message-Id: <20210803231959.26513-15-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mark Bloch If both eswitches are in switchdev mode and the uplink representors are enslaved to the same bond device create a shared FDB configuration. When moving to shared FDB mode not only the hardware needs be configured but the RDMA driver needs to reconfigure itself. When such change is done, unload the RDMA devices, configure the hardware and load the RDMA representors. When destroying the lag (can happen if a PCI function is unbinded, driver is unloaded or by just removing a netdev from the bond) make sure to restore the system to the previous state only if possible. For example, if a PCI function is unbinded there is no need to load the representors as the device is going away. Signed-off-by: Mark Bloch Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 118 +++++++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/lag.h | 3 +- .../net/ethernet/mellanox/mlx5/core/lag_mp.c | 2 +- 3 files changed, 105 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 89cd2b2af50a..f4dfa55c8c7e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -32,7 +32,9 @@ #include #include +#include #include +#include "lib/devcom.h" #include "mlx5_core.h" #include "eswitch.h" #include "lag.h" @@ -45,7 +47,7 @@ static DEFINE_SPINLOCK(lag_lock); static int mlx5_cmd_create_lag(struct mlx5_core_dev *dev, u8 remap_port1, - u8 remap_port2) + u8 remap_port2, bool shared_fdb) { u32 in[MLX5_ST_SZ_DW(create_lag_in)] = {}; void *lag_ctx = MLX5_ADDR_OF(create_lag_in, in, ctx); @@ -54,6 +56,7 @@ static int mlx5_cmd_create_lag(struct mlx5_core_dev *dev, u8 remap_port1, MLX5_SET(lagc, lag_ctx, tx_remap_affinity_1, remap_port1); MLX5_SET(lagc, lag_ctx, tx_remap_affinity_2, remap_port2); + MLX5_SET(lagc, lag_ctx, fdb_selection_mode, shared_fdb); return mlx5_cmd_exec_in(dev, create_lag, in); } @@ -224,35 +227,59 @@ void mlx5_modify_lag(struct mlx5_lag *ldev, } static int mlx5_create_lag(struct mlx5_lag *ldev, - struct lag_tracker *tracker) + struct lag_tracker *tracker, + bool shared_fdb) { struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + u32 in[MLX5_ST_SZ_DW(destroy_lag_in)] = {}; int err; mlx5_infer_tx_affinity_mapping(tracker, &ldev->v2p_map[MLX5_LAG_P1], &ldev->v2p_map[MLX5_LAG_P2]); - mlx5_core_info(dev0, "lag map port 1:%d port 2:%d", - ldev->v2p_map[MLX5_LAG_P1], ldev->v2p_map[MLX5_LAG_P2]); + mlx5_core_info(dev0, "lag map port 1:%d port 2:%d shared_fdb:%d", + ldev->v2p_map[MLX5_LAG_P1], ldev->v2p_map[MLX5_LAG_P2], + shared_fdb); err = mlx5_cmd_create_lag(dev0, ldev->v2p_map[MLX5_LAG_P1], - ldev->v2p_map[MLX5_LAG_P2]); - if (err) + ldev->v2p_map[MLX5_LAG_P2], shared_fdb); + if (err) { mlx5_core_err(dev0, "Failed to create LAG (%d)\n", err); + return err; + } + + if (shared_fdb) { + err = mlx5_eswitch_offloads_config_single_fdb(dev0->priv.eswitch, + dev1->priv.eswitch); + if (err) + mlx5_core_err(dev0, "Can't enable single FDB mode\n"); + else + mlx5_core_info(dev0, "Operation mode is single FDB\n"); + } + + if (err) { + MLX5_SET(destroy_lag_in, in, opcode, MLX5_CMD_OP_DESTROY_LAG); + if (mlx5_cmd_exec_in(dev0, destroy_lag, in)) + mlx5_core_err(dev0, + "Failed to deactivate RoCE LAG; driver restart required\n"); + } + return err; } int mlx5_activate_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker, - u8 flags) + u8 flags, + bool shared_fdb) { bool roce_lag = !!(flags & MLX5_LAG_FLAG_ROCE); struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; int err; - err = mlx5_create_lag(ldev, tracker); + err = mlx5_create_lag(ldev, tracker, shared_fdb); if (err) { if (roce_lag) { mlx5_core_err(dev0, @@ -266,6 +293,7 @@ int mlx5_activate_lag(struct mlx5_lag *ldev, } ldev->flags |= flags; + ldev->shared_fdb = shared_fdb; return 0; } @@ -278,6 +306,12 @@ static int mlx5_deactivate_lag(struct mlx5_lag *ldev) ldev->flags &= ~MLX5_LAG_MODE_FLAGS; + if (ldev->shared_fdb) { + mlx5_eswitch_offloads_destroy_single_fdb(ldev->pf[MLX5_LAG_P1].dev->priv.eswitch, + ldev->pf[MLX5_LAG_P2].dev->priv.eswitch); + ldev->shared_fdb = false; + } + MLX5_SET(destroy_lag_in, in, opcode, MLX5_CMD_OP_DESTROY_LAG); err = mlx5_cmd_exec_in(dev0, destroy_lag, in); if (err) { @@ -333,6 +367,10 @@ static void mlx5_lag_remove_devices(struct mlx5_lag *ldev) if (!ldev->pf[i].dev) continue; + if (ldev->pf[i].dev->priv.flags & + MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV) + continue; + ldev->pf[i].dev->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(ldev->pf[i].dev); } @@ -342,12 +380,15 @@ static void mlx5_disable_lag(struct mlx5_lag *ldev) { struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + bool shared_fdb = ldev->shared_fdb; bool roce_lag; int err; roce_lag = __mlx5_lag_is_roce(ldev); - if (roce_lag) { + if (shared_fdb) { + mlx5_lag_remove_devices(ldev); + } else if (roce_lag) { if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) { dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(dev0); @@ -359,8 +400,34 @@ static void mlx5_disable_lag(struct mlx5_lag *ldev) if (err) return; - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_add_devices(ldev); + + if (shared_fdb) { + if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) + mlx5_eswitch_reload_reps(dev0->priv.eswitch); + if (!(dev1->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) + mlx5_eswitch_reload_reps(dev1->priv.eswitch); + } +} + +static bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev) +{ + struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + + if (is_mdev_switchdev_mode(dev0) && + is_mdev_switchdev_mode(dev1) && + mlx5_eswitch_vport_match_metadata_enabled(dev0->priv.eswitch) && + mlx5_eswitch_vport_match_metadata_enabled(dev1->priv.eswitch) && + mlx5_devcom_is_paired(dev0->priv.devcom, + MLX5_DEVCOM_ESW_OFFLOADS) && + MLX5_CAP_GEN(dev1, lag_native_fdb_selection) && + MLX5_CAP_ESW(dev1, root_ft_on_other_esw) && + MLX5_CAP_ESW(dev0, esw_shared_ingress_acl)) + return true; + + return false; } static void mlx5_do_bond(struct mlx5_lag *ldev) @@ -380,6 +447,8 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) } if (do_bond && !__mlx5_lag_is_active(ldev)) { + bool shared_fdb = mlx5_shared_fdb_supported(ldev); + roce_lag = !mlx5_sriov_is_enabled(dev0) && !mlx5_sriov_is_enabled(dev1); @@ -389,23 +458,40 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) dev1->priv.eswitch->mode == MLX5_ESWITCH_NONE; #endif - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_remove_devices(ldev); err = mlx5_activate_lag(ldev, &tracker, roce_lag ? MLX5_LAG_FLAG_ROCE : - MLX5_LAG_FLAG_SRIOV); + MLX5_LAG_FLAG_SRIOV, + shared_fdb); if (err) { - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_add_devices(ldev); return; - } - - if (roce_lag) { + } else if (roce_lag) { dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(dev0); mlx5_nic_vport_enable_roce(dev1); + } else if (shared_fdb) { + dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; + mlx5_rescan_drivers_locked(dev0); + + err = mlx5_eswitch_reload_reps(dev0->priv.eswitch); + if (!err) + err = mlx5_eswitch_reload_reps(dev1->priv.eswitch); + + if (err) { + dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; + mlx5_rescan_drivers_locked(dev0); + mlx5_deactivate_lag(ldev); + mlx5_lag_add_devices(ldev); + mlx5_eswitch_reload_reps(dev0->priv.eswitch); + mlx5_eswitch_reload_reps(dev1->priv.eswitch); + mlx5_core_err(dev0, "Failed to enable lag\n"); + return; + } } } else if (do_bond && __mlx5_lag_is_active(ldev)) { mlx5_modify_lag(ldev, &tracker); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index e1d7a6671cf3..d4bae528954e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -73,7 +73,8 @@ void mlx5_modify_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker); int mlx5_activate_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker, - u8 flags); + u8 flags, + bool shared_fdb); int mlx5_lag_dev_get_netdev_idx(struct mlx5_lag *ldev, struct net_device *ndev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c index c4bf8b679541..011b639b29bf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c @@ -161,7 +161,7 @@ static void mlx5_lag_fib_route_event(struct mlx5_lag *ldev, struct lag_tracker tracker; tracker = ldev->tracker; - mlx5_activate_lag(ldev, &tracker, MLX5_LAG_FLAG_MULTIPATH); + mlx5_activate_lag(ldev, &tracker, MLX5_LAG_FLAG_MULTIPATH, false); } mlx5_lag_set_port_affinity(ldev, MLX5_LAG_NORMAL_AFFINITY);