From patchwork Tue Dec 15 09:03:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5BE2C2BBCA for ; Tue, 15 Dec 2020 09:05:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B89F2222A for ; Tue, 15 Dec 2020 09:05:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727834AbgLOJFZ (ORCPT ); Tue, 15 Dec 2020 04:05:25 -0500 Received: from mail.kernel.org ([198.145.29.99]:35822 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727084AbgLOJEs (ORCPT ); Tue, 15 Dec 2020 04:04:48 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Stephen Rothwell , Saeed Mahameed Subject: [net-next v5 01/15] net/mlx5: Fix compilation warning for 32-bit platform Date: Tue, 15 Dec 2020 01:03:44 -0800 Message-Id: <20201215090358.240365-2-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit MLX5_GENERAL_OBJECT_TYPES types bitfield is 64-bit field. Defining an enum for such bit fields on 32-bit platform results in below warning. ./include/vdso/bits.h:7:26: warning: left shift count >= width of type [-Wshift-count-overflow] ^ ./include/linux/mlx5/mlx5_ifc.h:10716:46: note: in expansion of macro ‘BIT’ MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = BIT(0x20), ^~~ Use 32-bit friendly left shift. Fixes: 2a2970891647 ("net/mlx5: Add sample offload hardware bits and structures") Signed-off-by: Parav Pandit Reported-by: Stephen Rothwell Signed-off-by: Leon Romanovsky Signed-off-by: Saeed Mahameed --- include/linux/mlx5/mlx5_ifc.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 0d6e287d614f..b9f15935dfe5 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -10711,9 +10711,9 @@ struct mlx5_ifc_affiliated_event_header_bits { }; enum { - MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = BIT(0xc), - MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_IPSEC = BIT(0x13), - MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = BIT(0x20), + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = 1ULL << 0xc, + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_IPSEC = 1ULL << 0x13, + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = 1ULL << 0x20, }; enum { From patchwork Tue Dec 15 09:03:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED462C4361B for ; Tue, 15 Dec 2020 09:11:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A81922225 for ; Tue, 15 Dec 2020 09:11:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727793AbgLOJFZ (ORCPT ); Tue, 15 Dec 2020 04:05:25 -0500 Received: from mail.kernel.org ([198.145.29.99]:35860 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727198AbgLOJEu (ORCPT ); Tue, 15 Dec 2020 04:04:50 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Jiri Pirko , Vu Pham , Saeed Mahameed Subject: [net-next v5 02/15] devlink: Prepare code to fill multiple port function attributes Date: Tue, 15 Dec 2020 01:03:45 -0800 Message-Id: <20201215090358.240365-3-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Prepare code to fill zero or more port function optional attributes. Subsequent patch makes use of this to fill more port function attributes. Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- net/core/devlink.c | 63 +++++++++++++++++++++++----------------------- 1 file changed, 32 insertions(+), 31 deletions(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index ee828e4b1007..13e0de80c4f9 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -712,6 +712,31 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, return 0; } +static int +devlink_port_function_hw_addr_fill(struct devlink *devlink, const struct devlink_ops *ops, + struct devlink_port *port, struct sk_buff *msg, + struct netlink_ext_ack *extack, bool *msg_updated) +{ + u8 hw_addr[MAX_ADDR_LEN]; + int hw_addr_len; + int err; + + if (!ops->port_function_hw_addr_get) + return 0; + + err = ops->port_function_hw_addr_get(devlink, port, hw_addr, &hw_addr_len, extack); + if (err) { + if (err == -EOPNOTSUPP) + return 0; + return err; + } + err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, hw_addr_len, hw_addr); + if (err) + return err; + *msg_updated = true; + return 0; +} + static int devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *port, struct netlink_ext_ack *extack) @@ -719,36 +744,16 @@ devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *por struct devlink *devlink = port->devlink; const struct devlink_ops *ops; struct nlattr *function_attr; - bool empty_nest = true; - int err = 0; + bool msg_updated = false; + int err; function_attr = nla_nest_start_noflag(msg, DEVLINK_ATTR_PORT_FUNCTION); if (!function_attr) return -EMSGSIZE; ops = devlink->ops; - if (ops->port_function_hw_addr_get) { - int hw_addr_len; - u8 hw_addr[MAX_ADDR_LEN]; - - err = ops->port_function_hw_addr_get(devlink, port, hw_addr, &hw_addr_len, extack); - if (err == -EOPNOTSUPP) { - /* Port function attributes are optional for a port. If port doesn't - * support function attribute, returning -EOPNOTSUPP is not an error. - */ - err = 0; - goto out; - } else if (err) { - goto out; - } - err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, hw_addr_len, hw_addr); - if (err) - goto out; - empty_nest = false; - } - -out: - if (err || empty_nest) + err = devlink_port_function_hw_addr_fill(devlink, ops, port, msg, extack, &msg_updated); + if (err || !msg_updated) nla_nest_cancel(msg, function_attr); else nla_nest_end(msg, function_attr); @@ -986,7 +991,6 @@ devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port * const struct devlink_ops *ops; const u8 *hw_addr; int hw_addr_len; - int err; hw_addr = nla_data(attr); hw_addr_len = nla_len(attr); @@ -1011,12 +1015,7 @@ devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port * return -EOPNOTSUPP; } - err = ops->port_function_hw_addr_set(devlink, port, hw_addr, hw_addr_len, extack); - if (err) - return err; - - devlink_port_notify(port, DEVLINK_CMD_PORT_NEW); - return 0; + return ops->port_function_hw_addr_set(devlink, port, hw_addr, hw_addr_len, extack); } static int @@ -1037,6 +1036,8 @@ devlink_port_function_set(struct devlink *devlink, struct devlink_port *port, if (attr) err = devlink_port_function_hw_addr_set(devlink, port, attr, extack); + if (!err) + devlink_port_notify(port, DEVLINK_CMD_PORT_NEW); return err; } From patchwork Tue Dec 15 09:03:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52F8EC4361B for ; Tue, 15 Dec 2020 09:05:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 13F1C2222A for ; Tue, 15 Dec 2020 09:05:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727734AbgLOJFO (ORCPT ); Tue, 15 Dec 2020 04:05:14 -0500 Received: from mail.kernel.org ([198.145.29.99]:35882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726730AbgLOJEu (ORCPT ); Tue, 15 Dec 2020 04:04:50 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Jiri Pirko , Vu Pham , Saeed Mahameed Subject: [net-next v5 03/15] devlink: Introduce PCI SF port flavour and port attribute Date: Tue, 15 Dec 2020 01:03:46 -0800 Message-Id: <20201215090358.240365-4-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit A PCI sub-function (SF) represents a portion of the device similar to PCI VF. In an eswitch, PCI SF may have port which is normally represented using a representor netdevice. To have better visibility of eswitch port, its association with SF, and its representor netdevice, introduce a PCI SF port flavour. When devlink port flavour is PCI SF, fill up PCI SF attributes of the port. Extend port name creation using PCI PF and SF number scheme on best effort basis, so that vendor drivers can skip defining their own scheme. This is done as cApfNSfM, where A, N and M are controller, PCI PF and PCI SF number respectively. This is similar to existing naming for PCI PF and PCI VF ports. An example view of a PCI SF port: $ devlink port show pci/0000:06:00.0/32768 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state active opstate attached $ devlink port show pci/0000:06:00.0/32768 -jp { "port": { "pci/0000:06:00.0/32768": { "type": "eth", "netdev": "ens2f0npf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "external": false, "splittable": false, "function": { "hw_addr": "00:00:00:00:88:88", "state": "active", "opstate": "attached" } } } } Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- include/net/devlink.h | 17 +++++++++++++ include/uapi/linux/devlink.h | 5 ++++ net/core/devlink.c | 46 ++++++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+) diff --git a/include/net/devlink.h b/include/net/devlink.h index f466819cc477..5bd43f0a79a8 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -93,6 +93,20 @@ struct devlink_port_pci_vf_attrs { u8 external:1; }; +/** + * struct devlink_port_pci_sf_attrs - devlink port's PCI SF attributes + * @controller: Associated controller number + * @pf: Associated PCI PF number for this port. + * @sf: Associated PCI SF for of the PCI PF for this port. + * @external: when set, indicates if a port is for an external controller + */ +struct devlink_port_pci_sf_attrs { + u32 controller; + u16 pf; + u32 sf; + u8 external:1; +}; + /** * struct devlink_port_attrs - devlink port object * @flavour: flavour of the port @@ -114,6 +128,7 @@ struct devlink_port_attrs { struct devlink_port_phys_attrs phys; struct devlink_port_pci_pf_attrs pci_pf; struct devlink_port_pci_vf_attrs pci_vf; + struct devlink_port_pci_sf_attrs pci_sf; }; }; @@ -1404,6 +1419,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port *devlink_port, u32 contro u16 pf, bool external); void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 controller, u16 pf, u16 vf, bool external); +void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 controller, + u16 pf, u32 sf, bool external); int devlink_sb_register(struct devlink *devlink, unsigned int sb_index, u32 size, u16 ingress_pools_count, u16 egress_pools_count, u16 ingress_tc_count, diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index 5203f54a2be1..6fe00f10eb3f 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -200,6 +200,10 @@ enum devlink_port_flavour { DEVLINK_PORT_FLAVOUR_UNUSED, /* Port which exists in the switch, but * is not used in any way. */ + DEVLINK_PORT_FLAVOUR_PCI_SF, /* Represents eswitch port + * for the PCI SF. It is an internal + * port that faces the PCI SF. + */ }; enum devlink_param_cmode { @@ -529,6 +533,7 @@ enum devlink_attr { DEVLINK_ATTR_RELOAD_ACTION_INFO, /* nested */ DEVLINK_ATTR_RELOAD_ACTION_STATS, /* nested */ + DEVLINK_ATTR_PORT_PCI_SF_NUMBER, /* u32 */ /* add new attributes above here, update the policy in devlink.c */ __DEVLINK_ATTR_MAX, diff --git a/net/core/devlink.c b/net/core/devlink.c index 13e0de80c4f9..08eac247f200 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -690,6 +690,15 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg, if (nla_put_u8(msg, DEVLINK_ATTR_PORT_EXTERNAL, attrs->pci_vf.external)) return -EMSGSIZE; break; + case DEVLINK_PORT_FLAVOUR_PCI_SF: + if (nla_put_u32(msg, DEVLINK_ATTR_PORT_CONTROLLER_NUMBER, + attrs->pci_sf.controller) || + nla_put_u16(msg, DEVLINK_ATTR_PORT_PCI_PF_NUMBER, attrs->pci_sf.pf) || + nla_put_u32(msg, DEVLINK_ATTR_PORT_PCI_SF_NUMBER, attrs->pci_sf.sf)) + return -EMSGSIZE; + if (nla_put_u8(msg, DEVLINK_ATTR_PORT_EXTERNAL, attrs->pci_sf.external)) + return -EMSGSIZE; + break; case DEVLINK_PORT_FLAVOUR_PHYSICAL: case DEVLINK_PORT_FLAVOUR_CPU: case DEVLINK_PORT_FLAVOUR_DSA: @@ -8373,6 +8382,33 @@ void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 contro } EXPORT_SYMBOL_GPL(devlink_port_attrs_pci_vf_set); +/** + * devlink_port_attrs_pci_sf_set - Set PCI SF port attributes + * + * @devlink_port: devlink port + * @controller: associated controller number for the devlink port instance + * @pf: associated PF for the devlink port instance + * @sf: associated SF of a PF for the devlink port instance + * @external: indicates if the port is for an external controller + */ +void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 controller, + u16 pf, u32 sf, bool external) +{ + struct devlink_port_attrs *attrs = &devlink_port->attrs; + int ret; + + if (WARN_ON(devlink_port->registered)) + return; + ret = __devlink_port_attrs_set(devlink_port, DEVLINK_PORT_FLAVOUR_PCI_SF); + if (ret) + return; + attrs->pci_sf.controller = controller; + attrs->pci_sf.pf = pf; + attrs->pci_sf.sf = sf; + attrs->pci_sf.external = external; +} +EXPORT_SYMBOL_GPL(devlink_port_attrs_pci_sf_set); + static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, char *name, size_t len) { @@ -8421,6 +8457,16 @@ static int __devlink_port_phys_port_name_get(struct devlink_port *devlink_port, n = snprintf(name, len, "pf%uvf%u", attrs->pci_vf.pf, attrs->pci_vf.vf); break; + case DEVLINK_PORT_FLAVOUR_PCI_SF: + if (attrs->pci_sf.external) { + n = snprintf(name, len, "c%u", attrs->pci_sf.controller); + if (n >= len) + return -EINVAL; + len -= n; + name += n; + } + n = snprintf(name, len, "pf%usf%u", attrs->pci_sf.pf, attrs->pci_sf.sf); + break; } if (n >= len) From patchwork Tue Dec 15 09:03:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AD90C4361B for ; Tue, 15 Dec 2020 09:11:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 327D122255 for ; Tue, 15 Dec 2020 09:11:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727764AbgLOJFZ (ORCPT ); Tue, 15 Dec 2020 04:05:25 -0500 Received: from mail.kernel.org ([198.145.29.99]:35898 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726901AbgLOJEv (ORCPT ); Tue, 15 Dec 2020 04:04:51 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Jiri Pirko , Vu Pham , Saeed Mahameed Subject: [net-next v5 04/15] devlink: Support add and delete devlink port Date: Tue, 15 Dec 2020 01:03:47 -0800 Message-Id: <20201215090358.240365-5-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Extended devlink interface for the user to add and delete port. Extend devlink to connect user requests to driver to add/delete such port in the device. When driver routines are invoked, devlink instance lock is not held. This enables driver to perform several devlink objects registration, unregistration such as (port, health reporter, resource etc) by using existing devlink APIs. This also helps to uniformly use the code for port unregistration during driver unload and during port deletion initiated by user. Examples of add, show and delete commands: $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 $ devlink port show pci/0000:06:00.0/32768 pci/0000:06:00.0/32768: type eth netdev eth0 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state inactive opstate detached $ udevadm test-builtin net_id /sys/class/net/eth0 Load module index Parsed configuration file /usr/lib/systemd/network/99-default.link Created link configuration context. Using default interface naming scheme 'v245'. ID_NET_NAMING_SCHEME=v245 ID_NET_NAME_PATH=enp6s0f0npf0sf88 ID_NET_NAME_SLOT=ens2f0npf0sf88 Unload module index Unloaded link configuration context. Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- include/net/devlink.h | 39 ++++++++++++++++++++++++ net/core/devlink.c | 71 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 110 insertions(+) diff --git a/include/net/devlink.h b/include/net/devlink.h index 5bd43f0a79a8..f8cff3e402da 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -153,6 +153,17 @@ struct devlink_port { struct mutex reporters_lock; /* Protects reporter_list */ }; +struct devlink_port_new_attrs { + enum devlink_port_flavour flavour; + unsigned int port_index; + u32 controller; + u32 sfnum; + u16 pfnum; + u8 port_index_valid:1, + controller_valid:1, + sfnum_valid:1; +}; + struct devlink_sb_pool_info { enum devlink_sb_pool_type pool_type; u32 size; @@ -1363,6 +1374,34 @@ struct devlink_ops { int (*port_function_hw_addr_set)(struct devlink *devlink, struct devlink_port *port, const u8 *hw_addr, int hw_addr_len, struct netlink_ext_ack *extack); + /** + * @port_new: Port add function. + * + * Should be used by device driver to let caller add new port of a + * specified flavour with optional attributes. + * Driver should return -EOPNOTSUPP if it doesn't support port addition + * of a specified flavour or specified attributes. Driver should set + * extack error message in case of fail to add the port. Devlink core + * does not hold a devlink instance lock when this callback is invoked. + * Driver must ensures synchronization when adding or deleting a port. + * Driver must register a port with devlink core. + */ + int (*port_new)(struct devlink *devlink, + const struct devlink_port_new_attrs *attrs, + struct netlink_ext_ack *extack); + /** + * @port_del: Port delete function. + * + * Should be used by device driver to let caller delete port which was + * previously created using port_new() callback. + * Driver should return -EOPNOTSUPP if it doesn't support port deletion. + * Driver should set extack error message in case of fail to delete the + * port. Devlink core does not hold a devlink instance lock when this + * callback is invoked. Driver must ensures synchronization when adding + * or deleting a port. Driver must register a port with devlink core. + */ + int (*port_del)(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack); }; static inline void *devlink_priv(struct devlink *devlink) diff --git a/net/core/devlink.c b/net/core/devlink.c index 08eac247f200..11043707f63f 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -1146,6 +1146,61 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff *skb, return devlink_port_unsplit(devlink, port_index, info->extack); } +static int devlink_nl_cmd_port_new_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct netlink_ext_ack *extack = info->extack; + struct devlink_port_new_attrs new_attrs = {}; + struct devlink *devlink = info->user_ptr[0]; + + if (!info->attrs[DEVLINK_ATTR_PORT_FLAVOUR] || + !info->attrs[DEVLINK_ATTR_PORT_PCI_PF_NUMBER]) { + NL_SET_ERR_MSG_MOD(extack, "Port flavour or PCI PF are not specified"); + return -EINVAL; + } + new_attrs.flavour = nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_FLAVOUR]); + new_attrs.pfnum = + nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_PCI_PF_NUMBER]); + + if (info->attrs[DEVLINK_ATTR_PORT_INDEX]) { + new_attrs.port_index = + nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]); + new_attrs.port_index_valid = true; + } + if (info->attrs[DEVLINK_ATTR_PORT_CONTROLLER_NUMBER]) { + new_attrs.controller = + nla_get_u16(info->attrs[DEVLINK_ATTR_PORT_CONTROLLER_NUMBER]); + new_attrs.controller_valid = true; + } + if (info->attrs[DEVLINK_ATTR_PORT_PCI_SF_NUMBER]) { + new_attrs.sfnum = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_PCI_SF_NUMBER]); + new_attrs.sfnum_valid = true; + } + + if (!devlink->ops->port_new) + return -EOPNOTSUPP; + + return devlink->ops->port_new(devlink, &new_attrs, extack); +} + +static int devlink_nl_cmd_port_del_doit(struct sk_buff *skb, + struct genl_info *info) +{ + struct netlink_ext_ack *extack = info->extack; + struct devlink *devlink = info->user_ptr[0]; + unsigned int port_index; + + if (!info->attrs[DEVLINK_ATTR_PORT_INDEX]) { + NL_SET_ERR_MSG_MOD(extack, "Port index is not specified"); + return -EINVAL; + } + port_index = nla_get_u32(info->attrs[DEVLINK_ATTR_PORT_INDEX]); + + if (!devlink->ops->port_del) + return -EOPNOTSUPP; + return devlink->ops->port_del(devlink, port_index, extack); +} + static int devlink_nl_sb_fill(struct sk_buff *msg, struct devlink *devlink, struct devlink_sb *devlink_sb, enum devlink_command cmd, u32 portid, @@ -7604,6 +7659,10 @@ static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + 1] = { [DEVLINK_ATTR_RELOAD_ACTION] = NLA_POLICY_RANGE(NLA_U8, DEVLINK_RELOAD_ACTION_DRIVER_REINIT, DEVLINK_RELOAD_ACTION_MAX), [DEVLINK_ATTR_RELOAD_LIMITS] = NLA_POLICY_BITFIELD32(DEVLINK_RELOAD_LIMITS_VALID_MASK), + [DEVLINK_ATTR_PORT_FLAVOUR] = { .type = NLA_U16 }, + [DEVLINK_ATTR_PORT_PCI_PF_NUMBER] = { .type = NLA_U16 }, + [DEVLINK_ATTR_PORT_PCI_SF_NUMBER] = { .type = NLA_U32 }, + [DEVLINK_ATTR_PORT_CONTROLLER_NUMBER] = { .type = NLA_U32 }, }; static const struct genl_small_ops devlink_nl_ops[] = { @@ -7643,6 +7702,18 @@ static const struct genl_small_ops devlink_nl_ops[] = { .flags = GENL_ADMIN_PERM, .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, }, + { + .cmd = DEVLINK_CMD_PORT_NEW, + .doit = devlink_nl_cmd_port_new_doit, + .flags = GENL_ADMIN_PERM, + .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, + }, + { + .cmd = DEVLINK_CMD_PORT_DEL, + .doit = devlink_nl_cmd_port_del_doit, + .flags = GENL_ADMIN_PERM, + .internal_flags = DEVLINK_NL_FLAG_NO_LOCK, + }, { .cmd = DEVLINK_CMD_SB_GET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, From patchwork Tue Dec 15 09:03:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 293A5C4361B for ; Tue, 15 Dec 2020 09:08:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C97B422225 for ; Tue, 15 Dec 2020 09:08:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728137AbgLOJIW (ORCPT ); Tue, 15 Dec 2020 04:08:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:36400 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727845AbgLOJFb (ORCPT ); Tue, 15 Dec 2020 04:05:31 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Jiri Pirko , Vu Pham , Saeed Mahameed Subject: [net-next v5 05/15] devlink: Support get and set state of port function Date: Tue, 15 Dec 2020 01:03:48 -0800 Message-Id: <20201215090358.240365-6-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit devlink port function can be in active or inactive state. Allow users to get and set port function's state. When the port function it activated, its operational state may change after a while when the device is created and driver binds to it. Similarly on deactivation flow. To clearly describe the state of the port function and its device's operational state in the host system, define state and opstate attributes. Example of a PCI SF port which supports a port function: Create a device with ID=10 and one physical port. $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 $ devlink port show pci/0000:06:00.0/32768 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state inactive opstate detached $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 state active $ devlink port show pci/0000:06:00.0/32768 -jp { "port": { "pci/0000:06:00.0/32768": { "type": "eth", "netdev": "ens2f0npf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "external": false, "splittable": false, "function": { "hw_addr": "00:00:00:00:88:88", "state": "active", "opstate": "attached" } } } } Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- include/net/devlink.h | 23 +++++++++ include/uapi/linux/devlink.h | 21 +++++++++ net/core/devlink.c | 90 +++++++++++++++++++++++++++++++++++- 3 files changed, 133 insertions(+), 1 deletion(-) diff --git a/include/net/devlink.h b/include/net/devlink.h index f8cff3e402da..18a7e66b7982 100644 --- a/include/net/devlink.h +++ b/include/net/devlink.h @@ -1374,6 +1374,29 @@ struct devlink_ops { int (*port_function_hw_addr_set)(struct devlink *devlink, struct devlink_port *port, const u8 *hw_addr, int hw_addr_len, struct netlink_ext_ack *extack); + /** + * @port_function_state_get: Port function's state get function. + * + * Should be used by device drivers to report the state of a function + * managed by the devlink port. Driver should return -EOPNOTSUPP if it + * doesn't support port function handling for a particular port. + */ + int (*port_function_state_get)(struct devlink *devlink, + struct devlink_port *port, + enum devlink_port_function_state *state, + enum devlink_port_function_opstate *opstate, + struct netlink_ext_ack *extack); + /** + * @port_function_state_set: Port function's state set function. + * + * Should be used by device drivers to set the state of a function + * managed by the devlink port. Driver should return -EOPNOTSUPP if it + * doesn't support port function handling for a particular port. + */ + int (*port_function_state_set)(struct devlink *devlink, + struct devlink_port *port, + enum devlink_port_function_state state, + struct netlink_ext_ack *extack); /** * @port_new: Port add function. * diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h index 6fe00f10eb3f..beeb30bb6b20 100644 --- a/include/uapi/linux/devlink.h +++ b/include/uapi/linux/devlink.h @@ -583,9 +583,30 @@ enum devlink_resource_unit { enum devlink_port_function_attr { DEVLINK_PORT_FUNCTION_ATTR_UNSPEC, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, /* binary */ + DEVLINK_PORT_FUNCTION_ATTR_STATE, /* u8 */ + DEVLINK_PORT_FUNCTION_ATTR_OPSTATE, /* u8 */ __DEVLINK_PORT_FUNCTION_ATTR_MAX, DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1 }; +enum devlink_port_function_state { + DEVLINK_PORT_FUNCTION_STATE_INACTIVE, + DEVLINK_PORT_FUNCTION_STATE_ACTIVE, +}; + +/** + * enum devlink_port_function_opstate - indicates operational state of port function + * @DEVLINK_PORT_FUNCTION_OPSTATE_ATTACHED: Driver is attached to the function of port, for + * gracefufl tear down of the function, after + * inactivation of the port function, user should wait + * for operational state to turn DETACHED. + * @DEVLINK_PORT_FUNCTION_OPSTATE_DETACHED: Driver is detached from the function of port; it is + * safe to delete the port. + */ +enum devlink_port_function_opstate { + DEVLINK_PORT_FUNCTION_OPSTATE_DETACHED, + DEVLINK_PORT_FUNCTION_OPSTATE_ATTACHED, +}; + #endif /* _UAPI_LINUX_DEVLINK_H_ */ diff --git a/net/core/devlink.c b/net/core/devlink.c index 11043707f63f..b8acb8842aa1 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -87,6 +87,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_trap_report); static const struct nla_policy devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_ATTR_MAX + 1] = { [DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY }, + [DEVLINK_PORT_FUNCTION_ATTR_STATE] = + NLA_POLICY_RANGE(NLA_U8, DEVLINK_PORT_FUNCTION_STATE_INACTIVE, + DEVLINK_PORT_FUNCTION_STATE_ACTIVE), }; static LIST_HEAD(devlink_list); @@ -746,6 +749,57 @@ devlink_port_function_hw_addr_fill(struct devlink *devlink, const struct devlink return 0; } +static bool +devlink_port_function_state_valid(enum devlink_port_function_state state) +{ + return state == DEVLINK_PORT_FUNCTION_STATE_INACTIVE || + state == DEVLINK_PORT_FUNCTION_STATE_ACTIVE; +} + +static bool +devlink_port_function_opstate_valid(enum devlink_port_function_opstate state) +{ + return state == DEVLINK_PORT_FUNCTION_OPSTATE_DETACHED || + state == DEVLINK_PORT_FUNCTION_OPSTATE_ATTACHED; +} + +static int +devlink_port_function_state_fill(struct devlink *devlink, + const struct devlink_ops *ops, + struct devlink_port *port, struct sk_buff *msg, + struct netlink_ext_ack *extack, + bool *msg_updated) +{ + enum devlink_port_function_opstate opstate; + enum devlink_port_function_state state; + int err; + + if (!ops->port_function_state_get) + return 0; + + err = ops->port_function_state_get(devlink, port, &state, &opstate, extack); + if (err) { + if (err == -EOPNOTSUPP) + return 0; + return err; + } + if (!devlink_port_function_state_valid(state)) { + WARN_ON_ONCE(1); + NL_SET_ERR_MSG_MOD(extack, "Invalid state value read from driver"); + return -EINVAL; + } + if (!devlink_port_function_opstate_valid(opstate)) { + WARN_ON_ONCE(1); + NL_SET_ERR_MSG_MOD(extack, "Invalid operational state value read from driver"); + return -EINVAL; + } + if (nla_put_u8(msg, DEVLINK_PORT_FUNCTION_ATTR_STATE, state) || + nla_put_u8(msg, DEVLINK_PORT_FUNCTION_ATTR_OPSTATE, opstate)) + return -EMSGSIZE; + *msg_updated = true; + return 0; +} + static int devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *port, struct netlink_ext_ack *extack) @@ -762,6 +816,13 @@ devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *por ops = devlink->ops; err = devlink_port_function_hw_addr_fill(devlink, ops, port, msg, extack, &msg_updated); + if (err) + goto out; + err = devlink_port_function_state_fill(devlink, ops, port, msg, extack, + &msg_updated); + if (err) + goto out; +out: if (err || !msg_updated) nla_nest_cancel(msg, function_attr); else @@ -1027,6 +1088,22 @@ devlink_port_function_hw_addr_set(struct devlink *devlink, struct devlink_port * return ops->port_function_hw_addr_set(devlink, port, hw_addr, hw_addr_len, extack); } +static int +devlink_port_function_state_set(struct devlink *devlink, struct devlink_port *port, + const struct nlattr *attr, struct netlink_ext_ack *extack) +{ + enum devlink_port_function_state state; + const struct devlink_ops *ops; + + state = nla_get_u8(attr); + ops = devlink->ops; + if (!ops->port_function_state_set) { + NL_SET_ERR_MSG_MOD(extack, "Port function does not support state setting"); + return -EOPNOTSUPP; + } + return ops->port_function_state_set(devlink, port, state, extack); +} + static int devlink_port_function_set(struct devlink *devlink, struct devlink_port *port, const struct nlattr *attr, struct netlink_ext_ack *extack) @@ -1042,8 +1119,19 @@ devlink_port_function_set(struct devlink *devlink, struct devlink_port *port, } attr = tb[DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR]; - if (attr) + if (attr) { err = devlink_port_function_hw_addr_set(devlink, port, attr, extack); + if (err) + return err; + } + /* Keep this as the last function attribute set, so that when + * multiple port function attributes are set along with state, + * Those can be applied first before activating the state. + */ + attr = tb[DEVLINK_PORT_FUNCTION_ATTR_STATE]; + if (attr) + err = devlink_port_function_state_set(devlink, port, attr, + extack); if (!err) devlink_port_notify(port, DEVLINK_CMD_PORT_NEW); From patchwork Tue Dec 15 09:03:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BBD9C4361B for ; Tue, 15 Dec 2020 09:06:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C23E22255 for ; Tue, 15 Dec 2020 09:06:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727886AbgLOJFx (ORCPT ); Tue, 15 Dec 2020 04:05:53 -0500 Received: from mail.kernel.org ([198.145.29.99]:36398 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727844AbgLOJFd (ORCPT ); Tue, 15 Dec 2020 04:05:33 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Saeed Mahameed Subject: [net-next v5 06/15] net/mlx5: Introduce vhca state event notifier Date: Tue, 15 Dec 2020 01:03:49 -0800 Message-Id: <20201215090358.240365-7-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit vhca state events indicates change in the state of the vhca that may occur due to a SF allocation, deallocation or enabling/disabling the SF HCA. Introduce vhca state event handler which will be used by SF devlink port manager and SF hardware id allocator in subsequent patches to act on the event. This enables single entity to subscribe, query and rearm the event for a function. Signed-off-by: Parav Pandit Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/Kconfig | 9 + .../net/ethernet/mellanox/mlx5/core/Makefile | 4 + drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 + drivers/net/ethernet/mellanox/mlx5/core/eq.c | 3 + .../net/ethernet/mellanox/mlx5/core/events.c | 7 + .../net/ethernet/mellanox/mlx5/core/main.c | 16 ++ .../ethernet/mellanox/mlx5/core/mlx5_core.h | 2 + .../mlx5/core/sf/mlx5_ifc_vhca_event.h | 82 ++++++++ .../net/ethernet/mellanox/mlx5/core/sf/sf.h | 45 +++++ .../mellanox/mlx5/core/sf/vhca_event.c | 189 ++++++++++++++++++ .../mellanox/mlx5/core/sf/vhca_event.h | 57 ++++++ include/linux/mlx5/driver.h | 4 + 12 files changed, 422 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index 6e4d7bb7fea2..d6c48582e7a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -203,3 +203,12 @@ config MLX5_SW_STEERING default y help Build support for software-managed steering in the NIC. + +config MLX5_SF + bool "Mellanox Technologies subfunction device support using auxiliary device" + depends on MLX5_CORE && MLX5_CORE_EN + default n + help + Build support for subfuction device in the NIC. A Mellanox subfunction + device can support RDMA, netdevice and vdpa device. + It is similar to a SRIOV VF but it doesn't require SRIOV support. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 77961643d5a9..292c02c4828c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -85,3 +85,7 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o steering/dr_ste.o steering/dr_send.o \ steering/dr_cmd.o steering/dr_fw.o \ steering/dr_action.o steering/fs_dr.o +# +# SF device +# +mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 50c7b9ee80c3..47dcc3ac2cf0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -464,6 +464,8 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_ALLOC_MEMIC: case MLX5_CMD_OP_MODIFY_XRQ: case MLX5_CMD_OP_RELEASE_XRQ_ERROR: + case MLX5_CMD_OP_QUERY_VHCA_STATE: + case MLX5_CMD_OP_MODIFY_VHCA_STATE: *status = MLX5_DRIVER_STATUS_ABORTED; *synd = MLX5_DRIVER_SYND; return -EIO; @@ -657,6 +659,8 @@ const char *mlx5_command_str(int command) MLX5_COMMAND_STR_CASE(DESTROY_UMEM); MLX5_COMMAND_STR_CASE(RELEASE_XRQ_ERROR); MLX5_COMMAND_STR_CASE(MODIFY_XRQ); + MLX5_COMMAND_STR_CASE(QUERY_VHCA_STATE); + MLX5_COMMAND_STR_CASE(MODIFY_VHCA_STATE); default: return "unknown command opcode"; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index fc0afa03d407..421febebc658 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -595,6 +595,9 @@ static void gather_async_events_mask(struct mlx5_core_dev *dev, u64 mask[4]) async_event_mask |= (1ull << MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED); + if (MLX5_CAP_GEN_MAX(dev, vhca_state)) + async_event_mask |= (1ull << MLX5_EVENT_TYPE_VHCA_STATE_CHANGE); + mask[0] = async_event_mask; if (MLX5_CAP_GEN(dev, event_cap)) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/events.c b/drivers/net/ethernet/mellanox/mlx5/core/events.c index 3ce17c3d7a00..5523d218e5fb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/events.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/events.c @@ -110,6 +110,8 @@ static const char *eqe_type_str(u8 type) return "MLX5_EVENT_TYPE_CMD"; case MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED: return "MLX5_EVENT_TYPE_ESW_FUNCTIONS_CHANGED"; + case MLX5_EVENT_TYPE_VHCA_STATE_CHANGE: + return "MLX5_EVENT_TYPE_VHCA_STATE_CHANGE"; case MLX5_EVENT_TYPE_PAGE_REQUEST: return "MLX5_EVENT_TYPE_PAGE_REQUEST"; case MLX5_EVENT_TYPE_PAGE_FAULT: @@ -403,3 +405,8 @@ int mlx5_notifier_call_chain(struct mlx5_events *events, unsigned int event, voi { return atomic_notifier_call_chain(&events->nh, event, data); } + +void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work) +{ + queue_work(dev->priv.events->wq, work); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index c08315b51fd3..6e67ad11c713 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -73,6 +73,7 @@ #include "ecpf.h" #include "lib/hv_vhca.h" #include "diag/rsc_dump.h" +#include "sf/vhca_event.h" MODULE_AUTHOR("Eli Cohen "); MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver"); @@ -567,6 +568,8 @@ static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx) if (MLX5_CAP_GEN_MAX(dev, mkey_by_name)) MLX5_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1); + mlx5_vhca_state_cap_handle(dev, set_hca_cap); + return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE); } @@ -884,6 +887,12 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) goto err_eswitch_cleanup; } + err = mlx5_vhca_event_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init vhca event notifier %d\n", err); + goto err_fpga_cleanup; + } + dev->dm = mlx5_dm_create(dev); if (IS_ERR(dev->dm)) mlx5_core_warn(dev, "Failed to init device memory%d\n", err); @@ -894,6 +903,8 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) return 0; +err_fpga_cleanup: + mlx5_fpga_cleanup(dev); err_eswitch_cleanup: mlx5_eswitch_cleanup(dev->priv.eswitch); err_sriov_cleanup: @@ -925,6 +936,7 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev) mlx5_hv_vhca_destroy(dev->hv_vhca); mlx5_fw_tracer_destroy(dev->tracer); mlx5_dm_cleanup(dev); + mlx5_vhca_event_cleanup(dev); mlx5_fpga_cleanup(dev); mlx5_eswitch_cleanup(dev->priv.eswitch); mlx5_sriov_cleanup(dev); @@ -1129,6 +1141,8 @@ static int mlx5_load(struct mlx5_core_dev *dev) goto err_sriov; } + mlx5_vhca_event_start(dev); + err = mlx5_ec_init(dev); if (err) { mlx5_core_err(dev, "Failed to init embedded CPU\n"); @@ -1146,6 +1160,7 @@ static int mlx5_load(struct mlx5_core_dev *dev) err_sriov: mlx5_ec_cleanup(dev); err_ec: + mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); err_fs: mlx5_accel_tls_cleanup(dev); @@ -1173,6 +1188,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev) { mlx5_sriov_detach(dev); mlx5_ec_cleanup(dev); + mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); mlx5_accel_ipsec_cleanup(dev); mlx5_accel_tls_cleanup(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 0a0302ce7144..a33b7496d748 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -259,4 +259,6 @@ void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state); void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup); int mlx5_load_one(struct mlx5_core_dev *dev, bool boot); + +void mlx5_events_work_enqueue(struct mlx5_core_dev *dev, struct work_struct *work); #endif /* __MLX5_CORE_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h new file mode 100644 index 000000000000..1daf5a122ba3 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_IFC_VHCA_EVENT_H__ +#define __MLX5_IFC_VHCA_EVENT_H__ + +enum mlx5_ifc_vhca_state { + MLX5_VHCA_STATE_INVALID = 0x0, + MLX5_VHCA_STATE_ALLOCATED = 0x1, + MLX5_VHCA_STATE_ACTIVE = 0x2, + MLX5_VHCA_STATE_IN_USE = 0x3, + MLX5_VHCA_STATE_TEARDOWN_REQUEST = 0x4, +}; + +struct mlx5_ifc_vhca_state_context_bits { + u8 arm_change_event[0x1]; + u8 reserved_at_1[0xb]; + u8 vhca_state[0x4]; + u8 reserved_at_10[0x10]; + + u8 sw_function_id[0x20]; + + u8 reserved_at_40[0x80]; +}; + +struct mlx5_ifc_query_vhca_state_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; + + struct mlx5_ifc_vhca_state_context_bits vhca_state_context; +}; + +struct mlx5_ifc_query_vhca_state_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 embedded_cpu_function[0x1]; + u8 reserved_at_41[0xf]; + u8 function_id[0x10]; + + u8 reserved_at_60[0x20]; +}; + +struct mlx5_ifc_vhca_state_field_select_bits { + u8 reserved_at_0[0x1e]; + u8 sw_function_id[0x1]; + u8 arm_change_event[0x1]; +}; + +struct mlx5_ifc_modify_vhca_state_out_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; +}; + +struct mlx5_ifc_modify_vhca_state_in_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 embedded_cpu_function[0x1]; + u8 reserved_at_41[0xf]; + u8 function_id[0x10]; + + struct mlx5_ifc_vhca_state_field_select_bits vhca_state_field_select; + + struct mlx5_ifc_vhca_state_context_bits vhca_state_context; +}; + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h new file mode 100644 index 000000000000..623191679b49 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_H__ +#define __MLX5_SF_H__ + +#include + +static inline u16 mlx5_sf_start_function_id(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf_base_id); +} + +#ifdef CONFIG_MLX5_SF + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf); +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + if (!mlx5_sf_supported(dev)) + return 0; + if (MLX5_CAP_GEN(dev, max_num_sf)) + return MLX5_CAP_GEN(dev, max_num_sf); + else + return 1 << MLX5_CAP_GEN(dev, log_max_sf); +} + +#else + +static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) +{ + return false; +} + +static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) +{ + return 0; +} + +#endif + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c new file mode 100644 index 000000000000..af2f2dd9db25 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include +#include "mlx5_ifc_vhca_event.h" +#include "mlx5_core.h" +#include "vhca_event.h" +#include "ecpf.h" + +struct mlx5_vhca_state_notifier { + struct mlx5_core_dev *dev; + struct mlx5_nb nb; + struct blocking_notifier_head n_head; +}; + +struct mlx5_vhca_event_work { + struct work_struct work; + struct mlx5_vhca_state_notifier *notifier; + struct mlx5_vhca_state_event event; +}; + +int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *out, u32 outlen) +{ + u32 in[MLX5_ST_SZ_DW(query_vhca_state_in)] = {}; + + MLX5_SET(query_vhca_state_in, in, opcode, MLX5_CMD_OP_QUERY_VHCA_STATE); + MLX5_SET(query_vhca_state_in, in, function_id, function_id); + MLX5_SET(query_vhca_state_in, in, embedded_cpu_function, ecpu); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, outlen); +} + +static int mlx5_cmd_modify_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *in, u32 inlen) +{ + u32 out[MLX5_ST_SZ_DW(modify_vhca_state_out)] = {}; + + MLX5_SET(modify_vhca_state_in, in, opcode, MLX5_CMD_OP_MODIFY_VHCA_STATE); + MLX5_SET(modify_vhca_state_in, in, function_id, function_id); + MLX5_SET(modify_vhca_state_in, in, embedded_cpu_function, ecpu); + + return mlx5_cmd_exec(dev, in, inlen, out, sizeof(out)); +} + +int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, bool ecpu, u32 sw_fn_id) +{ + u32 out[MLX5_ST_SZ_DW(modify_vhca_state_out)] = {}; + u32 in[MLX5_ST_SZ_DW(modify_vhca_state_in)] = {}; + + MLX5_SET(modify_vhca_state_in, in, opcode, MLX5_CMD_OP_MODIFY_VHCA_STATE); + MLX5_SET(modify_vhca_state_in, in, function_id, function_id); + MLX5_SET(modify_vhca_state_in, in, embedded_cpu_function, ecpu); + MLX5_SET(modify_vhca_state_in, in, vhca_state_field_select.sw_function_id, 1); + MLX5_SET(modify_vhca_state_in, in, vhca_state_context.sw_function_id, sw_fn_id); + + return mlx5_cmd_exec_inout(dev, modify_vhca_state, in, out); +} + +int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id, bool ecpu) +{ + u32 in[MLX5_ST_SZ_DW(modify_vhca_state_in)] = {}; + + MLX5_SET(modify_vhca_state_in, in, vhca_state_context.arm_change_event, 1); + MLX5_SET(modify_vhca_state_in, in, vhca_state_field_select.arm_change_event, 1); + + return mlx5_cmd_modify_vhca_state(dev, function_id, ecpu, in, sizeof(in)); +} + +static void +mlx5_vhca_event_notify(struct mlx5_core_dev *dev, struct mlx5_vhca_state_event *event) +{ + u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {}; + int err; + + err = mlx5_cmd_query_vhca_state(dev, event->function_id, event->ecpu, out, sizeof(out)); + if (err) + return; + + event->sw_function_id = MLX5_GET(query_vhca_state_out, out, + vhca_state_context.sw_function_id); + event->new_vhca_state = MLX5_GET(query_vhca_state_out, out, + vhca_state_context.vhca_state); + + mlx5_vhca_event_arm(dev, event->function_id, event->ecpu); + + blocking_notifier_call_chain(&dev->priv.vhca_state_notifier->n_head, 0, event); +} + +static void mlx5_vhca_state_work_handler(struct work_struct *_work) +{ + struct mlx5_vhca_event_work *work = container_of(_work, struct mlx5_vhca_event_work, work); + struct mlx5_vhca_state_notifier *notifier = work->notifier; + struct mlx5_core_dev *dev = notifier->dev; + + mlx5_vhca_event_notify(dev, &work->event); +} + +static int +mlx5_vhca_state_change_notifier(struct notifier_block *nb, unsigned long type, void *data) +{ + struct mlx5_vhca_state_notifier *notifier = + mlx5_nb_cof(nb, struct mlx5_vhca_state_notifier, nb); + struct mlx5_vhca_event_work *work; + struct mlx5_eqe *eqe = data; + + work = kzalloc(sizeof(*work), GFP_ATOMIC); + if (!work) + return NOTIFY_DONE; + INIT_WORK(&work->work, &mlx5_vhca_state_work_handler); + work->notifier = notifier; + work->event.function_id = be16_to_cpu(eqe->data.vhca_state.function_id); + work->event.ecpu = be16_to_cpu(eqe->data.vhca_state.ec_function); + mlx5_events_work_enqueue(notifier->dev, &work->work); + return NOTIFY_OK; +} + +void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap) +{ + if (!mlx5_vhca_event_supported(dev)) + return; + + MLX5_SET(cmd_hca_cap, set_hca_cap, vhca_state, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_allocated, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_active, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_in_use, 1); + MLX5_SET(cmd_hca_cap, set_hca_cap, event_on_vhca_state_teardown_request, 1); +} + +int mlx5_vhca_event_init(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!mlx5_vhca_event_supported(dev)) + return 0; + + notifier = kzalloc(sizeof(*notifier), GFP_KERNEL); + if (!notifier) + return -ENOMEM; + + dev->priv.vhca_state_notifier = notifier; + notifier->dev = dev; + BLOCKING_INIT_NOTIFIER_HEAD(¬ifier->n_head); + MLX5_NB_INIT(¬ifier->nb, mlx5_vhca_state_change_notifier, VHCA_STATE_CHANGE); + return 0; +} + +void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev) +{ + if (!mlx5_vhca_event_supported(dev)) + return; + + kfree(dev->priv.vhca_state_notifier); + dev->priv.vhca_state_notifier = NULL; +} + +void mlx5_vhca_event_start(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!dev->priv.vhca_state_notifier) + return; + + notifier = dev->priv.vhca_state_notifier; + mlx5_eq_notifier_register(dev, ¬ifier->nb); +} + +void mlx5_vhca_event_stop(struct mlx5_core_dev *dev) +{ + struct mlx5_vhca_state_notifier *notifier; + + if (!dev->priv.vhca_state_notifier) + return; + + notifier = dev->priv.vhca_state_notifier; + mlx5_eq_notifier_unregister(dev, ¬ifier->nb); +} + +int mlx5_vhca_event_notifier_register(struct mlx5_core_dev *dev, struct notifier_block *nb) +{ + if (!dev->priv.vhca_state_notifier) + return -EOPNOTSUPP; + return blocking_notifier_chain_register(&dev->priv.vhca_state_notifier->n_head, nb); +} + +void mlx5_vhca_event_notifier_unregister(struct mlx5_core_dev *dev, struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&dev->priv.vhca_state_notifier->n_head, nb); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h new file mode 100644 index 000000000000..1fe1ec6f4d4b --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_VHCA_EVENT_H__ +#define __MLX5_VHCA_EVENT_H__ + +#ifdef CONFIG_MLX5_SF + +struct mlx5_vhca_state_event { + u16 function_id; + u16 sw_function_id; + u8 new_vhca_state; + bool ecpu; +}; + +static inline bool mlx5_vhca_event_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN_MAX(dev, vhca_state); +} + +void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap); +int mlx5_vhca_event_init(struct mlx5_core_dev *dev); +void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev); +void mlx5_vhca_event_start(struct mlx5_core_dev *dev); +void mlx5_vhca_event_stop(struct mlx5_core_dev *dev); +int mlx5_vhca_event_notifier_register(struct mlx5_core_dev *dev, struct notifier_block *nb); +void mlx5_vhca_event_notifier_unregister(struct mlx5_core_dev *dev, struct notifier_block *nb); +int mlx5_modify_vhca_sw_id(struct mlx5_core_dev *dev, u16 function_id, bool ecpu, u32 sw_fn_id); +int mlx5_vhca_event_arm(struct mlx5_core_dev *dev, u16 function_id, bool ecpu); +int mlx5_cmd_query_vhca_state(struct mlx5_core_dev *dev, u16 function_id, + bool ecpu, u32 *out, u32 outlen); +#else + +static inline void mlx5_vhca_state_cap_handle(struct mlx5_core_dev *dev, void *set_hca_cap) +{ +} + +static inline int mlx5_vhca_event_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_vhca_event_cleanup(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_vhca_event_start(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_vhca_event_stop(struct mlx5_core_dev *dev) +{ +} + +#endif + +#endif diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index f93bfe7473aa..ffba0786051e 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -507,6 +507,7 @@ struct mlx5_devcom; struct mlx5_fw_reset; struct mlx5_eq_table; struct mlx5_irq_table; +struct mlx5_vhca_state_notifier; struct mlx5_rate_limit { u32 rate; @@ -603,6 +604,9 @@ struct mlx5_priv { struct mlx5_bfreg_data bfregs; struct mlx5_uars_page *uar; +#ifdef CONFIG_MLX5_SF + struct mlx5_vhca_state_notifier *vhca_state_notifier; +#endif }; enum mlx5_device_state { From patchwork Tue Dec 15 09:03:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A72ADC2BB48 for ; Tue, 15 Dec 2020 09:06:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7476322225 for ; Tue, 15 Dec 2020 09:06:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727057AbgLOJFq (ORCPT ); Tue, 15 Dec 2020 04:05:46 -0500 Received: from mail.kernel.org ([198.145.29.99]:36402 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727836AbgLOJFb (ORCPT ); Tue, 15 Dec 2020 04:05:31 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Saeed Mahameed Subject: [net-next v5 07/15] net/mlx5: SF, Add auxiliary device support Date: Tue, 15 Dec 2020 01:03:50 -0800 Message-Id: <20201215090358.240365-8-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Introduce API to add and delete an auxiliary device for an SF. Each SF has its own dedicated window in the PCI BAR 2. SF device is similar to PCI PF and VF that supports multiple class of devices such as net, rdma and vdpa. SF device will be added or removed in subsequent patch during SF devlink port function state change command. A subfunction device exposes user supplied subfunction number which will be further used by systemd/udev to have deterministic name for its netdevice and rdma device. An mlx5 subfunction auxiliary device example: $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 $ devlink port show ens2f0npf0sf88 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state inactive opstate detached $ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88 state active On activation, $ ls -l /sys/bus/auxiliary/devices/ mlx5_core.sf.4 -> ../../../devices/pci0000:00/0000:00:03.0/0000:06:00.0/mlx5_core.sf.4 $ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum 88 Signed-off-by: Parav Pandit Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- .../device_drivers/ethernet/mellanox/mlx5.rst | 5 + .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- .../net/ethernet/mellanox/mlx5/core/main.c | 4 + .../ethernet/mellanox/mlx5/core/sf/dev/dev.c | 261 ++++++++++++++++++ .../ethernet/mellanox/mlx5/core/sf/dev/dev.h | 35 +++ include/linux/mlx5/driver.h | 2 + 6 files changed, 308 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst index e9b65035cd47..a5eb22793bb9 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst @@ -97,6 +97,11 @@ Enabling the driver and kconfig options | Provides low-level InfiniBand/RDMA and `RoCE `_ support. +**CONFIG_MLX5_SF=(y/n)** + +| Build support for subfunction. +| Subfunctons are more light weight than PCI SRIOV VFs. Choosing this option +| will enable support for creating subfunction devices. **External options** ( Choose if the corresponding mlx5 feature is required ) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 292c02c4828c..2aefbca404c3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -88,4 +88,4 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o # # SF device # -mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o +mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o sf/dev/dev.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 6e67ad11c713..292c30e71d7f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -74,6 +74,7 @@ #include "lib/hv_vhca.h" #include "diag/rsc_dump.h" #include "sf/vhca_event.h" +#include "sf/dev/dev.h" MODULE_AUTHOR("Eli Cohen "); MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver"); @@ -1155,6 +1156,8 @@ static int mlx5_load(struct mlx5_core_dev *dev) goto err_sriov; } + mlx5_sf_dev_table_create(dev); + return 0; err_sriov: @@ -1186,6 +1189,7 @@ static int mlx5_load(struct mlx5_core_dev *dev) static void mlx5_unload(struct mlx5_core_dev *dev) { + mlx5_sf_dev_table_destroy(dev); mlx5_sriov_detach(dev); mlx5_ec_cleanup(dev); mlx5_vhca_event_stop(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c new file mode 100644 index 000000000000..6562bf63afaa --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c @@ -0,0 +1,261 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include +#include +#include "mlx5_core.h" +#include "dev.h" +#include "sf/vhca_event.h" +#include "sf/sf.h" +#include "sf/mlx5_ifc_vhca_event.h" +#include "ecpf.h" + +struct mlx5_sf_dev_table { + struct xarray devices; + unsigned int max_sfs; + phys_addr_t base_address; + u64 sf_bar_length; + struct notifier_block nb; + struct mlx5_core_dev *dev; +}; + +static bool mlx5_sf_dev_supported(const struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, sf) && mlx5_vhca_event_supported(dev); +} + +static ssize_t sfnum_show(struct device *dev, struct device_attribute *attr, char *buf) +{ + struct auxiliary_device *adev = container_of(dev, struct auxiliary_device, dev); + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + return scnprintf(buf, PAGE_SIZE, "%u\n", sf_dev->sfnum); +} +static DEVICE_ATTR_RO(sfnum); + +static struct attribute *sf_device_attrs[] = { + &dev_attr_sfnum.attr, + NULL, +}; + +static const struct attribute_group sf_attr_group = { + .attrs = sf_device_attrs, +}; + +static const struct attribute_group *sf_attr_groups[2] = { + &sf_attr_group, + NULL +}; + +static void mlx5_sf_dev_release(struct device *device) +{ + struct auxiliary_device *adev = container_of(device, struct auxiliary_device, dev); + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + mlx5_adev_idx_free(adev->id); + kfree(sf_dev); +} + +static void mlx5_sf_dev_remove(struct mlx5_sf_dev *sf_dev) +{ + auxiliary_device_delete(&sf_dev->adev); + auxiliary_device_uninit(&sf_dev->adev); +} + +static void mlx5_sf_dev_add(struct mlx5_core_dev *dev, u16 sf_index, u32 sfnum) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + struct mlx5_sf_dev *sf_dev; + struct pci_dev *pdev; + int err; + int id; + + id = mlx5_adev_idx_alloc(); + if (id < 0) { + err = id; + goto add_err; + } + + sf_dev = kzalloc(sizeof(*sf_dev), GFP_KERNEL); + if (!sf_dev) { + mlx5_adev_idx_free(id); + err = -ENOMEM; + goto add_err; + } + pdev = dev->pdev; + sf_dev->adev.id = id; + sf_dev->adev.name = MLX5_SF_DEV_ID_NAME; + sf_dev->adev.dev.release = mlx5_sf_dev_release; + sf_dev->adev.dev.parent = &pdev->dev; + sf_dev->adev.dev.groups = sf_attr_groups; + sf_dev->sfnum = sfnum; + sf_dev->parent_mdev = dev; + + if (!table->max_sfs) { + mlx5_adev_idx_free(id); + kfree(sf_dev); + err = -EOPNOTSUPP; + goto add_err; + } + sf_dev->bar_base_addr = table->base_address + (sf_index * table->sf_bar_length); + + err = auxiliary_device_init(&sf_dev->adev); + if (err) { + mlx5_adev_idx_free(id); + kfree(sf_dev); + goto add_err; + } + + err = auxiliary_device_add(&sf_dev->adev); + if (err) { + put_device(&sf_dev->adev.dev); + goto add_err; + } + + err = xa_insert(&table->devices, sf_index, sf_dev, GFP_KERNEL); + if (err) + goto xa_err; + return; + +xa_err: + mlx5_sf_dev_remove(sf_dev); +add_err: + mlx5_core_err(dev, "SF DEV: fail device add for index=%d sfnum=%d err=%d\n", + sf_index, sfnum, err); +} + +static void mlx5_sf_dev_del(struct mlx5_core_dev *dev, struct mlx5_sf_dev *sf_dev, u16 sf_index) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + xa_erase(&table->devices, sf_index); + mlx5_sf_dev_remove(sf_dev); +} + +static int +mlx5_sf_dev_state_change_handler(struct notifier_block *nb, unsigned long event_code, void *data) +{ + struct mlx5_sf_dev_table *table = container_of(nb, struct mlx5_sf_dev_table, nb); + const struct mlx5_vhca_state_event *event = data; + struct mlx5_sf_dev *sf_dev; + u16 sf_index; + + sf_index = event->function_id - MLX5_CAP_GEN(table->dev, sf_base_id); + sf_dev = xa_load(&table->devices, sf_index); + switch (event->new_vhca_state) { + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + if (sf_dev) + mlx5_sf_dev_del(table->dev, sf_dev, sf_index); + else + mlx5_core_err(table->dev, + "SF DEV: teardown state for invalid dev index=%d fn_id=0x%x\n", + sf_index, event->sw_function_id); + break; + case MLX5_VHCA_STATE_ACTIVE: + if (!sf_dev) + mlx5_sf_dev_add(table->dev, sf_index, event->sw_function_id); + break; + default: + break; + } + return 0; +} + +static int mlx5_sf_dev_vhca_arm_all(struct mlx5_sf_dev_table *table) +{ + struct mlx5_core_dev *dev = table->dev; + u16 max_functions; + u16 function_id; + int err = 0; + bool ecpu; + int i; + + max_functions = mlx5_sf_max_functions(dev); + function_id = MLX5_CAP_GEN(dev, sf_base_id); + ecpu = mlx5_read_embedded_cpu(dev); + /* Arm the vhca context as the vhca event notifier */ + for (i = 0; i < max_functions; i++) { + err = mlx5_vhca_event_arm(dev, function_id, ecpu); + if (err) + return err; + + function_id++; + } + return 0; +} + +void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table; + unsigned int max_sfs; + int err; + + if (!mlx5_sf_dev_supported(dev) || !mlx5_vhca_event_supported(dev)) + return; + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) { + err = -ENOMEM; + goto table_err; + } + + table->nb.notifier_call = mlx5_sf_dev_state_change_handler; + table->dev = dev; + if (MLX5_CAP_GEN(dev, max_num_sf)) + max_sfs = MLX5_CAP_GEN(dev, max_num_sf); + else + max_sfs = 1 << MLX5_CAP_GEN(dev, log_max_sf); + table->sf_bar_length = 1 << (MLX5_CAP_GEN(dev, log_min_sf_size) + 12); + table->base_address = pci_resource_start(dev->pdev, 2); + table->max_sfs = max_sfs; + xa_init(&table->devices); + dev->priv.sf_dev_table = table; + + err = mlx5_vhca_event_notifier_register(dev, &table->nb); + if (err) + goto vhca_err; + err = mlx5_sf_dev_vhca_arm_all(table); + if (err) + goto arm_err; + mlx5_core_dbg(dev, "SF DEV: max sf devices=%d\n", max_sfs); + return; + +arm_err: + mlx5_vhca_event_notifier_unregister(dev, &table->nb); +vhca_err: + table->max_sfs = 0; + kfree(table); + dev->priv.sf_dev_table = NULL; +table_err: + mlx5_core_err(dev, "SF DEV table create err = %d\n", err); +} + +static void mlx5_sf_dev_destroy_all(struct mlx5_sf_dev_table *table) +{ + struct mlx5_sf_dev *sf_dev; + unsigned long index; + + xa_for_each(&table->devices, index, sf_dev) { + xa_erase(&table->devices, index); + mlx5_sf_dev_remove(sf_dev); + } +} + +void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + if (!table) + return; + + mlx5_vhca_event_notifier_unregister(dev, &table->nb); + + /* Now that event handler is not running, it is safe to destroy + * the sf device without race. + */ + mlx5_sf_dev_destroy_all(table); + + WARN_ON(!xa_empty(&table->devices)); + kfree(table); + dev->priv.sf_dev_table = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h new file mode 100644 index 000000000000..a6fb7289ba2c --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_DEV_H__ +#define __MLX5_SF_DEV_H__ + +#ifdef CONFIG_MLX5_SF + +#include + +#define MLX5_SF_DEV_ID_NAME "sf" + +struct mlx5_sf_dev { + struct auxiliary_device adev; + struct mlx5_core_dev *parent_mdev; + phys_addr_t bar_base_addr; + u32 sfnum; +}; + +void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev); +void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev); + +#else + +static inline void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev) +{ +} + +static inline void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev) +{ +} + +#endif + +#endif diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index ffba0786051e..08e5fbe97df0 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -508,6 +508,7 @@ struct mlx5_fw_reset; struct mlx5_eq_table; struct mlx5_irq_table; struct mlx5_vhca_state_notifier; +struct mlx5_sf_dev_table; struct mlx5_rate_limit { u32 rate; @@ -606,6 +607,7 @@ struct mlx5_priv { struct mlx5_uars_page *uar; #ifdef CONFIG_MLX5_SF struct mlx5_vhca_state_notifier *vhca_state_notifier; + struct mlx5_sf_dev_table *sf_dev_table; #endif }; From patchwork Tue Dec 15 09:03:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81DD6C2BB9A for ; Tue, 15 Dec 2020 09:08:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2ADD722255 for ; Tue, 15 Dec 2020 09:08:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728168AbgLOJIj (ORCPT ); Tue, 15 Dec 2020 04:08:39 -0500 Received: from mail.kernel.org ([198.145.29.99]:36404 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727843AbgLOJFb (ORCPT ); Tue, 15 Dec 2020 04:05:31 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Saeed Mahameed Subject: [net-next v5 08/15] net/mlx5: SF, Add auxiliary device driver Date: Tue, 15 Dec 2020 01:03:51 -0800 Message-Id: <20201215090358.240365-9-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Add auxiliary device driver for mlx5 subfunction auxiliary device. A mlx5 subfunction is similar to PCI PF and VF. For a subfunction an auxiliary device is created. As a result, when mlx5 SF auxiliary device binds to the driver, its netdev and rdma device are created, they appear as $ ls -l /sys/bus/auxiliary/devices/ mlx5_core.sf.4 -> ../../../devices/pci0000:00/0000:00:03.0/0000:06:00.0/mlx5_core.sf.4 $ ls -l /sys/class/net/eth1/device /sys/class/net/eth1/device -> ../../../mlx5_core.sf.4 $ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum 88 $ devlink dev show pci/0000:06:00.0 auxiliary/mlx5_core.sf.4 $ devlink port show auxiliary/mlx5_core.sf.4/1 auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false $ rdma link show mlx5_0/1 link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88 $ rdma dev show 8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112 13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112 In future, devlink device instance name will adapt to have sfnum annotation using either an alias or as devlink instance name described in RFC [1]. [1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/ Signed-off-by: Parav Pandit Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- .../net/ethernet/mellanox/mlx5/core/devlink.c | 12 +++ drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 +- .../net/ethernet/mellanox/mlx5/core/main.c | 12 ++- .../ethernet/mellanox/mlx5/core/mlx5_core.h | 10 ++ .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 20 ++++ .../ethernet/mellanox/mlx5/core/sf/dev/dev.c | 10 ++ .../ethernet/mellanox/mlx5/core/sf/dev/dev.h | 20 ++++ .../mellanox/mlx5/core/sf/dev/driver.c | 101 ++++++++++++++++++ include/linux/mlx5/driver.h | 4 +- 10 files changed, 187 insertions(+), 6 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 2aefbca404c3..efa95d6dd112 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -88,4 +88,4 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o # # SF device # -mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o sf/dev/dev.o +mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o sf/dev/dev.o sf/dev/driver.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index 3261d0dc1104..9afe918c5827 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -7,6 +7,7 @@ #include "fw_reset.h" #include "fs_core.h" #include "eswitch.h" +#include "sf/dev/dev.h" static int mlx5_devlink_flash_update(struct devlink *devlink, struct devlink_flash_update_params *params, @@ -127,6 +128,17 @@ static int mlx5_devlink_reload_down(struct devlink *devlink, bool netns_change, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(devlink); + bool sf_dev_allocated; + + sf_dev_allocated = mlx5_sf_dev_allocated(dev); + if (sf_dev_allocated) { + /* Reload results in deleting SF device which further results in + * unregistering devlink instance while holding devlink_mutext. + * Hence, do not support reload. + */ + NL_SET_ERR_MSG_MOD(extack, "reload is unsupported when SFs are allocated\n"); + return -EOPNOTSUPP; + } switch (action) { case DEVLINK_RELOAD_ACTION_DRIVER_REINIT: diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 421febebc658..174dfbc996c6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -467,7 +467,7 @@ int mlx5_eq_table_init(struct mlx5_core_dev *dev) for (i = 0; i < MLX5_EVENT_TYPE_MAX; i++) ATOMIC_INIT_NOTIFIER_HEAD(&eq_table->nh[i]); - eq_table->irq_table = dev->priv.irq_table; + eq_table->irq_table = mlx5_irq_table_get(dev); return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 292c30e71d7f..932a280a56a5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -84,7 +84,6 @@ unsigned int mlx5_core_debug_mask; module_param_named(debug_mask, mlx5_core_debug_mask, uint, 0644); MODULE_PARM_DESC(debug_mask, "debug mask: 1 = dump cmd data, 2 = dump cmd exec time, 3 = both. Default=0"); -#define MLX5_DEFAULT_PROF 2 static unsigned int prof_sel = MLX5_DEFAULT_PROF; module_param_named(prof_sel, prof_sel, uint, 0444); MODULE_PARM_DESC(prof_sel, "profile selector. Valid range 0 - 2"); @@ -1303,7 +1302,7 @@ void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup) mutex_unlock(&dev->intf_state_mutex); } -static int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) +int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) { struct mlx5_priv *priv = &dev->priv; int err; @@ -1353,7 +1352,7 @@ static int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx) return err; } -static void mlx5_mdev_uninit(struct mlx5_core_dev *dev) +void mlx5_mdev_uninit(struct mlx5_core_dev *dev) { struct mlx5_priv *priv = &dev->priv; @@ -1693,6 +1692,10 @@ static int __init init(void) if (err) goto err_debug; + err = mlx5_sf_driver_register(); + if (err) + goto err_sf; + #ifdef CONFIG_MLX5_CORE_EN err = mlx5e_init(); if (err) { @@ -1703,6 +1706,8 @@ static int __init init(void) return 0; +err_sf: + pci_unregister_driver(&mlx5_core_driver); err_debug: mlx5_unregister_debugfs(); return err; @@ -1713,6 +1718,7 @@ static void __exit cleanup(void) #ifdef CONFIG_MLX5_CORE_EN mlx5e_cleanup(); #endif + mlx5_sf_driver_unregister(); pci_unregister_driver(&mlx5_core_driver); mlx5_unregister_debugfs(); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index a33b7496d748..3754ef98554f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -117,6 +117,8 @@ enum mlx5_semaphore_space_address { MLX5_SEMAPHORE_SW_RESET = 0x20, }; +#define MLX5_DEFAULT_PROF 2 + int mlx5_query_hca_caps(struct mlx5_core_dev *dev); int mlx5_query_board_id(struct mlx5_core_dev *dev); int mlx5_cmd_init(struct mlx5_core_dev *dev); @@ -176,6 +178,7 @@ struct cpumask * mlx5_irq_get_affinity_mask(struct mlx5_irq_table *irq_table, int vecidx); struct cpu_rmap *mlx5_irq_get_rmap(struct mlx5_irq_table *table); int mlx5_irq_get_num_comp(struct mlx5_irq_table *table); +struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev); int mlx5_events_init(struct mlx5_core_dev *dev); void mlx5_events_cleanup(struct mlx5_core_dev *dev); @@ -257,6 +260,13 @@ enum { u8 mlx5_get_nic_state(struct mlx5_core_dev *dev); void mlx5_set_nic_state(struct mlx5_core_dev *dev, u8 state); +static inline bool mlx5_core_is_sf(const struct mlx5_core_dev *dev) +{ + return dev->coredev_type == MLX5_COREDEV_SF; +} + +int mlx5_mdev_init(struct mlx5_core_dev *dev, int profile_idx); +void mlx5_mdev_uninit(struct mlx5_core_dev *dev); void mlx5_unload_one(struct mlx5_core_dev *dev, bool cleanup); int mlx5_load_one(struct mlx5_core_dev *dev, bool boot); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 6fd974920394..a61e09aff152 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -30,6 +30,9 @@ int mlx5_irq_table_init(struct mlx5_core_dev *dev) { struct mlx5_irq_table *irq_table; + if (mlx5_core_is_sf(dev)) + return 0; + irq_table = kvzalloc(sizeof(*irq_table), GFP_KERNEL); if (!irq_table) return -ENOMEM; @@ -40,6 +43,9 @@ int mlx5_irq_table_init(struct mlx5_core_dev *dev) void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev) { + if (mlx5_core_is_sf(dev)) + return; + kvfree(dev->priv.irq_table); } @@ -268,6 +274,9 @@ int mlx5_irq_table_create(struct mlx5_core_dev *dev) int nvec; int err; + if (mlx5_core_is_sf(dev)) + return 0; + nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + MLX5_IRQ_VEC_COMP_BASE; nvec = min_t(int, nvec, num_eqs); @@ -319,6 +328,9 @@ void mlx5_irq_table_destroy(struct mlx5_core_dev *dev) struct mlx5_irq_table *table = dev->priv.irq_table; int i; + if (mlx5_core_is_sf(dev)) + return; + /* free_irq requires that affinity and rmap will be cleared * before calling it. This is why there is asymmetry with set_rmap * which should be called after alloc_irq but before request_irq. @@ -332,3 +344,11 @@ void mlx5_irq_table_destroy(struct mlx5_core_dev *dev) kfree(table->irq); } +struct mlx5_irq_table *mlx5_irq_table_get(struct mlx5_core_dev *dev) +{ +#ifdef CONFIG_MLX5_SF + if (mlx5_core_is_sf(dev)) + return dev->priv.parent_mdev->priv.irq_table; +#endif + return dev->priv.irq_table; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c index 6562bf63afaa..2675b85d202d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c @@ -24,6 +24,16 @@ static bool mlx5_sf_dev_supported(const struct mlx5_core_dev *dev) return MLX5_CAP_GEN(dev, sf) && mlx5_vhca_event_supported(dev); } +bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev) +{ + struct mlx5_sf_dev_table *table = dev->priv.sf_dev_table; + + if (!mlx5_sf_dev_supported(dev)) + return false; + + return xa_empty(&table->devices); +} + static ssize_t sfnum_show(struct device *dev, struct device_attribute *attr, char *buf) { struct auxiliary_device *adev = container_of(dev, struct auxiliary_device, dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h index a6fb7289ba2c..4de02902aef1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h @@ -13,6 +13,7 @@ struct mlx5_sf_dev { struct auxiliary_device adev; struct mlx5_core_dev *parent_mdev; + struct mlx5_core_dev *mdev; phys_addr_t bar_base_addr; u32 sfnum; }; @@ -20,6 +21,11 @@ struct mlx5_sf_dev { void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev); void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev); +int mlx5_sf_driver_register(void); +void mlx5_sf_driver_unregister(void); + +bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev); + #else static inline void mlx5_sf_dev_table_create(struct mlx5_core_dev *dev) @@ -30,6 +36,20 @@ static inline void mlx5_sf_dev_table_destroy(struct mlx5_core_dev *dev) { } +static inline int mlx5_sf_driver_register(void) +{ + return 0; +} + +static inline void mlx5_sf_driver_unregister(void) +{ +} + +static inline bool mlx5_sf_dev_allocated(const struct mlx5_core_dev *dev) +{ + return 0; +} + #endif #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c new file mode 100644 index 000000000000..9a1ad331ce0a --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include +#include +#include "mlx5_core.h" +#include "dev.h" +#include "devlink.h" + +static int mlx5_sf_dev_probe(struct auxiliary_device *adev, const struct auxiliary_device_id *id) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + struct mlx5_core_dev *mdev; + struct devlink *devlink; + int err; + + devlink = mlx5_devlink_alloc(); + if (!devlink) + return -ENOMEM; + + mdev = devlink_priv(devlink); + mdev->device = &adev->dev; + mdev->pdev = sf_dev->parent_mdev->pdev; + mdev->bar_addr = sf_dev->bar_base_addr; + mdev->iseg_base = sf_dev->bar_base_addr; + mdev->coredev_type = MLX5_COREDEV_SF; + mdev->priv.parent_mdev = sf_dev->parent_mdev; + mdev->priv.adev_idx = adev->id; + sf_dev->mdev = mdev; + + err = mlx5_mdev_init(mdev, MLX5_DEFAULT_PROF); + if (err) { + mlx5_core_warn(mdev, "mlx5_mdev_init on err=%d\n", err); + goto mdev_err; + } + + mdev->iseg = ioremap(mdev->iseg_base, sizeof(*mdev->iseg)); + if (!mdev->iseg) { + mlx5_core_warn(mdev, "remap error\n"); + goto remap_err; + } + + err = mlx5_load_one(mdev, true); + if (err) { + mlx5_core_warn(mdev, "mlx5_load_one err=%d\n", err); + goto load_one_err; + } + return 0; + +load_one_err: + iounmap(mdev->iseg); +remap_err: + mlx5_mdev_uninit(mdev); +mdev_err: + mlx5_devlink_free(devlink); + return err; +} + +static void mlx5_sf_dev_remove(struct auxiliary_device *adev) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + struct devlink *devlink; + + devlink = priv_to_devlink(sf_dev->mdev); + mlx5_unload_one(sf_dev->mdev, true); + iounmap(sf_dev->mdev->iseg); + mlx5_mdev_uninit(sf_dev->mdev); + mlx5_devlink_free(devlink); +} + +static void mlx5_sf_dev_shutdown(struct auxiliary_device *adev) +{ + struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev); + + mlx5_unload_one(sf_dev->mdev, false); +} + +static const struct auxiliary_device_id mlx5_sf_dev_id_table[] = { + { .name = KBUILD_MODNAME "." MLX5_SF_DEV_ID_NAME, }, + { }, +}; + +MODULE_DEVICE_TABLE(auxiliary, mlx5_sf_dev_id_table); + +static struct auxiliary_driver mlx5_sf_driver = { + .name = KBUILD_MODNAME, + .probe = mlx5_sf_dev_probe, + .remove = mlx5_sf_dev_remove, + .shutdown = mlx5_sf_dev_shutdown, + .id_table = mlx5_sf_dev_id_table, +}; + +int mlx5_sf_driver_register(void) +{ + return auxiliary_driver_register(&mlx5_sf_driver); +} + +void mlx5_sf_driver_unregister(void) +{ + auxiliary_driver_unregister(&mlx5_sf_driver); +} diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 08e5fbe97df0..48e3638b1185 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -193,7 +193,8 @@ enum port_state_policy { enum mlx5_coredev_type { MLX5_COREDEV_PF, - MLX5_COREDEV_VF + MLX5_COREDEV_VF, + MLX5_COREDEV_SF, }; struct mlx5_field_desc { @@ -608,6 +609,7 @@ struct mlx5_priv { #ifdef CONFIG_MLX5_SF struct mlx5_vhca_state_notifier *vhca_state_notifier; struct mlx5_sf_dev_table *sf_dev_table; + struct mlx5_core_dev *parent_mdev; #endif }; From patchwork Tue Dec 15 09:03:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3500DC2BBD5 for ; Tue, 15 Dec 2020 09:07:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E13C122225 for ; Tue, 15 Dec 2020 09:07:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727988AbgLOJHL (ORCPT ); Tue, 15 Dec 2020 04:07:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:36476 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727850AbgLOJFh (ORCPT ); Tue, 15 Dec 2020 04:05:37 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Vu Pham , Parav Pandit , Roi Dayan , Saeed Mahameed Subject: [net-next v5 09/15] net/mlx5: E-switch, Prepare eswitch to handle SF vport Date: Tue, 15 Dec 2020 01:03:52 -0800 Message-Id: <20201215090358.240365-10-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Vu Pham Prepare eswitch to handle SF vport during (a) querying eswitch functions (b) egress ACL creation (c) account for SF vports in total vports calculation Assign a dedicated placeholder for SFs vports and their representors. They are placed after VFs vports and before ECPF vports as below: [PF,VF0,...,VFn,SF0,...SFm,ECPF,UPLINK]. Change functions to map SF's vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Vu Pham Signed-off-by: Parav Pandit Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/Kconfig | 10 ++++ .../mellanox/mlx5/core/esw/acl/egress_ofld.c | 2 +- .../net/ethernet/mellanox/mlx5/core/eswitch.c | 11 +++- .../net/ethernet/mellanox/mlx5/core/eswitch.h | 50 +++++++++++++++++++ .../mellanox/mlx5/core/eswitch_offloads.c | 11 ++++ .../net/ethernet/mellanox/mlx5/core/vport.c | 3 +- 6 files changed, 83 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index d6c48582e7a8..ad45d20f9d44 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -212,3 +212,13 @@ config MLX5_SF Build support for subfuction device in the NIC. A Mellanox subfunction device can support RDMA, netdevice and vdpa device. It is similar to a SRIOV VF but it doesn't require SRIOV support. + +config MLX5_SF_MANAGER + bool + depends on MLX5_SF && MLX5_ESWITCH + default y + help + Build support for subfuction port in the NIC. A Mellanox subfunction + port is managed through devlink. A subfunction supports RDMA, netdevice + and vdpa device. It is similar to a SRIOV VF but it doesn't require + SRIOV support. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c index 4c74e2690d57..26b37a0f8762 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c @@ -150,7 +150,7 @@ static void esw_acl_egress_ofld_groups_destroy(struct mlx5_vport *vport) static bool esw_acl_egress_needed(const struct mlx5_eswitch *esw, u16 vport_num) { - return mlx5_eswitch_is_vf_vport(esw, vport_num); + return mlx5_eswitch_is_vf_vport(esw, vport_num) || mlx5_esw_is_sf_vport(esw, vport_num); } int esw_acl_egress_ofld_setup(struct mlx5_eswitch *esw, struct mlx5_vport *vport) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index da901e364656..d75247a8ce55 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1366,9 +1366,15 @@ const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev) { int outlen = MLX5_ST_SZ_BYTES(query_esw_functions_out); u32 in[MLX5_ST_SZ_DW(query_esw_functions_in)] = {}; + u16 max_sf_vports; u32 *out; int err; + max_sf_vports = mlx5_sf_max_functions(dev); + /* Device interface is array of 64-bits */ + if (max_sf_vports) + outlen += DIV_ROUND_UP(max_sf_vports, BITS_PER_TYPE(__be64)) * sizeof(__be64); + out = kvzalloc(outlen, GFP_KERNEL); if (!out) return ERR_PTR(-ENOMEM); @@ -1376,7 +1382,7 @@ const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev) MLX5_SET(query_esw_functions_in, in, opcode, MLX5_CMD_OP_QUERY_ESW_FUNCTIONS); - err = mlx5_cmd_exec_inout(dev, query_esw_functions, in, out); + err = mlx5_cmd_exec(dev, in, sizeof(in), out, outlen); if (!err) return out; @@ -1899,7 +1905,8 @@ static bool is_port_function_supported(const struct mlx5_eswitch *esw, u16 vport_num) { return vport_num == MLX5_VPORT_PF || - mlx5_eswitch_is_vf_vport(esw, vport_num); + mlx5_eswitch_is_vf_vport(esw, vport_num) || + mlx5_esw_is_sf_vport(esw, vport_num); } int mlx5_devlink_port_function_hw_addr_get(struct devlink *devlink, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index cf87de94418f..4e3ed878ff03 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -43,6 +43,7 @@ #include #include "lib/mpfs.h" #include "lib/fs_chains.h" +#include "sf/sf.h" #include "en/tc_ct.h" #ifdef CONFIG_MLX5_ESWITCH @@ -499,6 +500,40 @@ static inline u16 mlx5_eswitch_first_host_vport_num(struct mlx5_core_dev *dev) MLX5_VPORT_PF : MLX5_VPORT_FIRST_VF; } +static inline int mlx5_esw_sf_start_idx(const struct mlx5_eswitch *esw) +{ + /* PF and VF vports indices start from 0 to max_vfs */ + return MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev); +} + +static inline int mlx5_esw_sf_end_idx(const struct mlx5_eswitch *esw) +{ + return mlx5_esw_sf_start_idx(esw) + mlx5_sf_max_functions(esw->dev); +} + +static inline int +mlx5_esw_sf_vport_num_to_index(const struct mlx5_eswitch *esw, u16 vport_num) +{ + return vport_num - mlx5_sf_start_function_id(esw->dev) + + MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev); +} + +static inline u16 +mlx5_esw_sf_vport_index_to_num(const struct mlx5_eswitch *esw, int idx) +{ + return mlx5_sf_start_function_id(esw->dev) + idx - + (MLX5_VPORT_PF_PLACEHOLDER + mlx5_core_max_vfs(esw->dev)); +} + +static inline bool +mlx5_esw_is_sf_vport(const struct mlx5_eswitch *esw, u16 vport_num) +{ + return mlx5_sf_supported(esw->dev) && + vport_num >= mlx5_sf_start_function_id(esw->dev) && + (vport_num < (mlx5_sf_start_function_id(esw->dev) + + mlx5_sf_max_functions(esw->dev))); +} + static inline bool mlx5_eswitch_is_funcs_handler(const struct mlx5_core_dev *dev) { return mlx5_core_is_ecpf_esw_manager(dev); @@ -527,6 +562,10 @@ static inline int mlx5_eswitch_vport_num_to_index(struct mlx5_eswitch *esw, if (vport_num == MLX5_VPORT_UPLINK) return mlx5_eswitch_uplink_idx(esw); + if (mlx5_esw_is_sf_vport(esw, vport_num)) + return mlx5_esw_sf_vport_num_to_index(esw, vport_num); + + /* PF and VF vports start from 0 to max_vfs */ return vport_num; } @@ -540,6 +579,12 @@ static inline u16 mlx5_eswitch_index_to_vport_num(struct mlx5_eswitch *esw, if (index == mlx5_eswitch_uplink_idx(esw)) return MLX5_VPORT_UPLINK; + /* SF vports indices are after VFs and before ECPF */ + if (mlx5_sf_supported(esw->dev) && + index > mlx5_core_max_vfs(esw->dev)) + return mlx5_esw_sf_vport_index_to_num(esw, index); + + /* PF and VF vports start from 0 to max_vfs */ return index; } @@ -625,6 +670,11 @@ void mlx5e_tc_clean_fdb_peer_flows(struct mlx5_eswitch *esw); for ((vport) = (nvfs); \ (vport) >= (esw)->first_host_vport; (vport)--) +#define mlx5_esw_for_each_sf_rep(esw, i, rep) \ + for ((i) = mlx5_esw_sf_start_idx(esw); \ + (rep) = &(esw)->offloads.vport_reps[(i)], \ + (i) < mlx5_esw_sf_end_idx(esw); (i++)) + struct mlx5_eswitch *mlx5_devlink_eswitch_get(struct devlink *devlink); struct mlx5_vport *__must_check mlx5_eswitch_get_vport(struct mlx5_eswitch *esw, u16 vport_num); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 2f6a0ae20650..2d241f7351b5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1800,11 +1800,22 @@ static void __esw_offloads_unload_rep(struct mlx5_eswitch *esw, esw->offloads.rep_ops[rep_type]->unload(rep); } +static void __unload_reps_sf_vport(struct mlx5_eswitch *esw, u8 rep_type) +{ + struct mlx5_eswitch_rep *rep; + int i; + + mlx5_esw_for_each_sf_rep(esw, i, rep) + __esw_offloads_unload_rep(esw, rep, rep_type); +} + static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type) { struct mlx5_eswitch_rep *rep; int i; + __unload_reps_sf_vport(esw, rep_type); + mlx5_esw_for_each_vf_rep_reverse(esw, i, rep, esw->esw_funcs.num_vfs) __esw_offloads_unload_rep(esw, rep, rep_type); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c index bdafc85fd874..ba78e0660523 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c @@ -36,6 +36,7 @@ #include #include #include "mlx5_core.h" +#include "sf/sf.h" /* Mutex to hold while enabling or disabling RoCE */ static DEFINE_MUTEX(mlx5_roce_en_lock); @@ -1160,6 +1161,6 @@ EXPORT_SYMBOL_GPL(mlx5_query_nic_system_image_guid); */ u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev) { - return MLX5_SPECIAL_VPORTS(dev) + mlx5_core_max_vfs(dev); + return MLX5_SPECIAL_VPORTS(dev) + mlx5_core_max_vfs(dev) + mlx5_sf_max_functions(dev); } EXPORT_SYMBOL_GPL(mlx5_eswitch_get_total_vports); From patchwork Tue Dec 15 09:03:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C64A7C4361B for ; Tue, 15 Dec 2020 09:08:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7743F22225 for ; Tue, 15 Dec 2020 09:08:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728094AbgLOJIQ (ORCPT ); Tue, 15 Dec 2020 04:08:16 -0500 Received: from mail.kernel.org ([198.145.29.99]:36478 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727082AbgLOJFg (ORCPT ); Tue, 15 Dec 2020 04:05:36 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Roi Dayan , Saeed Mahameed Subject: [net-next v5 10/15] net/mlx5: E-switch, Add eswitch helpers for SF vport Date: Tue, 15 Dec 2020 01:03:53 -0800 Message-Id: <20201215090358.240365-11-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Add helpers to enable/disable eswitch port, register its devlink port and load its representor. Signed-off-by: Vu Pham Signed-off-by: Parav Pandit Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../mellanox/mlx5/core/esw/devlink_port.c | 41 +++++++++++++++++++ .../net/ethernet/mellanox/mlx5/core/eswitch.c | 12 +++--- .../net/ethernet/mellanox/mlx5/core/eswitch.h | 16 ++++++++ .../mellanox/mlx5/core/eswitch_offloads.c | 36 +++++++++++++++- 4 files changed, 97 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c index ffff11baa3d0..4b7e9f783789 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c @@ -122,3 +122,44 @@ struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u1 vport = mlx5_eswitch_get_vport(esw, vport_num); return vport->dl_port; } + +int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum) +{ + struct mlx5_core_dev *dev = esw->dev; + struct netdev_phys_item_id ppid = {}; + unsigned int dl_port_index; + struct mlx5_vport *vport; + struct devlink *devlink; + u16 pfnum; + int err; + + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + pfnum = PCI_FUNC(dev->pdev->devfn); + mlx5_esw_get_port_parent_id(dev, &ppid); + memcpy(dl_port->attrs.switch_id.id, &ppid.id[0], ppid.id_len); + dl_port->attrs.switch_id.id_len = ppid.id_len; + devlink_port_attrs_pci_sf_set(dl_port, 0, pfnum, sfnum, false); + devlink = priv_to_devlink(dev); + dl_port_index = mlx5_esw_vport_to_devlink_port_index(dev, vport_num); + err = devlink_port_register(devlink, dl_port, dl_port_index); + if (err) + return err; + + vport->dl_port = dl_port; + return 0; +} + +void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num) +{ + struct mlx5_vport *vport; + + vport = mlx5_eswitch_get_vport(esw, vport_num); + if (IS_ERR(vport)) + return; + devlink_port_unregister(vport->dl_port); + vport->dl_port = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index d75247a8ce55..d06e7a5f15de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1273,8 +1273,8 @@ static void esw_vport_cleanup(struct mlx5_eswitch *esw, struct mlx5_vport *vport esw_vport_cleanup_acl(esw, vport); } -static int esw_enable_vport(struct mlx5_eswitch *esw, u16 vport_num, - enum mlx5_eswitch_vport_event enabled_events) +int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, u16 vport_num, + enum mlx5_eswitch_vport_event enabled_events) { struct mlx5_vport *vport; int ret; @@ -1310,7 +1310,7 @@ static int esw_enable_vport(struct mlx5_eswitch *esw, u16 vport_num, return ret; } -static void esw_disable_vport(struct mlx5_eswitch *esw, u16 vport_num) +void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_vport *vport; @@ -1432,7 +1432,7 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, { int err; - err = esw_enable_vport(esw, vport_num, enabled_events); + err = mlx5_esw_vport_enable(esw, vport_num, enabled_events); if (err) return err; @@ -1443,14 +1443,14 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, return err; err_rep: - esw_disable_vport(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); return err; } void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num) { esw_offloads_unload_rep(esw, vport_num); - esw_disable_vport(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); } void mlx5_eswitch_unload_vf_vports(struct mlx5_eswitch *esw, u16 num_vfs) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 4e3ed878ff03..54514b04808d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -688,6 +688,10 @@ mlx5_eswitch_enable_pf_vf_vports(struct mlx5_eswitch *esw, enum mlx5_eswitch_vport_event enabled_events); void mlx5_eswitch_disable_pf_vf_vports(struct mlx5_eswitch *esw); +int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, u16 vport_num, + enum mlx5_eswitch_vport_event enabled_events); +void mlx5_esw_vport_disable(struct mlx5_eswitch *esw, u16 vport_num); + int esw_vport_create_offloads_acl_tables(struct mlx5_eswitch *esw, struct mlx5_vport *vport); @@ -706,6 +710,9 @@ esw_get_max_restore_tag(struct mlx5_eswitch *esw); int esw_offloads_load_rep(struct mlx5_eswitch *esw, u16 vport_num); void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num); +int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num); +void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num); + int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, enum mlx5_eswitch_vport_event enabled_events); void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num); @@ -717,6 +724,15 @@ void mlx5_eswitch_unload_vf_vports(struct mlx5_eswitch *esw, u16 num_vfs); int mlx5_esw_offloads_devlink_port_register(struct mlx5_eswitch *esw, u16 vport_num); void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u16 vport_num); + +int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum); +void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); + +int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum); +void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num); + #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 2d241f7351b5..7f09f2bbf7c1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1833,7 +1833,7 @@ static void __unload_reps_all_vport(struct mlx5_eswitch *esw, u8 rep_type) __esw_offloads_unload_rep(esw, rep, rep_type); } -static int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num) +int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_eswitch_rep *rep; int rep_type; @@ -1857,7 +1857,7 @@ static int mlx5_esw_offloads_rep_load(struct mlx5_eswitch *esw, u16 vport_num) return err; } -static void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num) +void mlx5_esw_offloads_rep_unload(struct mlx5_eswitch *esw, u16 vport_num) { struct mlx5_eswitch_rep *rep; int rep_type; @@ -2835,3 +2835,35 @@ u32 mlx5_eswitch_get_vport_metadata_for_match(struct mlx5_eswitch *esw, return vport->metadata << (32 - ESW_SOURCE_PORT_METADATA_BITS); } EXPORT_SYMBOL(mlx5_eswitch_get_vport_metadata_for_match); + +int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_port *dl_port, + u16 vport_num, u32 sfnum) +{ + int err; + + err = mlx5_esw_vport_enable(esw, vport_num, MLX5_VPORT_UC_ADDR_CHANGE); + if (err) + return err; + + err = mlx5_esw_devlink_sf_port_register(esw, dl_port, vport_num, sfnum); + if (err) + goto devlink_err; + + err = mlx5_esw_offloads_rep_load(esw, vport_num); + if (err) + goto rep_err; + return 0; + +rep_err: + mlx5_esw_devlink_sf_port_unregister(esw, vport_num); +devlink_err: + mlx5_esw_vport_disable(esw, vport_num); + return err; +} + +void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num) +{ + mlx5_esw_offloads_rep_unload(esw, vport_num); + mlx5_esw_devlink_sf_port_unregister(esw, vport_num); + mlx5_esw_vport_disable(esw, vport_num); +} From patchwork Tue Dec 15 09:03:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B511FC2BBCD for ; Tue, 15 Dec 2020 09:07:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6963722255 for ; Tue, 15 Dec 2020 09:07:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727921AbgLOJGR (ORCPT ); Tue, 15 Dec 2020 04:06:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:36480 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727855AbgLOJFi (ORCPT ); Tue, 15 Dec 2020 04:05:38 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Saeed Mahameed Subject: [net-next v5 11/15] net/mlx5: SF, Add port add delete functionality Date: Tue, 15 Dec 2020 01:03:54 -0800 Message-Id: <20201215090358.240365-12-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit To handle SF port management outside of the eswitch as independent software layer, introduce eswitch notifier APIs so that upper layer who wish to support sf port management in switchdev mode can perform its task whenever eswitch mode is set to switchdev or before eswitch is disabled. Initialize sf port table on such eswitch event. Add SF port add and delete functionality in switchdev mode. Destroy all SF ports when eswitch is disabled. Expose SF port add and delete to user via devlink commands. $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 $ devlink port show ens2f0npf0sf88 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state inactive opstate detached $ devlink port show ens2f0npf0sf88 -jp { "port": { "pci/0000:06:00.0/32768": { "type": "eth", "netdev": "ens2f0npf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "external": false, "splittable": false, "function": { "hw_addr": "00:00:00:00:88:88", "state": "inactive", "opstate": "detached" } } } } Signed-off-by: Parav Pandit Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/Makefile | 5 + drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 4 + .../net/ethernet/mellanox/mlx5/core/devlink.c | 5 + .../net/ethernet/mellanox/mlx5/core/eswitch.c | 25 ++ .../net/ethernet/mellanox/mlx5/core/eswitch.h | 12 + .../net/ethernet/mellanox/mlx5/core/main.c | 18 + .../net/ethernet/mellanox/mlx5/core/sf/cmd.c | 27 ++ .../ethernet/mellanox/mlx5/core/sf/devlink.c | 312 ++++++++++++++++++ .../ethernet/mellanox/mlx5/core/sf/hw_table.c | 125 +++++++ .../net/ethernet/mellanox/mlx5/core/sf/priv.h | 17 + .../net/ethernet/mellanox/mlx5/core/sf/sf.h | 28 ++ include/linux/mlx5/driver.h | 6 + 12 files changed, 584 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index efa95d6dd112..957d5d9cfb36 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -89,3 +89,8 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o # SF device # mlx5_core-$(CONFIG_MLX5_SF) += sf/vhca_event.o sf/dev/dev.o sf/dev/driver.o + +# +# SF manager +# +mlx5_core-$(CONFIG_MLX5_SF_MANAGER) += sf/cmd.o sf/hw_table.o sf/devlink.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 47dcc3ac2cf0..e8cecd50558d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -333,6 +333,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_DEALLOC_MEMIC: case MLX5_CMD_OP_PAGE_FAULT_RESUME: case MLX5_CMD_OP_QUERY_ESW_FUNCTIONS: + case MLX5_CMD_OP_DEALLOC_SF: return MLX5_CMD_STAT_OK; case MLX5_CMD_OP_QUERY_HCA_CAP: @@ -466,6 +467,7 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op, case MLX5_CMD_OP_RELEASE_XRQ_ERROR: case MLX5_CMD_OP_QUERY_VHCA_STATE: case MLX5_CMD_OP_MODIFY_VHCA_STATE: + case MLX5_CMD_OP_ALLOC_SF: *status = MLX5_DRIVER_STATUS_ABORTED; *synd = MLX5_DRIVER_SYND; return -EIO; @@ -661,6 +663,8 @@ const char *mlx5_command_str(int command) MLX5_COMMAND_STR_CASE(MODIFY_XRQ); MLX5_COMMAND_STR_CASE(QUERY_VHCA_STATE); MLX5_COMMAND_STR_CASE(MODIFY_VHCA_STATE); + MLX5_COMMAND_STR_CASE(ALLOC_SF); + MLX5_COMMAND_STR_CASE(DEALLOC_SF); default: return "unknown command opcode"; } } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index 9afe918c5827..d4c0cdf5edd9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -8,6 +8,7 @@ #include "fs_core.h" #include "eswitch.h" #include "sf/dev/dev.h" +#include "sf/sf.h" static int mlx5_devlink_flash_update(struct devlink *devlink, struct devlink_flash_update_params *params, @@ -190,6 +191,10 @@ static const struct devlink_ops mlx5_devlink_ops = { .eswitch_encap_mode_get = mlx5_devlink_eswitch_encap_mode_get, .port_function_hw_addr_get = mlx5_devlink_port_function_hw_addr_get, .port_function_hw_addr_set = mlx5_devlink_port_function_hw_addr_set, +#endif +#ifdef CONFIG_MLX5_SF_MANAGER + .port_new = mlx5_devlink_sf_port_new, + .port_del = mlx5_devlink_sf_port_del, #endif .flash_update = mlx5_devlink_flash_update, .info_get = mlx5_devlink_info_get, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index d06e7a5f15de..86e972c82af7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1600,6 +1600,15 @@ mlx5_eswitch_update_num_of_vfs(struct mlx5_eswitch *esw, int num_vfs) kvfree(out); } +static void mlx5_esw_mode_change_notify(struct mlx5_eswitch *esw, u16 mode) +{ + struct mlx5_esw_event_info info = {}; + + info.new_mode = mode; + + blocking_notifier_call_chain(&esw->n_head, 0, &info); +} + /** * mlx5_eswitch_enable_locked - Enable eswitch * @esw: Pointer to eswitch @@ -1660,6 +1669,8 @@ int mlx5_eswitch_enable_locked(struct mlx5_eswitch *esw, int mode, int num_vfs) mode == MLX5_ESWITCH_LEGACY ? "LEGACY" : "OFFLOADS", esw->esw_funcs.num_vfs, esw->enabled_vports); + mlx5_esw_mode_change_notify(esw, mode); + return 0; abort: @@ -1716,6 +1727,11 @@ void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw, bool clear_vf) esw->mode == MLX5_ESWITCH_LEGACY ? "LEGACY" : "OFFLOADS", esw->esw_funcs.num_vfs, esw->enabled_vports); + /* Notify eswitch users that it is exiting from current mode. + * So that it can do necessary cleanup before the eswitch is disabled. + */ + mlx5_esw_mode_change_notify(esw, MLX5_ESWITCH_NONE); + mlx5_eswitch_event_handlers_unregister(esw); if (esw->mode == MLX5_ESWITCH_LEGACY) @@ -1816,6 +1832,7 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev) esw->offloads.inline_mode = MLX5_INLINE_MODE_NONE; dev->priv.eswitch = esw; + BLOCKING_INIT_NOTIFIER_HEAD(&esw->n_head); return 0; abort: if (esw->work_queue) @@ -2507,4 +2524,12 @@ bool mlx5_esw_multipath_prereq(struct mlx5_core_dev *dev0, dev1->priv.eswitch->mode == MLX5_ESWITCH_OFFLOADS); } +int mlx5_esw_event_notifier_register(struct mlx5_eswitch *esw, struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&esw->n_head, nb); +} +void mlx5_esw_event_notifier_unregister(struct mlx5_eswitch *esw, struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&esw->n_head, nb); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 54514b04808d..479d2ac2cd85 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -278,6 +278,7 @@ struct mlx5_eswitch { struct { u32 large_group_num; } params; + struct blocking_notifier_head n_head; }; void esw_offloads_disable(struct mlx5_eswitch *esw); @@ -733,6 +734,17 @@ int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_p u16 vport_num, u32 sfnum); void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num); +/** + * mlx5_esw_event_info - Indicates eswitch mode changed/changing. + * + * @new_mode: New mode of eswitch. + */ +struct mlx5_esw_event_info { + u16 new_mode; +}; + +int mlx5_esw_event_notifier_register(struct mlx5_eswitch *esw, struct notifier_block *n); +void mlx5_esw_event_notifier_unregister(struct mlx5_eswitch *esw, struct notifier_block *n); #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 932a280a56a5..435323088ce0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -893,6 +893,18 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) goto err_fpga_cleanup; } + err = mlx5_sf_hw_table_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init SF HW table %d\n", err); + goto err_sf_hw_table_cleanup; + } + + err = mlx5_sf_table_init(dev); + if (err) { + mlx5_core_err(dev, "Failed to init SF table %d\n", err); + goto err_sf_table_cleanup; + } + dev->dm = mlx5_dm_create(dev); if (IS_ERR(dev->dm)) mlx5_core_warn(dev, "Failed to init device memory%d\n", err); @@ -903,6 +915,10 @@ static int mlx5_init_once(struct mlx5_core_dev *dev) return 0; +err_sf_table_cleanup: + mlx5_sf_hw_table_cleanup(dev); +err_sf_hw_table_cleanup: + mlx5_vhca_event_cleanup(dev); err_fpga_cleanup: mlx5_fpga_cleanup(dev); err_eswitch_cleanup: @@ -936,6 +952,8 @@ static void mlx5_cleanup_once(struct mlx5_core_dev *dev) mlx5_hv_vhca_destroy(dev->hv_vhca); mlx5_fw_tracer_destroy(dev->tracer); mlx5_dm_cleanup(dev); + mlx5_sf_table_cleanup(dev); + mlx5_sf_hw_table_cleanup(dev); mlx5_vhca_event_cleanup(dev); mlx5_fpga_cleanup(dev); mlx5_eswitch_cleanup(dev->priv.eswitch); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c new file mode 100644 index 000000000000..0bc3075f34fa --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include +#include "priv.h" + +int mlx5_cmd_alloc_sf(struct mlx5_core_dev *dev, u16 function_id) +{ + u32 out[MLX5_ST_SZ_DW(alloc_sf_out)] = {}; + u32 in[MLX5_ST_SZ_DW(alloc_sf_in)] = {}; + + MLX5_SET(alloc_sf_in, in, opcode, MLX5_CMD_OP_ALLOC_SF); + MLX5_SET(alloc_sf_in, in, function_id, function_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + +int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id) +{ + u32 out[MLX5_ST_SZ_DW(dealloc_sf_out)] = {}; + u32 in[MLX5_ST_SZ_DW(dealloc_sf_in)] = {}; + + MLX5_SET(dealloc_sf_in, in, opcode, MLX5_CMD_OP_DEALLOC_SF); + MLX5_SET(dealloc_sf_in, in, function_id, function_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c new file mode 100644 index 000000000000..09365f36a513 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -0,0 +1,312 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#include +#include "eswitch.h" +#include "priv.h" + +struct mlx5_sf { + struct devlink_port dl_port; + unsigned int port_index; + u16 id; +}; + +struct mlx5_sf_table { + struct mlx5_core_dev *dev; /* To refer from notifier context. */ + struct xarray port_indices; /* port index based lookup. */ + refcount_t refcount; + struct completion disable_complete; + struct notifier_block esw_nb; +}; + +static struct mlx5_sf * +mlx5_sf_lookup_by_index(struct mlx5_sf_table *table, unsigned int port_index) +{ + return xa_load(&table->port_indices, port_index); +} + +static int mlx5_sf_id_insert(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + return xa_insert(&table->port_indices, sf->port_index, sf, GFP_KERNEL); +} + +static void mlx5_sf_id_erase(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + xa_erase(&table->port_indices, sf->port_index); +} + +static struct mlx5_sf * +mlx5_sf_alloc(struct mlx5_sf_table *table, u32 sfnum, struct netlink_ext_ack *extack) +{ + unsigned int dl_port_index; + struct mlx5_sf *sf; + u16 hw_fn_id; + int id_err; + int err; + + id_err = mlx5_sf_hw_table_sf_alloc(table->dev, sfnum); + if (id_err < 0) { + err = id_err; + goto id_err; + } + + sf = kzalloc(sizeof(*sf), GFP_KERNEL); + if (!sf) { + err = -ENOMEM; + goto alloc_err; + } + sf->id = id_err; + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, sf->id); + dl_port_index = mlx5_esw_vport_to_devlink_port_index(table->dev, hw_fn_id); + sf->port_index = dl_port_index; + + err = mlx5_sf_id_insert(table, sf); + if (err) + goto insert_err; + + return sf; + +insert_err: + kfree(sf); +alloc_err: + mlx5_sf_hw_table_sf_free(table->dev, id_err); +id_err: + if (err == -EEXIST) + NL_SET_ERR_MSG_MOD(extack, "SF already exist. Choose different sfnum"); + return ERR_PTR(err); +} + +static void mlx5_sf_free(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + mlx5_sf_id_erase(table, sf); + mlx5_sf_hw_table_sf_free(table->dev, sf->id); + kfree(sf); +} + +static struct mlx5_sf_table *mlx5_sf_table_try_get(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table = dev->priv.sf_table; + + if (!table) + return NULL; + + return refcount_inc_not_zero(&table->refcount) ? table : NULL; +} + +static void mlx5_sf_table_put(struct mlx5_sf_table *table) +{ + if (refcount_dec_and_test(&table->refcount)) + complete(&table->disable_complete); +} + +static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, + const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack) +{ + struct mlx5_eswitch *esw = dev->priv.eswitch; + struct mlx5_sf *sf; + u16 hw_fn_id; + int err; + + sf = mlx5_sf_alloc(table, new_attr->sfnum, extack); + if (IS_ERR(sf)) + return PTR_ERR(sf); + + hw_fn_id = mlx5_sf_sw_to_hw_id(dev, sf->id); + err = mlx5_esw_offloads_sf_vport_enable(esw, &sf->dl_port, hw_fn_id, new_attr->sfnum); + if (err) + goto esw_err; + return 0; + +esw_err: + mlx5_sf_free(table, sf); + return err; +} + +static void mlx5_sf_del(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + struct mlx5_eswitch *esw = dev->priv.eswitch; + u16 hw_fn_id; + + hw_fn_id = mlx5_sf_sw_to_hw_id(dev, sf->id); + mlx5_esw_offloads_sf_vport_disable(esw, hw_fn_id); + mlx5_sf_free(table, sf); +} + +static int +mlx5_sf_new_check_attr(struct mlx5_core_dev *dev, const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack) +{ + if (new_attr->flavour != DEVLINK_PORT_FLAVOUR_PCI_SF) { + NL_SET_ERR_MSG_MOD(extack, "Driver supports only SF port addition"); + return -EOPNOTSUPP; + } + if (new_attr->port_index_valid) { + NL_SET_ERR_MSG_MOD(extack, + "Driver does not support user defined port index assignment"); + return -EOPNOTSUPP; + } + if (!new_attr->sfnum_valid) { + NL_SET_ERR_MSG_MOD(extack, + "User must provide unique sfnum. Driver does not support auto assignment"); + return -EOPNOTSUPP; + } + if (new_attr->controller_valid && new_attr->controller) { + NL_SET_ERR_MSG_MOD(extack, "External controller is unsupported"); + return -EOPNOTSUPP; + } + if (new_attr->pfnum != PCI_FUNC(dev->pdev->devfn)) { + NL_SET_ERR_MSG_MOD(extack, "Invalid pfnum supplied"); + return -EOPNOTSUPP; + } + return 0; +} + +int mlx5_devlink_sf_port_new(struct devlink *devlink, const struct devlink_port_new_attrs *new_attr, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + int err; + + err = mlx5_sf_new_check_attr(dev, new_attr, extack); + if (err) + return err; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port add is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + err = mlx5_sf_add(dev, table, new_attr, extack); + mlx5_sf_table_put(table); + return err; +} + +int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err = 0; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port del is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + sf = mlx5_sf_lookup_by_index(table, port_index); + if (!sf) { + err = -ENODEV; + goto sf_err; + } + + mlx5_sf_del(dev, table, sf); +sf_err: + mlx5_sf_table_put(table); + return err; +} + +static void mlx5_sf_destroy_all(struct mlx5_sf_table *table) +{ + struct mlx5_core_dev *dev = table->dev; + unsigned long index; + struct mlx5_sf *sf; + + xa_for_each(&table->port_indices, index, sf) + mlx5_sf_del(dev, table, sf); +} + +static void mlx5_sf_table_enable(struct mlx5_sf_table *table) +{ + if (!mlx5_sf_max_functions(table->dev)) + return; + + init_completion(&table->disable_complete); + refcount_set(&table->refcount, 1); +} + +static void mlx5_sf_table_disable(struct mlx5_sf_table *table) +{ + if (!mlx5_sf_max_functions(table->dev)) + return; + + if (!refcount_read(&table->refcount)) + return; + + /* Balances with refcount_set; drop the reference so that new user cmd cannot start. */ + mlx5_sf_table_put(table); + wait_for_completion(&table->disable_complete); + + /* At this point, no new user commands can start. + * It is safe to destroy all user created SFs. + */ + mlx5_sf_destroy_all(table); +} + +static int mlx5_sf_esw_event(struct notifier_block *nb, unsigned long event, void *data) +{ + struct mlx5_sf_table *table = container_of(nb, struct mlx5_sf_table, esw_nb); + const struct mlx5_esw_event_info *mode = data; + + switch (mode->new_mode) { + case MLX5_ESWITCH_OFFLOADS: + mlx5_sf_table_enable(table); + break; + case MLX5_ESWITCH_NONE: + mlx5_sf_table_disable(table); + break; + default: + break; + }; + + return 0; +} + +static bool mlx5_sf_table_supported(const struct mlx5_core_dev *dev) +{ + return dev->priv.eswitch && MLX5_ESWITCH_MANAGER(dev) && mlx5_sf_supported(dev); +} + +int mlx5_sf_table_init(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table; + int err; + + if (!mlx5_sf_table_supported(dev)) + return 0; + + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + table->dev = dev; + xa_init(&table->port_indices); + dev->priv.sf_table = table; + table->esw_nb.notifier_call = mlx5_sf_esw_event; + err = mlx5_esw_event_notifier_register(dev->priv.eswitch, &table->esw_nb); + if (err) + goto reg_err; + return 0; + +reg_err: + kfree(table); + dev->priv.sf_table = NULL; + return err; +} + +void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_table *table = dev->priv.sf_table; + + if (!table) + return; + + mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); + WARN_ON(refcount_read(&table->refcount)); + WARN_ON(!xa_empty(&table->port_indices)); + kfree(table); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c new file mode 100644 index 000000000000..c7757f399e8a --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c @@ -0,0 +1,125 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2020 Mellanox Technologies Ltd */ +#include +#include "vhca_event.h" +#include "priv.h" +#include "sf.h" +#include "ecpf.h" + +struct mlx5_sf_hw { + u32 usr_sfnum; + u8 allocated: 1; +}; + +struct mlx5_sf_hw_table { + struct mlx5_core_dev *dev; + struct mlx5_sf_hw *sfs; + int max_local_functions; + u8 ecpu: 1; +}; + +u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id) +{ + return sw_id + mlx5_sf_start_function_id(dev); +} + +int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + int sw_id = -ENOSPC; + u16 hw_fn_id; + int err; + int i; + + if (!table->max_local_functions) + return -EOPNOTSUPP; + + /* Check if sf with same sfnum already exists or not. */ + for (i = 0; i < table->max_local_functions; i++) { + if (table->sfs[i].allocated && table->sfs[i].usr_sfnum == usr_sfnum) + return -EEXIST; + } + + /* Find the free entry and allocate the entry from the array */ + for (i = 0; i < table->max_local_functions; i++) { + if (!table->sfs[i].allocated) { + table->sfs[i].usr_sfnum = usr_sfnum; + table->sfs[i].allocated = true; + sw_id = i; + break; + } + } + if (sw_id == -ENOSPC) { + err = -ENOSPC; + goto err; + } + + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, sw_id); + err = mlx5_cmd_alloc_sf(table->dev, hw_fn_id); + if (err) + goto err; + + err = mlx5_modify_vhca_sw_id(dev, hw_fn_id, table->ecpu, usr_sfnum); + if (err) + goto vhca_err; + + return sw_id; + +vhca_err: + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); +err: + table->sfs[i].allocated = false; + return err; +} + +void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + u16 hw_fn_id; + + hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, id); + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); + table->sfs[id].allocated = false; +} + +int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table; + struct mlx5_sf_hw *sfs; + int max_functions; + + if (!mlx5_sf_supported(dev)) + return 0; + + max_functions = mlx5_sf_max_functions(dev); + table = kzalloc(sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + sfs = kcalloc(max_functions, sizeof(*sfs), GFP_KERNEL); + if (!sfs) + goto table_err; + + table->dev = dev; + table->sfs = sfs; + table->max_local_functions = max_functions; + table->ecpu = mlx5_read_embedded_cpu(dev); + dev->priv.sf_hw_table = table; + mlx5_core_dbg(dev, "SF HW table: max sfs = %d\n", max_functions); + return 0; + +table_err: + kfree(table); + return -ENOMEM; +} + +void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return; + + kfree(table->sfs); + kfree(table); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h new file mode 100644 index 000000000000..7f3622375a9c --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2020 Mellanox Technologies Ltd */ + +#ifndef __MLX5_SF_PRIV_H__ +#define __MLX5_SF_PRIV_H__ + +#include + +int mlx5_cmd_alloc_sf(struct mlx5_core_dev *dev, u16 function_id); +int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id); + +u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id); + +int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum); +void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id); + +#endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h index 623191679b49..dd23b6c2d887 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -28,6 +28,16 @@ static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) return 1 << MLX5_CAP_GEN(dev, log_max_sf); } +int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev); +void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev); + +int mlx5_sf_table_init(struct mlx5_core_dev *dev); +void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev); + +int mlx5_devlink_sf_port_new(struct devlink *devlink, const struct devlink_port_new_attrs *add_attr, + struct netlink_ext_ack *extack); +int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, + struct netlink_ext_ack *extack); #else static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) @@ -40,6 +50,24 @@ static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) return 0; } +static inline int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) +{ +} + +static inline int mlx5_sf_table_init(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) +{ +} + #endif #endif diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 48e3638b1185..7e357c7f0d5e 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -510,6 +510,8 @@ struct mlx5_eq_table; struct mlx5_irq_table; struct mlx5_vhca_state_notifier; struct mlx5_sf_dev_table; +struct mlx5_sf_hw_table; +struct mlx5_sf_table; struct mlx5_rate_limit { u32 rate; @@ -611,6 +613,10 @@ struct mlx5_priv { struct mlx5_sf_dev_table *sf_dev_table; struct mlx5_core_dev *parent_mdev; #endif +#ifdef CONFIG_MLX5_SF_MANAGER + struct mlx5_sf_hw_table *sf_hw_table; + struct mlx5_sf_table *sf_table; +#endif }; enum mlx5_device_state { From patchwork Tue Dec 15 09:03:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 181B0C2BB48 for ; Tue, 15 Dec 2020 09:06:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C706722257 for ; Tue, 15 Dec 2020 09:06:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727903AbgLOJGG (ORCPT ); Tue, 15 Dec 2020 04:06:06 -0500 Received: from mail.kernel.org ([198.145.29.99]:36482 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727852AbgLOJFh (ORCPT ); Tue, 15 Dec 2020 04:05:37 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Vu Pham , Saeed Mahameed Subject: [net-next v5 12/15] net/mlx5: SF, Port function state change support Date: Tue, 15 Dec 2020 01:03:55 -0800 Message-Id: <20201215090358.240365-13-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Support changing the state of the SF port's function through devlink. When activating the SF port's function, enable the hca in the device followed by adding its auxiliary device. When deactivating the SF port's function, delete its auxiliary device followed by disabling the vHCA. Port function attributes get/set callbacks are invoked with devlink instance lock held. Such callbacks need to synchronize with sf port table getting disabled either via sriov sysfs callback. Such callbacks synchronize with table disable context holding table refcount. $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev $ devlink port show pci/0000:06:00.0/65535: type eth netdev ens2f0np0 flavour physical port 0 splittable false $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 $ devlink port show ens2f0npf0sf88 pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false function: hw_addr 00:00:00:00:88:88 state inactive opstate detached $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 state active $ devlink port show ens2f0npf0sf88 -jp { "port": { "pci/0000:06:00.0/32768": { "type": "eth", "netdev": "ens2f0npf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "external": false, "splittable": false, "function": { "hw_addr": "00:00:00:00:88:88", "state": "active", "opstate": "attached" } } } } On port function activation, an auxiliary device is created in below example. $ devlink dev show devlink dev show auxiliary/mlx5_core.sf.4 $ devlink port show auxiliary/mlx5_core.sf.4/1 auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false Signed-off-by: Parav Pandit Reviewed-by: Vu Pham Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/devlink.c | 2 + .../net/ethernet/mellanox/mlx5/core/main.c | 10 + .../net/ethernet/mellanox/mlx5/core/sf/cmd.c | 22 ++ .../ethernet/mellanox/mlx5/core/sf/devlink.c | 284 ++++++++++++++++-- .../ethernet/mellanox/mlx5/core/sf/hw_table.c | 116 ++++++- .../net/ethernet/mellanox/mlx5/core/sf/priv.h | 4 + .../net/ethernet/mellanox/mlx5/core/sf/sf.h | 19 ++ 7 files changed, 431 insertions(+), 26 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index d4c0cdf5edd9..75d950d95fcf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -195,6 +195,8 @@ static const struct devlink_ops mlx5_devlink_ops = { #ifdef CONFIG_MLX5_SF_MANAGER .port_new = mlx5_devlink_sf_port_new, .port_del = mlx5_devlink_sf_port_del, + .port_function_state_get = mlx5_devlink_sf_port_fn_state_get, + .port_function_state_set = mlx5_devlink_sf_port_fn_state_set, #endif .flash_update = mlx5_devlink_flash_update, .info_get = mlx5_devlink_info_get, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 435323088ce0..f6b885fdd5c8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -75,6 +75,7 @@ #include "diag/rsc_dump.h" #include "sf/vhca_event.h" #include "sf/dev/dev.h" +#include "sf/sf.h" MODULE_AUTHOR("Eli Cohen "); MODULE_DESCRIPTION("Mellanox 5th generation network adapters (ConnectX series) core driver"); @@ -1161,6 +1162,12 @@ static int mlx5_load(struct mlx5_core_dev *dev) mlx5_vhca_event_start(dev); + err = mlx5_sf_hw_table_create(dev); + if (err) { + mlx5_core_err(dev, "sf table create failed %d\n", err); + goto err_vhca; + } + err = mlx5_ec_init(dev); if (err) { mlx5_core_err(dev, "Failed to init embedded CPU\n"); @@ -1180,6 +1187,8 @@ static int mlx5_load(struct mlx5_core_dev *dev) err_sriov: mlx5_ec_cleanup(dev); err_ec: + mlx5_sf_hw_table_destroy(dev); +err_vhca: mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); err_fs: @@ -1209,6 +1218,7 @@ static void mlx5_unload(struct mlx5_core_dev *dev) mlx5_sf_dev_table_destroy(dev); mlx5_sriov_detach(dev); mlx5_ec_cleanup(dev); + mlx5_sf_hw_table_destroy(dev); mlx5_vhca_event_stop(dev); mlx5_cleanup_fs(dev); mlx5_accel_ipsec_cleanup(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c index 0bc3075f34fa..a8d75c2f0275 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c @@ -25,3 +25,25 @@ int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id) return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); } + +int mlx5_cmd_sf_enable_hca(struct mlx5_core_dev *dev, u16 func_id) +{ + u32 out[MLX5_ST_SZ_DW(enable_hca_out)] = {}; + u32 in[MLX5_ST_SZ_DW(enable_hca_in)] = {}; + + MLX5_SET(enable_hca_in, in, opcode, MLX5_CMD_OP_ENABLE_HCA); + MLX5_SET(enable_hca_in, in, function_id, func_id); + MLX5_SET(enable_hca_in, in, embedded_cpu_function, 0); + return mlx5_cmd_exec(dev, &in, sizeof(in), &out, sizeof(out)); +} + +int mlx5_cmd_sf_disable_hca(struct mlx5_core_dev *dev, u16 func_id) +{ + u32 out[MLX5_ST_SZ_DW(disable_hca_out)] = {}; + u32 in[MLX5_ST_SZ_DW(disable_hca_in)] = {}; + + MLX5_SET(disable_hca_in, in, opcode, MLX5_CMD_OP_DISABLE_HCA); + MLX5_SET(disable_hca_in, in, function_id, func_id); + MLX5_SET(enable_hca_in, in, embedded_cpu_function, 0); + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c index 09365f36a513..eb5c536ff1d5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c @@ -4,11 +4,17 @@ #include #include "eswitch.h" #include "priv.h" +#include "sf/dev/dev.h" +#include "mlx5_ifc_vhca_event.h" +#include "vhca_event.h" +#include "ecpf.h" struct mlx5_sf { struct devlink_port dl_port; unsigned int port_index; u16 id; + u16 hw_fn_id; + u16 hw_state; }; struct mlx5_sf_table { @@ -16,7 +22,10 @@ struct mlx5_sf_table { struct xarray port_indices; /* port index based lookup. */ refcount_t refcount; struct completion disable_complete; + struct mutex sf_state_lock; /* Serializes sf state among user cmds & vhca event handler. */ struct notifier_block esw_nb; + struct notifier_block vhca_nb; + u8 ecpu: 1; }; static struct mlx5_sf * @@ -25,6 +34,19 @@ mlx5_sf_lookup_by_index(struct mlx5_sf_table *table, unsigned int port_index) return xa_load(&table->port_indices, port_index); } +static struct mlx5_sf * +mlx5_sf_lookup_by_function_id(struct mlx5_sf_table *table, unsigned int fn_id) +{ + unsigned long index; + struct mlx5_sf *sf; + + xa_for_each(&table->port_indices, index, sf) { + if (sf->hw_fn_id == fn_id) + return sf; + } + return NULL; +} + static int mlx5_sf_id_insert(struct mlx5_sf_table *table, struct mlx5_sf *sf) { return xa_insert(&table->port_indices, sf->port_index, sf, GFP_KERNEL); @@ -59,6 +81,8 @@ mlx5_sf_alloc(struct mlx5_sf_table *table, u32 sfnum, struct netlink_ext_ack *ex hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, sf->id); dl_port_index = mlx5_esw_vport_to_devlink_port_index(table->dev, hw_fn_id); sf->port_index = dl_port_index; + sf->hw_fn_id = hw_fn_id; + sf->hw_state = MLX5_VHCA_STATE_ALLOCATED; err = mlx5_sf_id_insert(table, sf); if (err) @@ -99,6 +123,146 @@ static void mlx5_sf_table_put(struct mlx5_sf_table *table) complete(&table->disable_complete); } +static enum devlink_port_function_state mlx5_sf_to_devlink_state(u8 hw_state) +{ + switch (hw_state) { + case MLX5_VHCA_STATE_ACTIVE: + case MLX5_VHCA_STATE_IN_USE: + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + return DEVLINK_PORT_FUNCTION_STATE_ACTIVE; + case MLX5_VHCA_STATE_INVALID: + case MLX5_VHCA_STATE_ALLOCATED: + default: + return DEVLINK_PORT_FUNCTION_STATE_INACTIVE; + } +} + +static enum devlink_port_function_opstate mlx5_sf_to_devlink_opstate(u8 hw_state) +{ + switch (hw_state) { + case MLX5_VHCA_STATE_IN_USE: + case MLX5_VHCA_STATE_TEARDOWN_REQUEST: + return DEVLINK_PORT_FUNCTION_OPSTATE_ATTACHED; + case MLX5_VHCA_STATE_INVALID: + case MLX5_VHCA_STATE_ALLOCATED: + case MLX5_VHCA_STATE_ACTIVE: + default: + return DEVLINK_PORT_FUNCTION_OPSTATE_DETACHED; + } +} + +static bool mlx5_sf_is_active(const struct mlx5_sf *sf) +{ + return sf->hw_state == MLX5_VHCA_STATE_ACTIVE || sf->hw_state == MLX5_VHCA_STATE_IN_USE; +} + +int mlx5_devlink_sf_port_fn_state_get(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_function_state *state, + enum devlink_port_function_opstate *opstate, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err = 0; + + table = mlx5_sf_table_try_get(dev); + if (!table) + return -EOPNOTSUPP; + + sf = mlx5_sf_lookup_by_index(table, dl_port->index); + if (!sf) { + err = -EOPNOTSUPP; + goto sf_err; + } + mutex_lock(&table->sf_state_lock); + *state = mlx5_sf_to_devlink_state(sf->hw_state); + *opstate = mlx5_sf_to_devlink_opstate(sf->hw_state); + mutex_unlock(&table->sf_state_lock); +sf_err: + mlx5_sf_table_put(table); + return err; +} + +static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf) +{ + int err; + + if (mlx5_sf_is_active(sf)) + return 0; + if (sf->hw_state != MLX5_VHCA_STATE_ALLOCATED) + return -EINVAL; + + err = mlx5_cmd_sf_enable_hca(dev, sf->hw_fn_id); + if (err) + return err; + + sf->hw_state = MLX5_VHCA_STATE_ACTIVE; + return 0; +} + +static int mlx5_sf_deactivate(struct mlx5_core_dev *dev, struct mlx5_sf *sf) +{ + int err; + + if (!mlx5_sf_is_active(sf)) + return 0; + + err = mlx5_cmd_sf_disable_hca(dev, sf->hw_fn_id); + if (err) + return err; + + sf->hw_state = MLX5_VHCA_STATE_TEARDOWN_REQUEST; + return 0; +} + +static int mlx5_sf_state_set(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, + struct mlx5_sf *sf, + enum devlink_port_function_state state) +{ + int err = 0; + + mutex_lock(&table->sf_state_lock); + if (state == mlx5_sf_to_devlink_state(sf->hw_state)) + goto out; + if (state == DEVLINK_PORT_FUNCTION_STATE_ACTIVE) + err = mlx5_sf_activate(dev, sf); + else if (state == DEVLINK_PORT_FUNCTION_STATE_INACTIVE) + err = mlx5_sf_deactivate(dev, sf); + else + err = -EINVAL; +out: + mutex_unlock(&table->sf_state_lock); + return err; +} + +int mlx5_devlink_sf_port_fn_state_set(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_function_state state, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_sf_table *table; + struct mlx5_sf *sf; + int err; + + table = mlx5_sf_table_try_get(dev); + if (!table) { + NL_SET_ERR_MSG_MOD(extack, + "Port state set is only supported in eswitch switchdev mode or SF ports are disabled."); + return -EOPNOTSUPP; + } + sf = mlx5_sf_lookup_by_index(table, dl_port->index); + if (!sf) { + err = -ENODEV; + goto out; + } + + err = mlx5_sf_state_set(dev, table, sf, state); +out: + mlx5_sf_table_put(table); + return err; +} + static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, const struct devlink_port_new_attrs *new_attr, struct netlink_ext_ack *extack) @@ -123,16 +287,6 @@ static int mlx5_sf_add(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, return err; } -static void mlx5_sf_del(struct mlx5_core_dev *dev, struct mlx5_sf_table *table, struct mlx5_sf *sf) -{ - struct mlx5_eswitch *esw = dev->priv.eswitch; - u16 hw_fn_id; - - hw_fn_id = mlx5_sf_sw_to_hw_id(dev, sf->id); - mlx5_esw_offloads_sf_vport_disable(esw, hw_fn_id); - mlx5_sf_free(table, sf); -} - static int mlx5_sf_new_check_attr(struct mlx5_core_dev *dev, const struct devlink_port_new_attrs *new_attr, struct netlink_ext_ack *extack) @@ -184,10 +338,30 @@ int mlx5_devlink_sf_port_new(struct devlink *devlink, const struct devlink_port_ return err; } +static void mlx5_sf_dealloc(struct mlx5_sf_table *table, struct mlx5_sf *sf) +{ + if (sf->hw_state == MLX5_VHCA_STATE_ALLOCATED) { + mlx5_sf_free(table, sf); + } else if (mlx5_sf_is_active(sf)) { + /* Even if its active, it is treated as in_use because by the time, + * it is disabled here, it may getting used. So it is safe to + * always look for the event to ensure that it is recycled only after + * firmware gives confirmation that it is detached by the driver. + */ + mlx5_cmd_sf_disable_hca(table->dev, sf->hw_fn_id); + mlx5_sf_hw_table_sf_deferred_free(table->dev, sf->id); + kfree(sf); + } else { + mlx5_sf_hw_table_sf_deferred_free(table->dev, sf->id); + kfree(sf); + } +} + int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, struct netlink_ext_ack *extack) { struct mlx5_core_dev *dev = devlink_priv(devlink); + struct mlx5_eswitch *esw = dev->priv.eswitch; struct mlx5_sf_table *table; struct mlx5_sf *sf; int err = 0; @@ -204,20 +378,58 @@ int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, goto sf_err; } - mlx5_sf_del(dev, table, sf); + mlx5_esw_offloads_sf_vport_disable(esw, sf->hw_fn_id); + mlx5_sf_id_erase(table, sf); + + mutex_lock(&table->sf_state_lock); + mlx5_sf_dealloc(table, sf); + mutex_unlock(&table->sf_state_lock); sf_err: mlx5_sf_table_put(table); return err; } -static void mlx5_sf_destroy_all(struct mlx5_sf_table *table) +static bool mlx5_sf_state_update_check(const struct mlx5_sf *sf, u8 new_state) { - struct mlx5_core_dev *dev = table->dev; - unsigned long index; + if (sf->hw_state == MLX5_VHCA_STATE_ACTIVE && new_state == MLX5_VHCA_STATE_IN_USE) + return true; + + if (sf->hw_state == MLX5_VHCA_STATE_IN_USE && new_state == MLX5_VHCA_STATE_ACTIVE) + return true; + + if (sf->hw_state == MLX5_VHCA_STATE_TEARDOWN_REQUEST && + new_state == MLX5_VHCA_STATE_ALLOCATED) + return true; + + return false; +} + +static int mlx5_sf_vhca_event(struct notifier_block *nb, unsigned long opcode, void *data) +{ + struct mlx5_sf_table *table = container_of(nb, struct mlx5_sf_table, vhca_nb); + const struct mlx5_vhca_state_event *event = data; + bool update = false; struct mlx5_sf *sf; - xa_for_each(&table->port_indices, index, sf) - mlx5_sf_del(dev, table, sf); + table = mlx5_sf_table_try_get(table->dev); + if (!table) + return 0; + + mutex_lock(&table->sf_state_lock); + sf = mlx5_sf_lookup_by_function_id(table, event->function_id); + if (!sf) + goto sf_err; + + /* When driver is attached or detached to a function, an event + * notifies such state change. + */ + update = mlx5_sf_state_update_check(sf, event->new_vhca_state); + if (update) + sf->hw_state = event->new_vhca_state; +sf_err: + mutex_unlock(&table->sf_state_lock); + mlx5_sf_table_put(table); + return 0; } static void mlx5_sf_table_enable(struct mlx5_sf_table *table) @@ -229,6 +441,22 @@ static void mlx5_sf_table_enable(struct mlx5_sf_table *table) refcount_set(&table->refcount, 1); } +static void mlx5_sf_deactivate_all(struct mlx5_sf_table *table) +{ + struct mlx5_eswitch *esw = table->dev->priv.eswitch; + unsigned long index; + struct mlx5_sf *sf; + + /* At this point, no new user commands can start and no vhca event can + * arrive. It is safe to destroy all user created SFs. + */ + xa_for_each(&table->port_indices, index, sf) { + mlx5_esw_offloads_sf_vport_disable(esw, sf->hw_fn_id); + mlx5_sf_id_erase(table, sf); + mlx5_sf_dealloc(table, sf); + } +} + static void mlx5_sf_table_disable(struct mlx5_sf_table *table) { if (!mlx5_sf_max_functions(table->dev)) @@ -237,14 +465,13 @@ static void mlx5_sf_table_disable(struct mlx5_sf_table *table) if (!refcount_read(&table->refcount)) return; - /* Balances with refcount_set; drop the reference so that new user cmd cannot start. */ + /* Balances with refcount_set; drop the reference so that new user cmd cannot start + * and new vhca event handler cannnot run. + */ mlx5_sf_table_put(table); wait_for_completion(&table->disable_complete); - /* At this point, no new user commands can start. - * It is safe to destroy all user created SFs. - */ - mlx5_sf_destroy_all(table); + mlx5_sf_deactivate_all(table); } static int mlx5_sf_esw_event(struct notifier_block *nb, unsigned long event, void *data) @@ -276,23 +503,34 @@ int mlx5_sf_table_init(struct mlx5_core_dev *dev) struct mlx5_sf_table *table; int err; - if (!mlx5_sf_table_supported(dev)) + if (!mlx5_sf_table_supported(dev) || !mlx5_vhca_event_supported(dev)) return 0; table = kzalloc(sizeof(*table), GFP_KERNEL); if (!table) return -ENOMEM; + mutex_init(&table->sf_state_lock); table->dev = dev; xa_init(&table->port_indices); dev->priv.sf_table = table; + refcount_set(&table->refcount, 0); table->esw_nb.notifier_call = mlx5_sf_esw_event; err = mlx5_esw_event_notifier_register(dev->priv.eswitch, &table->esw_nb); if (err) goto reg_err; + + table->vhca_nb.notifier_call = mlx5_sf_vhca_event; + err = mlx5_vhca_event_notifier_register(table->dev, &table->vhca_nb); + if (err) + goto vhca_err; + return 0; +vhca_err: + mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); reg_err: + mutex_destroy(&table->sf_state_lock); kfree(table); dev->priv.sf_table = NULL; return err; @@ -305,8 +543,10 @@ void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev) if (!table) return; + mlx5_vhca_event_notifier_unregister(table->dev, &table->vhca_nb); mlx5_esw_event_notifier_unregister(dev->priv.eswitch, &table->esw_nb); WARN_ON(refcount_read(&table->refcount)); + mutex_destroy(&table->sf_state_lock); WARN_ON(!xa_empty(&table->port_indices)); kfree(table); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c index c7757f399e8a..58b6be0b03d7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c @@ -4,11 +4,14 @@ #include "vhca_event.h" #include "priv.h" #include "sf.h" +#include "mlx5_ifc_vhca_event.h" +#include "vhca_event.h" #include "ecpf.h" struct mlx5_sf_hw { u32 usr_sfnum; u8 allocated: 1; + u8 pending_delete: 1; }; struct mlx5_sf_hw_table { @@ -16,6 +19,8 @@ struct mlx5_sf_hw_table { struct mlx5_sf_hw *sfs; int max_local_functions; u8 ecpu: 1; + struct mutex table_lock; /* Serializes sf deletion and vhca state change handler. */ + struct notifier_block vhca_nb; }; u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id) @@ -23,6 +28,11 @@ u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id) return sw_id + mlx5_sf_start_function_id(dev); } +static u16 mlx5_sf_hw_to_sw_id(const struct mlx5_core_dev *dev, u16 hw_id) +{ + return hw_id - mlx5_sf_start_function_id(dev); +} + int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum) { struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; @@ -34,10 +44,13 @@ int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum) if (!table->max_local_functions) return -EOPNOTSUPP; + mutex_lock(&table->table_lock); /* Check if sf with same sfnum already exists or not. */ for (i = 0; i < table->max_local_functions; i++) { - if (table->sfs[i].allocated && table->sfs[i].usr_sfnum == usr_sfnum) - return -EEXIST; + if (table->sfs[i].allocated && table->sfs[i].usr_sfnum == usr_sfnum) { + err = -EEXIST; + goto exist_err; + } } /* Find the free entry and allocate the entry from the array */ @@ -63,16 +76,19 @@ int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum) if (err) goto vhca_err; + mutex_unlock(&table->table_lock); return sw_id; vhca_err: mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); err: table->sfs[i].allocated = false; +exist_err: + mutex_unlock(&table->table_lock); return err; } -void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id) +static void _mlx5_sf_hw_id_free(struct mlx5_core_dev *dev, u16 id) { struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; u16 hw_fn_id; @@ -80,6 +96,50 @@ void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id) hw_fn_id = mlx5_sf_sw_to_hw_id(table->dev, id); mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); table->sfs[id].allocated = false; + table->sfs[id].pending_delete = false; +} + +void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + mutex_lock(&table->table_lock); + _mlx5_sf_hw_id_free(dev, id); + mutex_unlock(&table->table_lock); +} + +void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u16 id) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + u32 out[MLX5_ST_SZ_DW(query_vhca_state_out)] = {}; + u16 hw_fn_id; + u8 state; + int err; + + hw_fn_id = mlx5_sf_sw_to_hw_id(dev, id); + mutex_lock(&table->table_lock); + err = mlx5_cmd_query_vhca_state(dev, hw_fn_id, table->ecpu, out, sizeof(out)); + if (err) + goto err; + state = MLX5_GET(query_vhca_state_out, out, vhca_state_context.vhca_state); + if (state == MLX5_VHCA_STATE_ALLOCATED) { + mlx5_cmd_dealloc_sf(table->dev, hw_fn_id); + table->sfs[id].allocated = false; + } else { + table->sfs[id].pending_delete = true; + } +err: + mutex_unlock(&table->table_lock); +} + +static void mlx5_sf_hw_dealloc_all(struct mlx5_sf_hw_table *table) +{ + int i; + + for (i = 0; i < table->max_local_functions; i++) { + if (table->sfs[i].allocated) + _mlx5_sf_hw_id_free(table->dev, i); + } } int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) @@ -88,7 +148,7 @@ int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) struct mlx5_sf_hw *sfs; int max_functions; - if (!mlx5_sf_supported(dev)) + if (!mlx5_sf_supported(dev) || !mlx5_vhca_event_supported(dev)) return 0; max_functions = mlx5_sf_max_functions(dev); @@ -100,6 +160,7 @@ int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev) if (!sfs) goto table_err; + mutex_init(&table->table_lock); table->dev = dev; table->sfs = sfs; table->max_local_functions = max_functions; @@ -120,6 +181,53 @@ void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) if (!table) return; + mutex_destroy(&table->table_lock); kfree(table->sfs); kfree(table); } + +static int mlx5_sf_hw_vhca_event(struct notifier_block *nb, unsigned long opcode, void *data) +{ + struct mlx5_sf_hw_table *table = container_of(nb, struct mlx5_sf_hw_table, vhca_nb); + const struct mlx5_vhca_state_event *event = data; + struct mlx5_sf_hw *sf_hw; + u16 sw_id; + + if (event->new_vhca_state != MLX5_VHCA_STATE_ALLOCATED) + return 0; + + sw_id = mlx5_sf_hw_to_sw_id(table->dev, event->function_id); + sf_hw = &table->sfs[sw_id]; + + mutex_lock(&table->table_lock); + /* SF driver notified through firmware that SF is finally detached. + * Hence recycle the sf hardware id for reuse. + */ + if (sf_hw->allocated && sf_hw->pending_delete) + _mlx5_sf_hw_id_free(table->dev, sw_id); + mutex_unlock(&table->table_lock); + return 0; +} + +int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return 0; + + table->vhca_nb.notifier_call = mlx5_sf_hw_vhca_event; + return mlx5_vhca_event_notifier_register(table->dev, &table->vhca_nb); +} + +void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev) +{ + struct mlx5_sf_hw_table *table = dev->priv.sf_hw_table; + + if (!table) + return; + + mlx5_vhca_event_notifier_unregister(table->dev, &table->vhca_nb); + /* Dealloc SFs whose firmware event has been missed. */ + mlx5_sf_hw_dealloc_all(table); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h index 7f3622375a9c..cb02a51d0986 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h @@ -9,9 +9,13 @@ int mlx5_cmd_alloc_sf(struct mlx5_core_dev *dev, u16 function_id); int mlx5_cmd_dealloc_sf(struct mlx5_core_dev *dev, u16 function_id); +int mlx5_cmd_sf_enable_hca(struct mlx5_core_dev *dev, u16 func_id); +int mlx5_cmd_sf_disable_hca(struct mlx5_core_dev *dev, u16 func_id); + u16 mlx5_sf_sw_to_hw_id(const struct mlx5_core_dev *dev, u16 sw_id); int mlx5_sf_hw_table_sf_alloc(struct mlx5_core_dev *dev, u32 usr_sfnum); void mlx5_sf_hw_table_sf_free(struct mlx5_core_dev *dev, u16 id); +void mlx5_sf_hw_table_sf_deferred_free(struct mlx5_core_dev *dev, u16 id); #endif diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h index dd23b6c2d887..296fd070617e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h @@ -31,6 +31,9 @@ static inline u16 mlx5_sf_max_functions(const struct mlx5_core_dev *dev) int mlx5_sf_hw_table_init(struct mlx5_core_dev *dev); void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev); +int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev); +void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev); + int mlx5_sf_table_init(struct mlx5_core_dev *dev); void mlx5_sf_table_cleanup(struct mlx5_core_dev *dev); @@ -38,6 +41,13 @@ int mlx5_devlink_sf_port_new(struct devlink *devlink, const struct devlink_port_ struct netlink_ext_ack *extack); int mlx5_devlink_sf_port_del(struct devlink *devlink, unsigned int port_index, struct netlink_ext_ack *extack); +int mlx5_devlink_sf_port_fn_state_get(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_function_state *state, + enum devlink_port_function_opstate *opstate, + struct netlink_ext_ack *extack); +int mlx5_devlink_sf_port_fn_state_set(struct devlink *devlink, struct devlink_port *dl_port, + enum devlink_port_function_state state, + struct netlink_ext_ack *extack); #else static inline bool mlx5_sf_supported(const struct mlx5_core_dev *dev) @@ -59,6 +69,15 @@ static inline void mlx5_sf_hw_table_cleanup(struct mlx5_core_dev *dev) { } +static inline int mlx5_sf_hw_table_create(struct mlx5_core_dev *dev) +{ + return 0; +} + +static inline void mlx5_sf_hw_table_destroy(struct mlx5_core_dev *dev) +{ +} + static inline int mlx5_sf_table_init(struct mlx5_core_dev *dev) { return 0; From patchwork Tue Dec 15 09:03:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344886 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95C36C4361B for ; Tue, 15 Dec 2020 09:07:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4FF7722225 for ; Tue, 15 Dec 2020 09:07:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728013AbgLOJHf (ORCPT ); Tue, 15 Dec 2020 04:07:35 -0500 Received: from mail.kernel.org ([198.145.29.99]:36484 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727084AbgLOJFg (ORCPT ); Tue, 15 Dec 2020 04:05:36 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Jiri Pirko , Saeed Mahameed Subject: [net-next v5 13/15] devlink: Add devlink port documentation Date: Tue, 15 Dec 2020 01:03:56 -0800 Message-Id: <20201215090358.240365-14-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Added documentation for devlink port and port function related commands. Signed-off-by: Parav Pandit Reviewed-by: Jiri Pirko Reviewed-by: Jacob Keller Signed-off-by: Saeed Mahameed --- .../networking/devlink/devlink-port.rst | 118 ++++++++++++++++++ Documentation/networking/devlink/index.rst | 1 + 2 files changed, 119 insertions(+) create mode 100644 Documentation/networking/devlink/devlink-port.rst diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst new file mode 100644 index 000000000000..4c910dbb01ca --- /dev/null +++ b/Documentation/networking/devlink/devlink-port.rst @@ -0,0 +1,118 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _devlink_port: + +============ +Devlink Port +============ + +``devlink-port`` is a port that exists on the device. It has a logically +separate ingress/egress point of the device. A devlink port can be any one +of many flavours. A devlink port flavour along with port attributes +describe what a port represents. + +A device driver that intends to publish a devlink port sets the +devlink port attributes and registers the devlink port. + +Devlink port flavours are described below. + +.. list-table:: List of devlink port flavours + :widths: 33 90 + + * - Flavour + - Description + * - ``DEVLINK_PORT_FLAVOUR_PHYSICAL`` + - Any kind of physical port. This can be an eswitch physical port or any + other physical port on the device. + * - ``DEVLINK_PORT_FLAVOUR_DSA`` + - This indicates a DSA interconnect port. + * - ``DEVLINK_PORT_FLAVOUR_CPU`` + - This indicates a CPU port applicable only to DSA. + * - ``DEVLINK_PORT_FLAVOUR_PCI_PF`` + - This indicates an eswitch port representing a port of PCI + physical function (PF). + * - ``DEVLINK_PORT_FLAVOUR_PCI_VF`` + - This indicates an eswitch port representing a port of PCI + virtual function (VF). + * - ``DEVLINK_PORT_FLAVOUR_VIRTUAL`` + - This indicates a virtual port for the PCI virtual function. + +Devlink port can have a different type based on the link layer described below. + +.. list-table:: List of devlink port types + :widths: 23 90 + + * - Type + - Description + * - ``DEVLINK_PORT_TYPE_ETH`` + - Driver should set this port type when a link layer of the port is + Ethernet. + * - ``DEVLINK_PORT_TYPE_IB`` + - Driver should set this port type when a link layer of the port is + InfiniBand. + * - ``DEVLINK_PORT_TYPE_AUTO`` + - This type is indicated by the user when driver should detect the port + type automatically. + +PCI controllers +--------------- +In most cases a PCI device has only one controller. A controller consists of +potentially multiple physical and virtual functions. Such PCI function consists +of one or more ports. This port of the function is represented by the devlink +eswitch port. + +A PCI Device connected to multiple CPUs or multiple PCI root complexes or +SmartNIC, however, may have multiple controllers. For a device with multiple +controllers, each controller is distinguished by a unique controller number. +An eswitch on the PCI device support ports of multiple controllers. + +An example view of a system with two controllers:: + + --------------------------------------------------------- + | | + | --------- --------- ------- ------- | + ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | + | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- | + | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ | + | connect | | ------- ------- | + ----------- | | controller_num=1 (no eswitch) | + ------|-------------------------------------------------- + (internal wire) + | + --------------------------------------------------------- + | devlink eswitch ports and reps | + | ----------------------------------------------------- | + | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | | + | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | + | ----------------------------------------------------- | + | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | | + | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | | + | ----------------------------------------------------- | + | | + | | + ----------- | --------- --------- ------- ------- | + | smartNIC| | | vf(s) | | sf(s) | |vf(s)| |sf(s)| | + | pci rc |==| ------- ----/---- ---/----- ------- ---/--- ---/--- | + | connect | | | pf0 |______/________/ | pf1 |___/_______/ | + ----------- | ------- ------- | + | | + | local controller_num=0 (eswitch) | + --------------------------------------------------------- + +In above example, external controller (identified by controller number = 1) +doesn't have eswitch. Local controller (identified by controller number = 0) +has the eswitch. Devlink instance on local controller has eswitch devlink +ports representing ports for both the controllers. + +Port function configuration +=========================== + +A user can configure the port function attribute before enumerating the +PCI function. Usually it means, user should configure port function attribute +before a bus specific device for the function is created. However, when +SRIOV is enabled, virtual function devices are created on the PCI bus. +Hence, function attribute should be configured before binding virtual +function device to the driver. + +User may set the hardware address of the function represented by the devlink +port function. For Ethernet port function this means a MAC address. diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index d82874760ae2..aab79667f97b 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -18,6 +18,7 @@ general. devlink-info devlink-flash devlink-params + devlink-port devlink-region devlink-resource devlink-reload From patchwork Tue Dec 15 09:03:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D837C2BBCA for ; Tue, 15 Dec 2020 09:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4414822225 for ; Tue, 15 Dec 2020 09:07:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728012AbgLOJH2 (ORCPT ); Tue, 15 Dec 2020 04:07:28 -0500 Received: from mail.kernel.org ([198.145.29.99]:36486 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727846AbgLOJFg (ORCPT ); Tue, 15 Dec 2020 04:05:36 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Saeed Mahameed Subject: [net-next v5 14/15] devlink: Extend devlink port documentation for subfunctions Date: Tue, 15 Dec 2020 01:03:57 -0800 Message-Id: <20201215090358.240365-15-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Add devlink port documentation for subfunction management. Signed-off-by: Parav Pandit Signed-off-by: Saeed Mahameed --- Documentation/driver-api/auxiliary_bus.rst | 2 + .../networking/devlink/devlink-port.rst | 89 ++++++++++++++++++- 2 files changed, 87 insertions(+), 4 deletions(-) diff --git a/Documentation/driver-api/auxiliary_bus.rst b/Documentation/driver-api/auxiliary_bus.rst index 2312506b0674..fff96c7ba7a8 100644 --- a/Documentation/driver-api/auxiliary_bus.rst +++ b/Documentation/driver-api/auxiliary_bus.rst @@ -1,5 +1,7 @@ .. SPDX-License-Identifier: GPL-2.0-only +.. _auxiliary_bus: + ============= Auxiliary Bus ============= diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst index 4c910dbb01ca..c6924e7a341e 100644 --- a/Documentation/networking/devlink/devlink-port.rst +++ b/Documentation/networking/devlink/devlink-port.rst @@ -34,6 +34,9 @@ Devlink port flavours are described below. * - ``DEVLINK_PORT_FLAVOUR_PCI_VF`` - This indicates an eswitch port representing a port of PCI virtual function (VF). + * - ``DEVLINK_PORT_FLAVOUR_PCI_SF`` + - This indicates an eswitch port representing a port of PCI + subfunction (SF). * - ``DEVLINK_PORT_FLAVOUR_VIRTUAL`` - This indicates a virtual port for the PCI virtual function. @@ -57,9 +60,9 @@ Devlink port can have a different type based on the link layer described below. PCI controllers --------------- In most cases a PCI device has only one controller. A controller consists of -potentially multiple physical and virtual functions. Such PCI function consists -of one or more ports. This port of the function is represented by the devlink -eswitch port. +potentially multiple physical functions, virtual functions and subfunctions. +Such PCI function consists of one or more ports. This port of the function +is represented by the devlink eswitch port. A PCI Device connected to multiple CPUs or multiple PCI root complexes or SmartNIC, however, may have multiple controllers. For a device with multiple @@ -112,7 +115,85 @@ PCI function. Usually it means, user should configure port function attribute before a bus specific device for the function is created. However, when SRIOV is enabled, virtual function devices are created on the PCI bus. Hence, function attribute should be configured before binding virtual -function device to the driver. +function device to the driver. For subfunctions, this means user should +configure port function attribute before activating the port function. User may set the hardware address of the function represented by the devlink port function. For Ethernet port function this means a MAC address. + +Subfunctions +============ + +Subfunctions are lightweight functions that has parent PCI function on which +it is deployed. Subfunctions are created and deployed in unit of 1. Unlike +SRIOV VFs, they don't require their own PCI virtual function. They communicate +with the hardware through the parent PCI function. Subfunctions can possibly +scale better. + +To use a subfunction, 3 steps setup sequence is followed. +(1) create - create a subfunction; +(2) configure - configure subfunction attributes; +(3) deploy - deploy the subfunction; + +Subfunction management is done using devlink port user interface. +User performs setup on the subfunction management device. + +(1) Create +---------- +A subfunction is created using a devlink port interface. User adds the +subfunction by adding a devlink port of subfunction flavour. The devlink +kernel code calls down to subfunction management driver (devlink op) and asks +it to create a subfunction devlink port. Driver then instantiates the +subfunction port and any associated objects such as health reporters and +representor netdevice. + +(2) Configure +------------- +Subfunction devlink port is created but it is not active yet. That means the +entities are created on devlink side, the e-switch port representor is created, +but the subfunction device itself it not created. User might use e-switch port +representor to do settings, putting it into bridge, adding TC rules, etc. User +might as well configure the hardware address (such as MAC address) of the +subfunction while subfunction is inactive. + +(3) Deploy +---------- +Once subfunction is configured, user must activate it to use it. Upon +activation, subfunction management driver asks the subfunction management +device to instantiate the actual subfunction device on particular PCI function. +A subfunction device is created on the :ref:`Documentation/driver-api/auxiliary_bus.rst `. At this point matching +subfunction driver binds to the subfunction's auxiliary device. + +Terms and Definitions +===================== + +.. list-table:: Terms and Definitions + :widths: 22 90 + + * - Term + - Definitions + * - ``PCI device`` + - A physical PCI device having one or more PCI bus consists of one or + more PCI controllers. + * - ``PCI controller`` + - A controller consists of potentially multiple physical functions, + virtual functions and subfunctions. + * - ``Port function`` + - An object to manage the function of a port. + * - ``Subfunction`` + - A lightweight function that has parent PCI function on which it is + deployed. + * - ``Subfunction device`` + - A bus device of the subfunction, usually on a auxiliary bus. + * - ``Subfunction driver`` + - A device driver for the subfunction auxiliary device. + * - ``Subfunction management device`` + - A PCI physical function that supports subfunction management. + * - ``Subfunction management driver`` + - A device driver for PCI physical function that supports + subfunction management using devlink port interface. + * - ``Subfunction host driver`` + - A device driver for PCI physical function that host subfunction + devices. In most cases it is same as subfunction management driver. When + subfunction is used on external controller, subfunction management and + host drivers are different. From patchwork Tue Dec 15 09:03:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 344888 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 391AAC4361B for ; Tue, 15 Dec 2020 09:06:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E5A6422225 for ; Tue, 15 Dec 2020 09:06:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727945AbgLOJG0 (ORCPT ); Tue, 15 Dec 2020 04:06:26 -0500 Received: from mail.kernel.org ([198.145.29.99]:36404 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727881AbgLOJFv (ORCPT ); Tue, 15 Dec 2020 04:05:51 -0500 From: Saeed Mahameed Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: "David S. Miller" , Jakub Kicinski , Jason Gunthorpe Cc: Leon Romanovsky , netdev@vger.kernel.org, linux-rdma@vger.kernel.org, David Ahern , Jacob Keller , Sridhar Samudrala , david.m.ertman@intel.com, dan.j.williams@intel.com, kiran.patil@intel.com, gregkh@linuxfoundation.org, Parav Pandit , Saeed Mahameed Subject: [net-next v5 15/15] net/mlx5: Add devlink subfunction port documentation Date: Tue, 15 Dec 2020 01:03:58 -0800 Message-Id: <20201215090358.240365-16-saeed@kernel.org> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201215090358.240365-1-saeed@kernel.org> References: <20201215090358.240365-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Parav Pandit Add documentation for subfunction management using devlink port. Signed-off-by: Parav Pandit Signed-off-by: Saeed Mahameed --- .../device_drivers/ethernet/mellanox/mlx5.rst | 204 ++++++++++++++++++ 1 file changed, 204 insertions(+) diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst index a5eb22793bb9..07e38c044355 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst @@ -12,6 +12,8 @@ Contents - `Enabling the driver and kconfig options`_ - `Devlink info`_ - `Devlink parameters`_ +- `mlx5 subfunction`_ +- `mlx5 port function`_ - `Devlink health reporters`_ - `mlx5 tracepoints`_ @@ -181,6 +183,208 @@ User command examples: values: cmode driverinit value true +mlx5 subfunction +================ +mlx5 supports subfunctions management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst `) interface. + +A Subfunction has its own function capabilities and its own resources. This +means a subfunction has its own dedicated queues(txq, rxq, cq, eq). These queues +are neither shared nor stolen from the parent PCI function. + +When subfunction is RDMA capable, it has its own QP1, GID table and rdma +resources neither shared nor stolen from the parent PCI function. + +A subfunction has dedicated window in PCI BAR space that is not shared +with the other subfunctions or parent PCI function. This ensures that all +class devices of the subfunction accesses only assigned PCI BAR space. + +A Subfunction supports eswitch representation through which it supports tc +offloads. User must configure eswitch to send/receive packets from/to +subfunction port. + +Subfunctions share PCI level resources such as PCI MSI-X IRQs with +the other subfunctions and/or with its parent PCI function. + +Example mlx5 software, system and device view:: + + _______ + | admin | + | user |---------- + |_______| | + | | + ____|____ __|______ _________________ + | | | | | | + | devlink | | tc tool | | user | + | tool | |_________| | applications | + |_________| | |_________________| + | | | | + | | | | Userspace + +---------|-------------|-------------------|----------|--------------------+ + | | +----------+ +----------+ Kernel + | | | netdev | | rdma dev | + | | +----------+ +----------+ + (devlink port add/del | ^ ^ + port function set) | | | + | | +---------------| + _____|___ | | _______|_______ + | | | | | mlx5 class | + | devlink | +------------+ | | drivers | + | kernel | | rep netdev | | |(mlx5_core,ib) | + |_________| +------------+ | |_______________| + | | | ^ + (devlink ops) | | (probe/remove) + _________|________ | | ____|________ + | subfunction | | +---------------+ | subfunction | + | management driver|----- | subfunction |---| driver | + | (mlx5_core) | | auxiliary dev | | (mlx5_core) | + |__________________| +---------------+ |_____________| + | ^ + (sf add/del, vhca events) | + | (device add/del) + _____|____ ____|________ + | | | subfunction | + | PCI NIC |---- activate/deactive events---->| host driver | + |__________| | (mlx5_core) | + |_____________| + +Subfunction is created using devlink port interface. + +- Change device to switchdev mode:: + + $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev + +- Add a devlink port of subfunction flavour:: + + $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 + +- Show a devlink port of the subfunction:: + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:00:00 + +- Delete a devlink port of subfunction after use:: + + $ devlink port del pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88 + +mlx5 port function +================== +mlx5 driver provides mechanism to setup PCI VF/SF port function +attributes in unified way for smartnic and non-smartnic NICs. + +This is supported only when eswitch mode is set to switchdev. Port function +configuration of the PCI VF/SF is supported through devlink eswitch port. + +Port function attributes should be set before PCI VF/SF is enumerated by the +driver. + +MAC address setup +----------------- +mlx5 driver provides mechanism to setup the MAC address of the PCI VF/SF. + +Configured MAC address of the PCI VF/SF will be used by netdevice and rdma +device created for the PCI VF/SF. + +- Get MAC address of the VF identified by its unique devlink port index:: + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 + +- Set MAC address of the VF identified by its unique devlink port index:: + + $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55 + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:11:22:33:44:55 + +- Get MAC address of the SF identified by its unique devlink port index:: + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:00:00 + +- Set MAC address of the VF identified by its unique devlink port index:: + + $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88 + + $ devlink port show pci/0000:06:00.0/32768 + pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcivf pfnum 0 sfnum 88 + function: + hw_addr 00:00:00:00:88:88 + +SF state setup +-------------- +To use the SF, user must active the SF using SF port function state attribute. + +- Get state of the SF identified by its unique devlink port index:: + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state inactive opstate detached + +- Activate the function and verify its state is active:: + + $ devlink port function set ens2f0npf0sf88 state active + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state active opstate detached + +Upon function activation, PF driver instance gets the event from the device that +particular SF was activated. It's the cue to put the device on bus, probe it and +instantiate devlink instance and class specific auxiliary devices for it. + +- Show the auxiliary device and port of the subfunction:: + + $ devlink dev show + devlink dev show auxiliary/mlx5_core.sf.4 + + $ devlink port show auxiliary/mlx5_core.sf.4/1 + auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false + + $ rdma link show mlx5_0/1 + link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88 + + $ rdma dev show + 8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112 + 13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112 + +- Subfunction auxiliary device and class device hierarchy:: + + mlx5_core.sf.4 + (subfunction auxiliary device) + /\ + / \ + / \ + / \ + / \ + mlx5_core.eth.4 mlx5_core.rdma.4 + (sf eth aux dev) (sf rdma aux dev) + | | + | | + p0sf88 mlx5_0 + (sf netdev) (sf rdma device) + +Additionally SF port also gets the event when the driver attaches to the +auxiliary device of the subfunction. This results in changing the operational +state of the function. This provides visibility to user to decide when it is +safe to delete the SF port for graceful termination of the subfunction. + +- Show the SF port operational state:: + + $ devlink port show ens2f0npf0sf88 + pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false + function: + hw_addr 00:00:00:00:88:88 state active opstate attached + Devlink health reporters ========================