From patchwork Mon Dec 7 21:06:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4A98C4167B for ; Mon, 7 Dec 2020 21:08:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94BD8238EC for ; Mon, 7 Dec 2020 21:08:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727331AbgLGVHu (ORCPT ); Mon, 7 Dec 2020 16:07:50 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45685 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726387AbgLGVHt (ORCPT ); Mon, 7 Dec 2020 16:07:49 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:52 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qI9029788; Mon, 7 Dec 2020 23:06:52 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz , Yoray Zack Subject: [PATCH v1 net-next 01/15] iov_iter: Skip copy in memcpy_to_page if src==dst Date: Mon, 7 Dec 2020 23:06:35 +0200 Message-Id: <20201207210649.19194-2-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When using direct data placement the NIC writes some of the payload directly to the destination buffer, and constructs the SKB such that it points to this data. As a result, the skb_copy datagram_iter call will attempt to copy data when it is not necessary. This patch adds a check to avoid this copy, and a static_key to enabled it when TCP direct data placement is possible. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack --- include/linux/uio.h | 2 ++ lib/iov_iter.c | 11 ++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 72d88566694e..05573d848ff5 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -282,4 +282,6 @@ int iov_iter_for_each_range(struct iov_iter *i, size_t bytes, int (*f)(struct kvec *vec, void *context), void *context); +extern struct static_key_false skip_copy_enabled; + #endif diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 1635111c5bd2..206edb051135 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -15,6 +15,9 @@ #define PIPE_PARANOIA /* for now */ +DEFINE_STATIC_KEY_FALSE(skip_copy_enabled); +EXPORT_SYMBOL_GPL(skip_copy_enabled); + #define iterate_iovec(i, n, __v, __p, skip, STEP) { \ size_t left; \ size_t wanted = n; \ @@ -476,7 +479,13 @@ static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t static void memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len) { char *to = kmap_atomic(page); - memcpy(to + offset, from, len); + + if (static_branch_unlikely(&skip_copy_enabled)) { + if (to + offset != from) + memcpy(to + offset, from, len); + } else { + memcpy(to + offset, from, len); + } kunmap_atomic(to); } From patchwork Mon Dec 7 21:06:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339520 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9B37C4361B for ; Mon, 7 Dec 2020 21:08:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8CB7B238EC for ; Mon, 7 Dec 2020 21:08:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727431AbgLGVHv (ORCPT ); Mon, 7 Dec 2020 16:07:51 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45704 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727007AbgLGVHt (ORCPT ); Mon, 7 Dec 2020 16:07:49 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:53 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIC029788; Mon, 7 Dec 2020 23:06:53 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com Subject: [PATCH v1 net-next 04/15] net/tls: expose get_netdev_for_sock Date: Mon, 7 Dec 2020 23:06:38 +0200 Message-Id: <20201207210649.19194-5-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org get_netdev_for_sock is a utility that is used to obtain the net_device structure from a connected socket. Later patches will use this for nvme-tcp DDP and DDP CRC offloads. Signed-off-by: Boris Pismenny Reviewed-by: Sagi Grimberg --- include/net/sock.h | 17 +++++++++++++++++ net/tls/tls_device.c | 20 ++------------------ 2 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 093b51719c69..a8f7393ea433 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs); int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len); +/* Assume that the socket is already connected */ +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold) +{ + struct dst_entry *dst = sk_dst_get(sk); + struct net_device *netdev = NULL; + + if (likely(dst)) { + netdev = dst->dev; + if (hold) + dev_hold(netdev); + } + + dst_release(dst); + + return netdev; +} + #endif /* _SOCK_H */ diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index a3ab2d3d4e4e..8c3bc8705efb 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -106,22 +106,6 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx) spin_unlock_irqrestore(&tls_device_lock, flags); } -/* We assume that the socket is already connected */ -static struct net_device *get_netdev_for_sock(struct sock *sk) -{ - struct dst_entry *dst = sk_dst_get(sk); - struct net_device *netdev = NULL; - - if (likely(dst)) { - netdev = dst->dev; - dev_hold(netdev); - } - - dst_release(dst); - - return netdev; -} - static void destroy_record(struct tls_record_info *record) { int i; @@ -1104,7 +1088,7 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx) if (skb) TCP_SKB_CB(skb)->eor = 1; - netdev = get_netdev_for_sock(sk); + netdev = get_netdev_for_sock(sk, true); if (!netdev) { pr_err_ratelimited("%s: netdev not found\n", __func__); rc = -EINVAL; @@ -1180,7 +1164,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx) if (ctx->crypto_recv.info.version != TLS_1_2_VERSION) return -EOPNOTSUPP; - netdev = get_netdev_for_sock(sk); + netdev = get_netdev_for_sock(sk, true); if (!netdev) { pr_err_ratelimited("%s: netdev not found\n", __func__); return -EINVAL; From patchwork Mon Dec 7 21:06:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B828AC4361B for ; Mon, 7 Dec 2020 21:08:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E6D9238EE for ; Mon, 7 Dec 2020 21:08:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727591AbgLGVIF (ORCPT ); Mon, 7 Dec 2020 16:08:05 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45747 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727020AbgLGVHu (ORCPT ); Mon, 7 Dec 2020 16:07:50 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:53 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIE029788; Mon, 7 Dec 2020 23:06:53 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz , Yoray Zack Subject: [PATCH v1 net-next 06/15] nvme-tcp: Add DDP data-path Date: Mon, 7 Dec 2020 23:06:40 +0200 Message-Id: <20201207210649.19194-7-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Introduce the NVMe-TCP DDP data-path offload. Using this interface, the NIC hardware will scatter TCP payload directly to the BIO pages according to the command_id in the PDU. To maintain the correctness of the network stack, the driver is expected to construct SKBs that point to the BIO pages. The data-path interface contains two routines: tcp_ddp_setup/teardown. The setup provides the mapping from command_id to the request buffers, while the teardown removes this mapping. For efficiency, we introduce an asynchronous nvme completion, which is split between NVMe-TCP and the NIC driver as follows: NVMe-TCP performs the specific completion, while NIC driver performs the generic mq_blk completion. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack --- drivers/nvme/host/tcp.c | 119 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index ef96e4a02bbd..534fd5c00f33 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -57,6 +57,11 @@ struct nvme_tcp_request { size_t offset; size_t data_sent; enum nvme_tcp_send_state state; + + bool offloaded; + struct tcp_ddp_io ddp; + __le16 status; + union nvme_result result; }; enum nvme_tcp_queue_flags { @@ -231,10 +236,74 @@ static inline size_t nvme_tcp_pdu_last_send(struct nvme_tcp_request *req, #ifdef CONFIG_TCP_DDP bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags); +void nvme_tcp_ddp_teardown_done(void *ddp_ctx); const struct tcp_ddp_ulp_ops nvme_tcp_ddp_ulp_ops = { .resync_request = nvme_tcp_resync_request, + .ddp_teardown_done = nvme_tcp_ddp_teardown_done, }; +static +int nvme_tcp_teardown_ddp(struct nvme_tcp_queue *queue, + u16 command_id, + struct request *rq) +{ + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + struct net_device *netdev = queue->ctrl->offloading_netdev; + int ret; + + if (unlikely(!netdev)) { + pr_info_ratelimited("%s: netdev not found\n", __func__); + return -EINVAL; + } + + ret = netdev->tcp_ddp_ops->tcp_ddp_teardown(netdev, queue->sock->sk, + &req->ddp, rq); + sg_free_table_chained(&req->ddp.sg_table, SG_CHUNK_SIZE); + req->offloaded = false; + return ret; +} + +void nvme_tcp_ddp_teardown_done(void *ddp_ctx) +{ + struct request *rq = ddp_ctx; + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + + if (!nvme_try_complete_req(rq, cpu_to_le16(req->status << 1), req->result)) + nvme_complete_rq(rq); +} + +static +int nvme_tcp_setup_ddp(struct nvme_tcp_queue *queue, + u16 command_id, + struct request *rq) +{ + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + struct net_device *netdev = queue->ctrl->offloading_netdev; + int ret; + + req->offloaded = false; + + if (unlikely(!netdev)) { + pr_info_ratelimited("%s: netdev not found\n", __func__); + return -EINVAL; + } + + req->ddp.command_id = command_id; + req->ddp.sg_table.sgl = req->ddp.first_sgl; + ret = sg_alloc_table_chained(&req->ddp.sg_table, blk_rq_nr_phys_segments(rq), + req->ddp.sg_table.sgl, SG_CHUNK_SIZE); + if (ret) + return -ENOMEM; + req->ddp.nents = blk_rq_map_sg(rq->q, rq, req->ddp.sg_table.sgl); + + ret = netdev->tcp_ddp_ops->tcp_ddp_setup(netdev, + queue->sock->sk, + &req->ddp); + if (!ret) + req->offloaded = true; + return ret; +} + static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) { @@ -374,6 +443,25 @@ bool nvme_tcp_resync_request(struct sock *sk, u32 seq, u32 flags) #else +static +int nvme_tcp_setup_ddp(struct nvme_tcp_queue *queue, + u16 command_id, + struct request *rq) +{ + return -EINVAL; +} + +static +int nvme_tcp_teardown_ddp(struct nvme_tcp_queue *queue, + u16 command_id, + struct request *rq) +{ + return -EINVAL; +} + +void nvme_tcp_ddp_teardown_done(void *ddp_ctx) +{} + static int nvme_tcp_offload_socket(struct nvme_tcp_queue *queue) { @@ -651,6 +739,7 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl) static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue, struct nvme_completion *cqe) { + struct nvme_tcp_request *req; struct request *rq; rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), cqe->command_id); @@ -662,8 +751,15 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue, return -EINVAL; } - if (!nvme_try_complete_req(rq, cqe->status, cqe->result)) - nvme_complete_rq(rq); + req = blk_mq_rq_to_pdu(rq); + if (req->offloaded) { + req->status = cqe->status; + req->result = cqe->result; + nvme_tcp_teardown_ddp(queue, cqe->command_id, rq); + } else { + if (!nvme_try_complete_req(rq, cqe->status, cqe->result)) + nvme_complete_rq(rq); + } queue->nr_cqe++; return 0; @@ -857,9 +953,18 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb, static inline void nvme_tcp_end_request(struct request *rq, u16 status) { union nvme_result res = {}; + struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq); + struct nvme_tcp_queue *queue = req->queue; + struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu; - if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res)) - nvme_complete_rq(rq); + if (req->offloaded) { + req->status = cpu_to_le16(status << 1); + req->result = res; + nvme_tcp_teardown_ddp(queue, pdu->command_id, rq); + } else { + if (!nvme_try_complete_req(rq, cpu_to_le16(status << 1), res)) + nvme_complete_rq(rq); + } } static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb, @@ -1135,6 +1240,7 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req) bool inline_data = nvme_tcp_has_inline_data(req); u8 hdgst = nvme_tcp_hdgst_len(queue); int len = sizeof(*pdu) + hdgst - req->offset; + struct request *rq = blk_mq_rq_from_pdu(req); int flags = MSG_DONTWAIT; int ret; @@ -1143,6 +1249,10 @@ static int nvme_tcp_try_send_cmd_pdu(struct nvme_tcp_request *req) else flags |= MSG_EOR; + if (test_bit(NVME_TCP_Q_OFFLOADS, &queue->flags) && + blk_rq_nr_phys_segments(rq) && rq_data_dir(rq) == READ) + nvme_tcp_setup_ddp(queue, pdu->cmd.common.command_id, rq); + if (queue->hdr_digest && !req->offset) nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu)); @@ -2445,6 +2555,7 @@ static blk_status_t nvme_tcp_setup_cmd_pdu(struct nvme_ns *ns, req->data_len = blk_rq_nr_phys_segments(rq) ? blk_rq_payload_bytes(rq) : 0; req->curr_bio = rq->bio; + req->offloaded = false; if (rq_data_dir(rq) == WRITE && req->data_len <= nvme_tcp_inline_data_size(queue)) From patchwork Mon Dec 7 21:06:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0FD7C4361B for ; Mon, 7 Dec 2020 21:08:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B2012395C for ; Mon, 7 Dec 2020 21:08:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727416AbgLGVHu (ORCPT ); Mon, 7 Dec 2020 16:07:50 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45809 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727068AbgLGVHt (ORCPT ); Mon, 7 Dec 2020 16:07:49 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:53 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIG029788; Mon, 7 Dec 2020 23:06:53 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Or Gerlitz , Ben Ben-Ishay , Yoray Zack Subject: [PATCH v1 net-next 08/15] nvme-tcp: Deal with netdevice DOWN events Date: Mon, 7 Dec 2020 23:06:42 +0200 Message-Id: <20201207210649.19194-9-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Or Gerlitz For ddp setup/teardown and resync, the offloading logic uses HW resources at the NIC driver such as SQ and CQ. These resources are destroyed when the netdevice does down and hence we must stop using them before the NIC driver destroys them. Use netdevice notifier for that matter -- offloaded connections are stopped before the stack continues to call the NIC driver close ndo. We use the existing recovery flow which has the advantage of resuming the offload once the connection is re-set. This also buys us proper handling for the UNREGISTER event b/c our offloading starts in the UP state, and down is always there between up to unregister. Signed-off-by: Or Gerlitz Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Yoray Zack --- drivers/nvme/host/tcp.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 3c10c8876036..35c898fc833f 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -145,6 +145,7 @@ struct nvme_tcp_ctrl { static LIST_HEAD(nvme_tcp_ctrl_list); static DEFINE_MUTEX(nvme_tcp_ctrl_mutex); +static struct notifier_block nvme_tcp_netdevice_nb; static struct workqueue_struct *nvme_tcp_wq; static const struct blk_mq_ops nvme_tcp_mq_ops; static const struct blk_mq_ops nvme_tcp_admin_mq_ops; @@ -2900,6 +2901,27 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev, return ERR_PTR(ret); } +static int nvme_tcp_netdev_event(struct notifier_block *this, + unsigned long event, void *ptr) +{ + struct net_device *ndev = netdev_notifier_info_to_dev(ptr); + struct nvme_tcp_ctrl *ctrl; + + switch (event) { + case NETDEV_GOING_DOWN: + mutex_lock(&nvme_tcp_ctrl_mutex); + list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) { + if (ndev != ctrl->offloading_netdev) + continue; + nvme_tcp_error_recovery(&ctrl->ctrl); + } + mutex_unlock(&nvme_tcp_ctrl_mutex); + flush_workqueue(nvme_reset_wq); + /* we assume that the going down part of error recovery is over */ + } + return NOTIFY_DONE; +} + static struct nvmf_transport_ops nvme_tcp_transport = { .name = "tcp", .module = THIS_MODULE, @@ -2914,13 +2936,26 @@ static struct nvmf_transport_ops nvme_tcp_transport = { static int __init nvme_tcp_init_module(void) { + int ret; + nvme_tcp_wq = alloc_workqueue("nvme_tcp_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI, 0); if (!nvme_tcp_wq) return -ENOMEM; + nvme_tcp_netdevice_nb.notifier_call = nvme_tcp_netdev_event; + ret = register_netdevice_notifier(&nvme_tcp_netdevice_nb); + if (ret) { + pr_err("failed to register netdev notifier\n"); + goto out_err_reg_notifier; + } + nvmf_register_transport(&nvme_tcp_transport); return 0; + +out_err_reg_notifier: + destroy_workqueue(nvme_tcp_wq); + return ret; } static void __exit nvme_tcp_cleanup_module(void) @@ -2928,6 +2963,7 @@ static void __exit nvme_tcp_cleanup_module(void) struct nvme_tcp_ctrl *ctrl; nvmf_unregister_transport(&nvme_tcp_transport); + unregister_netdevice_notifier(&nvme_tcp_netdevice_nb); mutex_lock(&nvme_tcp_ctrl_mutex); list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) From patchwork Mon Dec 7 21:06:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51828C433FE for ; Mon, 7 Dec 2020 21:07:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2387D238EC for ; Mon, 7 Dec 2020 21:07:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727534AbgLGVHy (ORCPT ); Mon, 7 Dec 2020 16:07:54 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45781 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727062AbgLGVHv (ORCPT ); Mon, 7 Dec 2020 16:07:51 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:53 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIH029788; Mon, 7 Dec 2020 23:06:53 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Or Gerlitz , Yoray Zack Subject: [PATCH v1 net-next 09/15] net/mlx5: Header file changes for nvme-tcp offload Date: Mon, 7 Dec 2020 23:06:43 +0200 Message-Id: <20201207210649.19194-10-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ben Ben-ishay Add the necessary infrastructure for NVMEoTCP offload: - Add nvmeocp_en + nvmeotcp_crc_en bit to the TIR for identify NVMEoTCP offload flow And tag_buffer_id that will be used by the connected nvmeotcp_queues - Add new CQE field that will be used to pass scattered data information to SW - Add new capability to HCA_CAP that represnts the NVMEoTCP offload ability Signed-off-by: Ben Ben-ishay Signed-off-by: Boris Pismenny Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack --- include/linux/mlx5/device.h | 8 +++ include/linux/mlx5/mlx5_ifc.h | 104 +++++++++++++++++++++++++++++++++- include/linux/mlx5/qp.h | 1 + 3 files changed, 110 insertions(+), 3 deletions(-) diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 00057eae89ab..4433ff213d32 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -263,6 +263,7 @@ enum { enum { MLX5_MKEY_MASK_LEN = 1ull << 0, MLX5_MKEY_MASK_PAGE_SIZE = 1ull << 1, + MLX5_MKEY_MASK_XLT_OCT_SIZE = 1ull << 2, MLX5_MKEY_MASK_START_ADDR = 1ull << 6, MLX5_MKEY_MASK_PD = 1ull << 7, MLX5_MKEY_MASK_EN_RINVAL = 1ull << 8, @@ -1171,6 +1172,7 @@ enum mlx5_cap_type { MLX5_CAP_VDPA_EMULATION = 0x13, MLX5_CAP_DEV_EVENT = 0x14, MLX5_CAP_IPSEC, + MLX5_CAP_DEV_NVMEOTCP = 0x19, /* NUM OF CAP Types */ MLX5_CAP_NUM }; @@ -1391,6 +1393,12 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_IPSEC(mdev, cap)\ MLX5_GET(ipsec_cap, (mdev)->caps.hca_cur[MLX5_CAP_IPSEC], cap) +#define MLX5_CAP_DEV_NVMEOTCP(mdev, cap)\ + MLX5_GET(nvmeotcp_cap, mdev->caps.hca_cur[MLX5_CAP_DEV_NVMEOTCP], cap) + +#define MLX5_CAP64_NVMEOTCP(mdev, cap)\ + MLX5_GET64(nvmeotcp_cap, mdev->caps.hca_cur[MLX5_CAP_DEV_NVMEOTCP], cap) + enum { MLX5_CMD_STAT_OK = 0x0, MLX5_CMD_STAT_INT_ERR = 0x1, diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index a3510e81ab3b..e2b75c580c37 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1271,7 +1271,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 log_max_srq_sz[0x8]; u8 log_max_qp_sz[0x8]; u8 event_cap[0x1]; - u8 reserved_at_91[0x7]; + u8 reserved_at_91[0x5]; + u8 nvmeotcp[0x1]; + u8 reserved_at_97[0x1]; u8 prio_tag_required[0x1]; u8 reserved_at_99[0x2]; u8 log_max_qp[0x5]; @@ -3020,6 +3022,21 @@ struct mlx5_ifc_roce_addr_layout_bits { u8 reserved_at_e0[0x20]; }; +struct mlx5_ifc_nvmeotcp_cap_bits { + u8 zerocopy[0x1]; + u8 crc_rx[0x1]; + u8 crc_tx[0x1]; + u8 reserved_at_3[0x15]; + u8 version[0x8]; + + u8 reserved_at_20[0x13]; + u8 log_max_nvmeotcp_tag_buffer_table[0x5]; + u8 reserved_at_38[0x3]; + u8 log_max_nvmeotcp_tag_buffer_size[0x5]; + + u8 reserved_at_40[0x7c0]; +}; + union mlx5_ifc_hca_cap_union_bits { struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap; struct mlx5_ifc_odp_cap_bits odp_cap; @@ -3036,6 +3053,7 @@ union mlx5_ifc_hca_cap_union_bits { struct mlx5_ifc_tls_cap_bits tls_cap; struct mlx5_ifc_device_mem_cap_bits device_mem_cap; struct mlx5_ifc_virtio_emulation_cap_bits virtio_emulation_cap; + struct mlx5_ifc_nvmeotcp_cap_bits nvmeotcp_cap; u8 reserved_at_0[0x8000]; }; @@ -3230,7 +3248,9 @@ struct mlx5_ifc_tirc_bits { u8 disp_type[0x4]; u8 tls_en[0x1]; - u8 reserved_at_25[0x1b]; + u8 nvmeotcp_zero_copy_en[0x1]; + u8 nvmeotcp_crc_en[0x1]; + u8 reserved_at_27[0x19]; u8 reserved_at_40[0x40]; @@ -3261,7 +3281,8 @@ struct mlx5_ifc_tirc_bits { struct mlx5_ifc_rx_hash_field_select_bits rx_hash_field_selector_inner; - u8 reserved_at_2c0[0x4c0]; + u8 nvmeotcp_tag_buffer_table_id[0x20]; + u8 reserved_at_2e0[0x4a0]; }; enum { @@ -10716,12 +10737,14 @@ enum { MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = BIT(0xc), MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_IPSEC = BIT(0x13), MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_SAMPLER = BIT(0x20), + MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE = BIT(0x21), }; enum { MLX5_GENERAL_OBJECT_TYPES_ENCRYPTION_KEY = 0xc, MLX5_GENERAL_OBJECT_TYPES_IPSEC = 0x13, MLX5_GENERAL_OBJECT_TYPES_SAMPLER = 0x20, + MLX5_GENERAL_OBJECT_TYPES_NVMEOTCP_TAG_BUFFER_TABLE = 0x21 }; enum { @@ -10823,6 +10846,20 @@ struct mlx5_ifc_create_sampler_obj_in_bits { struct mlx5_ifc_sampler_obj_bits sampler_object; }; +struct mlx5_ifc_nvmeotcp_tag_buf_table_obj_bits { + u8 modify_field_select[0x40]; + + u8 reserved_at_20[0x20]; + + u8 reserved_at_40[0x1b]; + u8 log_tag_buffer_table_size[0x5]; +}; + +struct mlx5_ifc_create_nvmeotcp_tag_buf_table_in_bits { + struct mlx5_ifc_general_obj_in_cmd_hdr_bits general_obj_in_cmd_hdr; + struct mlx5_ifc_nvmeotcp_tag_buf_table_obj_bits nvmeotcp_tag_buf_table_obj; +}; + enum { MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_128 = 0x0, MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_KEY_SIZE_256 = 0x1, @@ -10833,6 +10870,18 @@ enum { MLX5_GENERAL_OBJECT_TYPE_ENCRYPTION_KEY_TYPE_IPSEC = 0x2, }; +enum { + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_XTS = 0x0, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_TLS = 0x1, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP = 0x2, + MLX5_TRANSPORT_STATIC_PARAMS_ACC_TYPE_NVMETCP_WITH_TLS = 0x3, +}; + +enum { + MLX5_TRANSPORT_STATIC_PARAMS_TI_INITIATOR = 0x0, + MLX5_TRANSPORT_STATIC_PARAMS_TI_TARGET = 0x1, +}; + struct mlx5_ifc_tls_static_params_bits { u8 const_2[0x2]; u8 tls_version[0x4]; @@ -10873,4 +10922,53 @@ enum { MLX5_MTT_PERM_RW = MLX5_MTT_PERM_READ | MLX5_MTT_PERM_WRITE, }; +struct mlx5_ifc_nvmeotcp_progress_params_bits { + u8 valid[0x1]; + u8 reserved_at_1[0x7]; + u8 pd[0x18]; + + u8 next_pdu_tcp_sn[0x20]; + + u8 hw_resync_tcp_sn[0x20]; + + u8 pdu_tracker_state[0x2]; + u8 offloading_state[0x2]; + u8 reserved_at_64[0xc]; + u8 cccid_ttag[0x10]; +}; + +struct mlx5_ifc_transport_static_params_bits { + u8 const_2[0x2]; + u8 tls_version[0x4]; + u8 const_1[0x2]; + u8 reserved_at_8[0x14]; + u8 acc_type[0x4]; + + u8 reserved_at_20[0x20]; + + u8 initial_record_number[0x40]; + + u8 resync_tcp_sn[0x20]; + + u8 gcm_iv[0x20]; + + u8 implicit_iv[0x40]; + + u8 reserved_at_100[0x8]; + u8 dek_index[0x18]; + + u8 reserved_at_120[0x15]; + u8 ti[0x1]; + u8 zero_copy_en[0x1]; + u8 ddgst_offload_en[0x1]; + u8 hdgst_offload_en[0x1]; + u8 ddgst_en[0x1]; + u8 hddgst_en[0x1]; + u8 pda[0x5]; + + u8 nvme_resync_tcp_sn[0x20]; + + u8 reserved_at_160[0xa0]; +}; + #endif /* MLX5_IFC_H */ diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h index d75ef8aa8fac..5fa8b82c9edb 100644 --- a/include/linux/mlx5/qp.h +++ b/include/linux/mlx5/qp.h @@ -220,6 +220,7 @@ struct mlx5_wqe_ctrl_seg { #define MLX5_WQE_CTRL_OPCODE_MASK 0xff #define MLX5_WQE_CTRL_WQE_INDEX_MASK 0x00ffff00 #define MLX5_WQE_CTRL_WQE_INDEX_SHIFT 8 +#define MLX5_WQE_CTRL_TIR_TIS_INDEX_SHIFT 8 enum { MLX5_ETH_WQE_L3_INNER_CSUM = 1 << 4, From patchwork Mon Dec 7 21:06:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDA31C433FE for ; Mon, 7 Dec 2020 21:07:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A768E238EE for ; Mon, 7 Dec 2020 21:07:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727459AbgLGVHv (ORCPT ); Mon, 7 Dec 2020 16:07:51 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45838 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727073AbgLGVHt (ORCPT ); Mon, 7 Dec 2020 16:07:49 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:53 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIJ029788; Mon, 7 Dec 2020 23:06:53 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz , Yoray Zack Subject: [PATCH v1 net-next 11/15] net/mlx5e: TCP flow steering for nvme-tcp Date: Mon, 7 Dec 2020 23:06:45 +0200 Message-Id: <20201207210649.19194-12-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Both nvme-tcp and tls require tcp flow steering. Compile it for both of them. Additionally, use reference counting to allocate/free TCP flow steering. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack --- drivers/net/ethernet/mellanox/mlx5/core/en/fs.h | 4 ++-- .../net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c | 10 ++++++++++ .../net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h | 2 +- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h index a16297e7e2ac..a7fe3a6358ea 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/fs.h @@ -137,7 +137,7 @@ enum { MLX5E_L2_FT_LEVEL, MLX5E_TTC_FT_LEVEL, MLX5E_INNER_TTC_FT_LEVEL, -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) MLX5E_ACCEL_FS_TCP_FT_LEVEL, #endif #ifdef CONFIG_MLX5_EN_ARFS @@ -256,7 +256,7 @@ struct mlx5e_flow_steering { #ifdef CONFIG_MLX5_EN_ARFS struct mlx5e_arfs_tables arfs; #endif -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) struct mlx5e_accel_fs_tcp *accel_tcp; #endif }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c index 97f1594cee11..feded6c8cca1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.c @@ -14,6 +14,7 @@ enum accel_fs_tcp_type { struct mlx5e_accel_fs_tcp { struct mlx5e_flow_table tables[ACCEL_FS_TCP_NUM_TYPES]; struct mlx5_flow_handle *default_rules[ACCEL_FS_TCP_NUM_TYPES]; + refcount_t ref_count; }; static enum mlx5e_traffic_types fs_accel2tt(enum accel_fs_tcp_type i) @@ -335,6 +336,7 @@ static int accel_fs_tcp_enable(struct mlx5e_priv *priv) return err; } } + refcount_set(&priv->fs.accel_tcp->ref_count, 1); return 0; } @@ -358,6 +360,9 @@ void mlx5e_accel_fs_tcp_destroy(struct mlx5e_priv *priv) if (!priv->fs.accel_tcp) return; + if (!refcount_dec_and_test(&priv->fs.accel_tcp->ref_count)) + return; + accel_fs_tcp_disable(priv); for (i = 0; i < ACCEL_FS_TCP_NUM_TYPES; i++) @@ -374,6 +379,11 @@ int mlx5e_accel_fs_tcp_create(struct mlx5e_priv *priv) if (!MLX5_CAP_FLOWTABLE_NIC_RX(priv->mdev, ft_field_support.outer_ip_version)) return -EOPNOTSUPP; + if (priv->fs.accel_tcp) { + refcount_inc(&priv->fs.accel_tcp->ref_count); + return 0; + } + priv->fs.accel_tcp = kzalloc(sizeof(*priv->fs.accel_tcp), GFP_KERNEL); if (!priv->fs.accel_tcp) return -ENOMEM; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h index 589235824543..8aff9298183c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.h @@ -6,7 +6,7 @@ #include "en.h" -#ifdef CONFIG_MLX5_EN_TLS +#if defined(CONFIG_MLX5_EN_TLS) || defined(CONFIG_MLX5_EN_NVMEOTCP) int mlx5e_accel_fs_tcp_create(struct mlx5e_priv *priv); void mlx5e_accel_fs_tcp_destroy(struct mlx5e_priv *priv); struct mlx5_flow_handle *mlx5e_accel_fs_add_sk(struct mlx5e_priv *priv, From patchwork Mon Dec 7 21:06:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3D7AC4167B for ; Mon, 7 Dec 2020 21:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 772462388C for ; Mon, 7 Dec 2020 21:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727511AbgLGVHy (ORCPT ); Mon, 7 Dec 2020 16:07:54 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45909 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727162AbgLGVHw (ORCPT ); Mon, 7 Dec 2020 16:07:52 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:54 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIM029788; Mon, 7 Dec 2020 23:06:54 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz , Yoray Zack Subject: [PATCH v1 net-next 14/15] net/mlx5e: NVMEoTCP statistics Date: Mon, 7 Dec 2020 23:06:48 +0200 Message-Id: <20201207210649.19194-15-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Ben Ben-ishay NVMEoTCP offload statistics includes both control and data path statistic: counters for ndo, offloaded packets/bytes , dropped packets and resync operation. Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz Signed-off-by: Yoray Zack --- .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 17 +++++++++ .../mlx5/core/en_accel/nvmeotcp_rxtx.c | 16 ++++++++ .../ethernet/mellanox/mlx5/core/en_stats.c | 37 +++++++++++++++++++ .../ethernet/mellanox/mlx5/core/en_stats.h | 24 ++++++++++++ 4 files changed, 94 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index 843e653699e9..756decf53930 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -651,6 +651,7 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_nvmeotcp_queue *queue; int max_wqe_sz_cap, queue_id, err; + struct mlx5e_rq_stats *stats; if (tconfig->type != TCP_DDP_NVME) { err = -EOPNOTSUPP; @@ -700,6 +701,8 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, if (err) goto destroy_rx; + stats = &priv->channel_stats[queue->channel_ix].rq; + stats->nvmeotcp_queue_init++; write_lock_bh(&sk->sk_callback_lock); rcu_assign_pointer(inet_csk(sk)->icsk_ulp_ddp_data, queue); write_unlock_bh(&sk->sk_callback_lock); @@ -714,6 +717,7 @@ mlx5e_nvmeotcp_queue_init(struct net_device *netdev, free_queue: kfree(queue); out: + stats->nvmeotcp_queue_init_fail++; return err; } @@ -724,11 +728,15 @@ mlx5e_nvmeotcp_queue_teardown(struct net_device *netdev, struct mlx5e_priv *priv = netdev_priv(netdev); struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_nvmeotcp_queue *queue; + struct mlx5e_rq_stats *stats; queue = (struct mlx5e_nvmeotcp_queue *)tcp_ddp_get_ctx(sk); napi_synchronize(&priv->channels.c[queue->channel_ix]->napi); + stats = &priv->channel_stats[queue->channel_ix].rq; + stats->nvmeotcp_queue_teardown++; + WARN_ON(refcount_read(&queue->ref_count) != 1); if (queue->zerocopy | queue->crc_rx) mlx5e_nvmeotcp_destroy_rx(queue, mdev, queue->zerocopy); @@ -750,6 +758,7 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, struct mlx5e_priv *priv = netdev_priv(netdev); struct scatterlist *sg = ddp->sg_table.sgl; struct mlx5e_nvmeotcp_queue *queue; + struct mlx5e_rq_stats *stats; struct mlx5_core_dev *mdev; int count = 0; @@ -767,6 +776,11 @@ mlx5e_nvmeotcp_ddp_setup(struct net_device *netdev, queue->ccid_table[ddp->command_id].ccid_gen++; queue->ccid_table[ddp->command_id].sgl_length = count; + stats = &priv->channel_stats[queue->channel_ix].rq; + stats->nvmeotcp_ddp_setup++; + if (unlikely(mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, count))) + stats->nvmeotcp_ddp_setup_fail++; + return 0; } @@ -808,6 +822,7 @@ mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, (struct mlx5e_nvmeotcp_queue *)tcp_ddp_get_ctx(sk); struct mlx5e_priv *priv = netdev_priv(netdev); struct nvmeotcp_queue_entry *q_entry; + struct mlx5e_rq_stats *stats; q_entry = &queue->ccid_table[ddp->command_id]; WARN_ON(q_entry->sgl_length == 0); @@ -816,6 +831,8 @@ mlx5e_nvmeotcp_ddp_teardown(struct net_device *netdev, q_entry->queue = queue; mlx5e_nvmeotcp_post_klm_wqe(queue, KLM_UMR, ddp->command_id, 0); + stats = &priv->channel_stats[queue->channel_ix].rq; + stats->nvmeotcp_ddp_teardown++; return 0; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c index be5111b66cc9..298558ae2dcd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -10,12 +10,16 @@ static void nvmeotcp_update_resync(struct mlx5e_nvmeotcp_queue *queue, struct mlx5e_cqe128 *cqe128) { const struct tcp_ddp_ulp_ops *ulp_ops; + struct mlx5e_rq_stats *stats; u32 seq; seq = be32_to_cpu(cqe128->resync_tcp_sn); ulp_ops = inet_csk(queue->sk)->icsk_ulp_ddp_ops; if (ulp_ops && ulp_ops->resync_request) ulp_ops->resync_request(queue->sk, seq, TCP_DDP_RESYNC_REQ); + + stats = queue->priv->channels.c[queue->channel_ix]->rq.stats; + stats->nvmeotcp_resync++; } static void mlx5e_nvmeotcp_advance_sgl_iter(struct mlx5e_nvmeotcp_queue *queue) @@ -61,10 +65,13 @@ mlx5_nvmeotcp_add_tail_nonlinear(struct mlx5e_nvmeotcp_queue *queue, int org_nr_frags, int frag_index) { struct mlx5e_priv *priv = queue->priv; + struct mlx5e_rq_stats *stats; while (org_nr_frags != frag_index) { if (skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS) { dev_kfree_skb_any(skb); + stats = priv->channels.c[queue->channel_ix]->rq.stats; + stats->nvmeotcp_drop++; return NULL; } skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, @@ -83,9 +90,12 @@ mlx5_nvmeotcp_add_tail(struct mlx5e_nvmeotcp_queue *queue, struct sk_buff *skb, int offset, int len) { struct mlx5e_priv *priv = queue->priv; + struct mlx5e_rq_stats *stats; if (skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS) { dev_kfree_skb_any(skb); + stats = priv->channels.c[queue->channel_ix]->rq.stats; + stats->nvmeotcp_drop++; return NULL; } skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, @@ -146,6 +156,7 @@ mlx5e_nvmeotcp_handle_rx_skb(struct net_device *netdev, struct sk_buff *skb, skb_frag_t org_frags[MAX_SKB_FRAGS]; struct mlx5e_nvmeotcp_queue *queue; struct nvmeotcp_queue_entry *nqe; + struct mlx5e_rq_stats *stats; int org_nr_frags, frag_index; struct mlx5e_cqe128 *cqe128; u32 queue_id; @@ -164,6 +175,8 @@ mlx5e_nvmeotcp_handle_rx_skb(struct net_device *netdev, struct sk_buff *skb, return skb; } + stats = priv->channels.c[queue->channel_ix]->rq.stats; + /* cc ddp from cqe */ ccid = be16_to_cpu(cqe128->ccid); ccoff = be32_to_cpu(cqe128->ccoff); @@ -206,6 +219,7 @@ mlx5e_nvmeotcp_handle_rx_skb(struct net_device *netdev, struct sk_buff *skb, while (to_copy < cclen) { if (skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS) { dev_kfree_skb_any(skb); + stats->nvmeotcp_drop++; mlx5e_nvmeotcp_put_queue(queue); return NULL; } @@ -235,6 +249,8 @@ mlx5e_nvmeotcp_handle_rx_skb(struct net_device *netdev, struct sk_buff *skb, frag_index); } + stats->nvmeotcp_offload_packets++; + stats->nvmeotcp_offload_bytes += cclen; mlx5e_nvmeotcp_put_queue(queue); return skb; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 2cf2042b37c7..ca7d2cb5099f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -34,6 +34,7 @@ #include "en.h" #include "en_accel/tls.h" #include "en_accel/en_accel.h" +#include "en_accel/nvmeotcp.h" static unsigned int stats_grps_num(struct mlx5e_priv *priv) { @@ -189,6 +190,18 @@ static const struct counter_desc sw_stats_desc[] = { { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_resync_res_ok) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_resync_res_skip) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_err) }, +#endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_queue_init) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_queue_init_fail) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_queue_teardown) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_ddp_setup) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_ddp_setup_fail) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_ddp_teardown) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_drop) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_resync) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_offload_packets) }, + { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_nvmeotcp_offload_bytes) }, #endif { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_events) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, ch_poll) }, @@ -352,6 +365,18 @@ static void mlx5e_stats_grp_sw_update_stats_rq_stats(struct mlx5e_sw_stats *s, s->rx_tls_resync_res_skip += rq_stats->tls_resync_res_skip; s->rx_tls_err += rq_stats->tls_err; #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + s->rx_nvmeotcp_queue_init += rq_stats->nvmeotcp_queue_init; + s->rx_nvmeotcp_queue_init_fail += rq_stats->nvmeotcp_queue_init_fail; + s->rx_nvmeotcp_queue_teardown += rq_stats->nvmeotcp_queue_teardown; + s->rx_nvmeotcp_ddp_setup += rq_stats->nvmeotcp_ddp_setup; + s->rx_nvmeotcp_ddp_setup_fail += rq_stats->nvmeotcp_ddp_setup_fail; + s->rx_nvmeotcp_ddp_teardown += rq_stats->nvmeotcp_ddp_teardown; + s->rx_nvmeotcp_drop += rq_stats->nvmeotcp_drop; + s->rx_nvmeotcp_resync += rq_stats->nvmeotcp_resync; + s->rx_nvmeotcp_offload_packets += rq_stats->nvmeotcp_offload_packets; + s->rx_nvmeotcp_offload_bytes += rq_stats->nvmeotcp_offload_bytes; +#endif } static void mlx5e_stats_grp_sw_update_stats_ch_stats(struct mlx5e_sw_stats *s, @@ -1612,6 +1637,18 @@ static const struct counter_desc rq_stats_desc[] = { { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_resync_res_skip) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_err) }, #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_queue_init) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_queue_init_fail) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_queue_teardown) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_ddp_setup) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_ddp_setup_fail) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_ddp_teardown) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_drop) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_resync) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_offload_packets) }, + { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, nvmeotcp_offload_bytes) }, +#endif }; static const struct counter_desc sq_stats_desc[] = { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index e41fc11f2ce7..a5cf8ec4295b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -179,6 +179,18 @@ struct mlx5e_sw_stats { u64 rx_congst_umr; u64 rx_arfs_err; u64 rx_recover; +#ifdef CONFIG_MLX5_EN_NVMEOTCP + u64 rx_nvmeotcp_queue_init; + u64 rx_nvmeotcp_queue_init_fail; + u64 rx_nvmeotcp_queue_teardown; + u64 rx_nvmeotcp_ddp_setup; + u64 rx_nvmeotcp_ddp_setup_fail; + u64 rx_nvmeotcp_ddp_teardown; + u64 rx_nvmeotcp_drop; + u64 rx_nvmeotcp_resync; + u64 rx_nvmeotcp_offload_packets; + u64 rx_nvmeotcp_offload_bytes; +#endif u64 ch_events; u64 ch_poll; u64 ch_arm; @@ -342,6 +354,18 @@ struct mlx5e_rq_stats { u64 tls_resync_res_skip; u64 tls_err; #endif +#ifdef CONFIG_MLX5_EN_NVMEOTCP + u64 nvmeotcp_queue_init; + u64 nvmeotcp_queue_init_fail; + u64 nvmeotcp_queue_teardown; + u64 nvmeotcp_ddp_setup; + u64 nvmeotcp_ddp_setup_fail; + u64 nvmeotcp_ddp_teardown; + u64 nvmeotcp_drop; + u64 nvmeotcp_resync; + u64 nvmeotcp_offload_packets; + u64 nvmeotcp_offload_bytes; +#endif }; struct mlx5e_sq_stats { From patchwork Mon Dec 7 21:06:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boris Pismenny X-Patchwork-Id: 339516 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38D12C4167B for ; Mon, 7 Dec 2020 21:08:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0F98A238EE for ; Mon, 7 Dec 2020 21:08:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727701AbgLGVIN (ORCPT ); Mon, 7 Dec 2020 16:08:13 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:45917 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727169AbgLGVHu (ORCPT ); Mon, 7 Dec 2020 16:07:50 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from borisp@mellanox.com) with SMTP; 7 Dec 2020 23:06:54 +0200 Received: from gen-l-vrt-133.mtl.labs.mlnx. (gen-l-vrt-133.mtl.labs.mlnx [10.237.11.160]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 0B7L6qIN029788; Mon, 7 Dec 2020 23:06:54 +0200 From: Boris Pismenny To: kuba@kernel.org, davem@davemloft.net, saeedm@nvidia.com, hch@lst.de, sagi@grimberg.me, axboe@fb.com, kbusch@kernel.org, viro@zeniv.linux.org.uk, edumazet@google.com Cc: boris.pismenny@gmail.com, linux-nvme@lists.infradead.org, netdev@vger.kernel.org, benishay@nvidia.com, ogerlitz@nvidia.com, yorayz@nvidia.com, Ben Ben-Ishay , Or Gerlitz Subject: [PATCH v1 net-next 15/15] net/mlx5e: NVMEoTCP workaround CRC after resync Date: Mon, 7 Dec 2020 23:06:49 +0200 Message-Id: <20201207210649.19194-16-borisp@mellanox.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20201207210649.19194-1-borisp@mellanox.com> References: <20201207210649.19194-1-borisp@mellanox.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yoray Zack The nvme-tcp crc computed over the first packet after resync may provide the wrong signal when the packet contains multiple PDUs. We workaround that by ignoring the cqe->nvmeotcp_crc signal for the first packet after resync. Signed-off-by: Yoray Zack Signed-off-by: Boris Pismenny Signed-off-by: Ben Ben-Ishay Signed-off-by: Or Gerlitz --- .../mellanox/mlx5/core/en_accel/nvmeotcp.c | 1 + .../mellanox/mlx5/core/en_accel/nvmeotcp.h | 3 +++ .../mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c | 14 ++++++++++++++ drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 ++++-------- include/linux/mlx5/device.h | 4 ++-- 5 files changed, 24 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c index 756decf53930..e9f7f8b17c92 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.c @@ -844,6 +844,7 @@ mlx5e_nvmeotcp_dev_resync(struct net_device *netdev, struct mlx5e_nvmeotcp_queue *queue = (struct mlx5e_nvmeotcp_queue *)tcp_ddp_get_ctx(sk); + queue->after_resync_cqe = 1; mlx5e_nvmeotcp_rx_post_static_params_wqe(queue, seq); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h index 5be300d8299e..a309971e11b1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp.h @@ -50,6 +50,7 @@ struct mlx5e_nvmeotcp_sq { * @ccoff_inner: Current offset within the @ccsglidx element * @priv: mlx5e netdev priv * @inv_done: invalidate callback of the nvme tcp driver + * @after_resync_cqe: indicate if resync occurred */ struct mlx5e_nvmeotcp_queue { struct tcp_ddp_ctx tcp_ddp_ctx; @@ -82,6 +83,8 @@ struct mlx5e_nvmeotcp_queue { /* for flow_steering flow */ struct completion done; + /* for MASK HW resync cqe */ + bool after_resync_cqe; }; struct mlx5e_nvmeotcp { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c index 298558ae2dcd..4b813de592be 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/nvmeotcp_rxtx.c @@ -175,6 +175,20 @@ mlx5e_nvmeotcp_handle_rx_skb(struct net_device *netdev, struct sk_buff *skb, return skb; } +#ifdef CONFIG_TCP_DDP_CRC + /* If a resync occurred in the previous cqe, + * the current cqe.crcvalid bit may not be valid, + * so we will treat it as 0 + */ + skb->ddp_crc = queue->after_resync_cqe ? 0 : + cqe_is_nvmeotcp_crcvalid(cqe); + queue->after_resync_cqe = 0; +#endif + if (!cqe_is_nvmeotcp_zc(cqe)) { + mlx5e_nvmeotcp_put_queue(queue); + return skb; + } + stats = priv->channels.c[queue->channel_ix]->rq.stats; /* cc ddp from cqe */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 2688396d21f8..960aee0d5f0c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1079,10 +1079,6 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe, if (unlikely(mlx5_ipsec_is_rx_flow(cqe))) mlx5e_ipsec_offload_handle_rx_skb(netdev, skb, cqe); -#if defined(CONFIG_TCP_DDP_CRC) && defined(CONFIG_MLX5_EN_NVMEOTCP) - skb->ddp_crc = cqe_is_nvmeotcp_crcvalid(cqe); -#endif - if (lro_num_seg > 1) { mlx5e_lro_update_hdr(skb, cqe, cqe_bcnt); skb_shinfo(skb)->gso_size = DIV_ROUND_UP(cqe_bcnt, lro_num_seg); @@ -1197,7 +1193,7 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, page_ref_inc(di->page); #if defined(CONFIG_TCP_DDP) && defined(CONFIG_MLX5_EN_NVMEOTCP) - if (cqe_is_nvmeotcp_zc_or_resync(cqe)) + if (cqe_is_nvmeotcp(cqe)) skb = mlx5e_nvmeotcp_handle_rx_skb(rq->netdev, skb, cqe, cqe_bcnt, true); #endif @@ -1253,7 +1249,7 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, skb->len += headlen; #if defined(CONFIG_TCP_DDP) && defined(CONFIG_MLX5_EN_NVMEOTCP) - if (cqe_is_nvmeotcp_zc_or_resync(cqe)) + if (cqe_is_nvmeotcp(cqe)) skb = mlx5e_nvmeotcp_handle_rx_skb(rq->netdev, skb, cqe, cqe_bcnt, false); #endif @@ -1486,7 +1482,7 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w skb->len += headlen; #if defined(CONFIG_TCP_DDP) && defined(CONFIG_MLX5_EN_NVMEOTCP) - if (cqe_is_nvmeotcp_zc_or_resync(cqe)) + if (cqe_is_nvmeotcp(cqe)) skb = mlx5e_nvmeotcp_handle_rx_skb(rq->netdev, skb, cqe, cqe_bcnt, false); #endif @@ -1539,7 +1535,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, page_ref_inc(di->page); #if defined(CONFIG_TCP_DDP) && defined(CONFIG_MLX5_EN_NVMEOTCP) - if (cqe_is_nvmeotcp_zc_or_resync(cqe)) + if (cqe_is_nvmeotcp(cqe)) skb = mlx5e_nvmeotcp_handle_rx_skb(rq->netdev, skb, cqe, cqe_bcnt, true); #endif diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index ea4d158e8329..ae879576e371 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -882,9 +882,9 @@ static inline bool cqe_is_nvmeotcp_zc(struct mlx5_cqe64 *cqe) return ((cqe->nvmetcp >> 4) & 0x1); } -static inline bool cqe_is_nvmeotcp_zc_or_resync(struct mlx5_cqe64 *cqe) +static inline bool cqe_is_nvmeotcp(struct mlx5_cqe64 *cqe) { - return ((cqe->nvmetcp >> 4) & 0x5); + return ((cqe->nvmetcp >> 4) & 0x7); } static inline u8 mlx5_get_cqe_format(struct mlx5_cqe64 *cqe)