From patchwork Sat Jan 15 23:02:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: mike.marciniszyn@cornelisnetworks.com X-Patchwork-Id: 532431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D52DC433EF for ; Sat, 15 Jan 2022 23:02:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232689AbiAOXCy (ORCPT ); Sat, 15 Jan 2022 18:02:54 -0500 Received: from mail-sn1anam02on2111.outbound.protection.outlook.com ([40.107.96.111]:15494 "EHLO NAM02-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233870AbiAOXCy (ORCPT ); Sat, 15 Jan 2022 18:02:54 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ncb28DAgdCJw302Q4w3U59zbtMBcTXCXXFDoM8RjWG1R119Qi67rws21th+3+06zvZ2iZJ5irUMUne4XMKlP6Iefl5dAWE8NVVnoAPBsgJ64xC7pZQZ8BlKY8pfYyFfC+MOgDjb3CRNy28WyGGDUY6cio7YX0wIB3Lvgsvxs/WAcqTNMfvX/8DYCj+OS7205r5X8UkbGns3OFtGW42d/gnzOvz//Oku7NzvL3rV+smfQc3nnBfNArHX7yB+nPS0lnkfWEYv6PBsPnEeYzdAbhkNEAHlkyep3yDP/jIQ513bKhPKAjjIi0GJoIcsD1xnxGWUo/ZChXuvDQObpOplTHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MFaCisCEQcwCkTuPnNda1ENNEW13q+t3SkY0KWLAiwU=; b=ckoi+2swV8K4q4GQ+4vsEyAZqxDjESZ7piaFq7Zvxki/IOtPah6W1emPWpAD8sMiCcnwy0PXtjNqwjS5+5sROFMYDU6gjPMOCBlitwF6XL51dKB6RbMlWQcJffViYrqjtDLMxNJd/YYmLw1r+SVdKx/pbhsRPOgHSzzYh7NcD/u1T71MtfaJbn7FJn6gFSIp8dWoy+Xdwuvig/zZblQRKoXeCmF740nSqal6o3aOiAI6iRBZyMPWXG9Qa0OTNZ+JuU/VVnnK0sv8pFvmCIF6X4zEWCJt5BySNAKCneDUQ++aPd14ivjj9hlepTqXin+wFcnD+CSXRfQ8e9N6KT/ytg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornelisnetworks.com; dmarc=pass action=none header.from=cornelisnetworks.com; dkim=pass header.d=cornelisnetworks.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cornelisnetworks.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MFaCisCEQcwCkTuPnNda1ENNEW13q+t3SkY0KWLAiwU=; b=TwO1XfQ4nelV++pqxh/w9hYEAFcJR3wOkmZcnoypWGT7XZbmEaGPbXeGYHVlsACnJUNPYvwksFij60CP81Gdm0/l8WTWBjuzM/rgJpcAkZPO3BL7aB7OgbIdv7kKWGBsjN8LGnGBA1RHOAJtuG8P014By0ucdjFy5HmZDDH9cioGPaMoNgnWKqFurcp8WuyQ7Eu/UJHMryRWyXf6ppxUjIPgyTa6RyQLzx68Hl+S2Pdxf9eE2wEfyEXuYlssGbgazf8ZxNAK3y5GmpMOpPy2HX1LCHx4DppNDRcNBWNAXITgGeggV8PdodqgnH5Ac8JydMv8zyCVUXS59T/Bci66SQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=cornelisnetworks.com; Received: from CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) by SA0PR01MB6140.prod.exchangelabs.com (2603:10b6:806:e4::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4888.11; Sat, 15 Jan 2022 23:02:53 +0000 Received: from CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0]) by CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0%8]) with mapi id 15.20.4888.012; Sat, 15 Jan 2022 23:02:53 +0000 From: mike.marciniszyn@cornelisnetworks.com To: jgg@ziepe.ca Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org Subject: [PATCH for-rc 1/4] IB/hfi1: Fix panic with larger ipoib send_queue_size Date: Sat, 15 Jan 2022 18:02:33 -0500 Message-Id: <1642287756-182313-2-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> References: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-ClientProxiedBy: BL1PR13CA0106.namprd13.prod.outlook.com (2603:10b6:208:2b9::21) To CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1cba2973-c16a-4d2b-9a39-08d9d87b290b X-MS-TrafficTypeDiagnostic: SA0PR01MB6140:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XGPONjfST9wmtuxbcHVyicTDOqQJO/e8poubmOd1gAotMs0QPSLOXUBwFrwpZQx+DTQ9BALBa45Ew1N77QY6zKFhRAzUZCgilxWV1WLhKCcIbYT8T2JpqvkXKskFk1+xqbzvzQnb/L6Gn7AisFvVrG+ORuN3QuyeUE1FWZ4uAwuxCyIh3Iawb4FHwlrPKgR+Ud52w0n7yIPsScdwQrv7Qz8NckT84VnCg2K+w5ec1BY/9j8Ye66FW4W3gLh3Kwkzpzwu67dCpnBoOleI9OENYLhy4GzVGEDgLf6xl/phQqXoSkfD3zNEKEFoR+TXFYkUITKyyOI7F6POxXUmf80LpcMblaqSErJ8ub3ghJ3S9hV+cZK0OzgJVt9XGoq5yTZtZ1MQW/IOSVEAWRB8xbkr/0fdziFJX5cm+BbTv8VWuzzJdQ5xQihYRH1OpB2kH+LWJTnHi/JDJBeyV4oP3tzfz6MuLBo0ZXHpagCmshSIL6YNkp7FXc+tYUU6FqzDI7/1+KqCCuxaziSDaXu+aDliy0teHicPrmOII04VmKufnVj4slZRPPgFeKRjX5wzwTUNH+ZbeAushQd6hZhXH2N9zUew0qMGHSg6E//ilJgW0qM3bIKSurQwJio3m3wFkNTDUHf7QoIQzduV41bhyNQm0ihZ9O52mqwG3Rh1X9Wu9muvHPl5iRou4V1bT3LmQrDwjUe2I8cBqRvv7mqgRmlzOA== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR01MB7153.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(4636009)(39830400003)(396003)(376002)(136003)(346002)(366004)(36756003)(86362001)(508600001)(4326008)(26005)(52116002)(8936002)(6666004)(38350700002)(38100700002)(6916009)(8676002)(6486002)(66946007)(5660300002)(83380400001)(66556008)(66476007)(186003)(2906002)(316002)(2616005)(9686003)(6512007)(6506007); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: xna8/OQRvnG+/ZTwCBaLswb7yBj5C0cltxegphNL0n7+pgqIFzDtrp/YfUKg4SWcGPRHf6y3dg+gfZRZtXDU7quhYmFi1t9Rcp5WJQ1+S8ntO7Op/YdpXj0nwlVFngDjuTnsnVJ+QFQqKuJ/XUJEDPyTn6r2C7tTUVGB7VUc6RQtBc1Cz4zhhhKvocBRMPI4csQ6AUBEBfuDAYldKthe1F53yelT5lCQdatjTuotXg04UGlw3bMzXBcLS0VPbS8zZaZiqWHOni2SQvaPe5vWw2NOGhM+DBwFqNFak/DsGM+H3ZTOR3IzsOfDpGEmhlGoDQWLZHnzeS0wVqAWY2jyXkfOerScGJUwwQ/yXnnxkzaSdVQooeDY63fJjkh4wLDlU7CWh3tfnoTzigM4UYkgIiovt8/mQp9W4FMcYq6N50BLzbwu9MhCgNCpNtLUKz7ncBiBDvQKhOdRYopnzU5pBifyFjNVOYuQ6mX+RVEP+tMvvB96jH08tZJl3nik/IMaPMe3HZlIegDqQs8ahz70r76K4LmhtcVBuQAyhGo4qT+oF6L0WRM3nvuZS7pdZR1b9LN8XuvcuEo8d7KWT+KT9PiCWiZjpzxCdKTjcWmv6tyRMQ1KOgx+zZvd2LXa2LUjp+i2LGs1PYNO8+DV5Ns8ae0bXz/p98e6NMrehbtvgW2BCO/IG/lOwrKGJpxOKWP7bIS0gf3lgfkXi1Uoo6tmQZqXx3Ih35H+Ff/QHB+I08cEHFDgpG2cdPcsLH2DtGmcd/cs5loqJ+BWIdIhWfJ5dRigXm6gBZ0uX9oqn8/wmtwVK0W9UUk38jenaty0MC1+DVPWm/LaP1IAWHxnqQ2PrTAZCnLXQ9U5c5DGe6QHzF94wt8zDW2e8uncXCJbIBliLq2aYQtqmIlrXZbWBmr/JefgYEox0GRvnbGOTLqaD3/RYj2XBXKbBStjKpKUBa6eMOQ9ENwLhDynszTDhQJyl4dSJQ9VqyJNhCf1h7czCgdQO9rfwnSdm9aaLaSz8yVmizdh3nP90MFaRHLztN8a16dr4StewXtoWpbV7aS+sVZnaZ3TTXCwuyxibiJbQj2hbudtNCB3yjGrXpy5FIlB6fLP1TvTIkwMvGM0JTy0ZimaFJEaXXY0aDP66miDo5sRbwfPDiKQDWmxyuaVnXZE1w+/7gkJc6fpcUmBVWMmE8vTgH4g/unA4IaUFHuhxvFidiqNwxb3oTQFF+8b2JSPQ8+6ctaYqXDxn41xiGu0Mqu2IxFbbdsR2K1gmW/HogSW4Bt3Qau+O45b31lPdT7659VZpEQtkKWlEu0Dl/sqR9izW+0prAhBRK/ZcQLceQnocR3FS8Ze0c6QYIOsvKJxquYbnssjwqjEF8nkD68snLYxAT0tqQHIRGoO/7Wy3X3ErT6Bw9LJSp7hreyoiYZc5hQ75yUZZpKUcj0jCy7zs16G8qf8wAWKaUEUpR316QpkSc6b2bMk70lE2IQxapiMMQD0UbOJer/mjMmBVySmRvn+asrZQ45rreStSrEfUv51P+x2GfwO+f27Shi9NcxuU8c7n0oMi+VcSSDG7+eidd/DCG38wG2AZTeB55AHZT83s5VMpuxTEu3clw8l6+2H7+98PIp0JliqBZbq6AtLuVy54YQgreNSx/uxBodG2Tw2uEMqTcczbMrb0UpcXmKtNrjy3OoRJIFdZen5hPlf9Qy0F8Ljf1IuxGg1dkwSp6IyK/rztCW9SBZf469Eu+z6kw== X-OriginatorOrg: cornelisnetworks.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1cba2973-c16a-4d2b-9a39-08d9d87b290b X-MS-Exchange-CrossTenant-AuthSource: CH0PR01MB7153.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jan 2022 23:02:53.2869 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4dbdb7da-74ee-4b45-8747-ef5ce5ebe68a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QPEueRN4kZws+JgG92P8+CL4/gW/9gb5m4desvCa0JbjEeapbIJRPHUZNpWesQK6n4OXXQkZiiuS96ziLyns34uPNd+pJPUvrVgM1FIICQG/Psp0b0phGv3ll+ccvS2q X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR01MB6140 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Mike Marciniszyn When the ipoib send_queue_size is increased from the default the following panic happens: [ 219.242960] RIP: 0010:hfi1_ipoib_drain_tx_ring+0x45/0xf0 [hfi1] [ 219.250708] Code: 31 e4 eb 0f 8b 85 c8 02 00 00 41 83 c4 01 44 39 e0 76 60 8b 8d cc 02 00 00 44 89 e3 be 01 00 00 00 d3 e3 48 03 9d c0 02 00 00 83 18 01 00 00 00 00 00 00 48 8b bb 30 01 00 00 e8 25 af a7 e0 [ 219.273764] RSP: 0018:ffffc9000798f4a0 EFLAGS: 00010286 [ 219.280740] RAX: 0000000000008000 RBX: ffffc9000aa0f000 RCX: 000000000000000f [ 219.289842] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 [ 219.298864] RBP: ffff88810ff08000 R08: ffff88889476d900 R09: 0000000000000101 [ 219.307907] R10: 0000000000000000 R11: ffffc90006590ff8 R12: 0000000000000200 [ 219.317016] R13: ffffc9000798fba8 R14: 0000000000000000 R15: 0000000000000001 [ 219.326100] FS: 00007fd0f79cc3c0(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000 [ 219.336171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 219.343639] CR2: ffffc9000aa0f118 CR3: 0000000889c84001 CR4: 00000000001706e0 [ 219.352589] Call Trace: [ 219.356340] [ 219.359804] hfi1_ipoib_napi_tx_disable+0x45/0x60 [hfi1] [ 219.366887] hfi1_ipoib_dev_stop+0x18/0x80 [hfi1] [ 219.373313] ipoib_ib_dev_stop+0x1d/0x40 [ib_ipoib] [ 219.379814] ipoib_stop+0x48/0xc0 [ib_ipoib] [ 219.385604] __dev_close_many+0x9e/0x110 [ 219.391001] __dev_change_flags+0xd9/0x210 [ 219.396618] dev_change_flags+0x21/0x60 [ 219.401878] do_setlink+0x31c/0x10f0 [ 219.406841] ? __nla_validate_parse+0x12d/0x1a0 [ 219.412902] ? __nla_parse+0x21/0x30 [ 219.417844] ? inet6_validate_link_af+0x5e/0xf0 [ 219.423913] ? cpumask_next+0x1f/0x20 [ 219.428914] ? __snmp6_fill_stats64.isra.53+0xbb/0x140 [ 219.435648] ? __nla_validate_parse+0x47/0x1a0 [ 219.441564] __rtnl_newlink+0x530/0x910 [ 219.446818] ? pskb_expand_head+0x73/0x300 [ 219.452198] ? __kmalloc_node_track_caller+0x109/0x280 [ 219.458999] ? __nla_put+0xc/0x20 [ 219.463733] ? cpumask_next_and+0x20/0x30 [ 219.469166] ? update_sd_lb_stats.constprop.144+0xd3/0x820 [ 219.476325] ? _raw_spin_unlock_irqrestore+0x25/0x37 [ 219.482815] ? __wake_up_common_lock+0x87/0xc0 [ 219.488761] ? kmem_cache_alloc_trace+0x3d/0x3d0 [ 219.494917] rtnl_newlink+0x43/0x60 The issue happens when the shift that should have been a function of the txq item size mistakenly used the ring size. Fix by using the item size. Fixes: d47dfc2b00e6 ("IB/hfi1: Remove cache and embed txreq in ring") Cc: stable@vger.kernel.org Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/infiniband/hw/hfi1/ipoib_tx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c index f401089..bf62956 100644 --- a/drivers/infiniband/hw/hfi1/ipoib_tx.c +++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c @@ -731,7 +731,7 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) goto free_txqs; txq->tx_ring.max_items = tx_ring_size; - txq->tx_ring.shift = ilog2(tx_ring_size); + txq->tx_ring.shift = ilog2(tx_item_size); txq->tx_ring.avail = hfi1_ipoib_ring_hwat(txq); netif_tx_napi_add(dev, &txq->napi, From patchwork Sat Jan 15 23:02:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: mike.marciniszyn@cornelisnetworks.com X-Patchwork-Id: 532560 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10ABCC433F5 for ; Sat, 15 Jan 2022 23:03:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233891AbiAOXC7 (ORCPT ); Sat, 15 Jan 2022 18:02:59 -0500 Received: from mail-sn1anam02on2093.outbound.protection.outlook.com ([40.107.96.93]:33155 "EHLO NAM02-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233870AbiAOXC6 (ORCPT ); Sat, 15 Jan 2022 18:02:58 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ty/F1h19RKbGc+U1LIb8gRr19MYCMlvBVbLjpGcdHw0QPgEdUlPlAzi8m7yijTMImud3Da2EIqSz8T2XpFzK/IO5/cBupAggYYaXdzPCVTxaP9KsSoR3kzE8KeRTsDS5z94PRfL6Q3P1lI5B2IQt3lvhuJyD15dz6eTnlgNQIJXmZwHWA57ZLO+Tg1EPYuJCvSyZEvfVq+mzbtWIs9tp5/etJM5Dbm7ViWMVL+w4nhujv8IPhTNWomCVevR3IBVQBFqsrbJ/jEfpVs45Vqu5dQdgbAGSzPgEydc/E+bK38DYqLWgTqeAyBKWkmAJ9MpxNilyt9eQkWBW7ziOnk1HvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7hDKk4IFo85fdT65LkGB72YBZl2QindArRfDEIUHsRg=; b=JbcKEWMaVZ+tYdfOjXyYUTzrlilA51MCFaTS0AT14n4q65NirIkfL6r777xrBH45JY5QYCxUEzVnewIumWmRvHcELutstKZYr3XO8MgRQtYiZiINeKbIODwOEAPEkbfFQS+yj91p/+Mwjb9/bsc7nuPGDiTxa83fEo91bWgxHWyYcyqUtmqAcTxJ6FMqpHxfqF8vUfTsShG5DwcOpOGlD/TaY1hvVauFfOhKQmFaNXv7RwAirRigaqXkY4oLeuabMB13rSaVenRRnoRWyg89SeYfdAuHKaHz53cdF1JbOL2xx2igQ8qmrkjEHF+HUyf39u3RJcueI8PmvNdJBtV12A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornelisnetworks.com; dmarc=pass action=none header.from=cornelisnetworks.com; dkim=pass header.d=cornelisnetworks.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cornelisnetworks.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7hDKk4IFo85fdT65LkGB72YBZl2QindArRfDEIUHsRg=; b=g6dqbelly//RDn5Oe/R7g2TH6Kkd8b6g1+r9I4HTkYkBiiOmFOH/mE9FVZN/3bZCbhyq3UUThG5kotqSLD/eviUau9AUdwYaCQYnB5euoZFMmkjLb2xRrIYTvcmlsKAO7xQaf9akaRvUron//K/c8NzqpTj5jdovk4e089B55lZSvsV8VET7XlJC/dZZjzFhiSbE+SaGY50EA0ePvXDjFKH1Tbdoh1fRWFQla+QDqFhkN7qna214U3k8gwVLIigSRgz47K3kSHXKDd2t2g2NscBxLKJsuClSi+ys/vO7Zqyn5eL5KrxpWSAjDAt3gWjwtrS+8JdWOqxhUkVzPnmAcg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=cornelisnetworks.com; Received: from CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) by SA0PR01MB6140.prod.exchangelabs.com (2603:10b6:806:e4::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4888.11; Sat, 15 Jan 2022 23:02:57 +0000 Received: from CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0]) by CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0%8]) with mapi id 15.20.4888.012; Sat, 15 Jan 2022 23:02:56 +0000 From: mike.marciniszyn@cornelisnetworks.com To: jgg@ziepe.ca Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org Subject: [PATCH for-rc 2/4] IB/hfi1: Fix alloc failure with larger txqueuelen Date: Sat, 15 Jan 2022 18:02:34 -0500 Message-Id: <1642287756-182313-3-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> References: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-ClientProxiedBy: BL1PR13CA0106.namprd13.prod.outlook.com (2603:10b6:208:2b9::21) To CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 3bb418cc-55f6-4219-d5bd-08d9d87b2b2b X-MS-TrafficTypeDiagnostic: SA0PR01MB6140:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:5516; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Bnr7tZlefjANoZMxRUL8aIB0Ll02tfa2bSdCM+9tRNg6s0MGAOVY6j8IJoWlwWKkX1nnqf8M0Q4LzCCQ16jtwd2Do7rihsKh0KVuzkYl219LnwhDHLY0Xvn9BHsQydbgP2g8/MmOV1V/TlblkAQwLTdLB0XWsRQ0P1LCLjQJwaDM/nynwWA9Qng9UdC+R2O+HsdStYOG390cgMj9wbsjIpZrKmQpTqCQMXMly47aEz1ssTz0DD7WSVWYEKpw2SBxLnREv9uyjObDw5We9VDH06Ab7PyXgPTUv83WyKZGl1ljKwcUciBZoy+vRDwW751wef+rNHpHTwukQaA1KZXczLh4G01FbQaUCCQPBRKJVPl/ttrGZ1bq7X3RwatHulpMtMlpKJqk7dJRkMbBez3gCIPFvrZY+i/gImjLFLzNMUGU6Cn+MX+JvRKdGDi6x20zmuCJLl8mMM5ri1yK1x7hCI93BQye8pHgavoHjYnu9aWug5MjxDmD2BOpk+eejwge3/ek4fJzVNEKI0mLU+4C94pSEGmvVuyjEBOiTCpOZEy0RJhwiIzt0pKZCi8C/e2JYCAWk8x3akJw4hEjDXuDL4s17j7KvLmLtliVvIG8uxHvxzRMQuItvS39oYksqNygvsLuCNXlV0MRcFPUy+nmj9nDPefn07W4tVYbAx4WpSXvjAKU9Sd00lDLczBZvUoPNvNjkxJcIVH1IWCQy2vTfQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR01MB7153.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(4636009)(39830400003)(396003)(376002)(136003)(346002)(366004)(36756003)(86362001)(508600001)(4326008)(26005)(52116002)(8936002)(6666004)(38350700002)(38100700002)(6916009)(8676002)(6486002)(66946007)(5660300002)(83380400001)(66556008)(66476007)(186003)(2906002)(316002)(2616005)(9686003)(6512007)(6506007); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: LXGQV5uP2UCYdkR7jtog+sBi59krvM/WHfGcTBjjcuRfJXV0MT+UlHxq45nJ//D0mCKMGqwSCNOtijIDmEfc62FK9yCSk74Np1Y/kVRNxRLcPr3yIYbpdI8xyIfDB4Jsd3xStBj2DaEOQZ48tEsOfDbYbp4tYyWF4Zl/fxXy2B8dYaTHr0eQcGV8XHq5tJjDOFqscdFybQObjH5FGWrQCnHdJVBemtuBirUHwOZpPaa/OUFirSdjFAD6XAJAVstwhcMAVcOhEwDALZQLvGvpALHDnTlFaoBkJoY+Kg+oqUQcIpyufpYDAMP10wZnxzvIfQhFII7EoUHi40DEmvHczuveL18VRlc/Z+jAYrkNriLwtE/o65w0SDFXXRAxuZLS1pVSYV/1aOcWXjzTThkm94pUFIbjffHeg1tjgICpN3tbKlFDoMonBsvghkNLjor0Y+t8jUrV5GILTGVOuKpOdNCAwIC9xxlGx9RiVd+ICbxD8BZ2unl7k1yS6YhoHr9hHRrcTeoFpcqPck8cheig2S1HShEFy2Lk9VlT9Gi0+7E0jSEZ7bcgnw5uBlTx7PIPH+tnXTTL3ONb6SNplsztVGQ+zTL/Pm6UBvBqp8IWpmso67c73GPQUCMAhiobVtMEzMgQS2CO+a0gF3yzlEZnr1YEvhqees4y26fYjSI0SGD++0fHdruXn82lLpXSVViU652lW8H1ey68fjS8YDMmLVcRiQAn/6K2YhcQiSGF2Cwo3bDnxkgsJQALMGKo1r1LluAaVhXtz8yY61zMauT0xnMxRuq8gBBfYGdHqeBOMy3CO3b3RuElhneARfbaMhKMb5LKVmczx7KTkraKH59pZH+ePTOIxDeM+wte8oSDZoYoPVbozYfajCaPAZyK5eGcHXsnJ1wTGzoNWLzAh/On80A5JBkdvKpV+ewcoKVvoJi4ah2ZJJW0rt626Su/e9EWXS1odV67YPDLSFGTteRaQyS30v/FS/VtQ7cKgNDADD4aYZSRdcRKA62Z8ub/PnFPAdazRtNIweOEaVZ6Ad7IHTE9AL0/+D+68k5uq5s60yt9JI3/fQx5iE7PKymoRTt2CRTb+LOD5cbs0dzvJIHCKQ+cSzueuXyc1jnlHkdgKsqvndVQ686KLoswFQpk6/cS6FOIOpNjM7MBeIqLsxQVH9zX6dxHIX4UuZfZljVN8efLgwpLn2z9B0f7s9sUmUL6Kub03U4zt2+iIbfBv2rqVgu2zd+0iFXvMdD6j0xq6kGRTUSSoPuKLhHJpCMq//otiykx80u+uZUw2jNyuSOZHAB9p7stL9SElTovPvWyUzI5yw9b82aGpLWEjp61DWqp/BTTZ8QQB20Ph9811e5CT2Dk1O8Sx6hbYiCr9E+mNvRF7W/0T03zugOfQBeU7tOznr76fczydtdB971xLTWSVgz6MrqhysidkwxG91f/ztgvYl1XBKOuDxHwS/6MSj+6zFZS0q8guF88/Y+80CXYh5fgiaFswYLPGKXkgG++GNQ42xI4L1/WDEBV1i6VO1CxtxsoZD0Cz3X+0895/+Da4yG8g5gZ06O+I4RhWBtqBjhZryIsg4CNCDGQBGOv48t2UkN2HL7npqFJF7hROW2AY18EVsBIZLIHPmhWNyfqEta06OE2r8g30kwojQoX1Q6TjyyJIfBmQGkI4noTH0jiDPs/DZ/hhVdZEeHE7TARZGhX8XiB/olSqkgP7OHFW0oCqdKYL2IYn+C1Hvu66Y1pMQ== X-OriginatorOrg: cornelisnetworks.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3bb418cc-55f6-4219-d5bd-08d9d87b2b2b X-MS-Exchange-CrossTenant-AuthSource: CH0PR01MB7153.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jan 2022 23:02:56.8536 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4dbdb7da-74ee-4b45-8747-ef5ce5ebe68a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 7Vk0Cxu6mpeJLYy6oT9e8UPS3OjQ3r1R3fPndjWzNJBie7oxlF8fwmX9UO5TRWaoghVZ4nZ1vOTx/LlvVxjdrtgAqszNrD5FPG2FrrSgLY8C3+XVa2pp1q/Hst0ZZVbp X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR01MB6140 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Mike Marciniszyn The following allocation with large txqueuelen will result in the following warning: [ 136.166367] Call Trace: [ 136.169661] __alloc_pages_nodemask+0x283/0x2c0 [ 136.175273] kmalloc_large_node+0x3c/0xa0 [ 136.180289] __kmalloc_node+0x22a/0x2f0 [ 136.185110] ? __kmalloc_node+0x22a/0x2f0 [ 136.190169] hfi1_ipoib_txreq_init+0x19f/0x330 [hfi1] [ 136.196453] hfi1_ipoib_setup_rn+0xd3/0x1a0 [hfi1] [ 136.202396] rdma_init_netdev+0x5a/0x80 [ib_core] [ 136.208210] ? hfi1_ipoib_set_id+0x30/0x30 [hfi1] [ 136.213995] ipoib_intf_init+0x6c/0x350 [ib_ipoib] [ 136.219873] ipoib_intf_alloc+0x5c/0xc0 [ib_ipoib] [ 136.225751] ipoib_add_one+0xbe/0x300 [ib_ipoib] [ 136.231563] add_client_context+0x12c/0x1a0 [ib_core] [ 136.237739] ib_register_client+0x147/0x190 [ib_core] [ 136.243906] ? 0xffffffffc0570000 [ 136.248123] ipoib_init_module+0xdd/0x132 [ib_ipoib] [ 136.254212] do_one_initcall+0x46/0x1c3 [ 136.259136] ? do_init_module+0x22/0x220 [ 136.264043] ? kmem_cache_alloc_trace+0x131/0x270 [ 136.269813] do_init_module+0x5a/0x220 [ 136.274547] load_module+0x14c5/0x17f0 [ 136.279246] ? __do_sys_init_module+0x13b/0x180 [ 136.284810] __do_sys_init_module+0x13b/0x180 [ 136.290295] do_syscall_64+0x5b/0x1a0 [ 136.294914] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 136.301070] RIP: 0033:0x7f3eacd0d80e For ipoib, the txqueuelen is modified with the module parameter send_queue_size. Fix by changing to use kv versions of the same allocator to handle the large allocations. The allocation embeds a hdr struct that is dma mapped. Change that struct to a pointer to a kzalloced struct. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Cc: stable@vger.kernel.org Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/infiniband/hw/hfi1/ipoib.h | 2 +- drivers/infiniband/hw/hfi1/ipoib_tx.c | 36 ++++++++++++++++++++++++----------- 2 files changed, 26 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/ipoib.h b/drivers/infiniband/hw/hfi1/ipoib.h index 9091229..aec60d4 100644 --- a/drivers/infiniband/hw/hfi1/ipoib.h +++ b/drivers/infiniband/hw/hfi1/ipoib.h @@ -55,7 +55,7 @@ */ struct ipoib_txreq { struct sdma_txreq txreq; - struct hfi1_sdma_header sdma_hdr; + struct hfi1_sdma_header *sdma_hdr; int sdma_status; int complete; struct hfi1_ipoib_dev_priv *priv; diff --git a/drivers/infiniband/hw/hfi1/ipoib_tx.c b/drivers/infiniband/hw/hfi1/ipoib_tx.c index bf62956..d6bbdb8 100644 --- a/drivers/infiniband/hw/hfi1/ipoib_tx.c +++ b/drivers/infiniband/hw/hfi1/ipoib_tx.c @@ -122,7 +122,7 @@ static void hfi1_ipoib_free_tx(struct ipoib_txreq *tx, int budget) dd_dev_warn(priv->dd, "%s: Status = 0x%x pbc 0x%llx txq = %d sde = %d\n", __func__, tx->sdma_status, - le64_to_cpu(tx->sdma_hdr.pbc), tx->txq->q_idx, + le64_to_cpu(tx->sdma_hdr->pbc), tx->txq->q_idx, tx->txq->sde->this_idx); } @@ -231,7 +231,7 @@ static int hfi1_ipoib_build_tx_desc(struct ipoib_txreq *tx, { struct hfi1_devdata *dd = txp->dd; struct sdma_txreq *txreq = &tx->txreq; - struct hfi1_sdma_header *sdma_hdr = &tx->sdma_hdr; + struct hfi1_sdma_header *sdma_hdr = tx->sdma_hdr; u16 pkt_bytes = sizeof(sdma_hdr->pbc) + (txp->hdr_dwords << 2) + tx->skb->len; int ret; @@ -256,7 +256,7 @@ static void hfi1_ipoib_build_ib_tx_headers(struct ipoib_txreq *tx, struct ipoib_txparms *txp) { struct hfi1_ipoib_dev_priv *priv = tx->txq->priv; - struct hfi1_sdma_header *sdma_hdr = &tx->sdma_hdr; + struct hfi1_sdma_header *sdma_hdr = tx->sdma_hdr; struct sk_buff *skb = tx->skb; struct hfi1_pportdata *ppd = ppd_from_ibp(txp->ibp); struct rdma_ah_attr *ah_attr = txp->ah_attr; @@ -483,7 +483,7 @@ static int hfi1_ipoib_send_dma_single(struct net_device *dev, if (likely(!ret)) { tx_ok: trace_sdma_output_ibhdr(txq->priv->dd, - &tx->sdma_hdr.hdr, + &tx->sdma_hdr->hdr, ib_is_sc5(txp->flow.sc5)); hfi1_ipoib_check_queue_depth(txq); return NETDEV_TX_OK; @@ -547,7 +547,7 @@ static int hfi1_ipoib_send_dma_list(struct net_device *dev, hfi1_ipoib_check_queue_depth(txq); trace_sdma_output_ibhdr(txq->priv->dd, - &tx->sdma_hdr.hdr, + &tx->sdma_hdr->hdr, ib_is_sc5(txp->flow.sc5)); if (!netdev_xmit_more()) @@ -683,7 +683,8 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) { struct net_device *dev = priv->netdev; u32 tx_ring_size, tx_item_size; - int i; + struct hfi1_ipoib_circ_buf *tx_ring; + int i, j; /* * Ring holds 1 less than tx_ring_size @@ -701,7 +702,9 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) for (i = 0; i < dev->num_tx_queues; i++) { struct hfi1_ipoib_txq *txq = &priv->txqs[i]; + struct ipoib_txreq *tx; + tx_ring = &txq->tx_ring; iowait_init(&txq->wait, 0, hfi1_ipoib_flush_txq, @@ -725,14 +728,19 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) priv->dd->node); txq->tx_ring.items = - kcalloc_node(tx_ring_size, tx_item_size, - GFP_KERNEL, priv->dd->node); + kvzalloc_node(array_size(tx_ring_size, tx_item_size), + GFP_KERNEL, priv->dd->node); if (!txq->tx_ring.items) goto free_txqs; txq->tx_ring.max_items = tx_ring_size; txq->tx_ring.shift = ilog2(tx_item_size); txq->tx_ring.avail = hfi1_ipoib_ring_hwat(txq); + tx_ring = &txq->tx_ring; + for (j = 0; j < tx_ring_size; j++) + hfi1_txreq_from_idx(tx_ring, j)->sdma_hdr = + kzalloc_node(sizeof(*tx->sdma_hdr), + GFP_KERNEL, priv->dd->node); netif_tx_napi_add(dev, &txq->napi, hfi1_ipoib_poll_tx_ring, @@ -746,7 +754,10 @@ int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv) struct hfi1_ipoib_txq *txq = &priv->txqs[i]; netif_napi_del(&txq->napi); - kfree(txq->tx_ring.items); + tx_ring = &txq->tx_ring; + for (j = 0; j < tx_ring_size; j++) + kfree(hfi1_txreq_from_idx(tx_ring, j)->sdma_hdr); + kvfree(tx_ring->items); } kfree(priv->txqs); @@ -780,17 +791,20 @@ static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq) void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv) { - int i; + int i, j; for (i = 0; i < priv->netdev->num_tx_queues; i++) { struct hfi1_ipoib_txq *txq = &priv->txqs[i]; + struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring; iowait_cancel_work(&txq->wait); iowait_sdma_drain(&txq->wait); hfi1_ipoib_drain_tx_list(txq); netif_napi_del(&txq->napi); hfi1_ipoib_drain_tx_ring(txq); - kfree(txq->tx_ring.items); + for (j = 0; j < tx_ring->max_items; j++) + kfree(hfi1_txreq_from_idx(tx_ring, j)->sdma_hdr); + kvfree(tx_ring->items); } kfree(priv->txqs); From patchwork Sat Jan 15 23:02:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: mike.marciniszyn@cornelisnetworks.com X-Patchwork-Id: 532430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F2A3C433EF for ; Sat, 15 Jan 2022 23:03:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233870AbiAOXDB (ORCPT ); Sat, 15 Jan 2022 18:03:01 -0500 Received: from mail-sn1anam02on2127.outbound.protection.outlook.com ([40.107.96.127]:20922 "EHLO NAM02-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233895AbiAOXDB (ORCPT ); Sat, 15 Jan 2022 18:03:01 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=W/FUfbdvSPh9Kea2BDGKT8MI72/0fEY+bh29jeDx3G2gO1V/S3TisPSyr5bRjzXJ2oBkI9LjZNxr7jNTAmnmGYnM3txIOLQxR+z7xZuPaEC6kM8POn5JwwI2FmZ+/ANOBuo8WotHmtCppI8BTA45W8fGELxbeXNbE/0Vos1jJ2lnV92SZrnHNHWTdtcrTImK1NuQ48SMC6kTe3gzE59RYQCqYBpj8dxJTibqWWX9Xh90PIff0Rv5Qkh1fmoDJfRb40QTdOrPBPK57UsCI9CIKOsmcaK2bgKXnePNaQbsXHqNuWvoXbNBJuetyuSgqqFrwP7q2SGcvFYITEEmE1QBag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yfidEllS2PHma5Rq+M/fqEnyQ9pS9xA+ZgZZXZbbF7U=; b=F3u086bNyC9SHTiRkihjDeh+06782qSKfcehz2BkpLJfsiYZtuMMuTicblbUew3+tbl333Iz+aYRhPTJCQXKAKxxzGxdoPKH+hzpfDpEZbdQ9T1n9sZ6HA+mxrlmnwVlyLzD5Uj25qxn58pz1/R6/Kzs6Q73KsWjYwkwR+/uVXYBwQrDDebexwjaQuU6d5gOhV23zNVvTmFC5zCuMLx052AUVdSs1HFGP6xAzs1kfpOSiVXsG/hy15ZJTU92Z6Y85N9JYyFh760Uxsx2LT1IRPu90uw5Xx07ZdH3uiB7WvZmQRZJMpktd3LEbYjKdL7XZ7paeO0DrJA792nL3qdVLw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornelisnetworks.com; dmarc=pass action=none header.from=cornelisnetworks.com; dkim=pass header.d=cornelisnetworks.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cornelisnetworks.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yfidEllS2PHma5Rq+M/fqEnyQ9pS9xA+ZgZZXZbbF7U=; b=aXi4xblEBFGeCikgDivsCwAGFVqLhrKVvekFsCGQ2v3P0FcvazpMG7NvYxDqwPkMhi1caJKJFech0oYFob+wtu8FNvIBovP2MfxDxlCJzT/LYufkPlAJP6LZJW9B/FO/hnUMqqpGywsJJXmmW9kQgEHOhXz7YRuimclKL+Vcf86aH5Sv6sPvIHUQDbY/BzjSWQWJsr9IQDN7Z397Hal3eNKzNC6SEapY75lKJpJpVK7ZldK8f1Lu2BaeNqDQG9yJi+yNjO0Z8NcuO7mQvp9KiVO+SvbWFLaSBaG/i/gjciT5ASB+dOIMB8UXMttYMcq4lfai/kdJd24S0uWZ4Z3ndg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=cornelisnetworks.com; Received: from CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) by SA0PR01MB6140.prod.exchangelabs.com (2603:10b6:806:e4::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4888.11; Sat, 15 Jan 2022 23:02:59 +0000 Received: from CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0]) by CH0PR01MB7153.prod.exchangelabs.com ([fe80::110:392e:efd1:88d0%8]) with mapi id 15.20.4888.012; Sat, 15 Jan 2022 23:02:59 +0000 From: mike.marciniszyn@cornelisnetworks.com To: jgg@ziepe.ca Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org Subject: [PATCH for-rc 3/4] IB/hfi1: Fix AIP early init panic Date: Sat, 15 Jan 2022 18:02:35 -0500 Message-Id: <1642287756-182313-4-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> References: <1642287756-182313-1-git-send-email-mike.marciniszyn@cornelisnetworks.com> X-ClientProxiedBy: BL1PR13CA0106.namprd13.prod.outlook.com (2603:10b6:208:2b9::21) To CH0PR01MB7153.prod.exchangelabs.com (2603:10b6:610:ea::7) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: a98495c3-cc61-4a66-58ab-08d9d87b2ce4 X-MS-TrafficTypeDiagnostic: SA0PR01MB6140:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:4714; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: wmBkDzTMv+FE4TKbWHO7VRPAL/js+xSQeLD3xyW9YIvynNzDL/9dMgcxJE6aSv9nZ9shYlfc6r8cKIgYwI8XcJxdtW2ybd6MHKNstN/ak0fvr1unamTqR4/uFT7arpgFMQOuWWYrx3FiGjCrd0+UHaMiVEq/TQYYBQwIuexx/ZtBPo/+0UJeuC8QX9282n+R9tJ/2BZL7hmK2/ghWStn6A7eb8UlkeIcyHA55pUEu3WpBl3QXIkzmp1Wj5zl1U7VbKM2YTgjD/aTF1DYGKhGS/wlRs4dfSYLXDDqFiNHuH5blVJdvC0lvX86X9JIJ4JwE3TArGQ2GLD6q2F1JxPvOSUzqwDHAEu1eREjJOrn6FA3Ermu4ko1z5DO31d2qKgi8j0+9dCP4aZAb+8lzGJGJZ/hyEWfpUS6AzvvCuQ+8qxityrlHwtJEZ7svqfvMopc9xWdT8b3ZJ3/namGRSYL3/FXhn2d0E26lEVnfPKpzrytTprsWTHxxS65KL3GCZfhtbgxdyycKat/4gCzUNByfBQoZP3Ytb/GkPhRO8GqOZKMZuLpmZQ6aCDWzZp2CqcvdhyDyCnGbeNqYrkQ5mL9OGx453DJ2lryd9vWb9Ren70oPORLWRNUQm5gQsyG1+lw98xdxrnMdEGEwwbFtzAjWALq3bSZC4ZJ0+v6hAI75pbOKDh+Um0dKFbZCPubGqGRxGJmqHwupV0CAKGm6O4tjQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH0PR01MB7153.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(4636009)(39830400003)(396003)(376002)(136003)(346002)(366004)(36756003)(86362001)(508600001)(4326008)(26005)(45080400002)(52116002)(8936002)(6666004)(38350700002)(38100700002)(6916009)(8676002)(6486002)(66946007)(5660300002)(83380400001)(66556008)(66476007)(186003)(2906002)(316002)(2616005)(9686003)(6512007)(6506007); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 8gG/Dxk1Hrz3SUSrRVxrGYSX8BrDhLoPildMxgSO3nPHQzIM+yY0mjzWXHL9pzxzmhDWCuLhpioTkwUifUekPjLu/fHJDSP5lfJ1Znl3PMXQkVLHPuXbCXekMQIVOkBzmJ6QuCnF8e5upjA71JvZZou0be1O9uMYScZUSYPAJPvTg88ROuHSzxFNKnV86/6TEoaIVuYG3h8XWEZK75jB1M0a40Oc0bxoMuSIIJZ9VZ9Ls2fyWSIqdoJcG+CWi5EBZHEY4QcAFnQ7mYnO7UBbjJE0kY+LnZpOdHyjZAf3HVXVXNQzw+ETyEUXT7hGjaHUl49jrmFVTqANScxoxcFX8e0URN0mLRewMlIb+oP2xtVsi/xH5SVhAPzABcTy8VIVD4yMunaaoTe6DOO82+5P5VriHlUaF1i0KSfHkxPPIN+0nlhWh9IBPmY8dzCDN3SI+naoNFwpNriw4rr2Rdz/eTmSAfh/cseib+i1y5YIcDA5koUI6dMyvNWYHdiGF8dPN1LZrbLqEYi3hwabyAbRL2Zo7PL5w/66uEH57jVzA/pDaWCQ00snEnK5eLd6I39/ZoqetgRXbMyRInCkOkfCUY9bjBbJq9bDIE6NiYOI9e8M+0LuR8oVUlVAJDnionQ7baTGldYbj3FR3SRC0N38IfAxuVYQOdj3BhyMzznduquHekqlU+AOuY8Y+JJybePs+eB89Zvd4iMq30HBqBx4xEMWbEaCsKGuMCdD7bdPPD80NF1wOEYHwxmJZxcGP7X7zwYqqs5aKNXAma/I7hjsuJFbeSdogZdYER+cKbaj3q1qLKYNNg029z7Go4tMFeGpyATJOm+tTbrOI5UzZWGmvSE7iGn0veKa2v5kO53rAk4DZexr4a2sD32Z+E5hmK5OYTWBjW0zuAFlw2Jik6rwIJLBCcDq8RNnTfKX8o07GREluddIkMrv38pPLJwQoGS/Ix1EodmSkPIrMnNYe4jSzcRTtBIh5OFrY/l7sk2Kkpp1cNPyi0/18ic2CHEcL8GkdNBX6TI81E3H1nQyPnGzebFOVwmJ+KSs4Ti9HQtqvcmU11RhnhO/NWGBj4dOUrZkhJ60RgYGLcMJKOWzSdlPZRIczB/osJ1oElAjOjf3NI+1+I2Pvxe+tPQdMFVbWJNzptc0RY0fP6ut881qpkfU4jZRSsLh18faicStK4df+BoK0DV3+6TSJvvcz2m+lYdQb34xExXW++sNzkypvgIicw2O4fTKusKme8DPoub+20i2+tYZ1o9HJcqXpMBNyC6SugmWv0Djs9aTK7jU0byCA63n7fUwd+t7vmnQzQHASWh/eD+0p+/n5Je0R4EwHLP9a0X74A0bUQ6yNYRbpB6WpkNwE1I5jSdDdGdBzHaI/985j8x1nD0qT2bYA+WYNDU6n8U4qY3XuYWhbx78UGSg3n4r8GQQYLOutitzCrlhxVGW3doYsmnNWcRie3UMEIKAmdWvgBCFaDYuId1xjlY51ueYxaPdebkCMBK5rzVbZ8GbojSN3ZgUta+B0o9Mnn1vAL9HZIyGBmbbTDATjAJew1703jSjq+N0TLJMgOLoKEgMNgPt/MY22UaoJ4Z/e5KZap4v1Z7ujqlClEh5k9Ac6mgVLpWjWt9P1aSR8wQwWy4GDEPbsSxT3ZuJMDZ0RK1KV1TFRdPwpqfmuMo8Sset3wYWU86r7kIB0oRrLg4/O8YwmaSungdLTbr72Q50YUfOmDTQTMaAkELay2An+OdDbQ== X-OriginatorOrg: cornelisnetworks.com X-MS-Exchange-CrossTenant-Network-Message-Id: a98495c3-cc61-4a66-58ab-08d9d87b2ce4 X-MS-Exchange-CrossTenant-AuthSource: CH0PR01MB7153.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jan 2022 23:02:59.7430 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4dbdb7da-74ee-4b45-8747-ef5ce5ebe68a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: iRZ4YIOSyv0uc+6v5SUZ698w7cxT/q//ZGpI+ZH6xQpQiYBseHdnXPGkJTTo7BM962ECZogRCevYSzdhiNp4TfuGWXdyG9E1msBeWfsPCEdbTqGuOHMWYtWVJ2G1VRdv X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR01MB6140 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Mike Marciniszyn An early failure in hfi1_ipoib_setup_rn() can lead to the following panic: [ 355.625765] BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0 [ 355.634188] PGD 0 P4D 0 [ 355.636731] Oops: 0002 [#1] SMP NOPTI [ 355.659994] Workqueue: events work_for_cpu_fn [ 355.664371] RIP: 0010:try_to_grab_pending+0x2b/0x140 [ 355.669361] Code: 1f 44 00 00 41 55 41 54 55 48 89 d5 53 48 89 fb 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 48 89 55 00 40 84 f6 75 77 48 0f ba 2b 00 72 09 31 c0 5b 5d 41 5c 41 5d c3 48 89 df e8 6c [ 355.688238] RSP: 0018:ffffb6b3cf7cfa48 EFLAGS: 00010046 [ 355.693491] RAX: 0000000000000246 RBX: 00000000000001b0 RCX: 0000000000000000 [ 355.700664] RDX: 0000000000000246 RSI: 0000000000000000 RDI: 00000000000001b0 [ 355.707836] RBP: ffffb6b3cf7cfa70 R08: 0000000000000f09 R09: 0000000000000001 [ 355.715007] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 355.722178] R13: ffffb6b3cf7cfa90 R14: ffffffff9b2fbfc0 R15: ffff8a4fdf244690 [ 355.729351] FS: 0000000000000000(0000) GS:ffff8a527f400000(0000) knlGS:0000000000000000 [ 355.737485] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 355.743260] CR2: 00000000000001b0 CR3: 00000017e2410003 CR4: 00000000007706f0 [ 355.750434] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 355.757607] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 355.764780] PKRU: 55555554 [ 355.767497] Call Trace: [ 355.769954] __cancel_work_timer+0x42/0x190 [ 355.774159] ? dev_printk_emit+0x4e/0x70 [ 355.778115] iowait_cancel_work+0x15/0x30 [hfi1] [ 355.782768] hfi1_ipoib_txreq_deinit+0x5a/0x220 [hfi1] [ 355.787933] ? dev_err+0x6c/0x90 [ 355.791188] hfi1_ipoib_netdev_dtor+0x15/0x30 [hfi1] [ 355.796188] hfi1_ipoib_setup_rn+0x10e/0x150 [hfi1] [ 355.801094] rdma_init_netdev+0x5a/0x80 [ib_core] [ 355.805832] ? hfi1_ipoib_free_rdma_netdev+0x20/0x20 [hfi1] [ 355.811434] ipoib_intf_init+0x6c/0x350 [ib_ipoib] [ 355.816251] ipoib_intf_alloc+0x5c/0xc0 [ib_ipoib] [ 355.821068] ipoib_add_one+0xbe/0x300 [ib_ipoib] [ 355.825712] add_client_context+0x12c/0x1a0 [ib_core] [ 355.830794] enable_device_and_get+0xdc/0x1d0 [ib_core] [ 355.836049] ib_register_device+0x572/0x6b0 [ib_core] [ 355.841128] rvt_register_device+0x11b/0x220 [rdmavt] [ 355.846219] hfi1_register_ib_device+0x6b4/0x770 [hfi1] [ 355.851486] do_init_one.isra.20+0x3e3/0x680 [hfi1] [ 355.856389] local_pci_probe+0x41/0x90 [ 355.860154] work_for_cpu_fn+0x16/0x20 [ 355.863921] process_one_work+0x1a7/0x360 [ 355.867948] ? create_worker+0x1a0/0x1a0 [ 355.871888] worker_thread+0x1cf/0x390 [ 355.875655] ? create_worker+0x1a0/0x1a0 [ 355.879594] kthread+0x116/0x130 [ 355.882838] ? kthread_flush_work_fn+0x10/0x10 [ 355.887302] ret_from_fork+0x1f/0x40 [ 355.890893] Modules linked in: rpcrdma sunrpc rdma_ucm ib_srpt ib_isert acpi_cpufreq(-) iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib intel_rapl_msr intel_rapl_ common isst_if_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass mlx5_core crct10dif_pclmul crc32_pclmul hfi1(OE+) tls ghash_clmulni_intel rdmavt(OE) mgag200 drm_kms_helper mlxfw mei_me syscopyarea sysfill rect ib_uverbs sysimgblt fb_sys_fops rapl ioatdma intel_cstate tg3 i2c_algo_bit mei hpwdt ses drm ib_core pci_hyperv_intf uas enclosure hpilo pcspkr intel_uncore wmi lpc_ich dca acpi_tad ipmi_ssif acpi_power_meter binfmt_misc xpmem(O ) numatools(O) fuse ip_tables dm_mod xfs libcrc32c vfat fat ext4 mbcache jbd2 sd_mod t10_pi sg smartpqi ipmi_si scsi_transport_sas usb_storage ipmi_devintf ipmi_msghandler crc32c_intel [last unloaded: mlxfw] [ 355.970226] CR2: 00000000000001b0 [ 355.973583] The panic happens in hfi1_ipoib_txreq_deinit() because there is a NULL deref when hfi1_ipoib_netdev_dtor() is called in this error case. hfi1_ipoib_txreq_init() and hfi1_ipoib_rxq_init() are self unwinding so fix by adjusting the error paths accordingly. Other changes: - hfi1_ipoib_free_rdma_netdev() is deleted including the free_netdev() since the netdev core code deletes calls free_netdev() - The switch to the accelerated entrances is moved to the success path. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Cc: stable@vger.kernel.org Reviewed-by: Dennis Dalessandro Signed-off-by: Mike Marciniszyn --- drivers/infiniband/hw/hfi1/ipoib_main.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/ipoib_main.c b/drivers/infiniband/hw/hfi1/ipoib_main.c index e1a2b02..8306ed5 100644 --- a/drivers/infiniband/hw/hfi1/ipoib_main.c +++ b/drivers/infiniband/hw/hfi1/ipoib_main.c @@ -168,12 +168,6 @@ static void hfi1_ipoib_netdev_dtor(struct net_device *dev) free_percpu(dev->tstats); } -static void hfi1_ipoib_free_rdma_netdev(struct net_device *dev) -{ - hfi1_ipoib_netdev_dtor(dev); - free_netdev(dev); -} - static void hfi1_ipoib_set_id(struct net_device *dev, int id) { struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev); @@ -211,24 +205,23 @@ static int hfi1_ipoib_setup_rn(struct ib_device *device, priv->port_num = port_num; priv->netdev_ops = netdev->netdev_ops; - netdev->netdev_ops = &hfi1_ipoib_netdev_ops; - ib_query_pkey(device, port_num, priv->pkey_index, &priv->pkey); rc = hfi1_ipoib_txreq_init(priv); if (rc) { dd_dev_err(dd, "IPoIB netdev TX init - failed(%d)\n", rc); - hfi1_ipoib_free_rdma_netdev(netdev); return rc; } rc = hfi1_ipoib_rxq_init(netdev); if (rc) { dd_dev_err(dd, "IPoIB netdev RX init - failed(%d)\n", rc); - hfi1_ipoib_free_rdma_netdev(netdev); + hfi1_ipoib_txreq_deinit(priv); return rc; } + netdev->netdev_ops = &hfi1_ipoib_netdev_ops; + netdev->priv_destructor = hfi1_ipoib_netdev_dtor; netdev->needs_free_netdev = true;