From patchwork Fri Aug 6 04:20:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Davey X-Patchwork-Id: 493416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 401CFC432BE for ; Fri, 6 Aug 2021 04:20:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CD2B611C6 for ; Fri, 6 Aug 2021 04:20:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231337AbhHFEUl (ORCPT ); Fri, 6 Aug 2021 00:20:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230513AbhHFEUl (ORCPT ); Fri, 6 Aug 2021 00:20:41 -0400 Received: from gate2.alliedtelesis.co.nz (gate2.alliedtelesis.co.nz [IPv6:2001:df5:b000:5::4]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3785BC061798 for ; Thu, 5 Aug 2021 21:20:25 -0700 (PDT) Received: from svr-chch-seg1.atlnz.lc (mmarshal3.atlnz.lc [10.32.18.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by gate2.alliedtelesis.co.nz (Postfix) with ESMTPS id 94B48806B5 for ; Fri, 6 Aug 2021 16:20:20 +1200 (NZST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alliedtelesis.co.nz; s=mail181024; t=1628223620; bh=GmIG1g254fWl3us7arRBDuxbp9RwD/O2JOa93HVpKG0=; h=From:To:Subject:Date; b=HdZASylq2TMk2rEDXMko0rri0f6VbOLgLKmAM62gYGyjgM3lDqVfa0e07wki3awc+ 7E8AOdVNeE0AvUlL3yuIBfvAonAbl8Tdo5UnmqH0nrRhT7yHE18VntlQUQHtgBP9FG N+wCk5LsJ7PUKA0fAhybt7FGXdby4onTnR1HRMZ6qVWUV/Qw1PAahK7IiAlR+es2fn D2eZhvt+YawXuYjnJe6MB6UXuV04dY9Y38sqbh2MjlwcDAfPqc5c04bhDM0KkHwNA5 Kw16hef3pfEDeGS9JSI/wBKhCvWECrcH20lJH6zV933NE7ez5XgynJLpFe7sJyUho4 /qczGFtoN2tJw== Received: from svr-chch-ex1.atlnz.lc (Not Verified[2001:df5:b000:bc8::77]) by svr-chch-seg1.atlnz.lc with Trustwave SEG (v8, 2, 6, 11305) id ; Fri, 06 Aug 2021 16:20:20 +1200 Received: from svr-chch-ex1.atlnz.lc (2001:df5:b000:bc8::77) by svr-chch-ex1.atlnz.lc (2001:df5:b000:bc8::77) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Fri, 6 Aug 2021 16:20:20 +1200 Received: from svr-chch-ex1.atlnz.lc ([fe80::409d:36f5:8899:92e8]) by svr-chch-ex1.atlnz.lc ([fe80::409d:36f5:8899:92e8%12]) with mapi id 15.00.1497.023; Fri, 6 Aug 2021 16:20:20 +1200 From: Paul Davey To: "linux-arm-msm@vger.kernel.org" Subject: bus: mhi: parse_xfer_event running transfer completion callbacks more than once for a given buffer Thread-Topic: mhi: parse_xfer_event running transfer completion callbacks more than once for a given buffer Thread-Index: AQHXinpdrX7hotkcVUS4ayEOxKCytg== Date: Fri, 6 Aug 2021 04:20:19 +0000 Message-ID: <9a6a00acc60c676f39f89a8ce2989416bed1b24d.camel@alliedtelesis.co.nz> Accept-Language: en-NZ, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [2001:df5:b000:23:902a:2b83:fa41:f783] Content-ID: MIME-Version: 1.0 X-SEG-SpamProfiler-Analysis: v=2.3 cv=dvql9Go4 c=1 sm=1 tr=0 a=Xf/6aR1Nyvzi7BryhOrcLQ==:117 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=MhDmnRu9jo8A:10 a=sBgCyHtZn0QAu0gE2X4A:9 a=QEXdDO2ut3YA:10 X-SEG-SpamProfiler-Score: 0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org Hi linux-arm-msm list, We have been using the mhi driver with a Sierra EM9191 5G modem module and have seen an occasional issue where the kernel would crash with messages showing "BUG: Bad page state" which we debugged further and found it was due to mhi_net_ul_callback freeing the same skb multiple times, further debugging tracked this down to a case where parse_xfer_event computed a dev_rp from the passed event's ev_tre which does not lie within the region of valid "in flight" transfers for the tre_ring. See the patch below for how this was detected. I believe that receiving such an event results in the loop which runs completion events for the transfers to re-run some completion callbacks as it walks all the way around the ring again to reach the invalid dev_rp position. What could cause parse_xfer_event to receive a transfer event with a tre pointer which would be outside the range of in-flight transfers? For example receiving events where the tre pointers do not only increase or receive a second event of types MHI_EV_CC_OVERFLOW, MHI_EV_CC_EOB, or MHI_EV_CC_EOT for a previous tre. The existing mhi driver code appears to assume that transfer events are received strictly in order such that you can never receive a transfer completion event for a transfer descriptor outside the current set of "in flight" transfers in the tre ring (those between the read pointer and write pointer). Thanks, Paul Davey ----8<---- >From bf3e3821a90b89758a30cefed662d32a78dcb766 Mon Sep 17 00:00:00 2001 From: Paul Davey Date: Fri, 6 Aug 2021 15:36:05 +1200 Subject: [PATCH] bus: mhi: Detect invalid xfer event dev_rp Signed-off-by: Paul Davey --- drivers/bus/mhi/core/main.c | 10 ++++++++++ 1 file changed, 10 insertions(+) -- 2.32.0 diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c index c67fd001ded1..238689a5dfc7 100644 --- a/drivers/bus/mhi/core/main.c +++ b/drivers/bus/mhi/core/main.c @@ -596,6 +596,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, void *dev_rp; struct mhi_buf_info *buf_info; u16 xfer_len; + bool rp_valid; if (!is_valid_ring_ptr(tre_ring, ptr)) { dev_err(&mhi_cntrl->mhi_dev->dev, @@ -609,6 +610,15 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, if (dev_rp >= (tre_ring->base + tre_ring->len)) dev_rp = tre_ring->base; + if (tre_ring->rp < tre_ring->wp) + rp_valid = (dev_rp <= tre_ring->wp && dev_rp > tre_ring->rp); + else + rp_valid = !(dev_rp <= tre_ring->rp && dev_rp > tre_ring->wp); + + WARN_ON(!rp_valid); + if (!rp_valid) + goto end_process_tx_event; + result.dir = mhi_chan->dir; local_rp = tre_ring->rp;