From patchwork Thu Oct 12 19:25:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 733186 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D2563AC1F for ; Thu, 12 Oct 2023 19:30:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="ZKyBR9cU" Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B2F9BB for ; Thu, 12 Oct 2023 12:30:18 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-6b20577ef7bso31186b3a.3 for ; Thu, 12 Oct 2023 12:30:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1697139018; x=1697743818; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NbvzQqP0LAVibFSc2uogz4x5ZKRmyFttXMYHpoYS+M4=; b=ZKyBR9cUkEwA/LMbDou152prn3jqubQEGGjLUARcLLnm0F8vWYtVANBYVDiLdrH8kG wCV+DLkaWoxHy22+6i+Bb9rILNZ5hv1WD+JfwFf8IYmoRpWTjBvGEHIFFTCjYKxy7eaR BGLxcG6cYjXPbEmAW6PKG1gByySvPV2cmro6k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697139018; x=1697743818; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NbvzQqP0LAVibFSc2uogz4x5ZKRmyFttXMYHpoYS+M4=; b=Kz/T4fdXe1n1NgyJCpiOmB/AU+6OJX5Zu0HCb9zVXCKUT5ZepRm7Iz4i99FekJFFrh 3Fxa1f4hUCkFz1EdjzLuvMwjWw9n2lQJDMLjeUH0MeN39+e9k5K8yaPaOiF1Kue8WGQ9 rXJgb02Od0BdWZqGtt+PYr8CFPXf5dGe1U8AlKaU8B7pcURmOW5+F+ZnKuUUwXFuddCZ MpfTmccweR3uCRII4mE1DMYCO/LLe3H9d4jnrBAcFUnzXlXdNDEsxpemroVKRdWl82S3 Py7vjxsZKo/9DWT7HCnH4iKQMM6uWGyUBnkqWUp+P5+FR0+vnYPlcfAuaJXV0vHQQRzN RnXQ== X-Gm-Message-State: AOJu0Yxd4JQiXay/KmipNlKbT1hIhM2vjraDtJtrTlXBjz4R6Zr4Sh7C GUr8w9ZX9MniBQospsHG62WZLA== X-Google-Smtp-Source: AGHT+IHX+LAFkl+UTb4hMHk5hMbenBt98/XVUPofnKxCfpJ1RLXBg+q7crQtvdtlRe1yoANKkXPevg== X-Received: by 2002:a05:6a21:789c:b0:14b:8b82:867f with SMTP id bf28-20020a056a21789c00b0014b8b82867fmr26000299pzc.50.1697139017857; Thu, 12 Oct 2023 12:30:17 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:7c85:4a99:f03e:6f30]) by smtp.gmail.com with ESMTPSA id b3-20020a639303000000b0057c25885fcfsm2075720pge.10.2023.10.12.12.30.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 12:30:17 -0700 (PDT) From: Douglas Anderson To: Jakub Kicinski , Hayes Wang , "David S . Miller" Cc: Alan Stern , Simon Horman , Edward Hill , Laura Nao , linux-usb@vger.kernel.org, Grant Grundler , Douglas Anderson , =?utf-8?q?Bj=C3=B8rn_Mork?= , Eric Dumazet , Paolo Abeni , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v3 1/5] r8152: Increase USB control msg timeout to 5000ms as per spec Date: Thu, 12 Oct 2023 12:25:00 -0700 Message-ID: <20231012122458.v3.1.I6e4fb5ae61b4c6ab32058cb12228fd5bd32da676@changeid> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012192552.3900360-1-dianders@chromium.org> References: <20231012192552.3900360-1-dianders@chromium.org> Precedence: bulk X-Mailing-List: linux-usb@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net According to the comment next to USB_CTRL_GET_TIMEOUT and USB_CTRL_SET_TIMEOUT, although sending/receiving control messages is usually quite fast, the spec allows them to take up to 5 seconds. Let's increase the timeout in the Realtek driver from 500ms to 5000ms (using the #defines) to account for this. This is not just a theoretical change. The need for the longer timeout was seen in testing. Specifically, if you drop a sc7180-trogdor based Chromebook into the kdb debugger and then "go" again after sitting in the debugger for a while, the next USB control message takes a long time. Out of ~40 tests the slowest USB control message was 4.5 seconds. While dropping into kdb is not exactly an end-user scenario, the above is similar to what could happen due to an temporary interrupt storm, what could happen if there was a host controller (HW or SW) issue, or what could happen if the Realtek device got into a confused state and needed time to recover. This change is fairly critical since the r8152 driver in Linux doesn't expect register reads/writes (which are backed by USB control messages) to fail. Fixes: ac718b69301c ("net/usb: new driver for RTL8152") Suggested-by: Hayes Wang Signed-off-by: Douglas Anderson --- (no changes since v1) drivers/net/usb/r8152.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 0c13d9950cd8..482957beae66 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -1212,7 +1212,7 @@ int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) ret = usb_control_msg(tp->udev, tp->pipe_ctrl_in, RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, - value, index, tmp, size, 500); + value, index, tmp, size, USB_CTRL_GET_TIMEOUT); if (ret < 0) memset(data, 0xff, size); else @@ -1235,7 +1235,7 @@ int set_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) ret = usb_control_msg(tp->udev, tp->pipe_ctrl_out, RTL8152_REQ_SET_REGS, RTL8152_REQT_WRITE, - value, index, tmp, size, 500); + value, index, tmp, size, USB_CTRL_SET_TIMEOUT); kfree(tmp); @@ -9494,7 +9494,8 @@ static u8 __rtl_get_hw_ver(struct usb_device *udev) ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0), RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, - PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp), 500); + PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp), + USB_CTRL_GET_TIMEOUT); if (ret > 0) ocp_data = (__le32_to_cpu(*tmp) >> 16) & VERSION_MASK; From patchwork Thu Oct 12 19:25:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 733185 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B07BD3AC3E for ; Thu, 12 Oct 2023 19:30:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="Lo4hCg8M" Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A5C4D9 for ; Thu, 12 Oct 2023 12:30:22 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-578d0d94986so997337a12.2 for ; Thu, 12 Oct 2023 12:30:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1697139022; x=1697743822; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wIBdQHfHPM91+b5JonQNYSwOG95FePEogJzP0FDSRh4=; b=Lo4hCg8MjbZ3rjVctuJwOHfdaEbxxPjqg2YWMAVzsSb3nN11ID0NyQsfSnrivseRfd qlus7Z5tDflBYlebtyxxR1gWj9UesmXyTFDKydhLvmZO+OoC5H5tQ8MqFTrSJ/wSizjl BnxHR+d9dsvcIDl/L6uwd59u7d/uhHtknxc8g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697139022; x=1697743822; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wIBdQHfHPM91+b5JonQNYSwOG95FePEogJzP0FDSRh4=; b=bKj3Gem5UeTW7Rx4Qyg94hO1GYODzDxcHs+a8wDO/1nXVyXNuCJCiKpOyzA4AZ8H+z xugyzq3bkjCXMtRE31Ww1a1un6VQLEMXoQuM1FhC9MSt3wnMhBtT0NTCIrmfvVJWFd3F bALUkA2GGDAAMbNzvdW20jE6ybzTaptkuNn2+nbaLeU7mszGVIJyUH7EIzZreH4B5Ccw jhaVcPK0RMB1HjQznniryeO6XCYTqUMY6YKOgP26KO32I98m8E4r9s324X9DjnvsqIiW O/wsCrMbO0LbTNoWV0B3Xo/BLGPu2oPtwW0cAgCSDBT81EMHuzXUc/RKEZXTPgRQcFaI xKHw== X-Gm-Message-State: AOJu0Yw01+8YVVfB8yuPf4hqQmiG58J7mVneD7MY0/L3VrA9oC+PHuG2 fpbB7cHFtkP96bHtUvfM0HVmWA== X-Google-Smtp-Source: AGHT+IGVCqHhyWjPHboIFX68T+H6J7OFRjv+HwCMKpgqYEXTLWlEh1BjUo3hAYNJygg/UPMwH3OV1A== X-Received: by 2002:a05:6a20:9385:b0:161:3120:e840 with SMTP id x5-20020a056a20938500b001613120e840mr31028866pzh.2.1697139021715; Thu, 12 Oct 2023 12:30:21 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:7c85:4a99:f03e:6f30]) by smtp.gmail.com with ESMTPSA id b3-20020a639303000000b0057c25885fcfsm2075720pge.10.2023.10.12.12.30.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 12:30:21 -0700 (PDT) From: Douglas Anderson To: Jakub Kicinski , Hayes Wang , "David S . Miller" Cc: Alan Stern , Simon Horman , Edward Hill , Laura Nao , linux-usb@vger.kernel.org, Grant Grundler , Douglas Anderson , =?utf-8?q?Bj=C3=B8rn_Mork?= , Eric Dumazet , Paolo Abeni , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v3 3/5] r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en() Date: Thu, 12 Oct 2023 12:25:02 -0700 Message-ID: <20231012122458.v3.3.I6405b1587446c157c6d6263957571f2b11f330a7@changeid> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012192552.3900360-1-dianders@chromium.org> References: <20231012192552.3900360-1-dianders@chromium.org> Precedence: bulk X-Mailing-List: linux-usb@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net If the adapter is unplugged while we're looping in r8153b_ups_en() / r8153c_ups_en() we could end up looping for 10 seconds (20 ms * 500 loops). Add code similar to what's done in other places in the driver to check for unplug and bail. Signed-off-by: Douglas Anderson --- (no changes since v2) Changes in v2: - ("Check for unplug in r8153b_ups_en() / r8153c_ups_en()") new for v2. drivers/net/usb/r8152.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index fff2f9e67b5f..888d3884821e 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3663,6 +3663,8 @@ static void r8153b_ups_en(struct r8152 *tp, bool enable) int i; for (i = 0; i < 500; i++) { + if (test_bit(RTL8152_UNPLUG, &tp->flags)) + return; if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; @@ -3703,6 +3705,8 @@ static void r8153c_ups_en(struct r8152 *tp, bool enable) int i; for (i = 0; i < 500; i++) { + if (test_bit(RTL8152_UNPLUG, &tp->flags)) + return; if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; From patchwork Thu Oct 12 19:25:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 733184 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B3EC3B292 for ; Thu, 12 Oct 2023 19:30:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="K/clFkMd" Received: from mail-oo1-xc35.google.com (mail-oo1-xc35.google.com [IPv6:2607:f8b0:4864:20::c35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 960BFE7 for ; Thu, 12 Oct 2023 12:30:26 -0700 (PDT) Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-57bab4e9e1aso773910eaf.3 for ; Thu, 12 Oct 2023 12:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1697139026; x=1697743826; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Gcf2QHQ6TL/y4ldUWtXd1iY7HrL3wUwUVRkcf9G8sDo=; b=K/clFkMd+XPolG0iAYTBF2zlEmF/6bHk3q/JPxGQIz+FKYm3hXETmsvhsawRxfavV+ BU8s10SGfdnAmqllT7Z87ZQ6ITKyrPt1p6wrQv2fluQHgdB7cPV8T2ME5jKVa2mTIHay faYbRwVQvYjXa9ppr3+iBX4CTS9mGEQzi96qw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697139026; x=1697743826; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gcf2QHQ6TL/y4ldUWtXd1iY7HrL3wUwUVRkcf9G8sDo=; b=xEOHUtd/mvdOBqicpS09GT5v8MaHNr4WOrGDicFNuFbYxA5gshECIWiwAl0kaeijjC Hd9h1imZ8eoBRAV9yeGjnO3i4l4KQC+9/5JBa4YW2NrfkCXzR5ys7zmuSnzenHchDLv3 otUdiB3iy5CG4cnHB1jc1JGld4qrZX+y6q8lhv5XXbtOLoKwQvsewI2OlqkH/UeD/i2W cUXH8XgsXhh5DxBZyBIcrl8FXWN1CED90RSe7/SDmg9bh1O8ps2mr34vgrCFGJ09XsfC ev/L/7QqiwG4ZsnjPS+i2VdDo9QAKGCEHPn0b9uY5AtQdsrVf8E0WBys2yFmaUZyt78g sJPA== X-Gm-Message-State: AOJu0Yz5U8WQkyX1VyxM9EAziJe9XULVeIRjrwrT1xxOJLSJVI2kimoD 3FxX7BxJqGMzI8tZDePKC7wczA== X-Google-Smtp-Source: AGHT+IGVMIagzcDJihVo3iwTpp5taRYV/4JiAT53a8LGf/J75FI9rrmlMqCXjag7LSIhKeXB0lCN3w== X-Received: by 2002:a05:6358:7e07:b0:134:e301:2c21 with SMTP id o7-20020a0563587e0700b00134e3012c21mr27381283rwm.15.1697139025586; Thu, 12 Oct 2023 12:30:25 -0700 (PDT) Received: from tictac2.mtv.corp.google.com ([2620:15c:9d:2:7c85:4a99:f03e:6f30]) by smtp.gmail.com with ESMTPSA id b3-20020a639303000000b0057c25885fcfsm2075720pge.10.2023.10.12.12.30.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 12:30:25 -0700 (PDT) From: Douglas Anderson To: Jakub Kicinski , Hayes Wang , "David S . Miller" Cc: Alan Stern , Simon Horman , Edward Hill , Laura Nao , linux-usb@vger.kernel.org, Grant Grundler , Douglas Anderson , =?utf-8?q?Bj=C3=B8rn_Mork?= , Eric Dumazet , Paolo Abeni , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v3 5/5] r8152: Block future register access if register access fails Date: Thu, 12 Oct 2023 12:25:04 -0700 Message-ID: <20231012122458.v3.5.Ib2affdbfdc2527aaeef9b46d4f23f7c04147faeb@changeid> X-Mailer: git-send-email 2.42.0.655.g421f12c284-goog In-Reply-To: <20231012192552.3900360-1-dianders@chromium.org> References: <20231012192552.3900360-1-dianders@chromium.org> Precedence: bulk X-Mailing-List: linux-usb@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Even though the functions to read/write registers can fail, most of the places in the r8152 driver that read/write register values don't check error codes. The lack of error code checking is problematic in at least two ways. The first problem is that the r8152 driver often uses code patterns similar to this: x = read_register() x = x | SOME_BIT; write_register(x); ...with the above pattern, if the read_register() fails and returns garbage then we'll end up trying to write modified garbage back to the Realtek adapter. If the write_register() succeeds that's bad. Note that as of commit f53a7ad18959 ("r8152: Set memory to all 0xFFs on failed reg reads") the "garbage" returned by read_register() will at least be consistent garbage, but it is still garbage. It turns out that this problem is very serious. Writing garbage to some of the hardware registers on the Ethernet adapter can put the adapter in such a bad state that it needs to be power cycled (fully unplugged and plugged in again) before it can enumerate again. The second problem is that the r8152 driver generally has functions that are long sequences of register writes. Assuming everything will be OK if a random register write fails in the middle isn't a great assumption. One might wonder if the above two problems are real. You could ask if we would really have a successful write after a failed read. It turns out that the answer appears to be "yes, this can happen". In fact, we've seen at least two distinct failure modes where this happens. On a sc7180-trogdor Chromebook if you drop into kdb for a while and then resume, you can see: 1. We get a "Tx timeout" 2. The "Tx timeout" queues up a USB reset. 3. In rtl8152_pre_reset() we try to reinit the hardware. 4. The first several (2-9) register accesses fail with a timeout, then things recover. The above test case was actually fixed by the patch ("r8152: Increase USB control msg timeout to 5000ms as per spec") but at least shows that we really can see successful calls after failed ones. On a different (AMD) based Chromebook with a particular adapter, we found that during reboot tests we'd also sometimes get a transitory failure. In this case we saw -EPIPE being returned sometimes. Retrying worked, but retrying is not always safe for all register accesses since reading/writing some registers might have side effects (like registers that clear on read). Let's fully lock out all register access if a register access fails. When we do this, we'll try to queue up a USB reset and try to unlock register access after the reset. This is slightly tricker than it sounds since the r8152 driver has an optimized reset sequence that only works reliably after probe happens. In order to handle this, we avoid the optimized reset if probe didn't finish. When locking out access, we'll use the existing infrastructure that the driver was using when it detected we were unplugged. This keeps us from getting stuck in delay loops in some parts of the driver. Signed-off-by: Douglas Anderson Reviewed-by: Grant Grundler --- Originally when looking at this problem I thought that the obvious solution was to "just" add better error handling to the driver. This _sounds_ appealing, but it's a massive change and touches a significant portion of the lines in this driver. It's also not always obvious what the driver should be doing to handle errors. If you feel like you need to be convinced and to see what it looked like to add better error handling, I put up my "work in progress" patch when I was investigating this at: https://crrev.com/c/4937290 There is still some active debate between the two approaches, though, so it would be interesting to hear if anyone had any opinions. Changes in v3: - Fixed v2 changelog ending up in the commit message. - farmework -> framework in comments. Changes in v2: - Reset patch no longer based on retry patch, since that was dropped. - Reset patch should be robust even if failures happen in probe. - Switched booleans to bits in the "flags" variable. - Check for -ENODEV instead of "udev->state == USB_STATE_NOTATTACHED" drivers/net/usb/r8152.c | 176 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 159 insertions(+), 17 deletions(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 151c3c383080..fce7c58f8142 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -773,6 +773,8 @@ enum rtl8152_flags { SCHEDULE_TASKLET, GREEN_ETHERNET, RX_EPROTO, + IN_PRE_RESET, + PROBED_WITH_NO_ERRORS, }; #define DEVICE_ID_LENOVO_USB_C_TRAVEL_HUB 0x721e @@ -953,6 +955,8 @@ struct r8152 { u8 version; u8 duplex; u8 autoneg; + + unsigned int reg_access_reset_count; }; /** @@ -1200,6 +1204,91 @@ static unsigned int agg_buf_sz = 16384; #define RTL_LIMITED_TSO_SIZE (size_to_mtu(agg_buf_sz) - sizeof(struct tx_desc)) +/* If register access fails then we block access and issue a reset. If this + * happens too many times in a row without a successful access then we stop + * trying to reset and just leave access blocked. + */ +#define REGISTER_ACCESS_MAX_RESETS 3 + +static void rtl_set_inaccessible(struct r8152 *tp) +{ + set_bit(RTL8152_INACCESSIBLE, &tp->flags); + smp_mb__after_atomic(); +} + +static void rtl_set_accessible(struct r8152 *tp) +{ + clear_bit(RTL8152_INACCESSIBLE, &tp->flags); + smp_mb__after_atomic(); +} + +static +int r8152_control_msg(struct r8152 *tp, unsigned int pipe, __u8 request, + __u8 requesttype, __u16 value, __u16 index, void *data, + __u16 size, const char *msg_tag) +{ + struct usb_device *udev = tp->udev; + int ret; + + if (test_bit(RTL8152_INACCESSIBLE, &tp->flags)) + return -ENODEV; + + ret = usb_control_msg(udev, pipe, request, requesttype, + value, index, data, size, + USB_CTRL_GET_TIMEOUT); + + /* No need to issue a reset report an error if the USB device got + * unplugged; just return immediately. + */ + if (ret == -ENODEV) + return ret; + + /* If the write was successful then we're done */ + if (ret >= 0) { + tp->reg_access_reset_count = 0; + return ret; + } + + dev_err(&udev->dev, + "Failed to %s %d bytes at %#06x/%#06x (%d)\n", + msg_tag, size, value, index, ret); + + /* Block all future register access until we reset. Much of the oode + * in the driver doesn't check for errors. Notably, many parts of the + * driver do a read/modify/write of a register value without + * confirming that the read succeeded. Writing back modified garbage + * like this can fully wedge the adapter, requiring a power cycle. + */ + rtl_set_inaccessible(tp); + + /* Failing to access registers in pre-reset is not surprising since we + * wouldn't be resetting if things were behaving normally. The register + * access we do in pre-reset isn't truly mandatory--we're just reusing + * the disable() function and trying to be nice by powering the + * adapter down before resetting it. Thus, if we're in pre-reset, + * we'll return right away and not try to queue up yet another reset. + * We know the post-reset is already coming. + * + * We'll also return right away if we haven't finished probe. At the + * end of probe we'll queue the reset just to make sure it doesn't + * timeout. + */ + if (test_bit(IN_PRE_RESET, &tp->flags) || + !test_bit(PROBED_WITH_NO_ERRORS, &tp->flags)) + return ret; + + if (tp->reg_access_reset_count < REGISTER_ACCESS_MAX_RESETS) { + usb_queue_reset_device(tp->intf); + tp->reg_access_reset_count++; + } else if (tp->reg_access_reset_count == REGISTER_ACCESS_MAX_RESETS) { + dev_err(&udev->dev, + "Tried to reset %d times; giving up.\n", + REGISTER_ACCESS_MAX_RESETS); + } + + return ret; +} + static int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) { @@ -1210,9 +1299,10 @@ int get_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) if (!tmp) return -ENOMEM; - ret = usb_control_msg(tp->udev, tp->pipe_ctrl_in, - RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, - value, index, tmp, size, USB_CTRL_GET_TIMEOUT); + ret = r8152_control_msg(tp, tp->pipe_ctrl_in, + RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, + value, index, tmp, size, "read"); + if (ret < 0) memset(data, 0xff, size); else @@ -1233,9 +1323,9 @@ int set_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) if (!tmp) return -ENOMEM; - ret = usb_control_msg(tp->udev, tp->pipe_ctrl_out, - RTL8152_REQ_SET_REGS, RTL8152_REQT_WRITE, - value, index, tmp, size, USB_CTRL_SET_TIMEOUT); + ret = r8152_control_msg(tp, tp->pipe_ctrl_out, + RTL8152_REQ_SET_REGS, RTL8152_REQT_WRITE, + value, index, tmp, size, "write"); kfree(tmp); @@ -1244,10 +1334,8 @@ int set_registers(struct r8152 *tp, u16 value, u16 index, u16 size, void *data) static void rtl_set_unplug(struct r8152 *tp) { - if (tp->udev->state == USB_STATE_NOTATTACHED) { - set_bit(RTL8152_INACCESSIBLE, &tp->flags); - smp_mb__after_atomic(); - } + if (tp->udev->state == USB_STATE_NOTATTACHED) + rtl_set_inaccessible(tp); } static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size, @@ -8265,6 +8353,19 @@ static int rtl8152_pre_reset(struct usb_interface *intf) if (!tp) return 0; + /* We can only use the optimized reset if we made it to the end of + * probe without any register access fails, which sets + * `PROBED_WITH_NO_ERRORS` to true. If we didn't have that then return + * an error here which tells the USB framework to fully unbind/rebind + * our driver. + */ + mutex_lock(&tp->control); + if (!test_bit(PROBED_WITH_NO_ERRORS, &tp->flags)) { + mutex_unlock(&tp->control); + return -EIO; + } + mutex_unlock(&tp->control); + netdev = tp->netdev; if (!netif_running(netdev)) return 0; @@ -8277,7 +8378,9 @@ static int rtl8152_pre_reset(struct usb_interface *intf) napi_disable(&tp->napi); if (netif_carrier_ok(netdev)) { mutex_lock(&tp->control); + set_bit(IN_PRE_RESET, &tp->flags); tp->rtl_ops.disable(tp); + clear_bit(IN_PRE_RESET, &tp->flags); mutex_unlock(&tp->control); } @@ -8293,6 +8396,10 @@ static int rtl8152_post_reset(struct usb_interface *intf) if (!tp) return 0; + mutex_lock(&tp->control); + rtl_set_accessible(tp); + mutex_unlock(&tp->control); + /* reset the MAC address in case of policy change */ if (determine_ethernet_addr(tp, &sa) >= 0) { rtnl_lock(); @@ -9494,17 +9601,30 @@ static u8 __rtl_get_hw_ver(struct usb_device *udev) __le32 *tmp; u8 version; int ret; + int i; tmp = kmalloc(sizeof(*tmp), GFP_KERNEL); if (!tmp) return 0; - ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0), - RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, - PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp), - USB_CTRL_GET_TIMEOUT); - if (ret > 0) - ocp_data = (__le32_to_cpu(*tmp) >> 16) & VERSION_MASK; + /* Retry up to 3 times in case there is a transitory error. We do this + * since retrying a read of the version is always safe and this + * function doesn't take advantage of r8152_control_msg() which would + * queue up a reset upon error. + */ + for (i = 0; i < 3; i++) { + ret = usb_control_msg(udev, usb_rcvctrlpipe(udev, 0), + RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, + PLA_TCR0, MCU_TYPE_PLA, tmp, sizeof(*tmp), + USB_CTRL_GET_TIMEOUT); + if (ret > 0) { + ocp_data = (__le32_to_cpu(*tmp) >> 16) & VERSION_MASK; + break; + } + } + + if (i != 0 && ret > 0) + dev_warn(&udev->dev, "Needed %d retries to read version\n", i); kfree(tmp); @@ -9784,7 +9904,29 @@ static int rtl8152_probe(struct usb_interface *intf, else device_set_wakeup_enable(&udev->dev, false); - netif_info(tp, probe, netdev, "%s\n", DRIVER_VERSION); + mutex_lock(&tp->control); + if (test_bit(RTL8152_INACCESSIBLE, &tp->flags)) { + /* If the device is marked inaccessible before probe even + * finished then one of two things happened. Either we got a + * USB error during probe or the user already unplugged the + * device. + * + * If we got a USB error during probe then we skipped doing a + * reset in r8152_control_msg() and deferred it to here. This + * is because the queued reset will give up after 1 second + * (see usb_lock_device_for_reset()) and we want to make sure + * that we queue things up right before probe finishes. + * + * If the user already unplugged the device then the USB + * framework will call unbind right away for us. The extra + * reset we queue up here will be harmless. + */ + usb_queue_reset_device(tp->intf); + } else { + set_bit(PROBED_WITH_NO_ERRORS, &tp->flags); + netif_info(tp, probe, netdev, "%s\n", DRIVER_VERSION); + } + mutex_unlock(&tp->control); return 0;