From patchwork Wed Dec 16 22:41:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 344556 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65CE1C1B0D8 for ; Wed, 16 Dec 2020 22:43:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2941C2342C for ; Wed, 16 Dec 2020 22:43:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730228AbgLPWnK (ORCPT ); Wed, 16 Dec 2020 17:43:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730199AbgLPWnK (ORCPT ); Wed, 16 Dec 2020 17:43:10 -0500 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54EA9C061282 for ; Wed, 16 Dec 2020 14:42:30 -0800 (PST) Received: by mail-pf1-x42d.google.com with SMTP id x126so8746175pfc.7 for ; Wed, 16 Dec 2020 14:42:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=cbZxiG/JD9YQXv2sQzFlyfu39bBLP48QnsSoq2lBxGQ=; b=IB9ftPWqhkJ4XfQ0uT62WOhCnW1scqBjf/Vu/VT1F2BkMYqlYsaNMwYuz85LTtBYHa W//kvue5VoM869pvuCSbkQV6qMW3s2ZT7kRPssW5MJlk1Ih8KMcog/Evox8zJvC4DDus FOMzbBs4C5TZnId1htBYIrrYQ70J2TGXpUPWs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=cbZxiG/JD9YQXv2sQzFlyfu39bBLP48QnsSoq2lBxGQ=; b=ptyugb+0N/OGwHqYI1RueHiCzaKCBoI+6O+AtyXwtPxITr1AYGmzjfduH4bfkoRjgj DlJqXvXZAREjXTs333hopSWIg/UvPJNowINnxm462CRBgssRS13AXFHqUyhNp+x49Y0R 4dorUSZFUZocIdhPHNcajwqLo0ScQWQQhvjxxpeIWn/t8z0f17NyYaJ5VQlNY1GkWJkw hPlfJ4q83oqPHIEd9b89LjrdGdH/1RvNnJUf6aIkFXas1CTSojpOdfFxxazpSv3E13xc JIwo6nCIR943NzLeBgD7Nf6RhWr0CFOT3tmhUZpMHZXepubAVp+BuulF1+j9z8ALIg0e cVJg== X-Gm-Message-State: AOAM530uasXwm0jHFC4sEMIx6i7+eJOG146mBR09+b7mWH6u/CwKpdmJ 1DvAAOX5iDKXjkL+PvFkssrA/g== X-Google-Smtp-Source: ABdhPJxxOPgyfgV1lE6ycXNrKq/uEGEznucC4Jaqs5lrvh2F69meDN9DL54tbwtdzGXgbgHJdVc2XQ== X-Received: by 2002:a65:5687:: with SMTP id v7mr30380250pgs.249.1608158549883; Wed, 16 Dec 2020 14:42:29 -0800 (PST) Received: from tictac2.mtv.corp.google.com ([2620:15c:202:1:42b0:34ff:fe3d:58e6]) by smtp.gmail.com with ESMTPSA id q26sm3561703pfl.219.2020.12.16.14.42.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Dec 2020 14:42:29 -0800 (PST) From: Douglas Anderson To: Mark Brown Cc: msavaliy@qti.qualcomm.com, akashast@codeaurora.org, Stephen Boyd , Roja Rani Yarubandi , Douglas Anderson , Alok Chauhan , Andy Gross , Bjorn Andersson , Dilip Kota , Girish Mahadevan , linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-spi@vger.kernel.org Subject: [PATCH v2 1/4] spi: spi-geni-qcom: Fix geni_spi_isr() NULL dereference in timeout case Date: Wed, 16 Dec 2020 14:41:49 -0800 Message-Id: <20201216144114.v2.1.I99ee04f0cb823415df59bd4f550d6ff5756e43d6@changeid> X-Mailer: git-send-email 2.29.2.684.gfbc64c5ab5-goog MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org In commit 7ba9bdcb91f6 ("spi: spi-geni-qcom: Don't keep a local state variable") we changed handle_fifo_timeout() so that we set "mas->cur_xfer" to NULL to make absolutely sure that we don't mess with the buffers from the previous transfer in the timeout case. Unfortunately, this caused the IRQ handler to dereference NULL in some cases. One case: CPU0 CPU1 ---- ---- setup_fifo_xfer() geni_se_setup_m_cmd() ... handle_fifo_timeout() spin_lock_irq(mas->lock) mas->cur_xfer = NULL geni_se_cancel_m_cmd() spin_unlock_irq(mas->lock) geni_spi_isr() spin_lock(mas->lock) if (m_irq & M_RX_FIFO_WATERMARK_EN) geni_spi_handle_rx() mas->cur_xfer NULL dereference! tl;dr: Seriously delayed interrupts for RX/TX can lead to timeout handling setting mas->cur_xfer to NULL. Let's check for the NULL transfer in the TX and RX cases and reset the watermark or clear out the fifo respectively to put the hardware back into a sane state. NOTE: things still could get confused if we get timeouts all the way through handle_fifo_timeout() and then start a new transfer because interrupts from the old transfer / cancel / abort could still be pending. A future patch will help this corner case. Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP") Signed-off-by: Douglas Anderson Reviewed-by: Stephen Boyd --- Changes in v2: - (ptr == NULL) => (!ptr). - Addressed loop nits in geni_spi_handle_rx(). - Commit message rewording from Stephen. drivers/spi/spi-geni-qcom.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c index 25810a7eef10..bf55abbd39f1 100644 --- a/drivers/spi/spi-geni-qcom.c +++ b/drivers/spi/spi-geni-qcom.c @@ -354,6 +354,12 @@ static bool geni_spi_handle_tx(struct spi_geni_master *mas) unsigned int bytes_per_fifo_word = geni_byte_per_fifo_word(mas); unsigned int i = 0; + /* Stop the watermark IRQ if nothing to send */ + if (!mas->cur_xfer) { + writel(0, se->base + SE_GENI_TX_WATERMARK_REG); + return false; + } + max_bytes = (mas->tx_fifo_depth - mas->tx_wm) * bytes_per_fifo_word; if (mas->tx_rem_bytes < max_bytes) max_bytes = mas->tx_rem_bytes; @@ -396,6 +402,14 @@ static void geni_spi_handle_rx(struct spi_geni_master *mas) if (rx_last_byte_valid && rx_last_byte_valid < 4) rx_bytes -= bytes_per_fifo_word - rx_last_byte_valid; } + + /* Clear out the FIFO and bail if nowhere to put it */ + if (mas->cur_xfer == NULL) { + while (i++ < DIV_ROUND_UP(rx_bytes, bytes_per_fifo_word)) + readl(se->base + SE_GENI_RX_FIFOn); + return; + } + if (mas->rx_rem_bytes < rx_bytes) rx_bytes = mas->rx_rem_bytes; From patchwork Wed Dec 16 22:41:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Anderson X-Patchwork-Id: 344555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BD7BC2BB9A for ; Wed, 16 Dec 2020 22:43:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2AA1423609 for ; Wed, 16 Dec 2020 22:43:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730242AbgLPWnr (ORCPT ); Wed, 16 Dec 2020 17:43:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730240AbgLPWnr (ORCPT ); Wed, 16 Dec 2020 17:43:47 -0500 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E6BDC061794 for ; Wed, 16 Dec 2020 14:42:32 -0800 (PST) Received: by mail-pf1-x432.google.com with SMTP id s21so17542505pfu.13 for ; Wed, 16 Dec 2020 14:42:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UErCAYhJNV6xENTjEWyjhUKDhkm78BgKdEVy9iP5rpY=; b=aGYwi3nCHT8nh/xWKQW6YYDtzh0JvnpyDaEJXPUIrRybqWWGZh4rvft41syG6W0grW 2DpNBeSoHiOOySRCAsyZ5QZpQWTjpXqb8Fjy3XT2LUHz9i4+AtyMtS9A7O1d+jJDyoye b12DQaByTtamB9mNemAUUFCKhWDrl45BC91eI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UErCAYhJNV6xENTjEWyjhUKDhkm78BgKdEVy9iP5rpY=; b=gSHF2PGk38qr+2OFvNl+WNFABh9r5bbP+9/sOBDETh0rBNqPdSpv+EfxlIPtg19j+g D9sFqdfQO/KgoDZXg+UAvWTILb1mA673thslwPph3kxp+k0c8RtJO/s7hCyd0lLWYXRR +THK5b+UXnaxydr8/IKYCw7V7S+V87cGKI/LcDEXTgz7HXoLazG8KhQPLEtX8JFMCKm4 1MqMoxh6oBkdczwioVkJqlbU/eD+C/JpCUdZdEFao03eP5wPGHRNIxmejnxe7qjNzWnP OeBQKf10gTA4zo9cwEKPxn3whAVQ1XjHcPrmIEZW7RB3noMCKAwCwiIsVOUxDa5Syf0U aYNg== X-Gm-Message-State: AOAM531qd7zqCppSCWiJPeJ7TRoISiiwTFOyfzwiQha+J5+GuxrMPbGq hoxrY/tjhEH0Me/O8DOOCXoprQ== X-Google-Smtp-Source: ABdhPJzBgFjwuLAhEPeoSpgqaZPjMc1mH2n4DifWFn9DUPtuN2Gbp7liVDv6JOveNwetKOvc6iHXhw== X-Received: by 2002:a62:1c16:0:b029:1a6:8b06:68e9 with SMTP id c22-20020a621c160000b02901a68b0668e9mr11441294pfc.45.1608158551710; Wed, 16 Dec 2020 14:42:31 -0800 (PST) Received: from tictac2.mtv.corp.google.com ([2620:15c:202:1:42b0:34ff:fe3d:58e6]) by smtp.gmail.com with ESMTPSA id q26sm3561703pfl.219.2020.12.16.14.42.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Dec 2020 14:42:31 -0800 (PST) From: Douglas Anderson To: Mark Brown Cc: msavaliy@qti.qualcomm.com, akashast@codeaurora.org, Stephen Boyd , Roja Rani Yarubandi , Douglas Anderson , Alok Chauhan , Andy Gross , Bjorn Andersson , Dilip Kota , Girish Mahadevan , linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-spi@vger.kernel.org Subject: [PATCH v2 2/4] spi: spi-geni-qcom: Fail new xfers if xfer/cancel/abort pending Date: Wed, 16 Dec 2020 14:41:50 -0800 Message-Id: <20201216144114.v2.2.Ibade998ed587e070388b4bf58801f1107a40eb53@changeid> X-Mailer: git-send-email 2.29.2.684.gfbc64c5ab5-goog In-Reply-To: <20201216144114.v2.1.I99ee04f0cb823415df59bd4f550d6ff5756e43d6@changeid> References: <20201216144114.v2.1.I99ee04f0cb823415df59bd4f550d6ff5756e43d6@changeid> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org If we got a timeout when trying to send an abort command then it means that we just got 3 timeouts in a row: 1. The original timeout that caused handle_fifo_timeout() to be called. 2. A one second timeout waiting for the cancel command to finish. 3. A one second timeout waiting for the abort command to finish. SPI is clocked by the controller, so nothing (aside from a hardware fault or a totally broken sequencer) should be causing the actual commands to fail in hardware. However, even though the hardware itself is not expected to fail (and it'd be hard to predict how we should handle things if it did), it's easy to hit the timeout case by simply blocking our interrupt handler from running for a long period of time. Obviously the system is in pretty bad shape if a interrupt handler is blocked for > 2 seconds, but there are certainly bugs (even bugs in other unrelated drivers) that can make this happen. Let's make things a bit more robust against this case. If we fail to abort we'll set a flag and then we'll block all future transfers until we have no more interrupts pending. Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP") Signed-off-by: Douglas Anderson --- Changes in v2: - Make this just about the failed abort. drivers/spi/spi-geni-qcom.c | 56 +++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c index bf55abbd39f1..d988463e606f 100644 --- a/drivers/spi/spi-geni-qcom.c +++ b/drivers/spi/spi-geni-qcom.c @@ -83,6 +83,7 @@ struct spi_geni_master { spinlock_t lock; int irq; bool cs_flag; + bool abort_failed; }; static int get_spi_clk_cfg(unsigned int speed_hz, @@ -141,8 +142,46 @@ static void handle_fifo_timeout(struct spi_master *spi, spin_unlock_irq(&mas->lock); time_left = wait_for_completion_timeout(&mas->abort_done, HZ); - if (!time_left) + if (!time_left) { dev_err(mas->dev, "Failed to cancel/abort m_cmd\n"); + + /* + * No need for a lock since SPI core has a lock and we never + * access this from an interrupt. + */ + mas->abort_failed = true; + } +} + +static bool spi_geni_is_abort_still_pending(struct spi_geni_master *mas) +{ + struct geni_se *se = &mas->se; + u32 m_irq, m_irq_en; + + if (!mas->abort_failed) + return false; + + /* + * The only known case where a transfer times out and then a cancel + * times out then an abort times out is if something is blocking our + * interrupt handler from running. Avoid starting any new transfers + * until that sorts itself out. + */ + m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS); + m_irq_en = readl(se->base + SE_GENI_M_IRQ_EN); + if (m_irq & m_irq_en) { + dev_err(mas->dev, "Interrupts pending after abort: %#010x\n", + m_irq & m_irq_en); + return true; + } + + /* + * If we're here the problem resolved itself so no need to check more + * on future transfers. + */ + mas->abort_failed = false; + + return false; } static void spi_geni_set_cs(struct spi_device *slv, bool set_flag) @@ -158,9 +197,15 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag) if (set_flag == mas->cs_flag) return; + pm_runtime_get_sync(mas->dev); + + if (spi_geni_is_abort_still_pending(mas)) { + dev_err(mas->dev, "Can't set chip select\n"); + goto exit; + } + mas->cs_flag = set_flag; - pm_runtime_get_sync(mas->dev); spin_lock_irq(&mas->lock); reinit_completion(&mas->cs_done); if (set_flag) @@ -173,6 +218,7 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag) if (!time_left) handle_fifo_timeout(spi, NULL); +exit: pm_runtime_put(mas->dev); } @@ -280,6 +326,9 @@ static int spi_geni_prepare_message(struct spi_master *spi, int ret; struct spi_geni_master *mas = spi_master_get_devdata(spi); + if (spi_geni_is_abort_still_pending(mas)) + return -EBUSY; + ret = setup_fifo_params(spi_msg->spi, spi); if (ret) dev_err(mas->dev, "Couldn't select mode %d\n", ret); @@ -509,6 +558,9 @@ static int spi_geni_transfer_one(struct spi_master *spi, { struct spi_geni_master *mas = spi_master_get_devdata(spi); + if (spi_geni_is_abort_still_pending(mas)) + return -EBUSY; + /* Terminate and return success for 0 byte length transfer */ if (!xfer->len) return 0;