From patchwork Tue May 20 17:44:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 891344 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1BF11AF4C1 for ; Tue, 20 May 2025 17:45:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763105; cv=none; b=dyAmBnCVRxyfHesAA3poMEP04HSTMD113DT4gmbyXAFeDe81FkCZ8TFB5PH2PvU0asdzfC0UsC/pg7EJXbmvZVswbLKUBr+nAHgvfFd/gRS7j5wM7xMhQX+6MOOd/T4JaIhA8VtC0mouTy5a15QUftbur5uxuxxoq8YCH2lIvJ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763105; c=relaxed/simple; bh=xmVtxKp7l1nH9HRMvlhp+AXA8iIyclW4FjVkhjfpiJg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QyJ4GFKOIbKWwpRsd6NAYEHWpik6IwroSpnHFCXCJZlkXrO0CzJ2cqbziPl6B3s50d7+mFu4JSBrvU6FUpPe73Ncsxjq6OkpNHXy7Up5e4zwU4Kd1l1ed4UEiOicLC7O5wGOASLkQ4Ix1Q6Hty5n8UwarNn3gDMFssJvnGPwuyk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=L/BHj9jB; arc=none smtp.client-ip=209.85.219.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L/BHj9jB" Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6f8adf56370so9292926d6.1 for ; Tue, 20 May 2025 10:45:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747763102; x=1748367902; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=Ii8Ima0T9o2vsm/Kvny8YHfjV8PP21HFA5Cwpdauwc4=; b=L/BHj9jBo8h9Txl5FghjeP7bWpfLPCdi4+t9M+yOh4iacDKu9ReM0q5GBXPMfv/vBB UiUWI9XCXglwzcnLxofl32DtPeBohZbbcT/N9yagMKDS9OefEnNpE1da+mbH9klUjxg+ D1T81E5ThhSWvKnMU22S3UTGio3VEHIPbd35hFkO72lxzBkOpS3KeYAicSgYXvW6mVpo oe8BKzNU7VJahaD8ZeUIPU8knwH5EEG3i3Rqdx0d3FTIxffQQ6ql1YG+2j/en8R37cKY jtTuCUKFgHk4q89CKj/idAVzpHZkFbcIkmAfsI10P/SJHDN8W/gD73SY++ClsELF287u wPMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747763102; x=1748367902; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ii8Ima0T9o2vsm/Kvny8YHfjV8PP21HFA5Cwpdauwc4=; b=hGgMfSahSFYMQu2lA5zPvPIkqwh2gTcCC1bTy6r6/bzM/zGyAXAMl2er1/iA2sooIy BQD0QMjSxvGkqY7TjxF1u3DnVoWzZuiz5bIKgX5fRP3PsQhU2BJl795/vqdgsInd4KfB 3uzac6gboxKmsrEK0b+98ovsm2j26+x+3DW+nhhcR6Ypd96aQHL9oMZvBglVxFIsCKRl vZwaHY5HNJbJwlqE86523y8A7JiKEBarhDQqN1LT+8k8/MbHVaIk8Sb5R0BXq6k/YQhF m9tThSQxJdhslRo6IRPNSJNGgkAL4l++es9/g01mYSYu2qNwIEa+TTH3TGXy4cItUmpC 23BA== X-Forwarded-Encrypted: i=1; AJvYcCUPBaB2ZhE8A8IWpQtkG+a5dZu7TLWENFtgoZkNs/DSO4fdNIAkmTFvqIWW7i6icieBX9X4sjKXzoL1C40N@vger.kernel.org X-Gm-Message-State: AOJu0YzBHZifeKiOU+f7aw89IaDeIW0vwdCwpagryg/Um/JyjBxsfqMM /7Sj50IvC8bfvTWM67ZOVOsJ0EVoDiB+3yJhP0eVK2B+r0RYY4rz9Y8GxW/bTejb X-Gm-Gg: ASbGnct9pJQSqtMyI7kYcvJWFMebKbWp+RGXJjr2K7So6arP0Ji7Rx5GIRJlieAwK90 qPLAiqgI6BMYWde0SLv5IK36dJ9hcc9AxVxZwCPRCEU7iMzs/LWV3+40HuxBmh4oamiff6tCIqN NnePu60VUD5IgAcrOJyLgRwWC9kFuMLKSZoX/OynD9X9deCcs5v3OVkUKPBasHKywyp1tj/NhhV +QavRtXETmz592nOPjzEo0zVy3cJ8y6cQzlpjnIqLHBByvi6DLJW011dzEuCyKzYu75PPjr6BH4 ZBx62rK4EIwVVooAfIz6XktcoItsdloVFws5m6XPM9vqPNeK7htNkOVLrYj4Bu6Qcyjuiv6Vq+j s7EB5Qetj/2S3YJLuvsc= X-Google-Smtp-Source: AGHT+IGHjNqtL4ML4K6mgFG8Y9ysSdvcdDlvh9TXE8IuhX0uhbekGYa67m2VZbTWJPgCHebVjAgMOg== X-Received: by 2002:a05:6214:e66:b0:6f8:c773:367 with SMTP id 6a1803df08f44-6f8c773156emr75941506d6.10.1747763102259; Tue, 20 May 2025 10:45:02 -0700 (PDT) Received: from [192.168.124.1] (syn-067-243-142-039.res.spectrum.com. [67.243.142.39]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b096ddb4sm74126586d6.78.2025.05.20.10.45.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 10:45:01 -0700 (PDT) From: Connor Abbott Date: Tue, 20 May 2025 13:44:51 -0400 Subject: [PATCH v7 2/7] iommu/arm-smmu: Move handing of RESUME to the context fault handler Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-msm-gpu-fault-fixes-next-v7-2-96cd1cc9ae05@gmail.com> References: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> In-Reply-To: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1747763098; l=5304; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=xmVtxKp7l1nH9HRMvlhp+AXA8iIyclW4FjVkhjfpiJg=; b=PbqlKooG9WJYSUN66hnuD7Hs/WXVtPXW5O42qPXoWh9DeILsAkZl1dC2cj4pxrwKSl2Dy/a6j 6Xw2i7fjiClCzQcHWFkZHeFDoj6GJLgXfQwNHeoaudrBrkhd0nC7Eg+ X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= The upper layer fault handler is now expected to handle everything required to retry the transaction or dump state related to it, since we enable threaded IRQs. This means that we can take charge of writing RESUME, making sure that we always write it after writing FSR as recommended by the specification. The iommu handler should write -EAGAIN if a transaction needs to be retried. This avoids tricky cross-tree changes in drm/msm, since it never wants to retry the transaction and it already returns 0 from its fault handler. Therefore it will continue to correctly terminate the transaction without any changes required. devcoredumps from drm/msm will temporarily be broken until it is fixed to collect devcoredumps inside its fault handler, but fixing that first would actually be worse because MMU-500 ignores writes to RESUME unless all fields of FSR (except SS of course) are clear and raises an interrupt when only SS is asserted. Right now, things happen to work most of the time if we collect a devcoredump, because RESUME is written asynchronously in the fault worker after the fault handler clears FSR and finishes, although there will be some spurious faults, but if this is changed before this commit fixes the FSR/RESUME write order then SS will never be cleared, the interrupt will never be cleared, and the whole system will hang every time a fault happens. It will therefore help bisectability if this commit goes first. I've changed the TBU path to also accept -EAGAIN and do the same thing, while keeping the old -EBUSY behavior. Although the old path was broken because you'd get a storm of interrupts due to returning IRQ_NONE that would eventually result in the interrupt being disabled, and I think it was dead code anyway, so it should eventually be deleted. Note that drm/msm never uses TBU so this is untested. Signed-off-by: Connor Abbott --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 9 +++++++++ drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 14 -------------- drivers/iommu/arm/arm-smmu/arm-smmu.c | 6 ++++++ 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c index d03b2239baad48680eb6c3201c85f924ec4a0e07..65e0ef6539fe70aabffa0c8fbe444c34c620d367 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c @@ -406,6 +406,12 @@ irqreturn_t qcom_smmu_context_fault(int irq, void *dev) arm_smmu_print_context_fault_info(smmu, idx, &cfi); arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr); + + if (cfi.fsr & ARM_SMMU_CB_FSR_SS) { + arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_RESUME, + ret == -EAGAIN ? 0 : ARM_SMMU_RESUME_TERMINATE); + } + return IRQ_HANDLED; } @@ -416,6 +422,9 @@ irqreturn_t qcom_smmu_context_fault(int irq, void *dev) if (!tmp || tmp == -EBUSY) { ret = IRQ_HANDLED; resume = ARM_SMMU_RESUME_TERMINATE; + } else if (tmp == -EAGAIN) { + ret = IRQ_HANDLED; + resume = 0; } else { phys_addr_t phys_atos = qcom_smmu_verify_fault(smmu_domain, cfi.iova, cfi.fsr); diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 4d3b99babd3584ec971bef30cd533c35904fe7f5..c84730d33a30c013a37e603d10319fb83203eaa5 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -120,19 +120,6 @@ static void qcom_adreno_smmu_set_stall(const void *cookie, bool enabled) qsmmu->stall_enabled &= ~BIT(cfg->cbndx); } -static void qcom_adreno_smmu_resume_translation(const void *cookie, bool terminate) -{ - struct arm_smmu_domain *smmu_domain = (void *)cookie; - struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - struct arm_smmu_device *smmu = smmu_domain->smmu; - u32 reg = 0; - - if (terminate) - reg |= ARM_SMMU_RESUME_TERMINATE; - - arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); -} - static void qcom_adreno_smmu_set_prr_bit(const void *cookie, bool set) { struct arm_smmu_domain *smmu_domain = (void *)cookie; @@ -337,7 +324,6 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain, priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg; priv->get_fault_info = qcom_adreno_smmu_get_fault_info; priv->set_stall = qcom_adreno_smmu_set_stall; - priv->resume_translation = qcom_adreno_smmu_resume_translation; priv->set_prr_bit = NULL; priv->set_prr_addr = NULL; diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 8f439c265a23f16bd11801a93dae12fd476ddfb2..8d95b14c7d5a4040bb8add56475e297beb16b162 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -474,6 +474,12 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) arm_smmu_print_context_fault_info(smmu, idx, &cfi); arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi.fsr); + + if (cfi.fsr & ARM_SMMU_CB_FSR_SS) { + arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_RESUME, + ret == -EAGAIN ? 0 : ARM_SMMU_RESUME_TERMINATE); + } + return IRQ_HANDLED; } From patchwork Tue May 20 17:44:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 891343 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 720462441B4 for ; Tue, 20 May 2025 17:45:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763108; cv=none; b=hdZ9tGXt09U2PqrYWIUFeY5B8T+qhIakAi/0f1ZY4gVaFRT0eTfDJ0zZB/7aPJwIfxCYU9vildEmVzwzKqpewE+gF7bCQJJVxnqBcy4m32vqMwfJCmEzsj2pmeUzjlKcFL4eeJM2wED5qfVlIoDiOwzk3OgIBA4OIhJJY7netww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763108; c=relaxed/simple; bh=gk8DwE6MH3TUftDA43LAG4YX9fts1PWBfmK+nfds2g0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hjuTwpovk8U9CMvZGhQl7NNaq6iKJZnqF/WskB78uXiZ1TAhKnMap9rSvO2d2xeGARJz0hQ2X267dCEGvdDx/5Mn0ncdYFtFn910gxwy9ayvUEzLZB+jYk6xcaEaychOsNCeHtckBrhAk5GhZFCurNU6Oc8Cq6w3rHpeQvLWjO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KilHrOv3; arc=none smtp.client-ip=209.85.219.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KilHrOv3" Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-6f8b0f7c42dso9874406d6.2 for ; Tue, 20 May 2025 10:45:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747763105; x=1748367905; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=MuutgzGDaH8CjNyI4g6TYCQbDYXMk/DcTGlIQTdOozs=; b=KilHrOv3w7uT/+Z4WuNHWgJe4PLA0ZDjaTwx8uXk9YMjnNJR5nMxlOx54o7tCYVjhO XBH3bi4kgP0/OvGt5a5JRUeI6avYTNo5T1J4lGP783eE/hoZkFKHbFbDCF4PZj83rAsL VoN0z5FcbCqfQBmLHHYdLFctoGJdB9HwYGF3djpVNVOGw0ZDeSQIBKKKolP/rjl7tyTB q1zQj5anqQN3H55DTKAC/kF9eTcEoDFqKqpJkoElvuN4dlMknlzvRTNy/0BqaGxRFX9F bYjc/92kW3b3/cXtm3eF1lXDM+hU3pM7Op6IqExHW3btHlYKPwzFDvaeEziIZ60QN3BA E/Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747763105; x=1748367905; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MuutgzGDaH8CjNyI4g6TYCQbDYXMk/DcTGlIQTdOozs=; b=e2QTNPWh5x+YaG3C8oh1DkhagAoXrpqBfSINYqtk5dFZlaMfVI09h8G+nOirigqkB5 9GtP6t0C57OmO9GDc3rOXDQ9qxcVSTZS9mMY0eP68zysqI3sOFm0bl0TFj++Y/fBYbip qxF/jMOeH7yiyoMlGPNwrXdzIE9KaIOIge/Lt/abzgkmDNJrjnLYXBveRUqIKwGegIOf Slgyt9+vfnerxNMcqdWjtIc7Ozf+Zs67xtUCqaZjt5EsYCFBGJ6YlOAFG793azxLsQ5F 9K7l+mvmqn397GKr/n71/b6papPGsLEyRoZjm7e4pQBjcCQ6OVhHzVSwyZMcdhPF9/Jy 9HCQ== X-Forwarded-Encrypted: i=1; AJvYcCWF3XVnmumxNoZnZmw9tQBQnn6BlzvGWQKSjwpJwnDk3PpDe+UsRksOF3m1TUjyLoSoUOd8sc3SacRMRFVd@vger.kernel.org X-Gm-Message-State: AOJu0YyottC0s2hcih+OMwyPdsHrvmn/A7bHvtxIK3oKAvCy19ZDx5UX Ql/WGVY59WEe5bJg17eiahswlskJtuJ3Y46yMBlTU2FoRBpw67QQuAIj8855D1QC X-Gm-Gg: ASbGnctC997rWfeAjpIMAZ7RW2XQuYY509YNq4XoNEY3aoSC+iP/CzzsjMOlI3TUnGV p486D+hgS/5kqlLHRy+PXzjBqXOH/+OmzCfHf2ZgxYyCtWBLDhEm3PP+7soSFSQzNAj5qGZThzs V6o+1TSt6o8MUv3zFkL+d81DnQJAC6a6pApV/ieHawmI2+qoHDrnkBjt2QnxakoTOktWgD/ZLKK M10gqIb4LT/SNqtfYc3VZvGsIrJvUz2TNll3CyROZfPA9qGKBVI9+8VT2ph5RNvJ9R4p5wXrlOy KgL5IlRATDIDLvyLQCscPAWZc1rxFV4CbYkDSAr+Yer3badQ/iQx8pHnFJuk0Wqig+piSLShBnA 0nowp5Pi9pl9azN8Vxxg= X-Google-Smtp-Source: AGHT+IEyEUyH2vl1JTCDZI6jpmWqWoPCvlggrK4ph7zQMR3tU98Bv33QLELDeAy8ZRJ0aZSyr67kCw== X-Received: by 2002:a05:6214:1cce:b0:6e6:5cad:5ce8 with SMTP id 6a1803df08f44-6f8b0903d96mr87113196d6.6.1747763104581; Tue, 20 May 2025 10:45:04 -0700 (PDT) Received: from [192.168.124.1] (syn-067-243-142-039.res.spectrum.com. [67.243.142.39]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b096ddb4sm74126586d6.78.2025.05.20.10.45.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 10:45:04 -0700 (PDT) From: Connor Abbott Date: Tue, 20 May 2025 13:44:53 -0400 Subject: [PATCH v7 4/7] drm/msm: Don't use a worker to capture fault devcoredump Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-msm-gpu-fault-fixes-next-v7-4-96cd1cc9ae05@gmail.com> References: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> In-Reply-To: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1747763098; l=6008; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=gk8DwE6MH3TUftDA43LAG4YX9fts1PWBfmK+nfds2g0=; b=Xbbc65pCLYyCTay/ace0inHb9u2gnbjiBQZT8I8BJ7JLe66r3VHZSlBkbYGTPfcn0j9dxIYsI MFFMeqbGxduCj1nFzecw6hgKb69GY0r6oorsy/XMU2zdZFC//qYGoFg X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= Now that we use a threaded IRQ, it should be safe to do this in the fault handler. We can also remove fault_info from struct msm_gpu and just pass it directly. Signed-off-by: Connor Abbott --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 22 ++++++++-------------- drivers/gpu/drm/msm/msm_gpu.c | 20 +++++++++----------- drivers/gpu/drm/msm/msm_gpu.h | 8 ++------ 3 files changed, 19 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 26db1f4b5fb90930bdbd2f17682bf47e35870936..4a6dc29ff7071940e440297f5fbbe4e2d06c3ffd 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -257,14 +257,6 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, const char *type = "UNKNOWN"; bool do_devcoredump = info && !READ_ONCE(gpu->crashstate); - /* - * If we aren't going to be resuming later from fault_worker, then do - * it now. - */ - if (!do_devcoredump) { - gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu); - } - /* * Print a default message if we couldn't get the data from the * adreno-smmu-priv @@ -291,16 +283,18 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, scratch[0], scratch[1], scratch[2], scratch[3]); if (do_devcoredump) { + struct msm_gpu_fault_info fault_info = {}; + /* Turn off the hangcheck timer to keep it from bothering us */ timer_delete(&gpu->hangcheck_timer); - gpu->fault_info.ttbr0 = info->ttbr0; - gpu->fault_info.iova = iova; - gpu->fault_info.flags = flags; - gpu->fault_info.type = type; - gpu->fault_info.block = block; + fault_info.ttbr0 = info->ttbr0; + fault_info.iova = iova; + fault_info.flags = flags; + fault_info.type = type; + fault_info.block = block; - kthread_queue_work(gpu->worker, &gpu->fault_work); + msm_gpu_fault_crashstate_capture(gpu, &fault_info); } return 0; diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index c380d9d9f5af10b90ef733b05f5b0295c0445f38..457f019d507e954daeb609c313d37ee64fd492f9 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -257,7 +257,8 @@ static void msm_gpu_crashstate_get_bo(struct msm_gpu_state *state, } static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, - struct msm_gem_submit *submit, char *comm, char *cmd) + struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info, + char *comm, char *cmd) { struct msm_gpu_state *state; @@ -276,7 +277,8 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, /* Fill in the additional crash state information */ state->comm = kstrdup(comm, GFP_KERNEL); state->cmd = kstrdup(cmd, GFP_KERNEL); - state->fault_info = gpu->fault_info; + if (fault_info) + state->fault_info = *fault_info; if (submit) { int i; @@ -308,7 +310,8 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, } #else static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, - struct msm_gem_submit *submit, char *comm, char *cmd) + struct msm_gem_submit *submit, struct msm_gpu_fault_info *fault_info, + char *comm, char *cmd) { } #endif @@ -405,7 +408,7 @@ static void recover_worker(struct kthread_work *work) /* Record the crash state */ pm_runtime_get_sync(&gpu->pdev->dev); - msm_gpu_crashstate_capture(gpu, submit, comm, cmd); + msm_gpu_crashstate_capture(gpu, submit, NULL, comm, cmd); kfree(cmd); kfree(comm); @@ -459,9 +462,8 @@ static void recover_worker(struct kthread_work *work) msm_gpu_retire(gpu); } -static void fault_worker(struct kthread_work *work) +void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info) { - struct msm_gpu *gpu = container_of(work, struct msm_gpu, fault_work); struct msm_gem_submit *submit; struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu); char *comm = NULL, *cmd = NULL; @@ -484,16 +486,13 @@ static void fault_worker(struct kthread_work *work) /* Record the crash state */ pm_runtime_get_sync(&gpu->pdev->dev); - msm_gpu_crashstate_capture(gpu, submit, comm, cmd); + msm_gpu_crashstate_capture(gpu, submit, fault_info, comm, cmd); pm_runtime_put_sync(&gpu->pdev->dev); kfree(cmd); kfree(comm); resume_smmu: - memset(&gpu->fault_info, 0, sizeof(gpu->fault_info)); - gpu->aspace->mmu->funcs->resume_translation(gpu->aspace->mmu); - mutex_unlock(&gpu->lock); } @@ -882,7 +881,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, init_waitqueue_head(&gpu->retire_event); kthread_init_work(&gpu->retire_work, retire_worker); kthread_init_work(&gpu->recover_work, recover_worker); - kthread_init_work(&gpu->fault_work, fault_worker); priv->hangcheck_period = DRM_MSM_HANGCHECK_DEFAULT_PERIOD; diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index e25009150579c08f7b98d4461a75757d1093734a..bed0692f5adb30e50d0448640a329158d1ffe5e5 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -253,12 +253,6 @@ struct msm_gpu { #define DRM_MSM_HANGCHECK_PROGRESS_RETRIES 3 struct timer_list hangcheck_timer; - /* Fault info for most recent iova fault: */ - struct msm_gpu_fault_info fault_info; - - /* work for handling GPU ioval faults: */ - struct kthread_work fault_work; - /* work for handling GPU recovery: */ struct kthread_work recover_work; @@ -705,6 +699,8 @@ static inline void msm_gpu_crashstate_put(struct msm_gpu *gpu) mutex_unlock(&gpu->lock); } +void msm_gpu_fault_crashstate_capture(struct msm_gpu *gpu, struct msm_gpu_fault_info *fault_info); + /* * Simple macro to semi-cleanly add the MAP_PRIV flag for targets that can * support expanded privileges From patchwork Tue May 20 17:44:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 891342 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3499A2441B4 for ; Tue, 20 May 2025 17:45:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763111; cv=none; b=jKWLxGBl2tWVuaLoMkXLfYC3UCjysHjnd0cM+CLZCsTzJuvOBouPcgiImfFmDBNUuQAQTCSqxwn8fWcKLXCcBIGo4K2qgARuioXvAT14u08IAmgSzACOKwRe7VAtr5p3FImnZCnkoE3yR2q/Ri3AT4IJ9W489khgXNK3waUutQc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747763111; c=relaxed/simple; bh=e3jbw3kDZPI89k+hZTvInm+Yxiq0J6a6lpR0/m0MqGE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=cxYBvECDMLeJ82bIwq03lmsQ+nQt4kdrB7ckYqjmrUAu2V/PnveGkKVOgnp/1UKG8H+reaVWWWPPfr53nuh77DSFUeSUJGeMNt6ETrtiWNHAIxqnsKVjcdGaalU5Rv/zDNOWZPpCfC5uLQNLOk1KLBbJFFUFo72Fgl20Vl1DIR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ey54UEZf; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ey54UEZf" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-6f0eb824f51so1316966d6.0 for ; Tue, 20 May 2025 10:45:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747763108; x=1748367908; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=r3AVFR5pvENJD0LK08oI4O9bGZ4226C5U9nUE49ThFI=; b=Ey54UEZfdVkFn0aIFK+rQlEoi+qgWwwQO5fGxs4yWDIxPi+WIZ2VLhsEGr2s8dKhQl bQ3RMt6k2GppQjZxY9FyfRupfGbpYufeWfTnZKj3Ap7w/7AagddFBNsGrIhYRbe1Nc8x RSwUUerAuqHOrWUzehUmb8AfH0vwMUD7MX693tgOMNWV5JGtnieYsiVhntKciAj4GwOJ FU3AT2BTjig8FwDtYXmE2mUnce8xX2zQNXVaWVmwyu1DwRrAKl9X7FvYJPLS2OW8hzQ1 Xo9uZeUxivj5iqgcKnPj3VU9jcP+ioCpzA3JVsrq86IO74e7TCdOG8YuG7E76DoUTpYl ttMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747763108; x=1748367908; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r3AVFR5pvENJD0LK08oI4O9bGZ4226C5U9nUE49ThFI=; b=FeoN+Bq9IZSgr7ZkUwPXjVBABJmMyNX24DIuucrobgOXs1Q5wUjiKcUhFzxQj6ZyNU elH1k3h2w6QK22RWOQax05syQsMBJ8kzOoi5wshg0mBGqyNFI0kD9EapeR9S873ndzk6 wOksgd9uvwXXzjI0A07In+zDEJBTkd20XNiHfz3hFu65mTanKSjbQqatOorfLeRbf8ak FDhU7w5ySjOffzcztSlIkqmQukmmfdWl9pNFb01tatD+wD0yzWykaB/OnwHWEYTKUU+N r0GSkqzbcUBRM9bEBuHcwHWqDM1WTwXnsojHTvdWBF3F8jbry8ZOA9MudGl1xZtBU3TA zS+g== X-Forwarded-Encrypted: i=1; AJvYcCVfPPh9QMBsjRXEYXFGmO/pN2vlnoC3x4tQqXrAAR8jYLNXIuQB5hPwxaptLtsxpCzgU4Ei/3u43Sg8rDsq@vger.kernel.org X-Gm-Message-State: AOJu0Yzs7AuEy9+9UFJQUoIipVBq+6MpPGBbNExeKocXaewBrtOBB+l1 Z0LON+c+3ULI/vuTEZ6vdlVqRAgDpeZYzmq8vBKnnx7Y+Wms61p9m4RCeeYP95eD X-Gm-Gg: ASbGnctVCGYgX1t+f/oLXx9Vjgg63No3XeuQsyb3RCjdTURmMZxbXVq5fd4svxRhFT/ QERqLk+aFq3tr7NI3rrp6kOyuCrBdz2dxXfwlnnLH7hGXqjEEGmJG1aB388S9M5SluZv71Y/McJ C4QFgOqBPMRIVY51fVhl1cFwm+nrU0LpZJisV+NWP0YCyCLQ5ZNAUTtHsPpgi78NMR8B+ahpWBm tUxESvP9oWWKeKm502Dh2p+RsKH6ghUqJw4OsgZzNZK0LBWyzmGpGAqgf+Q8B5mEODJiqxlMtE8 MUqhgXS1H6nVovXDn+1IdPLgkb0+JYXa1JJzw7O7j52/ZnVnfEFVXXLfub+8B6YoYeq9XNp+5RJ VdbXEej3zOHwClC85ZYQ= X-Google-Smtp-Source: AGHT+IE+EwoJ1X9m1AvUOZ7LWA8msklakeHYGs2LluqD9dZsBxqSjSM+J+JFRn8D0HNMVv5zDcxHGA== X-Received: by 2002:a05:6214:29ce:b0:6f8:e1d8:fa9c with SMTP id 6a1803df08f44-6f8e1d8ffa2mr27378356d6.9.1747763107407; Tue, 20 May 2025 10:45:07 -0700 (PDT) Received: from [192.168.124.1] (syn-067-243-142-039.res.spectrum.com. [67.243.142.39]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6f8b096ddb4sm74126586d6.78.2025.05.20.10.45.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 May 2025 10:45:07 -0700 (PDT) From: Connor Abbott Date: Tue, 20 May 2025 13:44:55 -0400 Subject: [PATCH v7 6/7] drm/msm: Temporarily disable stall-on-fault after a page fault Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250520-msm-gpu-fault-fixes-next-v7-6-96cd1cc9ae05@gmail.com> References: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> In-Reply-To: <20250520-msm-gpu-fault-fixes-next-v7-0-96cd1cc9ae05@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1747763098; l=10751; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=e3jbw3kDZPI89k+hZTvInm+Yxiq0J6a6lpR0/m0MqGE=; b=AQJVE5AsLsOQ2mwqyWQPNyfJcje9dLZfygWN9q3/eYHjVimK/YvntGS+JfbY4vf65aUrZvUoU pzu/X+6zfHMArYUAC7Y9cOvICx6dLfQJIDFud9C/xuIUDaY/a3WRG+N X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= When things go wrong, the GPU is capable of quickly generating millions of faulting translation requests per second. When that happens, in the stall-on-fault model each access will stall until it wins the race to signal the fault and then the RESUME register is written. This slows processing page faults to a crawl as the GPU can generate faults much faster than the CPU can acknowledge them. It also means that all available resources in the SMMU are saturated waiting for the stalled transactions, so that other transactions such as transactions generated by the GMU, which shares translation resources with the GPU, cannot proceed. This causes a GMU watchdog timeout, which leads to a failed reset because GX cannot collapse when there is a transaction pending and a permanently hung GPU. On older platforms with qcom,smmu-v2, it seems that when one transaction is stalled subsequent faulting transactions are terminated, which avoids this problem, but the MMU-500 follows the spec here. To work around these problems, disable stall-on-fault as soon as we get a page fault until a cooldown period after pagefaults stop. This allows the GMU some guaranteed time to continue working. We only use stall-on-fault to halt the GPU while we collect a devcoredump and we always terminate the transaction afterward, so it's fine to miss some subsequent page faults. We also keep it disabled so long as the current devcoredump hasn't been deleted, because in that case we likely won't capture another one if there's a fault. After this commit HFI messages still occasionally time out, because the crashdump handler doesn't run fast enough to let the GMU resume, but the driver seems to recover from it. This will probably go away after the HFI timeout is increased. Signed-off-by: Connor Abbott Reviewed-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++++ drivers/gpu/drm/msm/adreno/adreno_gpu.c | 40 ++++++++++++++++++++++++++++++++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 ++ drivers/gpu/drm/msm/msm_debugfs.c | 3 +++ drivers/gpu/drm/msm/msm_drv.c | 4 ++++ drivers/gpu/drm/msm/msm_drv.h | 23 +++++++++++++++++++ drivers/gpu/drm/msm/msm_iommu.c | 9 ++++++++ drivers/gpu/drm/msm/msm_mmu.h | 1 + 9 files changed, 87 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 650e5bac225f372e819130b891f1d020b464f17f..60aef079623606bb1ae44ba59ac45e391595b0ba 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -131,6 +131,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + if (IS_ENABLED(CONFIG_DRM_MSM_GPU_SUDO) && submit->in_rb) { ring->cur_ctx_seqno = 0; a5xx_submit_in_rb(gpu, submit); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 06465bc2d0b4b128cddfcfcaf1fe4252632b6777..afa4626d58f577d5d47f47b494b26953adcf230f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -212,6 +212,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + a6xx_set_pagetable(a6xx_gpu, ring, submit); get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP(0), @@ -335,6 +337,8 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + /* * Toggle concurrent binning for pagetable switch and set the thread to * BR since only it can execute the pagetable switch packets. diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 4a6dc29ff7071940e440297f5fbbe4e2d06c3ffd..0f8211641c318f1b619e1a72bb77f064fb78397b 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -246,16 +246,54 @@ u64 adreno_private_address_space_size(struct msm_gpu *gpu) return SZ_4G; } +void adreno_check_and_reenable_stall(struct adreno_gpu *adreno_gpu) +{ + struct msm_gpu *gpu = &adreno_gpu->base; + struct msm_drm_private *priv = gpu->dev->dev_private; + unsigned long flags; + + /* + * Wait until the cooldown period has passed and we would actually + * collect a crashdump to re-enable stall-on-fault. + */ + spin_lock_irqsave(&priv->fault_stall_lock, flags); + if (!priv->stall_enabled && + ktime_after(ktime_get(), priv->stall_reenable_time) && + !READ_ONCE(gpu->crashstate)) { + priv->stall_enabled = true; + + gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, true); + } + spin_unlock_irqrestore(&priv->fault_stall_lock, flags); +} + #define ARM_SMMU_FSR_TF BIT(1) #define ARM_SMMU_FSR_PF BIT(3) #define ARM_SMMU_FSR_EF BIT(4) +#define ARM_SMMU_FSR_SS BIT(30) int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, struct adreno_smmu_fault_info *info, const char *block, u32 scratch[4]) { + struct msm_drm_private *priv = gpu->dev->dev_private; const char *type = "UNKNOWN"; - bool do_devcoredump = info && !READ_ONCE(gpu->crashstate); + bool do_devcoredump = info && (info->fsr & ARM_SMMU_FSR_SS) && + !READ_ONCE(gpu->crashstate); + unsigned long irq_flags; + + /* + * In case there is a subsequent storm of pagefaults, disable + * stall-on-fault for at least half a second. + */ + spin_lock_irqsave(&priv->fault_stall_lock, irq_flags); + if (priv->stall_enabled) { + priv->stall_enabled = false; + + gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, false); + } + priv->stall_reenable_time = ktime_add_ms(ktime_get(), 500); + spin_unlock_irqrestore(&priv->fault_stall_lock, irq_flags); /* * Print a default message if we couldn't get the data from the diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index 92caba3584da0400b44a903e465814af165d40a3..6116f03e3d39bb208c7fa34f203931c563e029f9 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -634,6 +634,8 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, struct adreno_smmu_fault_info *info, const char *block, u32 scratch[4]); +void adreno_check_and_reenable_stall(struct adreno_gpu *gpu); + int adreno_read_speedbin(struct device *dev, u32 *speedbin); /* diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index 7ab607252d183f78b99c3a8b878c949ed5f99fec..27952c60575eb308635e7cd9af9d6eb89fdef24d 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -319,6 +319,9 @@ static void msm_debugfs_gpu_init(struct drm_minor *minor) debugfs_create_bool("disable_err_irq", 0600, minor->debugfs_root, &priv->disable_err_irq); + debugfs_create_bool("stall_on_fault_enabled", 0400, minor->debugfs_root, + &priv->stall_enabled); + gpu_devfreq = debugfs_create_dir("devfreq", minor->debugfs_root); debugfs_create_bool("idle_clamp",0600, gpu_devfreq, diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index c3588dc9e53764a27efda1901b094724cec8928a..04a4bde2d33b03ae8fb06b2134ee1910debd774a 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -245,6 +245,10 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv) drm_gem_lru_init(&priv->lru.willneed, &priv->lru.lock); drm_gem_lru_init(&priv->lru.dontneed, &priv->lru.lock); + /* Initialize stall-on-fault */ + spin_lock_init(&priv->fault_stall_lock); + priv->stall_enabled = true; + /* Teach lockdep about lock ordering wrt. shrinker: */ fs_reclaim_acquire(GFP_KERNEL); might_lock(&priv->lru.lock); diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index a65077855201746c37ee742364b61116565f3794..c8afb1ea6040b1ac94ac95a785e6fc366c8dbfd1 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -222,6 +222,29 @@ struct msm_drm_private { * the sw hangcheck mechanism. */ bool disable_err_irq; + + /** + * @fault_stall_lock: + * + * Serialize changes to stall-on-fault state. + */ + spinlock_t fault_stall_lock; + + /** + * @fault_stall_reenable_time: + * + * If stall_enabled is false, when to reenable stall-on-fault. + * Protected by @fault_stall_lock. + */ + ktime_t stall_reenable_time; + + /** + * @stall_enabled: + * + * Whether stall-on-fault is currently enabled. Protected by + * @fault_stall_lock. + */ + bool stall_enabled; }; const struct msm_format *mdp_get_format(struct msm_kms *kms, uint32_t format, uint64_t modifier); diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index aae885d048d0d2fd617d7b2a16833da25f5e84cc..739ce2c283a4613e74df4542ca3b68f180aa8335 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -372,6 +372,14 @@ static int msm_disp_fault_handler(struct iommu_domain *domain, struct device *de return -ENOSYS; } +static void msm_iommu_set_stall(struct msm_mmu *mmu, bool enable) +{ + struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(mmu->dev); + + if (adreno_smmu->set_stall) + adreno_smmu->set_stall(adreno_smmu->cookie, enable); +} + static void msm_iommu_detach(struct msm_mmu *mmu) { struct msm_iommu *iommu = to_msm_iommu(mmu); @@ -419,6 +427,7 @@ static const struct msm_mmu_funcs funcs = { .map = msm_iommu_map, .unmap = msm_iommu_unmap, .destroy = msm_iommu_destroy, + .set_stall = msm_iommu_set_stall, }; struct msm_mmu *msm_iommu_new(struct device *dev, unsigned long quirks) diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h index c3d17aae88b0a57b3c7d1df3351b39ec39bca60a..0c694907140d00bae86eb20411aed45650367e74 100644 --- a/drivers/gpu/drm/msm/msm_mmu.h +++ b/drivers/gpu/drm/msm/msm_mmu.h @@ -15,6 +15,7 @@ struct msm_mmu_funcs { size_t len, int prot); int (*unmap)(struct msm_mmu *mmu, uint64_t iova, size_t len); void (*destroy)(struct msm_mmu *mmu); + void (*set_stall)(struct msm_mmu *mmu, bool enable); }; enum msm_mmu_type {