From patchwork Tue Mar 4 16:56:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 870185 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1E2F281365 for ; Tue, 4 Mar 2025 16:57:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107444; cv=none; b=IuNg6Zn07G0HjagI6Ig763rTfG849dLvbsB4+DQe8r4uH8xO7F6EU/IB3CIVTiv+zWfTm4PPedWgTInEoJ8NzYyrRyF7zwrfZSe2nnb/QjP0XWrcf+2kjzPMMx9YujicRSiAXmq+ZDf0KzeoVOrilVhLkdHMrD6Hju4pgSnk/tc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107444; c=relaxed/simple; bh=0pLxdo5dIx3/jB5VJnMD05UBhNDf/ce6uNeBNzQSNFg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CUhqpXXr9K0xTnYpyELOkQ96njE1qS4UF6noSpbwVBv0KVkV2zWemwViNn2lcNJn0ioFy4ms6gV6FSy3t+GanMjoiE0w1O3tF1uwrTgEcUUuQi4b/jKdCl6MKaCsEjRRqM187I4pAmB0+DJCDjlBeCeHUjoIzvgHxxoANFQw7hQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AEUNX10m; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AEUNX10m" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-6e8a1a92bb3so5543616d6.3 for ; Tue, 04 Mar 2025 08:57:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741107442; x=1741712242; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=c5EzgLyFDTmhCErvGVVvwbq+cKJHcTR+SvQ3tvjXAPU=; b=AEUNX10mKZgp30Tb6t+O0JVdnp4ciC9RxxYzawsXdZL+ROcGP9YouTdLewSOyD2j/N g1aN27Rr3TmFNpVHWhPmHNob/nYVQPcbQ2wVlSn9gVrZMUJo4y2IYttHfLbe9cJb/PyU JbMniSrmRoKDtMYlfks2y2+YGEyGw6d8CBO4zVn2SqKyBZdfbXX2ezkVjoAN0YUVQ7Ew FrJN6riXR2bCPMIgD483Mh1tmSBJvZF1S/LuQU4Xz1MNglK5MKfkap5fwIJOn5BflA3W 5LkXp4bgaz1N/dHtqLMb6J1Styskpy+vE9ct75jxYgKOYZ4VvDKklrSYi4SHBUo5+z2X msbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741107442; x=1741712242; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c5EzgLyFDTmhCErvGVVvwbq+cKJHcTR+SvQ3tvjXAPU=; b=LnTjOEZ3nOgawGJCgAKNwS/aaaw3N5HzQVgMVWfeLLfwjr+gr4KHi+/ou/sXnCGMj/ vKhLtirpp7O8FzwPAQQV+obBBoNdF54x0dJF+6OP9Xizer+IPLwTaBrUlHMRsU0r9rTM JlipHBNu63SOtbE63VLae6CNITC1uyRm0Wl7NW6mMK/4YYxgp9HCmij2wJk9DNJpD0zH /80VaSOHxN5iqYEPwo895xAaOhfFf1dsLKph+t2yw2L/lRGTnW6Bo8JfWkvvpzjjd+Bu LerU+Q/X7jGLV0vzhMAVKvKuTvGkX6EPpea2J7gP8oo8x4HljDoROuazN103VmmJz9s4 4xqQ== X-Forwarded-Encrypted: i=1; AJvYcCUv6xonu5CdrRU4i7RYNSqAceIOPe/ilFdf1Ubwwrw0omC25uSIO/Sj/oX3lGRFgRgBtVSteLI2hMpPbmPY@vger.kernel.org X-Gm-Message-State: AOJu0YxBUulPL4bvyeueAVF++lsMBWvf5FCWPgd47zsee1e+LMyPrJ3/ GVkjJoMjvJB4VPVhLdqaBSXJX9LtV6PsbuXaH6KmoC4qJxeLeFsR X-Gm-Gg: ASbGncsX+o+C5hlZYE7ky2SzpRTDOWGv0eeGmcb/8iWdo5IlFai9quEHiKLfkzwIrK4 PDmmtfuDyD8xhgFSKJIKbiMq/3gjgGcPA37ntdxRteKaVEqZaottJ0/s3ihK6qfClCrEwL1WESH lUryGC1XQwtOm9bLX+Bh1L0a9WJU7QAiGQUV+u8g2/wX0AQu+JY/FCC/CVIgeYTWO1L4TYKk9Zg h0AoummR/X16u1bvDqLHjBIv3jIRV/fnattDMDwc/OwIxuLfm7tVE41nj5+TYb9aqONGPJxDUPj wFTWq386iGbayeqv8bsfrXReWLSX+N13vjqVz0yJEpqlt30AsDyDnv80Iqta46UNEEF9S4PEILf Tcyc= X-Google-Smtp-Source: AGHT+IGoupxASii7X7ZBE/b1vhvmyQma/ZYdh+Tbyfe+Zw9nzGDokCIyHSV+XP73s08QmxQbPXLd+Q== X-Received: by 2002:ad4:596c:0:b0:6d4:2db5:e585 with SMTP id 6a1803df08f44-6e8a0c8667amr101294476d6.1.1741107441643; Tue, 04 Mar 2025 08:57:21 -0800 (PST) Received: from [192.168.1.99] (ool-4355b0da.dyn.optonline.net. [67.85.176.218]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e8976ec3b6sm68915966d6.125.2025.03.04.08.57.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 08:57:21 -0800 (PST) From: Connor Abbott Date: Tue, 04 Mar 2025 11:56:47 -0500 Subject: [PATCH v4 1/5] iommu/arm-smmu: Save additional information on context fault Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250304-msm-gpu-fault-fixes-next-v4-1-be14be37f4c3@gmail.com> References: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> In-Reply-To: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1741107439; l=5429; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=0pLxdo5dIx3/jB5VJnMD05UBhNDf/ce6uNeBNzQSNFg=; b=Y1CairOsJ41mvDRtKq26OuucDooNFTjLoWPwbiBFIfoSNVkipQappptYF6gmvKXrHP1SoogsK PYXTL5Ae4q5CwQehnNi46RBvc1kI63x6VHaiJHQLf1iWq/3bMcW/0aU X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= This will be used by drm/msm for GPU page faults, replacing the manual register reading it does. Signed-off-by: Connor Abbott --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 4 ++-- drivers/iommu/arm/arm-smmu/arm-smmu.c | 27 +++++++++++++----------- drivers/iommu/arm/arm-smmu/arm-smmu.h | 5 ++++- 3 files changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c index 548783f3f8e89fd978367afa65c473002f66e2e7..ae4fdbbce6ba80440f539557a39866a932360d4e 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c @@ -400,7 +400,7 @@ irqreturn_t qcom_smmu_context_fault(int irq, void *dev) if (list_empty(&tbu_list)) { ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova, - cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); + cfi.fsynr0 & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); if (ret == -ENOSYS) arm_smmu_print_context_fault_info(smmu, idx, &cfi); @@ -412,7 +412,7 @@ irqreturn_t qcom_smmu_context_fault(int irq, void *dev) phys_soft = ops->iova_to_phys(ops, cfi.iova); tmp = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova, - cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); + cfi.fsynr0 & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); if (!tmp || tmp == -EBUSY) { ret = IRQ_HANDLED; resume = ARM_SMMU_RESUME_TERMINATE; diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index ade4684c14c9b2724a71e2457288dbfaf7562c83..a9213e0f1579d1e3be0bfba75eea1d5de23117de 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -409,9 +409,12 @@ void arm_smmu_read_context_fault_info(struct arm_smmu_device *smmu, int idx, struct arm_smmu_context_fault_info *cfi) { cfi->iova = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_FAR); + cfi->ttbr0 = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_TTBR0); cfi->fsr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSR); - cfi->fsynr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR0); + cfi->fsynr0 = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR0); + cfi->fsynr1 = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_FSYNR1); cfi->cbfrsynra = arm_smmu_gr1_read(smmu, ARM_SMMU_GR1_CBFRSYNRA(idx)); + cfi->contextidr = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_CONTEXTIDR); } void arm_smmu_print_context_fault_info(struct arm_smmu_device *smmu, int idx, @@ -419,7 +422,7 @@ void arm_smmu_print_context_fault_info(struct arm_smmu_device *smmu, int idx, { dev_err(smmu->dev, "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cbfrsynra=0x%x, cb=%d\n", - cfi->fsr, cfi->iova, cfi->fsynr, cfi->cbfrsynra, idx); + cfi->fsr, cfi->iova, cfi->fsynr0, cfi->cbfrsynra, idx); dev_err(smmu->dev, "FSR = %08x [%s%sFormat=%u%s%s%s%s%s%s%s%s], SID=0x%x\n", cfi->fsr, @@ -437,15 +440,15 @@ void arm_smmu_print_context_fault_info(struct arm_smmu_device *smmu, int idx, cfi->cbfrsynra); dev_err(smmu->dev, "FSYNR0 = %08x [S1CBNDX=%u%s%s%s%s%s%s PLVL=%u]\n", - cfi->fsynr, - (u32)FIELD_GET(ARM_SMMU_CB_FSYNR0_S1CBNDX, cfi->fsynr), - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_AFR) ? " AFR" : "", - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_PTWF) ? " PTWF" : "", - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_NSATTR) ? " NSATTR" : "", - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_IND) ? " IND" : "", - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_PNU) ? " PNU" : "", - (cfi->fsynr & ARM_SMMU_CB_FSYNR0_WNR) ? " WNR" : "", - (u32)FIELD_GET(ARM_SMMU_CB_FSYNR0_PLVL, cfi->fsynr)); + cfi->fsynr0, + (u32)FIELD_GET(ARM_SMMU_CB_FSYNR0_S1CBNDX, cfi->fsynr0), + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_AFR) ? " AFR" : "", + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_PTWF) ? " PTWF" : "", + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_NSATTR) ? " NSATTR" : "", + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_IND) ? " IND" : "", + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_PNU) ? " PNU" : "", + (cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_WNR) ? " WNR" : "", + (u32)FIELD_GET(ARM_SMMU_CB_FSYNR0_PLVL, cfi->fsynr0)); } static irqreturn_t arm_smmu_context_fault(int irq, void *dev) @@ -464,7 +467,7 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) return IRQ_NONE; ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova, - cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); + cfi.fsynr0 & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); if (ret == -ENOSYS && __ratelimit(&rs)) arm_smmu_print_context_fault_info(smmu, idx, &cfi); diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index e2aeb511ae903302e3c15d2cf5f22e2a26ac2346..d3bc77dcd4d40f25bc70f3289616fb866649b022 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -543,9 +543,12 @@ int arm_mmu500_reset(struct arm_smmu_device *smmu); struct arm_smmu_context_fault_info { unsigned long iova; + u64 ttbr0; u32 fsr; - u32 fsynr; + u32 fsynr0; + u32 fsynr1; u32 cbfrsynra; + u32 contextidr; }; void arm_smmu_read_context_fault_info(struct arm_smmu_device *smmu, int idx, From patchwork Tue Mar 4 16:56:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 870184 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A9C4283C81 for ; Tue, 4 Mar 2025 16:57:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107446; cv=none; b=KpCbSWAB9tdSkEQxYuDHtXp5jLFpjZWK3pNg+QL0e4HwyMxM+OKLhcji9Coc+QkdQzsK8dd9QsXOouegI76Iw+AzjzGoBg+ZcEHX2+Ztd3uAVMmd1AGqNdU6PFxvAS3k1WXGnawDTsw8Ujl+u261jcQWbLjKhcCFTlXqzgToqzs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107446; c=relaxed/simple; bh=lREsE9qBpuBOvxRn1Al6spv7zBw81cKds9oUzoZ9Rk0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VQCvB8vB0jUQbfWVP2d99K1lD+taLa4UnslNxTCBGGeJ9TEHlHV7K6Vil3L4zftWzcoQvvHyBA8H1qNAwkxvYb25wMRF23fuC0yH/eTzxft2qalczpT07/A9tYATywpnEGAOIC2IU8okPpyksWrlkJMLqUz1KdRbQmjgKg+dReI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SXJkiDyk; arc=none smtp.client-ip=209.85.219.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SXJkiDyk" Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-6e887999a66so3873566d6.3 for ; Tue, 04 Mar 2025 08:57:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741107444; x=1741712244; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=tz8w4kDb+vNicTDLDyVXgyrGlaWBUBamCstFPGHL+6k=; b=SXJkiDykdAJ52N3VdumTX1dtR/pAv85r1aI/DGA7zWH2sF1dS8wFs8E4JuPW9vTQKO q9aQQZi8VFC4BRYYCiSPfVGpmgPnlqCXQUaA2iQ8blOJqdwjEEzGg3PwhIXEcNJGk+Jc 1h/o7DW7B1e1n7yscIBkuA5xNMmkOxefc53COuDqDz828gp9P5YVeZYA8LVmV44/mLlU rLo79YRa4JvXqyuGKAO/7jOO8ALVG+UvxbAj1O4QZ7rODmAhh4JiOh/r1LVvPCLpPjOp vAnszH7iUrVI9fXXlyNyklB0qZWztiSy1UOLO1TiVOt2PoH46VkyqCJ8sPZjYfZY8K82 V8bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741107444; x=1741712244; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tz8w4kDb+vNicTDLDyVXgyrGlaWBUBamCstFPGHL+6k=; b=exR1K1xep3yGl8TWJyZyyB9txV0e7xZsS39COgg6kjbbARFGXObG2XcYZkT6g8hDd5 YhRch6CZQf616OV4myzKsBJt8wtPrkVDFMJmMySlgDynYfggucLH99oTsH6YfsC5fDW5 5ww9SJZtiyn6nzQU5MOd/pM67oAMQdOjqu2JzxG5MwFud0sLzRUNzi/NT5B9kjIyexU/ NyOskce6Yh8mum6b0noHllAn2EMkfnDA9Yo0yeIm3Xw404HogMcFhX3DFBm37un3rnth 2NXpQ12TYo2bHm1yhdwhe4r3N+eYg1asE3U4p5N+ehbI+4pgfxHq5tuN2LxZ2567tOUq he/Q== X-Forwarded-Encrypted: i=1; AJvYcCUUFBRMMYUcs/YcoH1DRYXNy9Kl9p0SaZ5hXtKqerk5HMXunpLvHpXYIGftCJy7XBuNq1wVATNYBNuv/HYA@vger.kernel.org X-Gm-Message-State: AOJu0YwnhjEFhkk2C0zCr0jpkRBqmoZDgNIS5JEOzU0v0oqlhsDRertb 4tLgEMj70PqIpaMnyC5Vkg2xhuJUcU7A6efyUNB0gs91rTR3xlUL X-Gm-Gg: ASbGncseVJpcZuHU/yt0h0tyn9mNHdZAYzhuA8jJ9lpYXaSi4dbCGUH2lCBvMe6aqaS aWdX0YOeJSStM0HsZH4idTeuOJhFVzs6AO8LSZL6qcZjfQTFp8cVDnH1lB2avV2CUl/oXOu5UFy TApbwAwBwQXEO0Kl7kiSSK9YP/XcwQM4DzHmtP66ZEPJ3hulguW7OlZ9WvQ0YvCqu10ci54CnEn vJJQc2isYtK+1K4qp1T/bI2QjC9mtoAnQjahHuH1y4kGqM2FRrphyc8ttA4mIJNLjQUCAr3ycIm Qkx7QAYPxCoFPl+6OOV05c7cbVTj6dAbR5uBjnf/rsVAptkahD9tSgbOSkuGTlfFZ/IfnmmoIuA UAOQ= X-Google-Smtp-Source: AGHT+IGX+RhV/Giv/+riLU1NZc5RpUqeZvB0d4P0/Z/QJ0oUGooasJgjc40zbomuWlTikveGaN/ADw== X-Received: by 2002:a05:6214:f02:b0:6d9:2fac:c208 with SMTP id 6a1803df08f44-6e8dc252aa1mr16326906d6.6.1741107443720; Tue, 04 Mar 2025 08:57:23 -0800 (PST) Received: from [192.168.1.99] (ool-4355b0da.dyn.optonline.net. [67.85.176.218]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e8976ec3b6sm68915966d6.125.2025.03.04.08.57.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 08:57:23 -0800 (PST) From: Connor Abbott Date: Tue, 04 Mar 2025 11:56:49 -0500 Subject: [PATCH v4 3/5] iommu/arm-smmu: Fix spurious interrupts with stall-on-fault Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250304-msm-gpu-fault-fixes-next-v4-3-be14be37f4c3@gmail.com> References: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> In-Reply-To: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1741107439; l=5777; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=lREsE9qBpuBOvxRn1Al6spv7zBw81cKds9oUzoZ9Rk0=; b=Edw7Jd3oWdgjh3lOT944zup9heTrKBEqqSb98hqpPcYfCRyt0+NAdBSz2Lfz/IGq7oUaZQXZa chdOfLLBveDA6Q0x3oHRtGMbFmfc5pEBYPcsS4SMF6TGji2fHYSunZL X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= On some SMMUv2 implementations, including MMU-500, SMMU_CBn_FSR.SS asserts an interrupt. The only way to clear that bit is to resume the transaction by writing SMMU_CBn_RESUME, but typically resuming the transaction requires complex operations (copying in pages, etc.) that can't be done in IRQ context. drm/msm already has a problem, because its fault handler sometimes schedules a job to dump the GPU state and doesn't resume translation until this is complete. Work around this by disabling context fault interrupts until after the transaction is resumed. Because other context banks can share an IRQ line, we may still get an interrupt intended for another context bank, but in this case only SMMU_CBn_FSR.SS will be asserted and we can skip it assuming that interrupts are disabled which is accomplished by removing the bit from ARM_SMMU_CB_FSR_FAULT. SMMU_CBn_FSR.SS won't be asserted unless an external user enabled stall-on-fault, and they are expected to resume the translation and re-enable interrupts. Signed-off-by: Connor Abbott Reviewed-by Robin Murphy --- drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 15 ++++++++++- drivers/iommu/arm/arm-smmu/arm-smmu.c | 41 +++++++++++++++++++++++++++++- drivers/iommu/arm/arm-smmu/arm-smmu.h | 1 - 3 files changed, 54 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c index 186d6ad4fd1c990398df4dec53f4d58ada9e658c..a428e53add08d451fb2152e3ab80e0fba936e214 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c @@ -90,12 +90,25 @@ static void qcom_adreno_smmu_resume_translation(const void *cookie, bool termina struct arm_smmu_domain *smmu_domain = (void *)cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; struct arm_smmu_device *smmu = smmu_domain->smmu; - u32 reg = 0; + u32 reg = 0, sctlr; + unsigned long flags; if (terminate) reg |= ARM_SMMU_RESUME_TERMINATE; + spin_lock_irqsave(&smmu_domain->cb_lock, flags); + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_RESUME, reg); + + /* + * Re-enable interrupts after they were disabled by + * arm_smmu_context_fault(). + */ + sctlr = arm_smmu_cb_read(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR); + sctlr |= ARM_SMMU_SCTLR_CFIE; + arm_smmu_cb_write(smmu, cfg->cbndx, ARM_SMMU_CB_SCTLR, sctlr); + + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); } #define QCOM_ADRENO_SMMU_GPU_SID 0 diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 498b96e95cb4fdb67c246ef13de1eb8f40d68f7d..284079ef95cd2deeb71816a284850523897badd8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -466,13 +466,52 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) if (!(cfi->fsr & ARM_SMMU_CB_FSR_FAULT)) return IRQ_NONE; + /* + * On some implementations FSR.SS asserts a context fault + * interrupt. We do not want this behavior, because resolving the + * original context fault typically requires operations that cannot be + * performed in IRQ context but leaving the stall unacknowledged will + * immediately lead to another spurious interrupt as FSR.SS is still + * set. Work around this by disabling interrupts for this context bank. + * It's expected that interrupts are re-enabled after resuming the + * translation. + * + * We have to do this before report_iommu_fault() so that we don't + * leave interrupts disabled in case the downstream user decides the + * fault can be resolved inside its fault handler. + * + * There is a possible race if there are multiple context banks sharing + * the same interrupt and both signal an interrupt in between writing + * RESUME and SCTLR. We could disable interrupts here before we + * re-enable them in the resume handler, leaving interrupts enabled. + * Lock the write to serialize it with the resume handler. + */ + if (cfi->fsr & ARM_SMMU_CB_FSR_SS) { + u32 val; + + spin_lock(&smmu_domain->cb_lock); + val = arm_smmu_cb_read(smmu, idx, ARM_SMMU_CB_SCTLR); + val &= ~ARM_SMMU_SCTLR_CFIE; + arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, val); + spin_unlock(&smmu_domain->cb_lock); + } + + /* + * The SMMUv2 architecture specification says that if stall-on-fault is + * enabled the correct sequence is to write to SMMU_CBn_FSR to clear + * the fault and then write to SMMU_CBn_RESUME. Clear the interrupt + * first before running the user's fault handler to make sure we follow + * this sequence. It should be ok if there is another fault in the + * meantime because we have already read the fault info. + */ + arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi->fsr); + ret = report_iommu_fault(&smmu_domain->domain, NULL, cfi->iova, cfi->fsynr0 & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ); if (ret == -ENOSYS && __ratelimit(&rs)) arm_smmu_print_context_fault_info(smmu, idx, cfi); - arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_FSR, cfi->fsr); return IRQ_HANDLED; } diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h index 411d807e0a7033833716635efb3968a0bd3ff237..4235b772c2cb032778816578c9e6644512543a5e 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h @@ -214,7 +214,6 @@ enum arm_smmu_cbar_type { ARM_SMMU_CB_FSR_TLBLKF) #define ARM_SMMU_CB_FSR_FAULT (ARM_SMMU_CB_FSR_MULTI | \ - ARM_SMMU_CB_FSR_SS | \ ARM_SMMU_CB_FSR_UUT | \ ARM_SMMU_CB_FSR_EF | \ ARM_SMMU_CB_FSR_PF | \ From patchwork Tue Mar 4 16:56:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Connor Abbott X-Patchwork-Id: 870183 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94E6B283C9A for ; Tue, 4 Mar 2025 16:57:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107450; cv=none; b=lDY7w18CP6Rov8J/LpWgHLo1NhzcvzkZC+aWexfuqxxZ6YXF1MoiZ55/BJtHmtWlz0TUWAqqQ9x64j0ma4PWY9qZS+8pda1JZ9z4a4HEL3mqECGs3wHkR4ueoT8DfDhAs4GFzVM65Z5z2Ez3clXm4i12VmJ8XqgEJGz+ikwf4gA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741107450; c=relaxed/simple; bh=1LEQ/SMNbwf2yI/kiLpEh729+NspG9MjxFycNdw8Mco=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BoFPnpo+kxZ+bwPv6xbB+IUr05TORwwDdvb3Hz4LyiypWIr/CIsU3HDzDiZND2fukuy3ziuTa00fA2CENzmiptIeN6ZoW5hSr1vp+v911EhKSR2DEh0l4KqUAON8bmY6QgoF5C6aVvPSJeNMXpRe4Kc3U0pxGWuzl8He/R4msyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BfE7TIDQ; arc=none smtp.client-ip=209.85.219.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BfE7TIDQ" Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-6e88ee9dd84so9052416d6.1 for ; Tue, 04 Mar 2025 08:57:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741107446; x=1741712246; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=uvSBZTv+Bn83Qgi3n8/RjQE01uKIcw895iDtbXzMRaM=; b=BfE7TIDQ2zLek1Qj12YD1NKTE3S7EJh4tn0+yM71/NWEdt5wm+5BbevB5eGawvQu67 ZoE+MvXvnpE/N6JiaDQtof9L7pt+7WuR1fU8xgSZv1ksdtYPXmDGuRV0tsAnqZnOziX3 9KVMcT1aMhwNi9Gx+NV0abRJtZeT9P+tVV5WjSyvtVT+I+heo9BNlZQFKCPdTvU2Zmdb gsXTZ3Hi9RmGMW3OAIlO5ENbPWGN4GUEl12DbtcEoZi8pgqT3WPPWy59a8CWQ0PvxWJu ClBdbzgK1GBNpZrezSf2C/mdZi//b/czT8Q/rtQEu82fLZm5+8mqW9NuB/p8zKSUig03 SPnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741107446; x=1741712246; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uvSBZTv+Bn83Qgi3n8/RjQE01uKIcw895iDtbXzMRaM=; b=aJNUsgZzdUmjRr8WWf9MK4FfVy5rwZ5vsW3ConeePC5/sLZ3/c2gAHXEbf4runPoe2 zpw4hQacyGopz+nfpR4fTft/qeXVOeHa2ZEX111hLEgA+/D1cDuy6A3uSM6I060jznAI JdWetsLoedW35zVKc4R4R776cOWaNd3S9TDq7fU57Eaa+gxDSmNQbeEs6MOxzOu/HCaq 32Vng5/USFsUarws+bhPx/nsYdH7U5qVxEpZMLugrRcSLgfle9KzJqxiivZ/wFSgj7zv ePzLoLrKUJrj35CpFIIzUwuJHiGXpFB7XPZwrrDUgp8sUjSt/+qWx4SMgXjCVM/bp5aF xUbA== X-Forwarded-Encrypted: i=1; AJvYcCVkzGBBm+vgac9rq9ZuE1wRFoaK0VwUlki1hB7SSeH7ldo/SrDTc9nEe6S6SUCf6xbfpOVATzdWp/3n2W8Q@vger.kernel.org X-Gm-Message-State: AOJu0Yx2sIdy5WauA9iHqWqNWUHIEVqG9USb3sSRq2/BwH56LGKzbbEc XFbRKpqzZ5ocD26qvfpwghXXkSgbvnsxccQzyhAOu+VClqxticQh X-Gm-Gg: ASbGnctUwrq0nq99HpaxvX8c0iEB6k7f8uouT1VMzYfEvrv9dWebEeeCwkP7iMWbEr0 gLzM1o0RNUo+UlJVkucNMi7OIB20EAlWu12DjFB2Ny0PkIOMesJ7s0QrXapKdVxQ/rAbQZNWLyS r21VpeyLWSwE9zt2bcTcnElkUlzbgfkkLbLSBNLyDeiUdu0OhnMZABIwxP5tCWFulo6PuZep7S8 KHS8DbbNr5ITzLmYAVv2oKRRYNS5l4Q+ho2ICWL3TvxcT3vxWhO7B3tb2KVPYtX3HvIH9qiOO7e kl+IEwtRdSRHd2rXrAsq+eH4n6269O3lsF32KpLM6DKGUCAIz/kbqzLyXvrzftRaA7v5HP1GzDZ ABNs= X-Google-Smtp-Source: AGHT+IG5ertw9SqZ82seP6yo0bwmOgw/MoKrjDy2609FiAuOH0UXSvPb32PmJG14bpD2BohhLdq4Qw== X-Received: by 2002:a05:6214:2462:b0:6d8:e6be:50fc with SMTP id 6a1803df08f44-6e8a0d7281amr88986276d6.6.1741107446025; Tue, 04 Mar 2025 08:57:26 -0800 (PST) Received: from [192.168.1.99] (ool-4355b0da.dyn.optonline.net. [67.85.176.218]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6e8976ec3b6sm68915966d6.125.2025.03.04.08.57.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 08:57:25 -0800 (PST) From: Connor Abbott Date: Tue, 04 Mar 2025 11:56:51 -0500 Subject: [PATCH v4 5/5] drm/msm: Temporarily disable stall-on-fault after a page fault Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250304-msm-gpu-fault-fixes-next-v4-5-be14be37f4c3@gmail.com> References: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> In-Reply-To: <20250304-msm-gpu-fault-fixes-next-v4-0-be14be37f4c3@gmail.com> To: Rob Clark , Will Deacon , Robin Murphy , Joerg Roedel , Sean Paul , Konrad Dybcio , Abhinav Kumar , Dmitry Baryshkov , Marijn Suijten Cc: iommu@lists.linux.dev, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, freedreno@lists.freedesktop.org, Connor Abbott X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1741107439; l=9275; i=cwabbott0@gmail.com; s=20240426; h=from:subject:message-id; bh=1LEQ/SMNbwf2yI/kiLpEh729+NspG9MjxFycNdw8Mco=; b=da8F8D1JeaERAjqcBDywOQpSKylWhhwOpzeiLpZTehLyJpqF0L/L+8uNejzktaGj0JeAqNW7Z /QFELiO7dWWCtNdvh2Mupw/wlKE+PXfXv4iEBsA3JcXDq3o0HxQpHgb X-Developer-Key: i=cwabbott0@gmail.com; a=ed25519; pk=dkpOeRSXLzVgqhy0Idr3nsBr4ranyERLMnoAgR4cHmY= When things go wrong, the GPU is capable of quickly generating millions of faulting translation requests per second. When that happens, in the stall-on-fault model each access will stall until it wins the race to signal the fault and then the RESUME register is written. This slows processing page faults to a crawl as the GPU can generate faults much faster than the CPU can acknowledge them. It also means that all available resources in the SMMU are saturated waiting for the stalled transactions, so that other transactions such as transactions generated by the GMU, which shares a context bank with the GPU, cannot proceed. This causes a GMU watchdog timeout, which leads to a failed reset because GX cannot collapse when there is a transaction pending and a permanently hung GPU. On older platforms with qcom,smmu-v2, it seems that when one transaction is stalled subsequent faulting transactions are terminated, which avoids this problem, but the MMU-500 follows the spec here. To work around these problem, disable stall-on-fault as soon as we get a page fault until a cooldown period after pagefaults stop. This allows the GMU some guaranteed time to continue working. We only use stall-on-fault to halt the GPU while we collect a devcoredump and we always terminate the transaction afterward, so it's fine to miss some subsequent page faults. We also keep it disabled so long as the current devcoredump hasn't been deleted, because in that case we likely won't capture another one if there's a fault. After this commit HFI messages still occasionally time out, because the crashdump handler doesn't run fast enough to let the GMU resume, but the driver seems to recover from it. This will probably go away after the HFI timeout is increased. Signed-off-by: Connor Abbott --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++++ drivers/gpu/drm/msm/adreno/adreno_gpu.c | 42 ++++++++++++++++++++++++++++++++- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 24 +++++++++++++++++++ drivers/gpu/drm/msm/msm_iommu.c | 9 +++++++ drivers/gpu/drm/msm/msm_mmu.h | 1 + 6 files changed, 81 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 71dca78cd7a5324e9ff5b14f173e2209fa42e196..670141531112c9d29cef8ef1fd51b74759fdd6d2 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -131,6 +131,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + if (IS_ENABLED(CONFIG_DRM_MSM_GPU_SUDO) && submit->in_rb) { ring->cur_ctx_seqno = 0; a5xx_submit_in_rb(gpu, submit); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 0ae29a7c8a4d3f74236a35cc919f69d5c0a384a0..5a34cd2109a2d74c92841448a61ccb0d4f34e264 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -212,6 +212,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + a6xx_set_pagetable(a6xx_gpu, ring, submit); get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP(0), @@ -335,6 +337,8 @@ static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct msm_ringbuffer *ring = submit->ring; unsigned int i, ibs = 0; + adreno_check_and_reenable_stall(adreno_gpu); + /* * Toggle concurrent binning for pagetable switch and set the thread to * BR since only it can execute the pagetable switch packets. diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 1238f326597808eb28b4c6822cbd41a26e555eb9..bac586101dc0494f46b069a8440a45825dfe9b5e 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -246,16 +246,53 @@ u64 adreno_private_address_space_size(struct msm_gpu *gpu) return SZ_4G; } +void adreno_check_and_reenable_stall(struct adreno_gpu *adreno_gpu) +{ + struct msm_gpu *gpu = &adreno_gpu->base; + unsigned long flags; + + /* + * Wait until the cooldown period has passed and we would actually + * collect a crashdump to re-enable stall-on-fault. + */ + spin_lock_irqsave(&adreno_gpu->fault_stall_lock, flags); + if (!adreno_gpu->stall_enabled && + ktime_after(ktime_get(), adreno_gpu->stall_reenable_time) && + !READ_ONCE(gpu->crashstate)) { + adreno_gpu->stall_enabled = true; + + gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, true); + } + spin_unlock_irqrestore(&adreno_gpu->fault_stall_lock, flags); +} + #define ARM_SMMU_FSR_TF BIT(1) #define ARM_SMMU_FSR_PF BIT(3) #define ARM_SMMU_FSR_EF BIT(4) +#define ARM_SMMU_FSR_SS BIT(30) int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, struct adreno_smmu_fault_info *info, const char *block, u32 scratch[4]) { + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); const char *type = "UNKNOWN"; - bool do_devcoredump = info && !READ_ONCE(gpu->crashstate); + bool do_devcoredump = info && (info->fsr & ARM_SMMU_FSR_SS) && + !READ_ONCE(gpu->crashstate); + unsigned long irq_flags; + + /* + * In case there is a subsequent storm of pagefaults, disable + * stall-on-fault for at least half a second. + */ + spin_lock_irqsave(&adreno_gpu->fault_stall_lock, irq_flags); + if (adreno_gpu->stall_enabled) { + adreno_gpu->stall_enabled = false; + + gpu->aspace->mmu->funcs->set_stall(gpu->aspace->mmu, false); + } + adreno_gpu->stall_reenable_time = ktime_add_ms(ktime_get(), 500); + spin_unlock_irqrestore(&adreno_gpu->fault_stall_lock, irq_flags); /* * If we aren't going to be resuming later from fault_worker, then do @@ -1143,6 +1180,9 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, adreno_gpu->info->inactive_period); pm_runtime_use_autosuspend(dev); + spin_lock_init(&adreno_gpu->fault_stall_lock); + adreno_gpu->stall_enabled = true; + return msm_gpu_init(drm, pdev, &adreno_gpu->base, &funcs->base, gpu_name, &adreno_gpu_config); } diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index dcf454629ce037b2a8274a6699674ad754ce1f07..a528036b46216bd898f6d48c5fb0555c4c4b053b 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -205,6 +205,28 @@ struct adreno_gpu { /* firmware: */ const struct firmware *fw[ADRENO_FW_MAX]; + /** + * fault_stall_lock: + * + * Serialize changes to stall-on-fault state. + */ + spinlock_t fault_stall_lock; + + /** + * fault_stall_reenable_time: + * + * if stall_enabled is false, when to reenable stall-on-fault. + */ + ktime_t stall_reenable_time; + + /** + * stall_enabled: + * + * Whether stall-on-fault is currently enabled. + */ + bool stall_enabled; + + struct { /** * @rgb565_predicator: Unknown, introduced with A650 family, @@ -629,6 +651,8 @@ int adreno_fault_handler(struct msm_gpu *gpu, unsigned long iova, int flags, struct adreno_smmu_fault_info *info, const char *block, u32 scratch[4]); +void adreno_check_and_reenable_stall(struct adreno_gpu *gpu); + int adreno_read_speedbin(struct device *dev, u32 *speedbin); /* diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index 2a94e82316f95c5f9dcc37ef0a4664a29e3492b2..8d5380e6dcc217c7c209b51527bf15748b3ada71 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -351,6 +351,14 @@ static void msm_iommu_resume_translation(struct msm_mmu *mmu) adreno_smmu->resume_translation(adreno_smmu->cookie, true); } +static void msm_iommu_set_stall(struct msm_mmu *mmu, bool enable) +{ + struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(mmu->dev); + + if (adreno_smmu->set_stall) + adreno_smmu->set_stall(adreno_smmu->cookie, enable); +} + static void msm_iommu_detach(struct msm_mmu *mmu) { struct msm_iommu *iommu = to_msm_iommu(mmu); @@ -399,6 +407,7 @@ static const struct msm_mmu_funcs funcs = { .unmap = msm_iommu_unmap, .destroy = msm_iommu_destroy, .resume_translation = msm_iommu_resume_translation, + .set_stall = msm_iommu_set_stall, }; struct msm_mmu *msm_iommu_new(struct device *dev, unsigned long quirks) diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h index 88af4f490881f2a6789ae2d03e1c02d10046331a..2694a356a17904e7572b767b16ed0cee806406cf 100644 --- a/drivers/gpu/drm/msm/msm_mmu.h +++ b/drivers/gpu/drm/msm/msm_mmu.h @@ -16,6 +16,7 @@ struct msm_mmu_funcs { int (*unmap)(struct msm_mmu *mmu, uint64_t iova, size_t len); void (*destroy)(struct msm_mmu *mmu); void (*resume_translation)(struct msm_mmu *mmu); + void (*set_stall)(struct msm_mmu *mmu, bool enable); }; enum msm_mmu_type {