Message ID | 20240814-smmu-v1-1-3d6c27027d5b@freebox.fr |
---|---|
State | New |
Headers | show |
Series | Work around reserved SMMU context bank on msm8998 | expand |
On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote: > On qcom msm8998, writing to the last context bank of lpass_q6_smmu > (base address 0x05100000) produces a system freeze & reboot. > > Specifically, here: > > qsmmu->bypass_cbndx = smmu->num_context_banks - 1; > arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0); > > and here: > > arm_smmu_write_context_bank(smmu, i); > arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT); > > It is likely that FW reserves the last context bank for its own use, > thus a simple work-around would be: DON'T USE IT in Linux. > > Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> > --- > Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml > index 280b4e49f2191..f9b23aef351b0 100644 > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml > @@ -204,6 +204,12 @@ properties: > access to SMMU configuration registers. In this case non-secure aliases of > secure registers have to be used during SMMU configuration. > > + qcom,last-ctx-bank-reserved: > + type: boolean > + description: > + FW reserves the last context bank of this SMMU for its own use. > + If Linux tries to use it, Linux gets nuked. How is this Qualcomm specific? Presumably any implementation could do this if there's no way to properly partition things. Robin? Also, this property isn't very flexible. What happens when it is not the last bank or more than 1 bank reserved? This should probably be a mask instead. Rob
On 18/08/2024 17:25, Rob Herring wrote: > On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote: > >> On qcom msm8998, writing to the last context bank of lpass_q6_smmu >> (base address 0x05100000) produces a system freeze & reboot. >> >> Specifically, here: >> >> qsmmu->bypass_cbndx = smmu->num_context_banks - 1; >> arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0); >> >> and here: >> >> arm_smmu_write_context_bank(smmu, i); >> arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT); >> >> It is likely that FW reserves the last context bank for its own use, >> thus a simple work-around would be: DON'T USE IT in Linux. >> >> Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> >> --- >> Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >> index 280b4e49f2191..f9b23aef351b0 100644 >> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >> @@ -204,6 +204,12 @@ properties: >> access to SMMU configuration registers. In this case non-secure aliases of >> secure registers have to be used during SMMU configuration. >> >> + qcom,last-ctx-bank-reserved: >> + type: boolean >> + description: >> + FW reserves the last context bank of this SMMU for its own use. >> + If Linux tries to use it, Linux gets nuked. > > How is this Qualcomm specific? Presumably any implementation could do > this if there's no way to properly partition things. Robin? Obviously, there is nothing Qualcomm specific about reserving an SMMU context bank for the FW / hypervisor, other than it appears that qcom is the first to do it; or at least the LPASS SMMU on qcom msm8998 is the first known SMMU where such a work-around is required. What is the correct nomenclature? Can we just drop the vendor prefix if a property is generic across vendors? But does it require a subsystem prefix like "iommu" in order to not clash with generic props in other subsystems? > Also, this property isn't very flexible. What happens when it is not the > last bank or more than 1 bank reserved? This should probably be a mask > instead. OK, I'm getting conflicting requests here. Bjorn has recommended dropping the property altogether: > It also seems, as the different SMMUs in this platform behave > differently it might be worth giving them further specific compatibles, > in which case we could just check if it's the qcom,msm8998-lpass-smmu, > instead of inventing a property for this quirk. I'll send a patch series in line with Bjorn's request. Regards
On 19/08/2024 12:37 pm, Marc Gonzalez wrote: > On 18/08/2024 17:25, Rob Herring wrote: > >> On Wed, Aug 14, 2024 at 03:59:55PM +0200, Marc Gonzalez wrote: >> >>> On qcom msm8998, writing to the last context bank of lpass_q6_smmu >>> (base address 0x05100000) produces a system freeze & reboot. >>> >>> Specifically, here: >>> >>> qsmmu->bypass_cbndx = smmu->num_context_banks - 1; >>> arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0); >>> >>> and here: >>> >>> arm_smmu_write_context_bank(smmu, i); >>> arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT); >>> >>> It is likely that FW reserves the last context bank for its own use, >>> thus a simple work-around would be: DON'T USE IT in Linux. >>> >>> Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> >>> --- >>> Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >>> index 280b4e49f2191..f9b23aef351b0 100644 >>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml >>> @@ -204,6 +204,12 @@ properties: >>> access to SMMU configuration registers. In this case non-secure aliases of >>> secure registers have to be used during SMMU configuration. >>> >>> + qcom,last-ctx-bank-reserved: >>> + type: boolean >>> + description: >>> + FW reserves the last context bank of this SMMU for its own use. >>> + If Linux tries to use it, Linux gets nuked. >> >> How is this Qualcomm specific? Presumably any implementation could do >> this if there's no way to properly partition things. Robin? > > Obviously, there is nothing Qualcomm specific about reserving > an SMMU context bank for the FW / hypervisor, other than it > appears that qcom is the first to do it; or at least the > LPASS SMMU on qcom msm8998 is the first known SMMU where such > a work-around is required. Yes, the Qualcomm-specific aspect is that it's Qualcomm's hypervisor which is broken and reporting a larger number in its emulated SMMU_IDR1.NUMCB than the number of context banks it's actually willing to emulate. > What is the correct nomenclature? > > Can we just drop the vendor prefix if a property is generic > across vendors? But does it require a subsystem prefix like > "iommu" in order to not clash with generic props in other subsystems? I guess if we *were* to consider a generic property to endorse violating the SMMU architecture, then it would logically be vendored to Arm as the owner of the SMMU architecture. However I am strongly against that idea, not only because I obviously don't want to normalise hypervisors emulating non-architectural behaviour which every DT-consuming OS will have to understand how to work around, but it's also less than great for the user to have a workaround that's not compatible with existing DTBs. Luckily, in this case it seems straightforward enough to be able to see that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we should just treat it as if it has 12 - it's also notable that it only reports NUMSMRG=12, so we couldn't use more than that many S1 context banks at once anyway. Thanks, Robin. >> Also, this property isn't very flexible. What happens when it is not the >> last bank or more than 1 bank reserved? This should probably be a mask >> instead. > > OK, I'm getting conflicting requests here. > > Bjorn has recommended dropping the property altogether: > >> It also seems, as the different SMMUs in this platform behave >> differently it might be worth giving them further specific compatibles, >> in which case we could just check if it's the qcom,msm8998-lpass-smmu, >> instead of inventing a property for this quirk. > > > I'll send a patch series in line with Bjorn's request. > > Regards >
On 19/08/2024 14:57, Robin Murphy wrote: > Luckily, in this case it seems straightforward enough to be able to see > that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we > should just treat it as if it has 12 - it's also notable that it only > reports NUMSMRG=12, so we couldn't use more than that many S1 context > banks at once anyway. This is what the hypervisor reports: [ 2.550974] arm-smmu 5100000.iommu: probing hardware configuration... [ 2.557309] arm-smmu 5100000.iommu: SMMUv2 with: [ 2.563815] arm-smmu 5100000.iommu: stage 1 translation [ 2.568494] arm-smmu 5100000.iommu: address translation ops [ 2.573791] arm-smmu 5100000.iommu: non-coherent table walk [ 2.579434] arm-smmu 5100000.iommu: (IDR0.CTTW overridden by FW configuration) [ 2.585088] arm-smmu 5100000.iommu: stream matching with 12 register groups [ 2.592132] arm-smmu 5100000.iommu: 13 context banks (0 stage-2 only) [ 2.619316] arm-smmu 5100000.iommu: Supported page sizes: 0x63315000 [ 2.626225] arm-smmu 5100000.iommu: Stage-1: 36-bit VA -> 36-bit IPA [ 2.632645] arm-smmu 5100000.iommu: preserved 0 boot mappings smmu->num_mapping_groups = 12 smmu->num_context_banks = 13 Are you saying that smmu->num_context_banks > smmu->num_mapping_groups does not make sense? Would a well-placed if (smmu->num_context_banks > smmu->num_mapping_groups) smmu->num_context_banks = smmu->num_mapping_groups; be a proper work-around? (Probably in qcom_smmu_cfg_probe() so as to not interfere with other platforms.) Maybe to limit the side effects even more: if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") && smmu->num_context_banks > smmu->num_mapping_groups)) smmu->num_context_banks = smmu->num_mapping_groups; Neither work-around would require changing the binding. Is either work-around acceptable, Robin? Regards
On Mon, Aug 19, 2024 at 05:02:16PM +0200, Marc Gonzalez wrote: > On 19/08/2024 14:57, Robin Murphy wrote: > > > Luckily, in this case it seems straightforward enough to be able to see > > that if we have a "qcom,msm8996-smmu-v2" with 13 context banks then we > > should just treat it as if it has 12 - it's also notable that it only > > reports NUMSMRG=12, so we couldn't use more than that many S1 context > > banks at once anyway. > > This is what the hypervisor reports: > > [ 2.550974] arm-smmu 5100000.iommu: probing hardware configuration... > [ 2.557309] arm-smmu 5100000.iommu: SMMUv2 with: > [ 2.563815] arm-smmu 5100000.iommu: stage 1 translation > [ 2.568494] arm-smmu 5100000.iommu: address translation ops > [ 2.573791] arm-smmu 5100000.iommu: non-coherent table walk > [ 2.579434] arm-smmu 5100000.iommu: (IDR0.CTTW overridden by FW configuration) > [ 2.585088] arm-smmu 5100000.iommu: stream matching with 12 register groups > [ 2.592132] arm-smmu 5100000.iommu: 13 context banks (0 stage-2 only) > [ 2.619316] arm-smmu 5100000.iommu: Supported page sizes: 0x63315000 > [ 2.626225] arm-smmu 5100000.iommu: Stage-1: 36-bit VA -> 36-bit IPA > [ 2.632645] arm-smmu 5100000.iommu: preserved 0 boot mappings > > > smmu->num_mapping_groups = 12 Ignore num_mapping_groups, they are used to define which streams should be mapped to which context bank. But there's no relationship between these numbers. > smmu->num_context_banks = 13 > > > Are you saying that > > smmu->num_context_banks > smmu->num_mapping_groups > > does not make sense? > > > Would a well-placed > > if (smmu->num_context_banks > smmu->num_mapping_groups) > smmu->num_context_banks = smmu->num_mapping_groups; > > be a proper work-around? No, something like this would apply your quirk to other targets (and specifically it would be wrong, per above). > > (Probably in qcom_smmu_cfg_probe() so as to not interfere with other platforms.) > > > Maybe to limit the side effects even more: > > if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") && > smmu->num_context_banks > smmu->num_mapping_groups)) > smmu->num_context_banks = smmu->num_mapping_groups; If we don't want to introduce a more specific compatible for this SMMU instance, then let's add this to qcom_smmu_cfg_probe(): /* MSM8998 LPASS SMMU reports 13 context banks, but only 12 are accessible */ if (of_device_is_compatible(smmu->dev->of_node, "qcom,msm8998-smmu-v2") && smmu->num_context_banks == 13) smmu->num_context_banks = 12; Regards, Bjorn > > > Neither work-around would require changing the binding. > > Is either work-around acceptable, Robin? > > Regards > >
diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml index 280b4e49f2191..f9b23aef351b0 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml @@ -204,6 +204,12 @@ properties: access to SMMU configuration registers. In this case non-secure aliases of secure registers have to be used during SMMU configuration. + qcom,last-ctx-bank-reserved: + type: boolean + description: + FW reserves the last context bank of this SMMU for its own use. + If Linux tries to use it, Linux gets nuked. + stream-match-mask: $ref: /schemas/types.yaml#/definitions/uint32 description: |
On qcom msm8998, writing to the last context bank of lpass_q6_smmu (base address 0x05100000) produces a system freeze & reboot. Specifically, here: qsmmu->bypass_cbndx = smmu->num_context_banks - 1; arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0); and here: arm_smmu_write_context_bank(smmu, i); arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT); It is likely that FW reserves the last context bank for its own use, thus a simple work-around would be: DON'T USE IT in Linux. Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> --- Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 6 ++++++ 1 file changed, 6 insertions(+)