[V6,5/5] arm64: dts: qcom: x1e80100: Enable cpufreq

Message ID	20240612124056.39230-6-quic_sibis@quicinc.com
State	New
Headers	show Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E407172762; Wed, 12 Jun 2024 12:41:52 +0000 (UTC) From: Sibi Sankar <quic_sibis@quicinc.com> To: <sudeep.holla@arm.com>, <cristian.marussi@arm.com>, <andersson@kernel.org>, <konrad.dybcio@linaro.org>, <jassisinghbrar@gmail.com>, <robh+dt@kernel.org>, <krzysztof.kozlowski+dt@linaro.org>, <dmitry.baryshkov@linaro.org> CC: <linux-kernel@vger.kernel.org>, <linux-arm-msm@vger.kernel.org>, <devicetree@vger.kernel.org>, <quic_rgottimu@quicinc.com>, <quic_kshivnan@quicinc.com>, <quic_sibis@quicinc.com>, <conor+dt@kernel.org>, <quic_nkela@quicinc.com>, <quic_psodagud@quicinc.com>, <abel.vesa@linaro.org> Subject: [PATCH V6 5/5] arm64: dts: qcom: x1e80100: Enable cpufreq Date: Wed, 12 Jun 2024 18:10:56 +0530 Message-ID: <20240612124056.39230-6-quic_sibis@quicinc.com> In-Reply-To: <20240612124056.39230-1-quic_sibis@quicinc.com> References: <20240612124056.39230-1-quic_sibis@quicinc.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	qcom: x1e80100: Enable CPUFreq \| expand [V6,0/5] qcom: x1e80100: Enable CPUFreq [V6,1/5] dt-bindings: mailbox: qcom: Add CPUCP mailbox controller bindings [V6,2/5] mailbox: Add support for QTI CPUCP mailbox controller [V6,3/5] arm64: dts: qcom: x1e80100: Resize GIC Redistributor register region [V6,4/5] arm64: dts: qcom: x1e80100: Add cpucp mailbox and sram nodes [V6,5/5] arm64: dts: qcom: x1e80100: Enable cpufreq

Sibi Sankar June 12, 2024, 12:40 p.m. UTC

Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.

Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
---
 arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63 ++++++++++++++++----------
 1 file changed, 39 insertions(+), 24 deletions(-)

Johan Hovold July 2, 2024, 3:55 p.m. UTC | #1

On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
> 
> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
> ---
>  arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63 ++++++++++++++++----------
>  1 file changed, 39 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> index 7b619db07694..d134dc4c7425 100644
> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>  			reg = <0x0 0x0>;
>  			enable-method = "psci";
>  			next-level-cache = <&L2_0>;
> -			power-domains = <&CPU_PD0>;
> -			power-domain-names = "psci";
> +			power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
> +			power-domain-names = "psci", "perf";
>  			cpu-idle-states = <&CLUSTER_C4>;

> +		scmi {
> +			compatible = "arm,scmi";
> +			mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
> +			mbox-names = "tx", "rx";
> +			shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
> +
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +
> +			scmi_dvfs: protocol@13 {
> +				reg = <0x13>;
> +				#power-domain-cells = <1>;
> +			};
> +		};
>  	};

This series gives a nice performance boost on the x1e80100 CRD, but I'm
seeing a bunch of warnings and errors that need to be addressed:

[    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
[    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
[    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
[    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
[    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
[    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
[    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
[    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
[    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!

Johan

Sibi Sankar July 2, 2024, 7:59 p.m. UTC | #2

On 7/2/24 21:25, Johan Hovold wrote:
> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
>>   arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63 ++++++++++++++++----------
>>   1 file changed, 39 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>> index 7b619db07694..d134dc4c7425 100644
>> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>>   			reg = <0x0 0x0>;
>>   			enable-method = "psci";
>>   			next-level-cache = <&L2_0>;
>> -			power-domains = <&CPU_PD0>;
>> -			power-domain-names = "psci";
>> +			power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
>> +			power-domain-names = "psci", "perf";
>>   			cpu-idle-states = <&CLUSTER_C4>;
> 
>> +		scmi {
>> +			compatible = "arm,scmi";
>> +			mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
>> +			mbox-names = "tx", "rx";
>> +			shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
>> +
>> +			#address-cells = <1>;
>> +			#size-cells = <0>;
>> +
>> +			scmi_dvfs: protocol@13 {
>> +				reg = <0x13>;
>> +				#power-domain-cells = <1>;
>> +			};
>> +		};
>>   	};
> 

Hey Johan,

Thanks for trying out the series.

> This series gives a nice performance boost on the x1e80100 CRD, but I'm
> seeing a bunch of warnings and errors that need to be addressed:
> 
> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.

X1E uses fast channels only for message-id: 7 (level set) and regular
channels for all the other messages. The spec doesn't mandate fast
channels for any of the supported message ids for the perf protocol.
So nothing to fix here.

> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1

The duplicate entries reported by the perf protocol come directly from
the speed bins. I was told the duplicate entry with volt 0 is meant to
indicate a lower power way of achieving the said frequency at a lower
core count. We have no way of using it in the kernel and it gets safely
discarded. So again nothing to fix in the kernel.

> [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
> [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!

Yeah I did notice ^^ during dev, the series isn't the one introducing it
so it shouldn't block the series acceptance. Meanwhile I'll spend some
cycles to get this warn fixed.

-Sibi

> 
> Johan
>

Nikunj Kela July 2, 2024, 8:13 p.m. UTC | #3

On 7/2/2024 12:59 PM, Sibi Sankar wrote:
>
>
> On 7/2/24 21:25, Johan Hovold wrote:
>> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>>
>>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>>> ---
>>>   arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63
>>> ++++++++++++++++----------
>>>   1 file changed, 39 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>> b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>> index 7b619db07694..d134dc4c7425 100644
>>> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>>>               reg = <0x0 0x0>;
>>>               enable-method = "psci";
>>>               next-level-cache = <&L2_0>;
>>> -            power-domains = <&CPU_PD0>;
>>> -            power-domain-names = "psci";
>>> +            power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
>>> +            power-domain-names = "psci", "perf";
>>>               cpu-idle-states = <&CLUSTER_C4>;
>>
>>> +        scmi {
>>> +            compatible = "arm,scmi";
>>> +            mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
>>> +            mbox-names = "tx", "rx";
>>> +            shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
>>> +
>>> +            #address-cells = <1>;
>>> +            #size-cells = <0>;
>>> +
>>> +            scmi_dvfs: protocol@13 {
>>> +                reg = <0x13>;
>>> +                #power-domain-cells = <1>;
>>> +            };
>>> +        };
>>>       };
>>
>
> Hey Johan,
>
> Thanks for trying out the series.
>
>> This series gives a nice performance boost on the x1e80100 CRD, but I'm
>> seeing a bunch of warnings and errors that need to be addressed:
>>
>> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol
>> 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
>> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>> 3417600 for NCC - ret:-16
>> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>> 3417600 for NCC - ret:-16
>> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol
>> 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
>> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>> 3417600 for NCC - ret:-16
>> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>> 3417600 for NCC - ret:-16
>> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol
>> 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
>
> X1E uses fast channels only for message-id: 7 (level set) and regular
> channels for all the other messages. The spec doesn't mandate fast
> channels for any of the supported message ids for the perf protocol.
> So nothing to fix here.
>
>> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>> 3417600000, volt: 0, enabled: 1
>> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>> 3417600000, volt: 0, enabled: 1
>> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>> 3417600000, volt: 0, enabled: 1
>> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>> 3417600000, volt: 0, enabled: 1
>
> The duplicate entries reported by the perf protocol come directly from
> the speed bins. I was told the duplicate entry with volt 0 is meant to
> indicate a lower power way of achieving the said frequency at a lower
> core count. We have no way of using it in the kernel and it gets safely
> discarded. So again nothing to fix in the kernel.

Hi Sibi,

Can you try increasing the max_msg_size to 256 bytes in mailbox
transport? We saw the same issue but got resolved by increasing the
max_msg_size for the transport(obviously, I reduced the max_msg to 10 to
keep the total shmem size same). Even the opps_by_lvl warning went away
with this for us.

Thanks,

-Nikunj

>
>> [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd'
>> already present!
>> [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd'
>> already present!
>
> Yeah I did notice ^^ during dev, the series isn't the one introducing it
> so it shouldn't block the series acceptance. Meanwhile I'll spend some
> cycles to get this warn fixed.
>
> -Sibi
>
>>
>> Johan
>>

Sibi Sankar July 3, 2024, 11:23 a.m. UTC | #4

On 7/3/24 01:43, Nikunj Kela wrote:
> 
> On 7/2/2024 12:59 PM, Sibi Sankar wrote:
>>
>>
>> On 7/2/24 21:25, Johan Hovold wrote:
>>> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>>>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>>>
>>>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>>>> ---
>>>>    arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63
>>>> ++++++++++++++++----------
>>>>    1 file changed, 39 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>> b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>> index 7b619db07694..d134dc4c7425 100644
>>>> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>>>>                reg = <0x0 0x0>;
>>>>                enable-method = "psci";
>>>>                next-level-cache = <&L2_0>;
>>>> -            power-domains = <&CPU_PD0>;
>>>> -            power-domain-names = "psci";
>>>> +            power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
>>>> +            power-domain-names = "psci", "perf";
>>>>                cpu-idle-states = <&CLUSTER_C4>;
>>>
>>>> +        scmi {
>>>> +            compatible = "arm,scmi";
>>>> +            mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
>>>> +            mbox-names = "tx", "rx";
>>>> +            shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
>>>> +
>>>> +            #address-cells = <1>;
>>>> +            #size-cells = <0>;
>>>> +
>>>> +            scmi_dvfs: protocol@13 {
>>>> +                reg = <0x13>;
>>>> +                #power-domain-cells = <1>;
>>>> +            };
>>>> +        };
>>>>        };
>>>
>>
>> Hey Johan,
>>
>> Thanks for trying out the series.
>>
>>> This series gives a nice performance boost on the x1e80100 CRD, but I'm
>>> seeing a bunch of warnings and errors that need to be addressed:
>>>
>>> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol
>>> 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
>>> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>> 3417600 for NCC - ret:-16
>>> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>> 3417600 for NCC - ret:-16
>>> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol
>>> 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
>>> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>> 3417600 for NCC - ret:-16
>>> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>> 3417600 for NCC - ret:-16
>>> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol
>>> 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
>>
>> X1E uses fast channels only for message-id: 7 (level set) and regular
>> channels for all the other messages. The spec doesn't mandate fast
>> channels for any of the supported message ids for the perf protocol.
>> So nothing to fix here.
>>
>>> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>> 3417600000, volt: 0, enabled: 1
>>> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>> 3417600000, volt: 0, enabled: 1
>>> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>> 3417600000, volt: 0, enabled: 1
>>> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>> 3417600000, volt: 0, enabled: 1
>>
>> The duplicate entries reported by the perf protocol come directly from
>> the speed bins. I was told the duplicate entry with volt 0 is meant to
>> indicate a lower power way of achieving the said frequency at a lower
>> core count. We have no way of using it in the kernel and it gets safely
>> discarded. So again nothing to fix in the kernel.
> 
> Hi Sibi,
> 
> Can you try increasing the max_msg_size to 256 bytes in mailbox
> transport? We saw the same issue but got resolved by increasing the
> max_msg_size for the transport(obviously, I reduced the max_msg to 10 to
> keep the total shmem size same). Even the opps_by_lvl warning went away
> with this for us.

Nikunj,
Thanks for taking time to review the series :)

Not sure if we are talking about the same things here, are you
suggesting that tweaking with the max_msg size will stop the SCMI
controller from reporting duplicate OPPs? Even if it does go away
magically wouldn't it mean you are dropping messages? Also opps_by_lvl
failing with -16 and duplicate opps detected in the opp core have the
same root cause i.e. duplicate entries reported by the controller.

> 
> Thanks,
> 
> -Nikunj
> 
>>
>>> [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>> already present!
>>> [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>> already present!
>>
>> Yeah I did notice ^^ during dev, the series isn't the one introducing it
>> so it shouldn't block the series acceptance. Meanwhile I'll spend some
>> cycles to get this warn fixed.

Johan,

https://lore.kernel.org/lkml/20240703110741.2668800-1-quic_sibis@quicinc.com/

Posted a fix for the warn ^^

>>
>> -Sibi
>>
>>>
>>> Johan
>>>

Nikunj Kela July 3, 2024, 2:05 p.m. UTC | #5

On 7/3/2024 4:23 AM, Sibi Sankar wrote:
>
>
> On 7/3/24 01:43, Nikunj Kela wrote:
>>
>> On 7/2/2024 12:59 PM, Sibi Sankar wrote:
>>>
>>>
>>> On 7/2/24 21:25, Johan Hovold wrote:
>>>> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>>>>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>>>>
>>>>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>>>>> ---
>>>>>    arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63
>>>>> ++++++++++++++++----------
>>>>>    1 file changed, 39 insertions(+), 24 deletions(-)
>>>>>
>>>>> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>> b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>> index 7b619db07694..d134dc4c7425 100644
>>>>> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>>>>>                reg = <0x0 0x0>;
>>>>>                enable-method = "psci";
>>>>>                next-level-cache = <&L2_0>;
>>>>> -            power-domains = <&CPU_PD0>;
>>>>> -            power-domain-names = "psci";
>>>>> +            power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
>>>>> +            power-domain-names = "psci", "perf";
>>>>>                cpu-idle-states = <&CLUSTER_C4>;
>>>>
>>>>> +        scmi {
>>>>> +            compatible = "arm,scmi";
>>>>> +            mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
>>>>> +            mbox-names = "tx", "rx";
>>>>> +            shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
>>>>> +
>>>>> +            #address-cells = <1>;
>>>>> +            #size-cells = <0>;
>>>>> +
>>>>> +            scmi_dvfs: protocol@13 {
>>>>> +                reg = <0x13>;
>>>>> +                #power-domain-cells = <1>;
>>>>> +            };
>>>>> +        };
>>>>>        };
>>>>
>>>
>>> Hey Johan,
>>>
>>> Thanks for trying out the series.
>>>
>>>> This series gives a nice performance boost on the x1e80100 CRD, but
>>>> I'm
>>>> seeing a bunch of warnings and errors that need to be addressed:
>>>>
>>>> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>> 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
>>>> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>> 3417600 for NCC - ret:-16
>>>> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>> 3417600 for NCC - ret:-16
>>>> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>> 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
>>>> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>> 3417600 for NCC - ret:-16
>>>> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>> 3417600 for NCC - ret:-16
>>>> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>> 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
>>>
>>> X1E uses fast channels only for message-id: 7 (level set) and regular
>>> channels for all the other messages. The spec doesn't mandate fast
>>> channels for any of the supported message ids for the perf protocol.
>>> So nothing to fix here.
>>>
>>>> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>> 3417600000, volt: 0, enabled: 1
>>>> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>> 3417600000, volt: 0, enabled: 1
>>>> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>> 3417600000, volt: 0, enabled: 1
>>>> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>> 3417600000, volt: 0, enabled: 1
>>>
>>> The duplicate entries reported by the perf protocol come directly from
>>> the speed bins. I was told the duplicate entry with volt 0 is meant to
>>> indicate a lower power way of achieving the said frequency at a lower
>>> core count. We have no way of using it in the kernel and it gets safely
>>> discarded. So again nothing to fix in the kernel.
>>
>> Hi Sibi,
>>
>> Can you try increasing the max_msg_size to 256 bytes in mailbox
>> transport? We saw the same issue but got resolved by increasing the
>> max_msg_size for the transport(obviously, I reduced the max_msg to 10 to
>> keep the total shmem size same). Even the opps_by_lvl warning went away
>> with this for us.
>
> Nikunj,
> Thanks for taking time to review the series :)
>
> Not sure if we are talking about the same things here, are you
> suggesting that tweaking with the max_msg size will stop the SCMI
> controller from reporting duplicate OPPs? Even if it does go away
> magically wouldn't it mean you are dropping messages? Also opps_by_lvl
> failing with -16 and duplicate opps detected in the opp core have the
> same root cause i.e. duplicate entries reported by the controller.


Sibi,

My observation was that only 12 OPPs could fit it 128bytes msg_size and
our platform was sending 16 OPPs in one go. OPPs above 12 were getting
clobbered so the duplicate warning/error were not genuine. You may need
to tweak platform to send only 12(or less) OPPs in one go.


>
>>
>> Thanks,
>>
>> -Nikunj
>>
>>>
>>>> [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>>> already present!
>>>> [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>>> already present!
>>>
>>> Yeah I did notice ^^ during dev, the series isn't the one
>>> introducing it
>>> so it shouldn't block the series acceptance. Meanwhile I'll spend some
>>> cycles to get this warn fixed.
>
> Johan,
>
> https://lore.kernel.org/lkml/20240703110741.2668800-1-quic_sibis@quicinc.com/
>
>
> Posted a fix for the warn ^^
>
>>>
>>> -Sibi
>>>
>>>>
>>>> Johan
>>>>

Sibi Sankar July 4, 2024, 10:22 a.m. UTC | #6

On 7/3/24 19:35, Nikunj Kela wrote:
> 
> On 7/3/2024 4:23 AM, Sibi Sankar wrote:
>>
>>
>> On 7/3/24 01:43, Nikunj Kela wrote:
>>>
>>> On 7/2/2024 12:59 PM, Sibi Sankar wrote:
>>>>
>>>>
>>>> On 7/2/24 21:25, Johan Hovold wrote:
>>>>> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>>>>>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>>>>>
>>>>>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>>>>>> ---
>>>>>>     arch/arm64/boot/dts/qcom/x1e80100.dtsi | 63
>>>>>> ++++++++++++++++----------
>>>>>>     1 file changed, 39 insertions(+), 24 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>>> b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>>> index 7b619db07694..d134dc4c7425 100644
>>>>>> --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>>> +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi
>>>>>> @@ -69,8 +69,8 @@ CPU0: cpu@0 {
>>>>>>                 reg = <0x0 0x0>;
>>>>>>                 enable-method = "psci";
>>>>>>                 next-level-cache = <&L2_0>;
>>>>>> -            power-domains = <&CPU_PD0>;
>>>>>> -            power-domain-names = "psci";
>>>>>> +            power-domains = <&CPU_PD0>, <&scmi_dvfs 0>;
>>>>>> +            power-domain-names = "psci", "perf";
>>>>>>                 cpu-idle-states = <&CLUSTER_C4>;
>>>>>
>>>>>> +        scmi {
>>>>>> +            compatible = "arm,scmi";
>>>>>> +            mboxes = <&cpucp_mbox 0>, <&cpucp_mbox 2>;
>>>>>> +            mbox-names = "tx", "rx";
>>>>>> +            shmem = <&cpu_scp_lpri0>, <&cpu_scp_lpri1>;
>>>>>> +
>>>>>> +            #address-cells = <1>;
>>>>>> +            #size-cells = <0>;
>>>>>> +
>>>>>> +            scmi_dvfs: protocol@13 {
>>>>>> +                reg = <0x13>;
>>>>>> +                #power-domain-cells = <1>;
>>>>>> +            };
>>>>>> +        };
>>>>>>         };
>>>>>
>>>>
>>>> Hey Johan,
>>>>
>>>> Thanks for trying out the series.
>>>>
>>>>> This series gives a nice performance boost on the x1e80100 CRD, but
>>>>> I'm
>>>>> seeing a bunch of warnings and errors that need to be addressed:
>>>>>
>>>>> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>>> 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
>>>>> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>>> 3417600 for NCC - ret:-16
>>>>> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>>> 3417600 for NCC - ret:-16
>>>>> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>>> 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
>>>>> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>>> 3417600 for NCC - ret:-16
>>>>> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at
>>>>> 3417600 for NCC - ret:-16
>>>>> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol
>>>>> 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
>>>>
>>>> X1E uses fast channels only for message-id: 7 (level set) and regular
>>>> channels for all the other messages. The spec doesn't mandate fast
>>>> channels for any of the supported message ids for the perf protocol.
>>>> So nothing to fix here.
>>>>
>>>>> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>>> 3417600000, volt: 0, enabled: 1
>>>>> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected.
>>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>>> 3417600000, volt: 0, enabled: 1
>>>>> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>>> 3417600000, volt: 0, enabled: 1
>>>>> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected.
>>>>> Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq:
>>>>> 3417600000, volt: 0, enabled: 1
>>>>
>>>> The duplicate entries reported by the perf protocol come directly from
>>>> the speed bins. I was told the duplicate entry with volt 0 is meant to
>>>> indicate a lower power way of achieving the said frequency at a lower
>>>> core count. We have no way of using it in the kernel and it gets safely
>>>> discarded. So again nothing to fix in the kernel.
>>>
>>> Hi Sibi,
>>>
>>> Can you try increasing the max_msg_size to 256 bytes in mailbox
>>> transport? We saw the same issue but got resolved by increasing the
>>> max_msg_size for the transport(obviously, I reduced the max_msg to 10 to
>>> keep the total shmem size same). Even the opps_by_lvl warning went away
>>> with this for us.
>>
>> Nikunj,
>> Thanks for taking time to review the series :)
>>
>> Not sure if we are talking about the same things here, are you
>> suggesting that tweaking with the max_msg size will stop the SCMI
>> controller from reporting duplicate OPPs? Even if it does go away
>> magically wouldn't it mean you are dropping messages? Also opps_by_lvl
>> failing with -16 and duplicate opps detected in the opp core have the
>> same root cause i.e. duplicate entries reported by the controller.
> 
> 
> Sibi,
> 
> My observation was that only 12 OPPs could fit it 128bytes msg_size and
> our platform was sending 16 OPPs in one go. OPPs above 12 were getting
> clobbered so the duplicate warning/error were not genuine. You may need
> to tweak platform to send only 12(or less) OPPs in one go.

Nikunj,

The platform we are talking abt in this thread is X1E and the number
of performance levels returned by the PERFORMANCE_DESCRIBE_LEVELS
is just one. I relies on the skip_index and iterator ops to get
all the available levels. So the clobbering you are talking abt
in whatever platform you are referring to does not apply here.
Please find the logs below for Domain 1. Hope this clears up
whatever misunderstanding you had about X1E.

Logs Domain -1:
arm-scmi: iter_perf_levels_update_state num_returned: 1 num_remaining: 15
arm-scmi firmware:scmi: Level 710400 Power 23243 Latency 30us Ifreq 
710400 Index 0
...
[snip]
...
arm-scmi: iter_perf_levels_update_state num_returned: 1 num_remaining: 3
arm-scmi firmware:scmi: Level 3417600 Power 307141 Latency 30us Ifreq 
3417600 Index 12
arm-scmi: iter_perf_levels_update_state num_returned: 1 num_remaining: 2
arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - 
ret:-16
arm-scmi firmware:scmi: Level 3417600 Power 307141 Latency 30us Ifreq 
3417600 Index 13
arm-scmi: iter_perf_levels_update_state num_returned: 1 num_remaining: 1
arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - 
ret:-16
arm-scmi firmware:scmi: Level 3417600 Power 307141 Latency 30us Ifreq 
3417600 Index 14
arm-scmi: iter_perf_levels_update_state num_returned: 1 num_remaining: 0
arm-scmi firmware:scmi: Level 4012800 Power 539962 Latency 30us Ifreq 
4012800 Index 15


-Sibi

> 
> 
>>
>>>
>>> Thanks,
>>>
>>> -Nikunj
>>>
>>>>
>>>>> [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>>>> already present!
>>>>> [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd'
>>>>> already present!
>>>>
>>>> Yeah I did notice ^^ during dev, the series isn't the one
>>>> introducing it
>>>> so it shouldn't block the series acceptance. Meanwhile I'll spend some
>>>> cycles to get this warn fixed.
>>
>> Johan,
>>
>> https://lore.kernel.org/lkml/20240703110741.2668800-1-quic_sibis@quicinc.com/
>>
>>
>> Posted a fix for the warn ^^
>>
>>>>
>>>> -Sibi
>>>>
>>>>>
>>>>> Johan
>>>>>

Johan Hovold July 9, 2024, 9:13 a.m. UTC | #7

Hi Sibi,

On Wed, Jul 03, 2024 at 01:29:11AM +0530, Sibi Sankar wrote:
> On 7/2/24 21:25, Johan Hovold wrote:
> > On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
> >> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.

> > This series gives a nice performance boost on the x1e80100 CRD, but I'm
> > seeing a bunch of warnings and errors that need to be addressed:
> > 
> > [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
> > [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> > [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> > [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
> > [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> > [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
> > [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
> 
> X1E uses fast channels only for message-id: 7 (level set) and regular
> channels for all the other messages. The spec doesn't mandate fast
> channels for any of the supported message ids for the perf protocol.
> So nothing to fix here.

I didn't look at this in any detail, but if the firmware is spec
compliant you should not be spamming the logs with warnings. Not sure
how best to address that, but you could, for example, add a quirk for
qcom fw or at a minimum demote this mess to info level.

Also the failure to add oops_by_lvl appears to be a separate issue (e.g.
related to the duplicate entries).

> > [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> > [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> > [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> > [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> 
> The duplicate entries reported by the perf protocol come directly from
> the speed bins. I was told the duplicate entry with volt 0 is meant to
> indicate a lower power way of achieving the said frequency at a lower
> core count. We have no way of using it in the kernel and it gets safely
> discarded. So again nothing to fix in the kernel.

Again, you should not be spamming the logs with warnings for things are
benign (e.g. as it may prevent people from noticing real issues).

Also these duplicate entries do not seem to get safely discarded as they
result in a bunch of operations failing loudly at boot (e.g. the
oops_by_lvl warning above) and similarly at resume as I recently
noticed:

[   42.690569] CPU4: Booted secondary processor 0x0000010000 [0x511f0011]
[   42.704360] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[   42.737865] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[   42.752943] debugfs: File 'cpu5' in directory 'opp' already present!
[   42.759956] debugfs: File 'cpu6' in directory 'opp' already present!
[   42.766641] debugfs: File 'cpu7' in directory 'opp' already present!
...
[   42.855520] CPU8: Booted secondary processor 0x0000020000 [0x511f0011]
[   42.865188] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[   42.898494] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
[   42.913559] debugfs: File 'cpu9' in directory 'opp' already present!
[   42.920265] debugfs: File 'cpu10' in directory 'opp' already present!
[   42.927029] debugfs: File 'cpu11' in directory 'opp' already present!

Perhaps you can find some way to filter out the unused, duplicate
entries for qualcomm fw so that all of these issues go away.

> > [    9.913506] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
> > [    9.922198] debugfs: Directory 'NCC' with parent 'pm_genpd' already present!
> 
> Yeah I did notice ^^ during dev, the series isn't the one introducing it
> so it shouldn't block the series acceptance. Meanwhile I'll spend some
> cycles to get this warn fixed.

I didn't try to track down where this comes from, but figured it could
be related to the duplicate entries. Either way, these are actually
errors (not just warnings) that need to be addressed in some way.

Johan

Konrad Dybcio July 9, 2024, 9:39 a.m. UTC | #8

On 9.07.2024 11:13 AM, Johan Hovold wrote:
> Hi Sibi,
> 
> On Wed, Jul 03, 2024 at 01:29:11AM +0530, Sibi Sankar wrote:
>> On 7/2/24 21:25, Johan Hovold wrote:
>>> On Wed, Jun 12, 2024 at 06:10:56PM +0530, Sibi Sankar wrote:
>>>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
> 
>>> This series gives a nice performance boost on the x1e80100 CRD, but I'm
>>> seeing a bunch of warnings and errors that need to be addressed:
>>>
>>> [    9.533053] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
>>> [    9.549458] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
>>> [    9.563925] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
>>> [    9.572835] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
>>> [    9.609471] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
>>> [    9.633341] arm-scmi firmware:scmi: Failed to add opps_by_lvl at 3417600 for NCC - ret:-16
>>> [    9.650000] arm-scmi firmware:scmi: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.
>>
>> X1E uses fast channels only for message-id: 7 (level set) and regular
>> channels for all the other messages. The spec doesn't mandate fast
>> channels for any of the supported message ids for the perf protocol.
>> So nothing to fix here.
> 
> I didn't look at this in any detail, but if the firmware is spec
> compliant you should not be spamming the logs with warnings. Not sure
> how best to address that, but you could, for example, add a quirk for
> qcom fw or at a minimum demote this mess to info level.
> 
> Also the failure to add oops_by_lvl appears to be a separate issue (e.g.
> related to the duplicate entries).
> 
>>> [    9.727098] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
>>> [    9.737157] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
>>> [    9.875039] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
>>> [    9.888428] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
>>
>> The duplicate entries reported by the perf protocol come directly from
>> the speed bins. I was told the duplicate entry with volt 0 is meant to
>> indicate a lower power way of achieving the said frequency at a lower
>> core count. We have no way of using it in the kernel and it gets safely
>> discarded. So again nothing to fix in the kernel.
> 
> Again, you should not be spamming the logs with warnings for things are
> benign (e.g. as it may prevent people from noticing real issues).
> 
> Also these duplicate entries do not seem to get safely discarded as they
> result in a bunch of operations failing loudly at boot (e.g. the
> oops_by_lvl warning above) and similarly at resume as I recently
> noticed:
> 
> [   42.690569] CPU4: Booted secondary processor 0x0000010000 [0x511f0011]
> [   42.704360] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [   42.737865] cpu cpu4: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [   42.752943] debugfs: File 'cpu5' in directory 'opp' already present!
> [   42.759956] debugfs: File 'cpu6' in directory 'opp' already present!
> [   42.766641] debugfs: File 'cpu7' in directory 'opp' already present!
> ...
> [   42.855520] CPU8: Booted secondary processor 0x0000020000 [0x511f0011]
> [   42.865188] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [   42.898494] cpu cpu8: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 3417600000, volt: 0, enabled: 1. New: freq: 3417600000, volt: 0, enabled: 1
> [   42.913559] debugfs: File 'cpu9' in directory 'opp' already present!
> [   42.920265] debugfs: File 'cpu10' in directory 'opp' already present!
> [   42.927029] debugfs: File 'cpu11' in directory 'opp' already present!
> 
> Perhaps you can find some way to filter out the unused, duplicate
> entries for qualcomm fw so that all of these issues go away.

I would say that the firmware should probably change the PSTATEs'
"enabled" state based on availability and report that to the OS..
Or the OS should know the conditions (enabled core count as you mentioned)
and decide whether it makes sense to shut down these cores, based on
workloads.. The latter sounds more sane..

The SCMI perf protocol already exposes power metrics (through opp->power)
for EAS purposes, so perhaps additional field could be added (cpu mask /
cpu count, depending on whether the specific cores being off is meaningful)
so that the OS can make more educated choices here.. otherwise this almost
looks like a hack that made it into the firmware because there was no
time left or something..

You mentioned that "We have no way of using it in the kernel", but is that
actually true? Can you not set that OPP if the conditions are met?

Konrad

Konrad Dybcio July 22, 2024, 12:12 p.m. UTC | #9

On 16.07.2024 12:45 PM, Konrad Dybcio wrote:
> On 12.06.2024 2:40 PM, Sibi Sankar wrote:
>> Enable cpufreq on X1E80100 SoCs through the SCMI perf protocol node.
>>
>> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
>> ---
> 
> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Taking this back.. forgot about <ZoQjAWse2YxwyRJv@hovoldconsulting.com>

Konrad

[V6,5/5] arm64: dts: qcom: x1e80100: Enable cpufreq

Commit Message

Comments

Patch