mbox series

[v2,00/22] Restructure RPM SMD ICC

Message ID 20230526-topic-smd_icc-v2-0-e5934b07d813@linaro.org
Headers show
Series Restructure RPM SMD ICC | expand

Message

Konrad Dybcio June 9, 2023, 8:19 p.m. UTC
This series reshuffles things around, moving the management of SMD RPM
bus clocks to the interconnect framework where they belong. This helps
us solve a couple of issues:

1. We can work towards unused clk cleanup of RPMCC without worrying
   about it killing some NoC bus, resulting in the SoC dying.
   Deasserting actually unused RPM clocks (among other things) will
   let us achieve "true SoC-wide power collapse states", also known as
   VDD_LOW and VDD_MIN.

2. We no longer have to keep tons of quirky bus clock ifs in the icc
   driver. You either have a RPM clock and call "rpm set rate" or you
   have a single non-RPM clock (like AHB_CLK_SRC) or you don't have any.

3. There's less overhead - instead of going through layers and layers of
   the CCF, ratesetting comes down to calling max() and sending a single
   RPM message. ICC is very very dynamic so that's a big plus.

The clocks still need to be vaguely described in the clk-smd-rpm driver,
as it gives them an initial kickoff, before actually telling RPM to
enable DVFS scaling.  After RPM receives that command, all clocks that
have not been assigned a rate are considered unused and are shut down
in hardware, leading to the same issue as described in point 1.

We can consider marking them __initconst in the future, but this series
is very fat even without that..

Apart from that, it squashes a couple of bugs that really need fixing..

--- MERGING STRATEGY ---
If Stephen and Georgi agree, it would be best to take all of this through
the qcom tree, as it touches on heavily intertwined components and
introduces compile-time dependencies between icc and clk drivers.

Tested on SM6375 (OOT), MSM8998 (OOT), MSM8996.

MSM8974 conversion to common code and modernization will be handled separately.

Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
---
Changes in v2:
- Sort entries properly in "Add missing headers in icc-rpm.h"
- Fix the check for no clocks on a given provider
- Replace "Divide clk rate by src node bus width" with a proper fix
- Add "Set correct bandwidth through RPM bw req"
- Split "Add QCOM_SMD_RPM_STATE_NUM" into 2 logical changes
- Move "Separate out interconnect bus clocks" a bit later in the series
- Link to v1: https://lore.kernel.org/r/20230526-topic-smd_icc-v1-0-1bf8e6663c4e@linaro.org

---
Konrad Dybcio (22):
      soc: qcom: smd-rpm: Add QCOM_SMD_RPM_STATE_NUM
      soc: qcom: smd-rpm: Use tabs for defines
      clk: qcom: smd-rpm: Move some RPM resources to the common header
      clk: qcom: smd-rpm: Export clock scaling availability
      interconnect: qcom: icc-rpm: Introduce keep_alive
      interconnect: qcom: icc-rpm: Allow negative QoS offset
      interconnect: qcom: Fold smd-rpm.h into icc-rpm.h
      interconnect: qcom: smd-rpm: Add rpmcc handling skeleton code
      interconnect: qcom: Add missing headers in icc-rpm.h
      interconnect: qcom: Define RPM bus clocks
      interconnect: qcom: sdm660: Hook up RPM bus clk definitions
      interconnect: qcom: msm8996: Hook up RPM bus clk definitions
      interconnect: qcom: qcs404: Hook up RPM bus clk definitions
      interconnect: qcom: msm8939: Hook up RPM bus clk definitions
      interconnect: qcom: msm8916: Hook up RPM bus clk definitions
      interconnect: qcom: qcm2290: Hook up RPM bus clk definitions
      interconnect: qcom: icc-rpm: Control bus rpmcc from icc
      clk: qcom: smd-rpm: Separate out interconnect bus clocks
      interconnect: qcom: icc-rpm: Fix bucket number
      interconnect: qcom: icc-rpm: Set bandwidth on both contexts
      interconnect: qcom: icc-rpm: Set correct bandwidth through RPM bw req
      interconnect: qcom: icc-rpm: Fix bandwidth calculations

 drivers/clk/qcom/clk-smd-rpm.c             | 300 ++++++++++++-----------------
 drivers/interconnect/qcom/Makefile         |   2 +-
 drivers/interconnect/qcom/icc-rpm-clocks.c |  66 +++++++
 drivers/interconnect/qcom/icc-rpm.c        | 212 ++++++++++----------
 drivers/interconnect/qcom/icc-rpm.h        |  55 ++++--
 drivers/interconnect/qcom/msm8916.c        |   4 +-
 drivers/interconnect/qcom/msm8939.c        |   5 +-
 drivers/interconnect/qcom/msm8974.c        |   2 +-
 drivers/interconnect/qcom/msm8996.c        |   9 +-
 drivers/interconnect/qcom/qcm2290.c        |   7 +-
 drivers/interconnect/qcom/qcs404.c         |   4 +-
 drivers/interconnect/qcom/sdm660.c         |   7 +-
 drivers/interconnect/qcom/smd-rpm.c        |  39 +++-
 drivers/interconnect/qcom/smd-rpm.h        |  15 --
 include/linux/soc/qcom/smd-rpm.h           |  22 ++-
 15 files changed, 427 insertions(+), 322 deletions(-)
---
base-commit: 53ab6975c12d1ad86c599a8927e8c698b144d669
change-id: 20230526-topic-smd_icc-b8213948a5ed

Best regards,

Comments

Stephan Gerhold June 10, 2023, 11:35 a.m. UTC | #1
On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
> clock rate requests will not be commited in hardware. This poses a
> race threat since we're accessing the bus clocks directly from within
> the interconnect framework.
> 
> Add a marker to indicate that we're good to go with sending new requests
> and export it so that it can be referenced from icc.
> 
> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> ---
>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
>  include/linux/soc/qcom/smd-rpm.h | 2 ++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
> index 937cb1515968..482fe30ee6f0 100644
> --- a/drivers/clk/qcom/clk-smd-rpm.c
> +++ b/drivers/clk/qcom/clk-smd-rpm.c
> @@ -151,6 +151,7 @@
>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
>  
>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
> +static bool smd_rpm_clk_scaling;
>  
>  struct clk_smd_rpm {
>  	const int rpm_res_type;
> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
>  	return r->rate;
>  }
>  
> +bool qcom_smd_rpm_scaling_available(void)
> +{
> +	return smd_rpm_clk_scaling;
> +}
> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
> +
>  static int clk_smd_rpm_enable_scaling(void)
>  {
>  	int ret;
> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
>  		return ret;
>  	}
>  
> +	smd_rpm_clk_scaling = true;
> +

If you move the platform_device_register_data(&rpdev->dev,
"icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
avoid the race completely and drop this API. I think that would be
cleaner. And it will likely probe much faster because probe deferral
is slow. :)

Thanks,
Stephan
Stephan Gerhold June 10, 2023, 12:02 p.m. UTC | #2
On Fri, Jun 09, 2023 at 10:19:19PM +0200, Konrad Dybcio wrote:
> Assign the necessary definitions to migrate to the new bus clock
> handling mechanism.
> 
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Reviewed-by: Stephan Gerhold <stephan@gerhold.net>

> ---
>  drivers/interconnect/qcom/msm8939.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/interconnect/qcom/msm8939.c b/drivers/interconnect/qcom/msm8939.c
> index 639566dce45a..94b11b590a8e 100644
> --- a/drivers/interconnect/qcom/msm8939.c
> +++ b/drivers/interconnect/qcom/msm8939.c
> @@ -1284,6 +1284,7 @@ static const struct qcom_icc_desc msm8939_snoc = {
>  	.type = QCOM_ICC_NOC,
>  	.nodes = msm8939_snoc_nodes,
>  	.num_nodes = ARRAY_SIZE(msm8939_snoc_nodes),
> +	.bus_clk_desc = &bus_1_clk,
>  	.regmap_cfg = &msm8939_snoc_regmap_config,
>  	.qos_offset = 0x7000,
>  };
> @@ -1304,6 +1305,7 @@ static const struct qcom_icc_desc msm8939_snoc_mm = {
>  	.type = QCOM_ICC_NOC,
>  	.nodes = msm8939_snoc_mm_nodes,
>  	.num_nodes = ARRAY_SIZE(msm8939_snoc_mm_nodes),
> +	.bus_clk_desc = &bus_2_clk,
>  	.regmap_cfg = &msm8939_snoc_regmap_config,
>  	.qos_offset = 0x7000,
>  };
> @@ -1332,6 +1334,7 @@ static const struct qcom_icc_desc msm8939_bimc = {
>  	.type = QCOM_ICC_BIMC,
>  	.nodes = msm8939_bimc_nodes,
>  	.num_nodes = ARRAY_SIZE(msm8939_bimc_nodes),
> +	.bus_clk_desc = &bimc_clk,
>  	.regmap_cfg = &msm8939_bimc_regmap_config,
>  	.qos_offset = 0x8000,
>  };
> @@ -1403,6 +1406,7 @@ static const struct qcom_icc_desc msm8939_pcnoc = {
>  	.type = QCOM_ICC_NOC,
>  	.nodes = msm8939_pcnoc_nodes,
>  	.num_nodes = ARRAY_SIZE(msm8939_pcnoc_nodes),
> +	.bus_clk_desc = &bus_0_clk,
>  	.regmap_cfg = &msm8939_pcnoc_regmap_config,
>  	.qos_offset = 0x7000,
>  };
> 
> -- 
> 2.41.0
>
Konrad Dybcio June 10, 2023, 12:15 p.m. UTC | #3
On 10.06.2023 13:35, Stephan Gerhold wrote:
> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
>> clock rate requests will not be commited in hardware. This poses a
>> race threat since we're accessing the bus clocks directly from within
>> the interconnect framework.
>>
>> Add a marker to indicate that we're good to go with sending new requests
>> and export it so that it can be referenced from icc.
>>
>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>> ---
>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
>>  2 files changed, 11 insertions(+)
>>
>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
>> index 937cb1515968..482fe30ee6f0 100644
>> --- a/drivers/clk/qcom/clk-smd-rpm.c
>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
>> @@ -151,6 +151,7 @@
>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
>>  
>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
>> +static bool smd_rpm_clk_scaling;
>>  
>>  struct clk_smd_rpm {
>>  	const int rpm_res_type;
>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
>>  	return r->rate;
>>  }
>>  
>> +bool qcom_smd_rpm_scaling_available(void)
>> +{
>> +	return smd_rpm_clk_scaling;
>> +}
>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
>> +
>>  static int clk_smd_rpm_enable_scaling(void)
>>  {
>>  	int ret;
>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
>>  		return ret;
>>  	}
>>  
>> +	smd_rpm_clk_scaling = true;
>> +
> 
> If you move the platform_device_register_data(&rpdev->dev,
> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
> avoid the race completely and drop this API. I think that would be
> cleaner. And it will likely probe much faster because probe deferral
> is slow. :)
Sounds like an idea.. especially since it's pretty much the only
dependency other than SMDRPM itself!

Konrad
> 
> Thanks,
> Stephan
Stephan Gerhold June 10, 2023, 6 p.m. UTC | #4
On Fri, Jun 09, 2023 at 10:19:25PM +0200, Konrad Dybcio wrote:
> Up until now, for some reason we've only been setting bandwidth values
> on the active-only context. That pretty much meant that RPM could lift
> all votes when entering sleep mode. Or never sleep at all.
> 
> That in turn could potentially break things like USB wakeup, as the
> connection between APSS and SNoC/PNoC would simply be dead.
> 

Nitpick: Apparently an "active" vote is applied during both active+sleep
until the first "sleep" vote is sent. It's documented only for
regulators [1] but I would expect the same applies to the bandwidths.
This means actual breakage shouldn't have been possible.

The patch itself is still the right thing to do to have the sleep state
correct during deep cpuidle/suspend.

[1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.c26-04700-8x09.0/drivers/regulator/rpm-smd-regulator.c#L199-209

> Set the values appropriately.
> 
> Fixes: 30c8fa3ec61a ("interconnect: qcom: Add MSM8916 interconnect provider driver")
> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> ---
>  drivers/interconnect/qcom/icc-rpm.c | 54 +++++++++++++++++++------------------
>  1 file changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
> index 3ac47b818afe..ac719013077e 100644
> --- a/drivers/interconnect/qcom/icc-rpm.c
> +++ b/drivers/interconnect/qcom/icc-rpm.c
> @@ -205,34 +205,39 @@ static int qcom_icc_qos_set(struct icc_node *node)
>  	}
>  }
>  
> -static int qcom_icc_rpm_set(struct qcom_icc_node *qn, u64 sum_bw)
> +static int qcom_icc_rpm_set(struct qcom_icc_node *qn, u64 *bw)
>  {
> -	int ret = 0;
> +	int ret, rpm_ctx = 0;
> +	u64 bw_bps;
>  
>  	if (qn->qos.ap_owned)
>  		return 0;
>  
> -	if (qn->mas_rpm_id != -1) {
> -		ret = qcom_icc_rpm_smd_send(QCOM_SMD_RPM_ACTIVE_STATE,
> -					    RPM_BUS_MASTER_REQ,
> -					    qn->mas_rpm_id,
> -					    sum_bw);
> -		if (ret) {
> -			pr_err("qcom_icc_rpm_smd_send mas %d error %d\n",
> -			       qn->mas_rpm_id, ret);
> -			return ret;
> +	for (rpm_ctx = 0; rpm_ctx < QCOM_SMD_RPM_STATE_NUM; rpm_ctx++) {
> +		bw_bps = icc_units_to_bps(bw[rpm_ctx]);
> +
> +		if (qn->mas_rpm_id != -1) {
> +			ret = qcom_icc_rpm_smd_send(rpm_ctx,
> +						    RPM_BUS_MASTER_REQ,
> +						    qn->mas_rpm_id,
> +						    bw_bps);
> +			if (ret) {
> +				pr_err("qcom_icc_rpm_smd_send mas %d error %d\n",
> +				qn->mas_rpm_id, ret);
> +				return ret;
> +			}
>  		}
> -	}
>  
> -	if (qn->slv_rpm_id != -1) {
> -		ret = qcom_icc_rpm_smd_send(QCOM_SMD_RPM_ACTIVE_STATE,
> -					    RPM_BUS_SLAVE_REQ,
> -					    qn->slv_rpm_id,
> -					    sum_bw);
> -		if (ret) {
> -			pr_err("qcom_icc_rpm_smd_send slv %d error %d\n",
> -			       qn->slv_rpm_id, ret);
> -			return ret;
> +		if (qn->slv_rpm_id != -1) {
> +			ret = qcom_icc_rpm_smd_send(rpm_ctx,
> +						    RPM_BUS_SLAVE_REQ,
> +						    qn->slv_rpm_id,
> +						    bw_bps);
> +			if (ret) {
> +				pr_err("qcom_icc_rpm_smd_send slv %d error %d\n",
> +				qn->slv_rpm_id, ret);
> +				return ret;
> +			}
>  		}
>  	}
>  
> @@ -337,7 +342,6 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>  	struct qcom_icc_provider *qp;
>  	struct qcom_icc_node *src_qn = NULL, *dst_qn = NULL;
>  	struct icc_provider *provider;
> -	u64 sum_bw;
>  	u64 active_rate, sleep_rate;
>  	u64 agg_avg[QCOM_SMD_RPM_STATE_NUM], agg_peak[QCOM_SMD_RPM_STATE_NUM];
>  	u64 max_agg_avg;
> @@ -351,14 +355,12 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>  
>  	qcom_icc_bus_aggregate(provider, agg_avg, agg_peak, &max_agg_avg);
>  
> -	sum_bw = icc_units_to_bps(max_agg_avg);
> -
> -	ret = qcom_icc_rpm_set(src_qn, sum_bw);
> +	ret = qcom_icc_rpm_set(src_qn, agg_avg);
>  	if (ret)
>  		return ret;
>  
>  	if (dst_qn) {
> -		ret = qcom_icc_rpm_set(dst_qn, sum_bw);
> +		ret = qcom_icc_rpm_set(dst_qn, agg_avg);
>  		if (ret)
>  			return ret;
>  	}
> 
> -- 
> 2.41.0
>
Konrad Dybcio June 10, 2023, 6:28 p.m. UTC | #5
On 10.06.2023 20:00, Stephan Gerhold wrote:
> On Fri, Jun 09, 2023 at 10:19:25PM +0200, Konrad Dybcio wrote:
>> Up until now, for some reason we've only been setting bandwidth values
>> on the active-only context. That pretty much meant that RPM could lift
>> all votes when entering sleep mode. Or never sleep at all.
>>
>> That in turn could potentially break things like USB wakeup, as the
>> connection between APSS and SNoC/PNoC would simply be dead.
>>
> 
> Nitpick: Apparently an "active" vote is applied during both active+sleep
> until the first "sleep" vote is sent. It's documented only for
> regulators [1] but I would expect the same applies to the bandwidths.
> This means actual breakage shouldn't have been possible.
..unless some part of the boot chain voted for the sleep set!

I'm not sure whether the regulator comment also holds for bw, but I
also don't really have a great way to check it.. Would you want me to
alter this commit message somehow?

Konrad
> 
> The patch itself is still the right thing to do to have the sleep state
> correct during deep cpuidle/suspend.
> 
> [1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.c26-04700-8x09.0/drivers/regulator/rpm-smd-regulator.c#L199-209
> 
>> Set the values appropriately.
>>
>> Fixes: 30c8fa3ec61a ("interconnect: qcom: Add MSM8916 interconnect provider driver")
>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>> ---
>>  drivers/interconnect/qcom/icc-rpm.c | 54 +++++++++++++++++++------------------
>>  1 file changed, 28 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
>> index 3ac47b818afe..ac719013077e 100644
>> --- a/drivers/interconnect/qcom/icc-rpm.c
>> +++ b/drivers/interconnect/qcom/icc-rpm.c
>> @@ -205,34 +205,39 @@ static int qcom_icc_qos_set(struct icc_node *node)
>>  	}
>>  }
>>  
>> -static int qcom_icc_rpm_set(struct qcom_icc_node *qn, u64 sum_bw)
>> +static int qcom_icc_rpm_set(struct qcom_icc_node *qn, u64 *bw)
>>  {
>> -	int ret = 0;
>> +	int ret, rpm_ctx = 0;
>> +	u64 bw_bps;
>>  
>>  	if (qn->qos.ap_owned)
>>  		return 0;
>>  
>> -	if (qn->mas_rpm_id != -1) {
>> -		ret = qcom_icc_rpm_smd_send(QCOM_SMD_RPM_ACTIVE_STATE,
>> -					    RPM_BUS_MASTER_REQ,
>> -					    qn->mas_rpm_id,
>> -					    sum_bw);
>> -		if (ret) {
>> -			pr_err("qcom_icc_rpm_smd_send mas %d error %d\n",
>> -			       qn->mas_rpm_id, ret);
>> -			return ret;
>> +	for (rpm_ctx = 0; rpm_ctx < QCOM_SMD_RPM_STATE_NUM; rpm_ctx++) {
>> +		bw_bps = icc_units_to_bps(bw[rpm_ctx]);
>> +
>> +		if (qn->mas_rpm_id != -1) {
>> +			ret = qcom_icc_rpm_smd_send(rpm_ctx,
>> +						    RPM_BUS_MASTER_REQ,
>> +						    qn->mas_rpm_id,
>> +						    bw_bps);
>> +			if (ret) {
>> +				pr_err("qcom_icc_rpm_smd_send mas %d error %d\n",
>> +				qn->mas_rpm_id, ret);
>> +				return ret;
>> +			}
>>  		}
>> -	}
>>  
>> -	if (qn->slv_rpm_id != -1) {
>> -		ret = qcom_icc_rpm_smd_send(QCOM_SMD_RPM_ACTIVE_STATE,
>> -					    RPM_BUS_SLAVE_REQ,
>> -					    qn->slv_rpm_id,
>> -					    sum_bw);
>> -		if (ret) {
>> -			pr_err("qcom_icc_rpm_smd_send slv %d error %d\n",
>> -			       qn->slv_rpm_id, ret);
>> -			return ret;
>> +		if (qn->slv_rpm_id != -1) {
>> +			ret = qcom_icc_rpm_smd_send(rpm_ctx,
>> +						    RPM_BUS_SLAVE_REQ,
>> +						    qn->slv_rpm_id,
>> +						    bw_bps);
>> +			if (ret) {
>> +				pr_err("qcom_icc_rpm_smd_send slv %d error %d\n",
>> +				qn->slv_rpm_id, ret);
>> +				return ret;
>> +			}
>>  		}
>>  	}
>>  
>> @@ -337,7 +342,6 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>>  	struct qcom_icc_provider *qp;
>>  	struct qcom_icc_node *src_qn = NULL, *dst_qn = NULL;
>>  	struct icc_provider *provider;
>> -	u64 sum_bw;
>>  	u64 active_rate, sleep_rate;
>>  	u64 agg_avg[QCOM_SMD_RPM_STATE_NUM], agg_peak[QCOM_SMD_RPM_STATE_NUM];
>>  	u64 max_agg_avg;
>> @@ -351,14 +355,12 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>>  
>>  	qcom_icc_bus_aggregate(provider, agg_avg, agg_peak, &max_agg_avg);
>>  
>> -	sum_bw = icc_units_to_bps(max_agg_avg);
>> -
>> -	ret = qcom_icc_rpm_set(src_qn, sum_bw);
>> +	ret = qcom_icc_rpm_set(src_qn, agg_avg);
>>  	if (ret)
>>  		return ret;
>>  
>>  	if (dst_qn) {
>> -		ret = qcom_icc_rpm_set(dst_qn, sum_bw);
>> +		ret = qcom_icc_rpm_set(dst_qn, agg_avg);
>>  		if (ret)
>>  			return ret;
>>  	}
>>
>> -- 
>> 2.41.0
>>
Stephan Gerhold June 10, 2023, 6:43 p.m. UTC | #6
On Sat, Jun 10, 2023 at 08:28:22PM +0200, Konrad Dybcio wrote:
> 
> 
> On 10.06.2023 20:00, Stephan Gerhold wrote:
> > On Fri, Jun 09, 2023 at 10:19:25PM +0200, Konrad Dybcio wrote:
> >> Up until now, for some reason we've only been setting bandwidth values
> >> on the active-only context. That pretty much meant that RPM could lift
> >> all votes when entering sleep mode. Or never sleep at all.
> >>
> >> That in turn could potentially break things like USB wakeup, as the
> >> connection between APSS and SNoC/PNoC would simply be dead.
> >>
> > 
> > Nitpick: Apparently an "active" vote is applied during both active+sleep
> > until the first "sleep" vote is sent. It's documented only for
> > regulators [1] but I would expect the same applies to the bandwidths.
> > This means actual breakage shouldn't have been possible.
> ..unless some part of the boot chain voted for the sleep set!
> 
> I'm not sure whether the regulator comment also holds for bw, but I
> also don't really have a great way to check it.. Would you want me to
> alter this commit message somehow?
> 

Hm. Well, on a second look you used "could" instead of "definitely does"
everywhere in your commit message. There is a indeed a slight chance
so feel free to just keep it as-is. :D

Thanks,
Stephan
Konrad Dybcio June 10, 2023, 6:53 p.m. UTC | #7
On 10.06.2023 14:15, Konrad Dybcio wrote:
> 
> 
> On 10.06.2023 13:35, Stephan Gerhold wrote:
>> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
>>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
>>> clock rate requests will not be commited in hardware. This poses a
>>> race threat since we're accessing the bus clocks directly from within
>>> the interconnect framework.
>>>
>>> Add a marker to indicate that we're good to go with sending new requests
>>> and export it so that it can be referenced from icc.
>>>
>>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>>> ---
>>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
>>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
>>>  2 files changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
>>> index 937cb1515968..482fe30ee6f0 100644
>>> --- a/drivers/clk/qcom/clk-smd-rpm.c
>>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
>>> @@ -151,6 +151,7 @@
>>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
>>>  
>>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
>>> +static bool smd_rpm_clk_scaling;
>>>  
>>>  struct clk_smd_rpm {
>>>  	const int rpm_res_type;
>>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
>>>  	return r->rate;
>>>  }
>>>  
>>> +bool qcom_smd_rpm_scaling_available(void)
>>> +{
>>> +	return smd_rpm_clk_scaling;
>>> +}
>>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
>>> +
>>>  static int clk_smd_rpm_enable_scaling(void)
>>>  {
>>>  	int ret;
>>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
>>>  		return ret;
>>>  	}
>>>  
>>> +	smd_rpm_clk_scaling = true;
>>> +
>>
>> If you move the platform_device_register_data(&rpdev->dev,
>> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
>> avoid the race completely and drop this API. I think that would be
>> cleaner. And it will likely probe much faster because probe deferral
>> is slow. :)
> Sounds like an idea.. especially since it's pretty much the only
> dependency other than SMDRPM itself!
It sounds great, but to not break bisecting one has to:

1. change the registration in soc/smd-rpm to store rpm ptr in driver
   data, in addition to parent driver data

2. change icc/smd-rpm to use the device and not parent data

3. add a platform_device_register_data call in clk-smd-rpm that will
   always fail because the device is always registered

4. remove the registration from soc/smd-rpm


I know you'd love to see me break my [PATCH xx/42] record, but I'd say
that deserves its own series and *this* patch could be a good transition
middleground :P

Hopefully Stephen, Bjorn and Georgi won't send a hitman after me for
abusing atomic cross-subsystem merges..

Konrad

> 
> Konrad
>>
>> Thanks,
>> Stephan
Stephan Gerhold June 10, 2023, 7:06 p.m. UTC | #8
On Fri, Jun 09, 2023 at 10:19:27PM +0200, Konrad Dybcio wrote:
> Up until now, we've been aggregating the bandwidth values and only
> dividing them by the bus width of the source node. This was completely
> wrong, as different nodes on a given path may (and usually do) have
> varying bus widths.  That in turn, resulted in the calculated clock rates
> being completely bogus - usually they ended up being much higher, as
> NoC_A<->NoC_B links are very wide.
> 
> Since we're not using the aggregate bandwidth value for anything other
> than clock rate calculations, remodel qcom_icc_bus_aggregate() to
> calculate the per-context clock rate for a given provider, taking into
> account the bus width of every individual node.
> 
> Fixes: 30c8fa3ec61a ("interconnect: qcom: Add MSM8916 interconnect provider driver")
> Reported-by: Stephan Gerhold <stephan@gerhold.net>
> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> ---
>  drivers/interconnect/qcom/icc-rpm.c | 59 ++++++++++++-------------------------
>  1 file changed, 19 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
> index 1508233632f6..d177a76abe2a 100644
> --- a/drivers/interconnect/qcom/icc-rpm.c
> +++ b/drivers/interconnect/qcom/icc-rpm.c
> @@ -293,58 +293,44 @@ static int qcom_icc_bw_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>  }
>  
>  /**
> - * qcom_icc_bus_aggregate - aggregate bandwidth by traversing all nodes
> + * qcom_icc_bus_aggregate - calculate bus clock rates by traversing all nodes
>   * @provider: generic interconnect provider
> - * @agg_avg: an array for aggregated average bandwidth of buckets
> - * @agg_peak: an array for aggregated peak bandwidth of buckets
> - * @max_agg_avg: pointer to max value of aggregated average bandwidth
> + * @agg_clk_rate: array containing the aggregated clock rates in kHz
>   */
> -static void qcom_icc_bus_aggregate(struct icc_provider *provider,
> -				   u64 *agg_avg, u64 *agg_peak,
> -				   u64 *max_agg_avg)
> +static void qcom_icc_bus_aggregate(struct icc_provider *provider, u64 *agg_clk_rate)
>  {
> -	struct icc_node *node;
> +	u64 agg_avg_rate, agg_rate;
>  	struct qcom_icc_node *qn;
> -	u64 sum_avg[QCOM_SMD_RPM_STATE_NUM];
> +	struct icc_node *node;
>  	int i;
>  
> -	/* Initialise aggregate values */
> -	for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++) {
> -		agg_avg[i] = 0;
> -		agg_peak[i] = 0;
> -	}
> -
> -	*max_agg_avg = 0;
> -
>  	/*
> -	 * Iterate nodes on the interconnect and aggregate bandwidth
> -	 * requests for every bucket.
> +	 * Iterate nodes on the provider, aggregate bandwidth requests for
> +	 * every bucket and convert them into bus clock rates.
>  	 */
>  	list_for_each_entry(node, &provider->nodes, node_list) {
>  		qn = node->data;
>  		for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++) {
>  			if (qn->channels)
> -				sum_avg[i] = div_u64(qn->sum_avg[i], qn->channels);
> +				agg_avg_rate = div_u64(qn->sum_avg[i], qn->channels);
>  			else
> -				sum_avg[i] = qn->sum_avg[i];
> -			agg_avg[i] += sum_avg[i];
> -			agg_peak[i] = max_t(u64, agg_peak[i], qn->max_peak[i]);
> +				agg_avg_rate = qn->sum_avg[i];
> +
> +			agg_rate = max_t(u64, agg_avg_rate, qn->max_peak[i]);
> +			do_div(agg_rate, qn->buswidth);
> +
> +			agg_clk_rate[i] = max_t(u64, agg_clk_rate[i], agg_rate);
>  		}
>  	}
> -
> -	/* Find maximum values across all buckets */
> -	for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++)
> -		*max_agg_avg = max_t(u64, *max_agg_avg, agg_avg[i]);
>  }
>  
>  static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>  {
> -	struct qcom_icc_provider *qp;
>  	struct qcom_icc_node *src_qn = NULL, *dst_qn = NULL;
> +	u64 agg_clk_rate[QCOM_SMD_RPM_STATE_NUM] = { 0 };
>  	struct icc_provider *provider;
> +	struct qcom_icc_provider *qp;
>  	u64 active_rate, sleep_rate;
> -	u64 agg_avg[QCOM_SMD_RPM_STATE_NUM], agg_peak[QCOM_SMD_RPM_STATE_NUM];
> -	u64 max_agg_avg;
>  	int ret;
>  
>  	src_qn = src->data;
> @@ -353,7 +339,9 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>  	provider = src->provider;
>  	qp = to_qcom_provider(provider);
>  
> -	qcom_icc_bus_aggregate(provider, agg_avg, agg_peak, &max_agg_avg);
> +	qcom_icc_bus_aggregate(provider, agg_clk_rate);
> +	active_rate = agg_clk_rate[QCOM_SMD_RPM_ACTIVE_STATE];
> +	sleep_rate = agg_clk_rate[QCOM_SMD_RPM_SLEEP_STATE];
>  
>  	ret = qcom_icc_rpm_set(src_qn, src_qn->sum_avg);
>  	if (ret)
> @@ -369,15 +357,6 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>  	if (!qp->bus_clk_desc && !qp->bus_clk)
>  		return 0;
>  
> -	/* Intentionally keep the rates in kHz as that's what RPM accepts */

I kind of liked this comment because otherwise it's not obvious why
you're not converting from "ICC units" anywhere.

Anyway:

Reviewed-by: Stephan Gerhold <stephan@gerhold.net>

Thanks for going through the giant maze to fix this!

Stephan
Konrad Dybcio June 10, 2023, 7:09 p.m. UTC | #9
On 10.06.2023 21:06, Stephan Gerhold wrote:
> On Fri, Jun 09, 2023 at 10:19:27PM +0200, Konrad Dybcio wrote:
>> Up until now, we've been aggregating the bandwidth values and only
>> dividing them by the bus width of the source node. This was completely
>> wrong, as different nodes on a given path may (and usually do) have
>> varying bus widths.  That in turn, resulted in the calculated clock rates
>> being completely bogus - usually they ended up being much higher, as
>> NoC_A<->NoC_B links are very wide.
>>
>> Since we're not using the aggregate bandwidth value for anything other
>> than clock rate calculations, remodel qcom_icc_bus_aggregate() to
>> calculate the per-context clock rate for a given provider, taking into
>> account the bus width of every individual node.
>>
>> Fixes: 30c8fa3ec61a ("interconnect: qcom: Add MSM8916 interconnect provider driver")
>> Reported-by: Stephan Gerhold <stephan@gerhold.net>
>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>> ---
>>  drivers/interconnect/qcom/icc-rpm.c | 59 ++++++++++++-------------------------
>>  1 file changed, 19 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/interconnect/qcom/icc-rpm.c b/drivers/interconnect/qcom/icc-rpm.c
>> index 1508233632f6..d177a76abe2a 100644
>> --- a/drivers/interconnect/qcom/icc-rpm.c
>> +++ b/drivers/interconnect/qcom/icc-rpm.c
>> @@ -293,58 +293,44 @@ static int qcom_icc_bw_aggregate(struct icc_node *node, u32 tag, u32 avg_bw,
>>  }
>>  
>>  /**
>> - * qcom_icc_bus_aggregate - aggregate bandwidth by traversing all nodes
>> + * qcom_icc_bus_aggregate - calculate bus clock rates by traversing all nodes
>>   * @provider: generic interconnect provider
>> - * @agg_avg: an array for aggregated average bandwidth of buckets
>> - * @agg_peak: an array for aggregated peak bandwidth of buckets
>> - * @max_agg_avg: pointer to max value of aggregated average bandwidth
>> + * @agg_clk_rate: array containing the aggregated clock rates in kHz
>>   */
>> -static void qcom_icc_bus_aggregate(struct icc_provider *provider,
>> -				   u64 *agg_avg, u64 *agg_peak,
>> -				   u64 *max_agg_avg)
>> +static void qcom_icc_bus_aggregate(struct icc_provider *provider, u64 *agg_clk_rate)
>>  {
>> -	struct icc_node *node;
>> +	u64 agg_avg_rate, agg_rate;
>>  	struct qcom_icc_node *qn;
>> -	u64 sum_avg[QCOM_SMD_RPM_STATE_NUM];
>> +	struct icc_node *node;
>>  	int i;
>>  
>> -	/* Initialise aggregate values */
>> -	for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++) {
>> -		agg_avg[i] = 0;
>> -		agg_peak[i] = 0;
>> -	}
>> -
>> -	*max_agg_avg = 0;
>> -
>>  	/*
>> -	 * Iterate nodes on the interconnect and aggregate bandwidth
>> -	 * requests for every bucket.
>> +	 * Iterate nodes on the provider, aggregate bandwidth requests for
>> +	 * every bucket and convert them into bus clock rates.
>>  	 */
>>  	list_for_each_entry(node, &provider->nodes, node_list) {
>>  		qn = node->data;
>>  		for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++) {
>>  			if (qn->channels)
>> -				sum_avg[i] = div_u64(qn->sum_avg[i], qn->channels);
>> +				agg_avg_rate = div_u64(qn->sum_avg[i], qn->channels);
>>  			else
>> -				sum_avg[i] = qn->sum_avg[i];
>> -			agg_avg[i] += sum_avg[i];
>> -			agg_peak[i] = max_t(u64, agg_peak[i], qn->max_peak[i]);
>> +				agg_avg_rate = qn->sum_avg[i];
>> +
>> +			agg_rate = max_t(u64, agg_avg_rate, qn->max_peak[i]);
>> +			do_div(agg_rate, qn->buswidth);
>> +
>> +			agg_clk_rate[i] = max_t(u64, agg_clk_rate[i], agg_rate);
>>  		}
>>  	}
>> -
>> -	/* Find maximum values across all buckets */
>> -	for (i = 0; i < QCOM_SMD_RPM_STATE_NUM; i++)
>> -		*max_agg_avg = max_t(u64, *max_agg_avg, agg_avg[i]);
>>  }
>>  
>>  static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>>  {
>> -	struct qcom_icc_provider *qp;
>>  	struct qcom_icc_node *src_qn = NULL, *dst_qn = NULL;
>> +	u64 agg_clk_rate[QCOM_SMD_RPM_STATE_NUM] = { 0 };
>>  	struct icc_provider *provider;
>> +	struct qcom_icc_provider *qp;
>>  	u64 active_rate, sleep_rate;
>> -	u64 agg_avg[QCOM_SMD_RPM_STATE_NUM], agg_peak[QCOM_SMD_RPM_STATE_NUM];
>> -	u64 max_agg_avg;
>>  	int ret;
>>  
>>  	src_qn = src->data;
>> @@ -353,7 +339,9 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>>  	provider = src->provider;
>>  	qp = to_qcom_provider(provider);
>>  
>> -	qcom_icc_bus_aggregate(provider, agg_avg, agg_peak, &max_agg_avg);
>> +	qcom_icc_bus_aggregate(provider, agg_clk_rate);
>> +	active_rate = agg_clk_rate[QCOM_SMD_RPM_ACTIVE_STATE];
>> +	sleep_rate = agg_clk_rate[QCOM_SMD_RPM_SLEEP_STATE];
>>  
>>  	ret = qcom_icc_rpm_set(src_qn, src_qn->sum_avg);
>>  	if (ret)
>> @@ -369,15 +357,6 @@ static int qcom_icc_set(struct icc_node *src, struct icc_node *dst)
>>  	if (!qp->bus_clk_desc && !qp->bus_clk)
>>  		return 0;
>>  
>> -	/* Intentionally keep the rates in kHz as that's what RPM accepts */
> 
> I kind of liked this comment because otherwise it's not obvious why
> you're not converting from "ICC units" anywhere.
I figured the kerneldoc change would be enough:

@agg_clk_rate: array containing the aggregated clock rates in kHz

> 
> Anyway:
> 
> Reviewed-by: Stephan Gerhold <stephan@gerhold.net>
> 
> Thanks for going through the giant maze to fix this!
Thanks for being there along the way.. This spaghetti is far too much
for a single human..

Konrad

> 
> Stephan
Stephan Gerhold June 10, 2023, 7:25 p.m. UTC | #10
On Sat, Jun 10, 2023 at 08:53:05PM +0200, Konrad Dybcio wrote:
> On 10.06.2023 14:15, Konrad Dybcio wrote:
> > On 10.06.2023 13:35, Stephan Gerhold wrote:
> >> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
> >>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
> >>> clock rate requests will not be commited in hardware. This poses a
> >>> race threat since we're accessing the bus clocks directly from within
> >>> the interconnect framework.
> >>>
> >>> Add a marker to indicate that we're good to go with sending new requests
> >>> and export it so that it can be referenced from icc.
> >>>
> >>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> >>> ---
> >>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
> >>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
> >>>  2 files changed, 11 insertions(+)
> >>>
> >>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
> >>> index 937cb1515968..482fe30ee6f0 100644
> >>> --- a/drivers/clk/qcom/clk-smd-rpm.c
> >>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
> >>> @@ -151,6 +151,7 @@
> >>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
> >>>  
> >>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
> >>> +static bool smd_rpm_clk_scaling;
> >>>  
> >>>  struct clk_smd_rpm {
> >>>  	const int rpm_res_type;
> >>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
> >>>  	return r->rate;
> >>>  }
> >>>  
> >>> +bool qcom_smd_rpm_scaling_available(void)
> >>> +{
> >>> +	return smd_rpm_clk_scaling;
> >>> +}
> >>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
> >>> +
> >>>  static int clk_smd_rpm_enable_scaling(void)
> >>>  {
> >>>  	int ret;
> >>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
> >>>  		return ret;
> >>>  	}
> >>>  
> >>> +	smd_rpm_clk_scaling = true;
> >>> +
> >>
> >> If you move the platform_device_register_data(&rpdev->dev,
> >> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
> >> avoid the race completely and drop this API. I think that would be
> >> cleaner. And it will likely probe much faster because probe deferral
> >> is slow. :)
> > Sounds like an idea.. especially since it's pretty much the only
> > dependency other than SMDRPM itself!
> It sounds great, but to not break bisecting one has to:
> 
> 1. change the registration in soc/smd-rpm to store rpm ptr in driver
>    data, in addition to parent driver data
> 
> 2. change icc/smd-rpm to use the device and not parent data
> 
> 3. add a platform_device_register_data call in clk-smd-rpm that will
>    always fail because the device is always registered
> 
> 4. remove the registration from soc/smd-rpm
> 

Logically the icc_smd_rpm device still fits better as child of
smd-rpm and not clk-smd-rpm. So I would probably just continue
registering it on the parent device from clk-smd-rpm.
Then there are no changes necessary in icc_smd_rpm.

You could use this. Both touched files are Bjorn-maintained so should be
manageable to have it in one commit. (note: compile-tested only)

Thanks,
Stephan
Konrad Dybcio June 10, 2023, 7:39 p.m. UTC | #11
On 10.06.2023 21:25, Stephan Gerhold wrote:
> On Sat, Jun 10, 2023 at 08:53:05PM +0200, Konrad Dybcio wrote:
>> On 10.06.2023 14:15, Konrad Dybcio wrote:
>>> On 10.06.2023 13:35, Stephan Gerhold wrote:
>>>> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
>>>>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
>>>>> clock rate requests will not be commited in hardware. This poses a
>>>>> race threat since we're accessing the bus clocks directly from within
>>>>> the interconnect framework.
>>>>>
>>>>> Add a marker to indicate that we're good to go with sending new requests
>>>>> and export it so that it can be referenced from icc.
>>>>>
>>>>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>>>>> ---
>>>>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
>>>>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
>>>>>  2 files changed, 11 insertions(+)
>>>>>
>>>>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
>>>>> index 937cb1515968..482fe30ee6f0 100644
>>>>> --- a/drivers/clk/qcom/clk-smd-rpm.c
>>>>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
>>>>> @@ -151,6 +151,7 @@
>>>>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
>>>>>  
>>>>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
>>>>> +static bool smd_rpm_clk_scaling;
>>>>>  
>>>>>  struct clk_smd_rpm {
>>>>>  	const int rpm_res_type;
>>>>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
>>>>>  	return r->rate;
>>>>>  }
>>>>>  
>>>>> +bool qcom_smd_rpm_scaling_available(void)
>>>>> +{
>>>>> +	return smd_rpm_clk_scaling;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
>>>>> +
>>>>>  static int clk_smd_rpm_enable_scaling(void)
>>>>>  {
>>>>>  	int ret;
>>>>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
>>>>>  		return ret;
>>>>>  	}
>>>>>  
>>>>> +	smd_rpm_clk_scaling = true;
>>>>> +
>>>>
>>>> If you move the platform_device_register_data(&rpdev->dev,
>>>> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
>>>> avoid the race completely and drop this API. I think that would be
>>>> cleaner. And it will likely probe much faster because probe deferral
>>>> is slow. :)
>>> Sounds like an idea.. especially since it's pretty much the only
>>> dependency other than SMDRPM itself!
>> It sounds great, but to not break bisecting one has to:
>>
>> 1. change the registration in soc/smd-rpm to store rpm ptr in driver
>>    data, in addition to parent driver data
>>
>> 2. change icc/smd-rpm to use the device and not parent data
>>
>> 3. add a platform_device_register_data call in clk-smd-rpm that will
>>    always fail because the device is always registered
>>
>> 4. remove the registration from soc/smd-rpm
>>
> 
> Logically the icc_smd_rpm device still fits better as child of
> smd-rpm and not clk-smd-rpm. So I would probably just continue
> registering it on the parent device from clk-smd-rpm.
> Then there are no changes necessary in icc_smd_rpm.
> 
> You could use this. Both touched files are Bjorn-maintained so should be
> manageable to have it in one commit. (note: compile-tested only)
> 
> Thanks,
> Stephan
> 
> From a2610adb2551b01e76b9de8e4cbcc89853814a8f Mon Sep 17 00:00:00 2001
> From: Stephan Gerhold <stephan@gerhold.net>
> Date: Sat, 10 Jun 2023 21:19:48 +0200
> Subject: [PATCH] soc: qcom: smd-rpm: Move icc_smd_rpm registration to
>  clk-smd-rpm
> 
> icc_smd_rpm will do bus clock votes itself rather than taking the
> unnecessary detour through the clock subsystem. However, it can only
> do that after the clocks have been handed off and scaling has been
> enabled in the RPM in clk-smd-rpm.
> 
> Move the icc_smd_rpm registration from smd-rpm.c to clk-smd-rpm.c
> to avoid any possible races. icc_smd_rpm gets the driver data from
> the smd-rpm device, so still register the platform device on the
> smd-rpm parent device.
> 
> Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
> ---
Generally it looks good.. I'll give it a spin next week. One
thing below.

>  drivers/clk/qcom/clk-smd-rpm.c | 21 +++++++++++++++++++++
>  drivers/soc/qcom/smd-rpm.c     | 23 +----------------------
>  2 files changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
> index e4de74b68797..91adb16889b3 100644
> --- a/drivers/clk/qcom/clk-smd-rpm.c
> +++ b/drivers/clk/qcom/clk-smd-rpm.c
> @@ -1302,12 +1302,20 @@ static struct clk_hw *qcom_smdrpm_clk_hw_get(struct of_phandle_args *clkspec,
>  	return desc->clks[idx] ? &desc->clks[idx]->hw : ERR_PTR(-ENOENT);
>  }
>  
> +static void rpm_smd_unregister_icc(void *data)
> +{
> +	struct platform_device *icc_pdev = data;
> +
> +	platform_device_unregister(icc_pdev);
> +}
> +
>  static int rpm_smd_clk_probe(struct platform_device *pdev)
>  {
>  	int ret;
>  	size_t num_clks, i;
>  	struct clk_smd_rpm **rpm_smd_clks;
>  	const struct rpm_smd_clk_desc *desc;
> +	struct platform_device *icc_pdev;
>  
>  	rpmcc_smd_rpm = dev_get_drvdata(pdev->dev.parent);
>  	if (!rpmcc_smd_rpm) {
> @@ -1357,6 +1365,19 @@ static int rpm_smd_clk_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto err;
>  
> +	icc_pdev = platform_device_register_data(pdev->dev.parent,
> +						 "icc_smd_rpm", -1, NULL, 0);
> +	if (IS_ERR(icc_pdev)) {
> +		dev_err(&pdev->dev, "Failed to register icc_smd_rpm device: %pE\n",
> +			icc_pdev);
> +		/* No need to unregister clocks because of this */
> +	} else {
> +		ret = devm_add_action_or_reset(&pdev->dev, rpm_smd_unregister_icc,
> +					       icc_pdev);
> +		if (ret)
> +			goto err;
> +	}
> +
>  	return 0;
>  err:
>  	dev_err(&pdev->dev, "Error registering SMD clock driver (%d)\n", ret);
> diff --git a/drivers/soc/qcom/smd-rpm.c b/drivers/soc/qcom/smd-rpm.c
> index 0c1aa809cc4e..427dd5392b82 100644
> --- a/drivers/soc/qcom/smd-rpm.c
> +++ b/drivers/soc/qcom/smd-rpm.c
> @@ -19,7 +19,6 @@
>  /**
>   * struct qcom_smd_rpm - state of the rpm device driver
>   * @rpm_channel:	reference to the smd channel
> - * @icc:		interconnect proxy device
>   * @dev:		rpm device
>   * @ack:		completion for acks
>   * @lock:		mutual exclusion around the send/complete pair
> @@ -27,7 +26,6 @@
>   */
>  struct qcom_smd_rpm {
>  	struct rpmsg_endpoint *rpm_channel;
> -	struct platform_device *icc;
>  	struct device *dev;
>  
>  	struct completion ack;
> @@ -197,7 +195,6 @@ static int qcom_smd_rpm_callback(struct rpmsg_device *rpdev,
>  static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
>  {
>  	struct qcom_smd_rpm *rpm;
> -	int ret;
>  
>  	rpm = devm_kzalloc(&rpdev->dev, sizeof(*rpm), GFP_KERNEL);
>  	if (!rpm)
> @@ -210,24 +207,7 @@ static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
>  	rpm->rpm_channel = rpdev->ept;
>  	dev_set_drvdata(&rpdev->dev, rpm);
>  
> -	rpm->icc = platform_device_register_data(&rpdev->dev, "icc_smd_rpm", -1,
> -						 NULL, 0);
> -	if (IS_ERR(rpm->icc))
> -		return PTR_ERR(rpm->icc);
> -
> -	ret = of_platform_populate(rpdev->dev.of_node, NULL, NULL, &rpdev->dev);
> -	if (ret)
> -		platform_device_unregister(rpm->icc);
> -
> -	return ret;
> -}
> -
> -static void qcom_smd_rpm_remove(struct rpmsg_device *rpdev)
> -{
> -	struct qcom_smd_rpm *rpm = dev_get_drvdata(&rpdev->dev);
> -
> -	platform_device_unregister(rpm->icc);
> -	of_platform_depopulate(&rpdev->dev);
> +	return devm_of_platform_populate(&rpdev->dev);
>  }
>  
>  static const struct of_device_id qcom_smd_rpm_of_match[] = {
> @@ -256,7 +236,6 @@ MODULE_DEVICE_TABLE(of, qcom_smd_rpm_of_match);
>  
>  static struct rpmsg_driver qcom_smd_rpm_driver = {
>  	.probe = qcom_smd_rpm_probe,
> -	.remove = qcom_smd_rpm_remove,
This reaches over the removal of the icc registration, the depopulate
call should stay.

Konrad
>  	.callback = qcom_smd_rpm_callback,
>  	.drv  = {
>  		.name  = "qcom_smd_rpm",
Stephan Gerhold June 11, 2023, 9:20 a.m. UTC | #12
On Sat, Jun 10, 2023 at 09:39:00PM +0200, Konrad Dybcio wrote:
> On 10.06.2023 21:25, Stephan Gerhold wrote:
> > On Sat, Jun 10, 2023 at 08:53:05PM +0200, Konrad Dybcio wrote:
> >> On 10.06.2023 14:15, Konrad Dybcio wrote:
> >>> On 10.06.2023 13:35, Stephan Gerhold wrote:
> >>>> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
> >>>>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
> >>>>> clock rate requests will not be commited in hardware. This poses a
> >>>>> race threat since we're accessing the bus clocks directly from within
> >>>>> the interconnect framework.
> >>>>>
> >>>>> Add a marker to indicate that we're good to go with sending new requests
> >>>>> and export it so that it can be referenced from icc.
> >>>>>
> >>>>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
> >>>>> ---
> >>>>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
> >>>>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
> >>>>>  2 files changed, 11 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
> >>>>> index 937cb1515968..482fe30ee6f0 100644
> >>>>> --- a/drivers/clk/qcom/clk-smd-rpm.c
> >>>>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
> >>>>> @@ -151,6 +151,7 @@
> >>>>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
> >>>>>  
> >>>>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
> >>>>> +static bool smd_rpm_clk_scaling;
> >>>>>  
> >>>>>  struct clk_smd_rpm {
> >>>>>  	const int rpm_res_type;
> >>>>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
> >>>>>  	return r->rate;
> >>>>>  }
> >>>>>  
> >>>>> +bool qcom_smd_rpm_scaling_available(void)
> >>>>> +{
> >>>>> +	return smd_rpm_clk_scaling;
> >>>>> +}
> >>>>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
> >>>>> +
> >>>>>  static int clk_smd_rpm_enable_scaling(void)
> >>>>>  {
> >>>>>  	int ret;
> >>>>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
> >>>>>  		return ret;
> >>>>>  	}
> >>>>>  
> >>>>> +	smd_rpm_clk_scaling = true;
> >>>>> +
> >>>>
> >>>> If you move the platform_device_register_data(&rpdev->dev,
> >>>> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
> >>>> avoid the race completely and drop this API. I think that would be
> >>>> cleaner. And it will likely probe much faster because probe deferral
> >>>> is slow. :)
> >>> Sounds like an idea.. especially since it's pretty much the only
> >>> dependency other than SMDRPM itself!
> >> It sounds great, but to not break bisecting one has to:
> >>
> >> 1. change the registration in soc/smd-rpm to store rpm ptr in driver
> >>    data, in addition to parent driver data
> >>
> >> 2. change icc/smd-rpm to use the device and not parent data
> >>
> >> 3. add a platform_device_register_data call in clk-smd-rpm that will
> >>    always fail because the device is always registered
> >>
> >> 4. remove the registration from soc/smd-rpm
> >>
> > 
> > Logically the icc_smd_rpm device still fits better as child of
> > smd-rpm and not clk-smd-rpm. So I would probably just continue
> > registering it on the parent device from clk-smd-rpm.
> > Then there are no changes necessary in icc_smd_rpm.
> > 
> > You could use this. Both touched files are Bjorn-maintained so should be
> > manageable to have it in one commit. (note: compile-tested only)
> > 
> > Thanks,
> > Stephan
> > 
> > From a2610adb2551b01e76b9de8e4cbcc89853814a8f Mon Sep 17 00:00:00 2001
> > From: Stephan Gerhold <stephan@gerhold.net>
> > Date: Sat, 10 Jun 2023 21:19:48 +0200
> > Subject: [PATCH] soc: qcom: smd-rpm: Move icc_smd_rpm registration to
> >  clk-smd-rpm
> > 
> > icc_smd_rpm will do bus clock votes itself rather than taking the
> > unnecessary detour through the clock subsystem. However, it can only
> > do that after the clocks have been handed off and scaling has been
> > enabled in the RPM in clk-smd-rpm.
> > 
> > Move the icc_smd_rpm registration from smd-rpm.c to clk-smd-rpm.c
> > to avoid any possible races. icc_smd_rpm gets the driver data from
> > the smd-rpm device, so still register the platform device on the
> > smd-rpm parent device.
> > 
> > Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
> > ---
> Generally it looks good.. I'll give it a spin next week. One
> thing below.
> 
> >  drivers/clk/qcom/clk-smd-rpm.c | 21 +++++++++++++++++++++
> >  drivers/soc/qcom/smd-rpm.c     | 23 +----------------------
> >  2 files changed, 22 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
> > index e4de74b68797..91adb16889b3 100644
> > --- a/drivers/clk/qcom/clk-smd-rpm.c
> > +++ b/drivers/clk/qcom/clk-smd-rpm.c
> > @@ -1302,12 +1302,20 @@ static struct clk_hw *qcom_smdrpm_clk_hw_get(struct of_phandle_args *clkspec,
> >  	return desc->clks[idx] ? &desc->clks[idx]->hw : ERR_PTR(-ENOENT);
> >  }
> >  
> > +static void rpm_smd_unregister_icc(void *data)
> > +{
> > +	struct platform_device *icc_pdev = data;
> > +
> > +	platform_device_unregister(icc_pdev);
> > +}
> > +
> >  static int rpm_smd_clk_probe(struct platform_device *pdev)
> >  {
> >  	int ret;
> >  	size_t num_clks, i;
> >  	struct clk_smd_rpm **rpm_smd_clks;
> >  	const struct rpm_smd_clk_desc *desc;
> > +	struct platform_device *icc_pdev;
> >  
> >  	rpmcc_smd_rpm = dev_get_drvdata(pdev->dev.parent);
> >  	if (!rpmcc_smd_rpm) {
> > @@ -1357,6 +1365,19 @@ static int rpm_smd_clk_probe(struct platform_device *pdev)
> >  	if (ret)
> >  		goto err;
> >  
> > +	icc_pdev = platform_device_register_data(pdev->dev.parent,
> > +						 "icc_smd_rpm", -1, NULL, 0);
> > +	if (IS_ERR(icc_pdev)) {
> > +		dev_err(&pdev->dev, "Failed to register icc_smd_rpm device: %pE\n",
> > +			icc_pdev);
> > +		/* No need to unregister clocks because of this */
> > +	} else {
> > +		ret = devm_add_action_or_reset(&pdev->dev, rpm_smd_unregister_icc,
> > +					       icc_pdev);
> > +		if (ret)
> > +			goto err;
> > +	}
> > +
> >  	return 0;
> >  err:
> >  	dev_err(&pdev->dev, "Error registering SMD clock driver (%d)\n", ret);
> > diff --git a/drivers/soc/qcom/smd-rpm.c b/drivers/soc/qcom/smd-rpm.c
> > index 0c1aa809cc4e..427dd5392b82 100644
> > --- a/drivers/soc/qcom/smd-rpm.c
> > +++ b/drivers/soc/qcom/smd-rpm.c
> > @@ -19,7 +19,6 @@
> >  /**
> >   * struct qcom_smd_rpm - state of the rpm device driver
> >   * @rpm_channel:	reference to the smd channel
> > - * @icc:		interconnect proxy device
> >   * @dev:		rpm device
> >   * @ack:		completion for acks
> >   * @lock:		mutual exclusion around the send/complete pair
> > @@ -27,7 +26,6 @@
> >   */
> >  struct qcom_smd_rpm {
> >  	struct rpmsg_endpoint *rpm_channel;
> > -	struct platform_device *icc;
> >  	struct device *dev;
> >  
> >  	struct completion ack;
> > @@ -197,7 +195,6 @@ static int qcom_smd_rpm_callback(struct rpmsg_device *rpdev,
> >  static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
> >  {
> >  	struct qcom_smd_rpm *rpm;
> > -	int ret;
> >  
> >  	rpm = devm_kzalloc(&rpdev->dev, sizeof(*rpm), GFP_KERNEL);
> >  	if (!rpm)
> > @@ -210,24 +207,7 @@ static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
> >  	rpm->rpm_channel = rpdev->ept;
> >  	dev_set_drvdata(&rpdev->dev, rpm);
> >  
> > -	rpm->icc = platform_device_register_data(&rpdev->dev, "icc_smd_rpm", -1,
> > -						 NULL, 0);
> > -	if (IS_ERR(rpm->icc))
> > -		return PTR_ERR(rpm->icc);
> > -
> > -	ret = of_platform_populate(rpdev->dev.of_node, NULL, NULL, &rpdev->dev);
> > -	if (ret)
> > -		platform_device_unregister(rpm->icc);
> > -
> > -	return ret;
> > -}
> > -
> > -static void qcom_smd_rpm_remove(struct rpmsg_device *rpdev)
> > -{
> > -	struct qcom_smd_rpm *rpm = dev_get_drvdata(&rpdev->dev);
> > -
> > -	platform_device_unregister(rpm->icc);
> > -	of_platform_depopulate(&rpdev->dev);
> > +	return devm_of_platform_populate(&rpdev->dev);
> >  }
> >  
> >  static const struct of_device_id qcom_smd_rpm_of_match[] = {
> > @@ -256,7 +236,6 @@ MODULE_DEVICE_TABLE(of, qcom_smd_rpm_of_match);
> >  
> >  static struct rpmsg_driver qcom_smd_rpm_driver = {
> >  	.probe = qcom_smd_rpm_probe,
> > -	.remove = qcom_smd_rpm_remove,
> This reaches over the removal of the icc registration, the depopulate
> call should stay.
> 

I switched the of_platform_populate() to devm_of_platform_populate(),
that's why the remove callback is no longer necessary. It's a bit
hidden, perhaps it would be enough to add to the commit message:

"While at it, switch the remaining of_platform_populate() call to the
 devm variant and remove the remove callback."

Or maybe it should be split into two patches.

Thanks,
Stephan
Konrad Dybcio June 12, 2023, 5:03 p.m. UTC | #13
On 12.06.2023 14:51, Konrad Dybcio wrote:
> On 11.06.2023 11:20, Stephan Gerhold wrote:
>> On Sat, Jun 10, 2023 at 09:39:00PM +0200, Konrad Dybcio wrote:
>>> On 10.06.2023 21:25, Stephan Gerhold wrote:
>>>> On Sat, Jun 10, 2023 at 08:53:05PM +0200, Konrad Dybcio wrote:
>>>>> On 10.06.2023 14:15, Konrad Dybcio wrote:
>>>>>> On 10.06.2023 13:35, Stephan Gerhold wrote:
>>>>>>> On Fri, Jun 09, 2023 at 10:19:09PM +0200, Konrad Dybcio wrote:
>>>>>>>> Before we issue a call to RPM through clk_smd_rpm_enable_scaling() the
>>>>>>>> clock rate requests will not be commited in hardware. This poses a
>>>>>>>> race threat since we're accessing the bus clocks directly from within
>>>>>>>> the interconnect framework.
>>>>>>>>
>>>>>>>> Add a marker to indicate that we're good to go with sending new requests
>>>>>>>> and export it so that it can be referenced from icc.
>>>>>>>>
>>>>>>>> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>>>>>>>> ---
>>>>>>>>  drivers/clk/qcom/clk-smd-rpm.c   | 9 +++++++++
>>>>>>>>  include/linux/soc/qcom/smd-rpm.h | 2 ++
>>>>>>>>  2 files changed, 11 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
>>>>>>>> index 937cb1515968..482fe30ee6f0 100644
>>>>>>>> --- a/drivers/clk/qcom/clk-smd-rpm.c
>>>>>>>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
>>>>>>>> @@ -151,6 +151,7 @@
>>>>>>>>  #define to_clk_smd_rpm(_hw) container_of(_hw, struct clk_smd_rpm, hw)
>>>>>>>>  
>>>>>>>>  static struct qcom_smd_rpm *rpmcc_smd_rpm;
>>>>>>>> +static bool smd_rpm_clk_scaling;
>>>>>>>>  
>>>>>>>>  struct clk_smd_rpm {
>>>>>>>>  	const int rpm_res_type;
>>>>>>>> @@ -385,6 +386,12 @@ static unsigned long clk_smd_rpm_recalc_rate(struct clk_hw *hw,
>>>>>>>>  	return r->rate;
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> +bool qcom_smd_rpm_scaling_available(void)
>>>>>>>> +{
>>>>>>>> +	return smd_rpm_clk_scaling;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(qcom_smd_rpm_scaling_available);
>>>>>>>> +
>>>>>>>>  static int clk_smd_rpm_enable_scaling(void)
>>>>>>>>  {
>>>>>>>>  	int ret;
>>>>>>>> @@ -410,6 +417,8 @@ static int clk_smd_rpm_enable_scaling(void)
>>>>>>>>  		return ret;
>>>>>>>>  	}
>>>>>>>>  
>>>>>>>> +	smd_rpm_clk_scaling = true;
>>>>>>>> +
>>>>>>>
>>>>>>> If you move the platform_device_register_data(&rpdev->dev,
>>>>>>> "icc_smd_rpm", ...) from drivers/soc/qcom/smd-rpm.c to here you can
>>>>>>> avoid the race completely and drop this API. I think that would be
>>>>>>> cleaner. And it will likely probe much faster because probe deferral
>>>>>>> is slow. :)
>>>>>> Sounds like an idea.. especially since it's pretty much the only
>>>>>> dependency other than SMDRPM itself!
>>>>> It sounds great, but to not break bisecting one has to:
>>>>>
>>>>> 1. change the registration in soc/smd-rpm to store rpm ptr in driver
>>>>>    data, in addition to parent driver data
>>>>>
>>>>> 2. change icc/smd-rpm to use the device and not parent data
>>>>>
>>>>> 3. add a platform_device_register_data call in clk-smd-rpm that will
>>>>>    always fail because the device is always registered
>>>>>
>>>>> 4. remove the registration from soc/smd-rpm
>>>>>
>>>>
>>>> Logically the icc_smd_rpm device still fits better as child of
>>>> smd-rpm and not clk-smd-rpm. So I would probably just continue
>>>> registering it on the parent device from clk-smd-rpm.
>>>> Then there are no changes necessary in icc_smd_rpm.
>>>>
>>>> You could use this. Both touched files are Bjorn-maintained so should be
>>>> manageable to have it in one commit. (note: compile-tested only)
>>>>
>>>> Thanks,
>>>> Stephan
>>>>
>>>> From a2610adb2551b01e76b9de8e4cbcc89853814a8f Mon Sep 17 00:00:00 2001
>>>> From: Stephan Gerhold <stephan@gerhold.net>
>>>> Date: Sat, 10 Jun 2023 21:19:48 +0200
>>>> Subject: [PATCH] soc: qcom: smd-rpm: Move icc_smd_rpm registration to
>>>>  clk-smd-rpm
>>>>
>>>> icc_smd_rpm will do bus clock votes itself rather than taking the
>>>> unnecessary detour through the clock subsystem. However, it can only
>>>> do that after the clocks have been handed off and scaling has been
>>>> enabled in the RPM in clk-smd-rpm.
>>>>
>>>> Move the icc_smd_rpm registration from smd-rpm.c to clk-smd-rpm.c
>>>> to avoid any possible races. icc_smd_rpm gets the driver data from
>>>> the smd-rpm device, so still register the platform device on the
>>>> smd-rpm parent device.
>>>>
>>>> Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
>>>> ---
>>> Generally it looks good.. I'll give it a spin next week. One
>>> thing below.
>>>
>>>>  drivers/clk/qcom/clk-smd-rpm.c | 21 +++++++++++++++++++++
>>>>  drivers/soc/qcom/smd-rpm.c     | 23 +----------------------
>>>>  2 files changed, 22 insertions(+), 22 deletions(-)
>>>>
>>>> diff --git a/drivers/clk/qcom/clk-smd-rpm.c b/drivers/clk/qcom/clk-smd-rpm.c
>>>> index e4de74b68797..91adb16889b3 100644
>>>> --- a/drivers/clk/qcom/clk-smd-rpm.c
>>>> +++ b/drivers/clk/qcom/clk-smd-rpm.c
>>>> @@ -1302,12 +1302,20 @@ static struct clk_hw *qcom_smdrpm_clk_hw_get(struct of_phandle_args *clkspec,
>>>>  	return desc->clks[idx] ? &desc->clks[idx]->hw : ERR_PTR(-ENOENT);
>>>>  }
>>>>  
>>>> +static void rpm_smd_unregister_icc(void *data)
>>>> +{
>>>> +	struct platform_device *icc_pdev = data;
>>>> +
>>>> +	platform_device_unregister(icc_pdev);
>>>> +}
>>>> +
>>>>  static int rpm_smd_clk_probe(struct platform_device *pdev)
>>>>  {
>>>>  	int ret;
>>>>  	size_t num_clks, i;
>>>>  	struct clk_smd_rpm **rpm_smd_clks;
>>>>  	const struct rpm_smd_clk_desc *desc;
>>>> +	struct platform_device *icc_pdev;
>>>>  
>>>>  	rpmcc_smd_rpm = dev_get_drvdata(pdev->dev.parent);
>>>>  	if (!rpmcc_smd_rpm) {
>>>> @@ -1357,6 +1365,19 @@ static int rpm_smd_clk_probe(struct platform_device *pdev)
>>>>  	if (ret)
>>>>  		goto err;
>>>>  
>>>> +	icc_pdev = platform_device_register_data(pdev->dev.parent,
>>>> +						 "icc_smd_rpm", -1, NULL, 0);
>>>> +	if (IS_ERR(icc_pdev)) {
>>>> +		dev_err(&pdev->dev, "Failed to register icc_smd_rpm device: %pE\n",
>>>> +			icc_pdev);
>>>> +		/* No need to unregister clocks because of this */
>>>> +	} else {
>>>> +		ret = devm_add_action_or_reset(&pdev->dev, rpm_smd_unregister_icc,
>>>> +					       icc_pdev);
>>>> +		if (ret)
>>>> +			goto err;
>>>> +	}
>>>> +
>>>>  	return 0;
>>>>  err:
>>>>  	dev_err(&pdev->dev, "Error registering SMD clock driver (%d)\n", ret);
>>>> diff --git a/drivers/soc/qcom/smd-rpm.c b/drivers/soc/qcom/smd-rpm.c
>>>> index 0c1aa809cc4e..427dd5392b82 100644
>>>> --- a/drivers/soc/qcom/smd-rpm.c
>>>> +++ b/drivers/soc/qcom/smd-rpm.c
>>>> @@ -19,7 +19,6 @@
>>>>  /**
>>>>   * struct qcom_smd_rpm - state of the rpm device driver
>>>>   * @rpm_channel:	reference to the smd channel
>>>> - * @icc:		interconnect proxy device
>>>>   * @dev:		rpm device
>>>>   * @ack:		completion for acks
>>>>   * @lock:		mutual exclusion around the send/complete pair
>>>> @@ -27,7 +26,6 @@
>>>>   */
>>>>  struct qcom_smd_rpm {
>>>>  	struct rpmsg_endpoint *rpm_channel;
>>>> -	struct platform_device *icc;
>>>>  	struct device *dev;
>>>>  
>>>>  	struct completion ack;
>>>> @@ -197,7 +195,6 @@ static int qcom_smd_rpm_callback(struct rpmsg_device *rpdev,
>>>>  static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
>>>>  {
>>>>  	struct qcom_smd_rpm *rpm;
>>>> -	int ret;
>>>>  
>>>>  	rpm = devm_kzalloc(&rpdev->dev, sizeof(*rpm), GFP_KERNEL);
>>>>  	if (!rpm)
>>>> @@ -210,24 +207,7 @@ static int qcom_smd_rpm_probe(struct rpmsg_device *rpdev)
>>>>  	rpm->rpm_channel = rpdev->ept;
>>>>  	dev_set_drvdata(&rpdev->dev, rpm);
>>>>  
>>>> -	rpm->icc = platform_device_register_data(&rpdev->dev, "icc_smd_rpm", -1,
>>>> -						 NULL, 0);
>>>> -	if (IS_ERR(rpm->icc))
>>>> -		return PTR_ERR(rpm->icc);
>>>> -
>>>> -	ret = of_platform_populate(rpdev->dev.of_node, NULL, NULL, &rpdev->dev);
>>>> -	if (ret)
>>>> -		platform_device_unregister(rpm->icc);
>>>> -
>>>> -	return ret;
>>>> -}
>>>> -
>>>> -static void qcom_smd_rpm_remove(struct rpmsg_device *rpdev)
>>>> -{
>>>> -	struct qcom_smd_rpm *rpm = dev_get_drvdata(&rpdev->dev);
>>>> -
>>>> -	platform_device_unregister(rpm->icc);
>>>> -	of_platform_depopulate(&rpdev->dev);
>>>> +	return devm_of_platform_populate(&rpdev->dev);
>>>>  }
>>>>  
>>>>  static const struct of_device_id qcom_smd_rpm_of_match[] = {
>>>> @@ -256,7 +236,6 @@ MODULE_DEVICE_TABLE(of, qcom_smd_rpm_of_match);
>>>>  
>>>>  static struct rpmsg_driver qcom_smd_rpm_driver = {
>>>>  	.probe = qcom_smd_rpm_probe,
>>>> -	.remove = qcom_smd_rpm_remove,
>>> This reaches over the removal of the icc registration, the depopulate
>>> call should stay.
>>>
>>
>> I switched the of_platform_populate() to devm_of_platform_populate(),
>> that's why the remove callback is no longer necessary. It's a bit
>> hidden, perhaps it would be enough to add to the commit message:
>>
>> "While at it, switch the remaining of_platform_populate() call to the
>>  devm variant and remove the remove callback."
>>
>> Or maybe it should be split into two patches.
> Gave it a spin, I think it ends up being worse if an IPA rpm clock is
> consumed by one of the icc providers, and that's sadly the case
> for almost all platforms (or supposed to be).. :/ Only qcm2290 doesn't
> seem to care if we poke the network interface units with half the soc
> off :P
Actually it's more complex and that rings the same tons-of-probe-deferrals
bell that we have to get rid of.. The icc_smd_rpm driver actually does
probe faster with your patch, but that's not the case with the SoC icc
provider driver which does probe later :/

Don't think either solution is perfect, but I'm willing to take your
solution, as the one I proposed is a bit of a dead end for future
improvements.

Konrad
> 
> Konrad
>>
>> Thanks,
>> Stephan