diff mbox series

[V2,1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER

Message ID 24ff92dd1b0ee1b802b45698520f2937418f8094.1598260050.git.viresh.kumar@linaro.org
State New
Headers show
Series [V2,1/2] opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER | expand

Commit Message

Viresh Kumar Aug. 24, 2020, 9:09 a.m. UTC
From: Stephan Gerhold <stephan@gerhold.net>

The OPP core manages various resources, e.g. clocks or interconnect paths.
These resources are looked up when the OPP table is allocated once
dev_pm_opp_get_opp_table() is called the first time (either directly
or indirectly through one of the many helper functions).

At this point, the resources may not be available yet, i.e. looking them
up will result in -EPROBE_DEFER. Unfortunately, dev_pm_opp_get_opp_table()
is currently unable to propagate this error code since it only returns
the allocated OPP table or NULL.

This means that all consumers of the OPP core are required to make sure
that all necessary resources are available. Usually this happens by
requesting them, checking the result and releasing them immediately after.

For example, we have added "dev_pm_opp_of_find_icc_paths(dev, NULL)" to
several drivers now just to make sure the interconnect providers are
ready before the OPP table is allocated. If this call is missing,
the OPP core will only warn about this and then attempt to continue
without interconnect. This will eventually fail horribly, e.g.:

    cpu cpu0: _allocate_opp_table: Error finding interconnect paths: -517
    ... later ...
    of: _read_bw: Mismatch between opp-peak-kBps and paths (1 0)
    cpu cpu0: _opp_add_static_v2: opp key field not found
    cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -22

This example happens when trying to use interconnects for a CPU OPP
table together with qcom-cpufreq-nvmem.c. qcom-cpufreq-nvmem calls
dev_pm_opp_set_supported_hw(), which ends up allocating the OPP table
early. To fix the problem with the current approach we would need to add
yet another call to dev_pm_opp_of_find_icc_paths(dev, NULL).
But actually qcom-cpufreq-nvmem.c has nothing to do with interconnects...

This commit attempts to make this more robust by allowing
dev_pm_opp_get_opp_table() to return an error pointer. Fixing all
the usages is trivial because the function is usually used indirectly
through another helper (e.g. dev_pm_opp_set_supported_hw() above).
These other helpers already return an error pointer.

The example above then works correctly because set_supported_hw() will
return -EPROBE_DEFER, and qcom-cpufreq-nvmem.c already propagates that
error. It should also be possible to remove the remaining usages of
"dev_pm_opp_of_find_icc_paths(dev, NULL)" from other drivers as well.

Note that this commit currently only handles -EPROBE_DEFER for the
clock/interconnects within _allocate_opp_table(). Other errors are just
ignored as before. Eventually those should be propagated as well.

Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
[ Viresh: skip checking return value of dev_pm_opp_get_opp_table() for
	  EPROBE_DEFER in domain.c, fix NULL return value and reorder
	  code a bit in core.c, and update exynos-asv.c ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Stephan, I have made some changes to the code. Please try it again and
lemme know if it works fine.

 drivers/base/power/domain.c      | 14 +++++----
 drivers/opp/core.c               | 53 +++++++++++++++++++-------------
 drivers/opp/of.c                 |  8 ++---
 drivers/soc/samsung/exynos-asv.c |  2 +-
 4 files changed, 44 insertions(+), 33 deletions(-)

Comments

Marek Szyprowski Sept. 1, 2020, 8:57 a.m. UTC | #1
Hi Viresh

On 24.08.2020 11:09, Viresh Kumar wrote:
> From: Stephan Gerhold <stephan@gerhold.net>
>
> cpufreq-dt is currently unable to handle -EPROBE_DEFER properly
> because the error code is not propagated for the cpufreq_driver->init()
> callback. Instead, it attempts to avoid the situation by temporarily
> requesting all resources within resources_available() and releasing them
> again immediately after. This has several disadvantages:
>
>    - Whenever we add something like interconnect handling to the OPP core
>      we need to patch cpufreq-dt to request these resources early.
>
>    - resources_available() is only run for CPU0, but other clusters may
>      eventually depend on other resources that are not available yet.
>      (See FIXME comment removed by this commit...)
>
>    - All resources need to be looked up several times.
>
> Now that the OPP core can propagate -EPROBE_DEFER during initialization,
> it would be nice to avoid all that trouble and just propagate its error
> code when necessary.
>
> This commit refactors the cpufreq-dt driver to initialize private_data
> before registering the cpufreq driver. We do this by iterating over
> all possible CPUs and ensure that all resources are initialized:
>
>    1. dev_pm_opp_get_opp_table() ensures the OPP table is allocated
>       and initialized with clock and interconnects.
>
>    2. dev_pm_opp_set_regulators() requests the regulators and assigns
>       them to the OPP table.
>
>    3. We call dev_pm_opp_of_get_sharing_cpus() early so that we only
>       initialize the OPP table once for each shared policy.
>
> With these changes, we actually end up saving a few lines of code,
> the resources are no longer looked up multiple times and everything
> should be much more robust.
>
> Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
> [ Viresh: Use list_head structure for maintaining the list and minor
> 	  changes ]
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

This patch landed in linux-next about a week ago. It introduces a 
following warning on Samsung Exnyos3250 SoC:

cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
1000000000, volt: 1150000, enabled: 1. New: freq: 1000000000, volt: 
1150000, enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
900000000, volt: 1112500, enabled: 1. New: freq: 900000000, volt: 
1112500, enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
800000000, volt: 1075000, enabled: 1. New: freq: 800000000, volt: 
1075000, enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
700000000, volt: 1037500, enabled: 1. New: freq: 700000000, volt: 
1037500, enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
600000000, volt: 1000000, enabled: 1. New: freq: 600000000, volt: 
1000000, enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
500000000, volt: 962500, enabled: 1. New: freq: 500000000, volt: 962500, 
enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
400000000, volt: 925000, enabled: 1. New: freq: 400000000, volt: 925000, 
enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
300000000, volt: 887500, enabled: 1. New: freq: 300000000, volt: 887500, 
enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
200000000, volt: 850000, enabled: 1. New: freq: 200000000, volt: 850000, 
enabled: 1
cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 
100000000, volt: 850000, enabled: 1. New: freq: 100000000, volt: 850000, 
enabled: 1

I've checked a bit and this is related to the fact that Exynos3250 SoC 
use OPP-v1 table. Is this intentional? It is not a problem to convert it 
to OPP-v2 and mark OPP table as shared, but this is a kind of a regression.

> ...

Best regards
Marek Szyprowski Sept. 1, 2020, 10:05 a.m. UTC | #2
Hi Viresh,

On 01.09.2020 11:45, Viresh Kumar wrote:
> On 01-09-20, 10:57, Marek Szyprowski wrote:
>> This patch landed in linux-next about a week ago. It introduces a
>> following warning on Samsung Exnyos3250 SoC:
>>
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 1000000000, volt: 1150000, enabled: 1. New: freq: 1000000000, volt:
>> 1150000, enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 900000000, volt: 1112500, enabled: 1. New: freq: 900000000, volt:
>> 1112500, enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 800000000, volt: 1075000, enabled: 1. New: freq: 800000000, volt:
>> 1075000, enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 700000000, volt: 1037500, enabled: 1. New: freq: 700000000, volt:
>> 1037500, enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 600000000, volt: 1000000, enabled: 1. New: freq: 600000000, volt:
>> 1000000, enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 500000000, volt: 962500, enabled: 1. New: freq: 500000000, volt: 962500,
>> enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 400000000, volt: 925000, enabled: 1. New: freq: 400000000, volt: 925000,
>> enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 300000000, volt: 887500, enabled: 1. New: freq: 300000000, volt: 887500,
>> enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 200000000, volt: 850000, enabled: 1. New: freq: 200000000, volt: 850000,
>> enabled: 1
>> cpu cpu1: _opp_is_duplicate: duplicate OPPs detected. Existing: freq:
>> 100000000, volt: 850000, enabled: 1. New: freq: 100000000, volt: 850000,
>> enabled: 1
>>
>> I've checked a bit and this is related to the fact that Exynos3250 SoC
>> use OPP-v1 table. Is this intentional? It is not a problem to convert it
>> to OPP-v2 and mark OPP table as shared, but this is a kind of a regression.
> It took me 20 minutes for me to see "where has my patch gone" :(
>
> I wrote a small patch for that to work without any issues, but not
> sure how I missed or abandoned it. Anyway, here is the diff again and
> I will send it out again once you confirm it fixes the issue. Can you
> please also test your driver as a module with multiple insertion/removals ?

Indeed, this patch seems to fix/hide that warning. Feel free to add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 5dac8bffd68c..e72753be7dc7 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -905,6 +905,16 @@ static int _of_add_opp_table_v1(struct device *dev, struct opp_table *opp_table)
>          const __be32 *val;
>          int nr, ret = 0;
>   
> +       mutex_lock(&opp_table->lock);
> +       if (opp_table->parsed_static_opps) {
> +               opp_table->parsed_static_opps++;
> +               mutex_unlock(&opp_table->lock);
> +               return 0;
> +       }
> +
> +       opp_table->parsed_static_opps = 1;
> +       mutex_unlock(&opp_table->lock);
> +
>          prop = of_find_property(dev->of_node, "operating-points", NULL);
>          if (!prop)
>                  return -ENODEV;
> @@ -921,10 +931,6 @@ static int _of_add_opp_table_v1(struct device *dev, struct opp_table *opp_table)
>                  return -EINVAL;
>          }
>   
> -       mutex_lock(&opp_table->lock);
> -       opp_table->parsed_static_opps = 1;
> -       mutex_unlock(&opp_table->lock);
> -
>          val = prop->value;
>          while (nr) {
>                  unsigned long freq = be32_to_cpup(val++) * 1000;
>
Best regards
Viresh Kumar Sept. 1, 2020, 10:12 a.m. UTC | #3
On 01-09-20, 12:05, Marek Szyprowski wrote:
> Indeed, this patch seems to fix/hide that warning. Feel free to add:

s/hide/really-fix/ :)

I assumed that this problem is going to be there, when I applied the
other patch and so was trying to write a fix but somehow skipped doing
that :(
Sudeep Holla Oct. 15, 2020, 6:05 p.m. UTC | #4
On Mon, Aug 24, 2020 at 02:39:32PM +0530, Viresh Kumar wrote:
> From: Stephan Gerhold <stephan@gerhold.net>

> 

> The OPP core manages various resources, e.g. clocks or interconnect paths.

> These resources are looked up when the OPP table is allocated once

> dev_pm_opp_get_opp_table() is called the first time (either directly

> or indirectly through one of the many helper functions).

> 

> At this point, the resources may not be available yet, i.e. looking them

> up will result in -EPROBE_DEFER. Unfortunately, dev_pm_opp_get_opp_table()

> is currently unable to propagate this error code since it only returns

> the allocated OPP table or NULL.

>


OK, this breaks with SCMI which doesn't provide clocks but manage OPPs
directly. Before this change clk_get(dev..) was allowed to fail and
--EPROBE_DEFER was not an error. We use dev_pm_opp_add to add OPPs
read from the firmware and this change is preventing that.

Sorry for checking this so late, but noticed only when this hit mainline.

-- 
Regards,
Sudeep
Viresh Kumar Oct. 16, 2020, 4:24 a.m. UTC | #5
On 15-10-20, 19:05, Sudeep Holla wrote:
> OK, this breaks with SCMI which doesn't provide clocks but manage OPPs
> directly. Before this change clk_get(dev..) was allowed to fail and
> --EPROBE_DEFER was not an error.

I think the change in itself is fine. We should be returning from
there if we get EPROBE_DEFER. The question is rather why are you
getting EPROBE_DEFER here ?

> We use dev_pm_opp_add to add OPPs
> read from the firmware and this change is preventing that.
> 
> Sorry for checking this so late, but noticed only when this hit mainline.
> 
> -- 
> Regards,
> Sudeep
Sudeep Holla Oct. 16, 2020, 6 a.m. UTC | #6
On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:
> On 15-10-20, 19:05, Sudeep Holla wrote:

> > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs

> > directly. Before this change clk_get(dev..) was allowed to fail and

> > --EPROBE_DEFER was not an error.

> 

> I think the change in itself is fine. We should be returning from

> there if we get EPROBE_DEFER. The question is rather why are you

> getting EPROBE_DEFER here ?

> 


Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted
this patch and it worked. I need to check it in detail yet.

-- 
Regards,
Sudeep
Sudeep Holla Oct. 16, 2020, 11:12 a.m. UTC | #7
On Fri, Oct 16, 2020 at 07:00:21AM +0100, Sudeep Holla wrote:
> On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:

> > On 15-10-20, 19:05, Sudeep Holla wrote:

> > > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs

> > > directly. Before this change clk_get(dev..) was allowed to fail and

> > > --EPROBE_DEFER was not an error.

> >

> > I think the change in itself is fine. We should be returning from

> > there if we get EPROBE_DEFER. The question is rather why are you

> > getting EPROBE_DEFER here ?

> >

>

> Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted

> this patch and it worked. I need to check it in detail yet.

>


You confused me earlier. As I said there will be no clock provider
registered for SCMI CPU/Dev DVFS.
	opp_table->clk = clk_get(dev, NULL);
will always return -EPROBE_DEFER as there is no clock provider for dev.
But this change now propagates that error to caller of dev_pm_opp_add
which means we can't add opp to a device if there are no clock providers.
This breaks for DVFS which don't operate separately with clocks and
regulators.

--
Regards,
Sudeep
Sudeep Holla Oct. 19, 2020, 9:17 a.m. UTC | #8
On Mon, Oct 19, 2020 at 10:28:27AM +0530, Viresh Kumar wrote:
> On 16-10-20, 12:12, Sudeep Holla wrote:
> > On Fri, Oct 16, 2020 at 07:00:21AM +0100, Sudeep Holla wrote:
> > > On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:
> > > > On 15-10-20, 19:05, Sudeep Holla wrote:
> > > > > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs
> > > > > directly. Before this change clk_get(dev..) was allowed to fail and
> > > > > --EPROBE_DEFER was not an error.
> > > >
> > > > I think the change in itself is fine. We should be returning from
> > > > there if we get EPROBE_DEFER. The question is rather why are you
> > > > getting EPROBE_DEFER here ?
> > > >
> > >
> > > Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted
> > > this patch and it worked. I need to check it in detail yet.
> > >
> > 
> > You confused me earlier. As I said there will be no clock provider
> > registered for SCMI CPU/Dev DVFS.
> > 	opp_table->clk = clk_get(dev, NULL);
> > will always return -EPROBE_DEFER as there is no clock provider for dev.
> > But this change now propagates that error to caller of dev_pm_opp_add
> > which means we can't add opp to a device if there are no clock providers.
> > This breaks for DVFS which don't operate separately with clocks and
> > regulators.
>
> The CPUs DT node shouldn't have a clock property in such a case and I
> would expect an error instead of EPROBE_DEFER then. Isn't it ?

Ideally yes, but for legacy reasons clocks property has been used for
providing OPP/DVFS handle too. While we can change and add new property
for that, it will still break old bindings.

--
Regards,
Sudeep
Viresh Kumar Oct. 19, 2020, 9:24 a.m. UTC | #9
On 19-10-20, 10:17, Sudeep Holla wrote:
> On Mon, Oct 19, 2020 at 10:28:27AM +0530, Viresh Kumar wrote:

> > On 16-10-20, 12:12, Sudeep Holla wrote:

> > > On Fri, Oct 16, 2020 at 07:00:21AM +0100, Sudeep Holla wrote:

> > > > On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:

> > > > > On 15-10-20, 19:05, Sudeep Holla wrote:

> > > > > > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs

> > > > > > directly. Before this change clk_get(dev..) was allowed to fail and

> > > > > > --EPROBE_DEFER was not an error.

> > > > >

> > > > > I think the change in itself is fine. We should be returning from

> > > > > there if we get EPROBE_DEFER. The question is rather why are you

> > > > > getting EPROBE_DEFER here ?

> > > > >

> > > >

> > > > Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted

> > > > this patch and it worked. I need to check it in detail yet.

> > > >

> > > 

> > > You confused me earlier. As I said there will be no clock provider

> > > registered for SCMI CPU/Dev DVFS.

> > > 	opp_table->clk = clk_get(dev, NULL);

> > > will always return -EPROBE_DEFER as there is no clock provider for dev.

> > > But this change now propagates that error to caller of dev_pm_opp_add

> > > which means we can't add opp to a device if there are no clock providers.

> > > This breaks for DVFS which don't operate separately with clocks and

> > > regulators.

> >

> > The CPUs DT node shouldn't have a clock property in such a case and I

> > would expect an error instead of EPROBE_DEFER then. Isn't it ?

> 

> Ideally yes, but for legacy reasons clocks property has been used for

> providing OPP/DVFS handle too. While we can change and add new property

> for that, it will still break old bindings.


I am not sure I understood it all. So does your platform have the
clock-names property or not for the CPUs ? And how will something
break here ?

-- 
viresh
Sudeep Holla Oct. 19, 2020, 10:12 a.m. UTC | #10
On Mon, Oct 19, 2020 at 02:54:11PM +0530, Viresh Kumar wrote:
> On 19-10-20, 10:17, Sudeep Holla wrote:
> > On Mon, Oct 19, 2020 at 10:28:27AM +0530, Viresh Kumar wrote:
> > > On 16-10-20, 12:12, Sudeep Holla wrote:
> > > > On Fri, Oct 16, 2020 at 07:00:21AM +0100, Sudeep Holla wrote:
> > > > > On Fri, Oct 16, 2020 at 09:54:34AM +0530, Viresh Kumar wrote:
> > > > > > On 15-10-20, 19:05, Sudeep Holla wrote:
> > > > > > > OK, this breaks with SCMI which doesn't provide clocks but manage OPPs
> > > > > > > directly. Before this change clk_get(dev..) was allowed to fail and
> > > > > > > --EPROBE_DEFER was not an error.
> > > > > >
> > > > > > I think the change in itself is fine. We should be returning from
> > > > > > there if we get EPROBE_DEFER. The question is rather why are you
> > > > > > getting EPROBE_DEFER here ?
> > > > > >
> > > > >
> > > > > Ah OK, I didn't spend too much time, saw -EPROBE_DEFER, just reverted
> > > > > this patch and it worked. I need to check it in detail yet.
> > > > >
> > > > 
> > > > You confused me earlier. As I said there will be no clock provider
> > > > registered for SCMI CPU/Dev DVFS.
> > > > 	opp_table->clk = clk_get(dev, NULL);
> > > > will always return -EPROBE_DEFER as there is no clock provider for dev.
> > > > But this change now propagates that error to caller of dev_pm_opp_add
> > > > which means we can't add opp to a device if there are no clock providers.
> > > > This breaks for DVFS which don't operate separately with clocks and
> > > > regulators.
> > >
> > > The CPUs DT node shouldn't have a clock property in such a case and I
> > > would expect an error instead of EPROBE_DEFER then. Isn't it ?
> > 
> > Ideally yes, but for legacy reasons clocks property has been used for
> > providing OPP/DVFS handle too. While we can change and add new property
> > for that, it will still break old bindings.
> 
> I am not sure I understood it all. So does your platform have the
> clock-names property or not for the CPUs ? And how will something
> break here ?
> 

Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
by any clock provider driver. E.g. the issue you will see if "clocks"
property is used instead of "qcom,freq-domain" on Qcom parts.

On SCMI, we have used clocks property to represent perf domain which I
understand is not ideal but it is there 🙁.
Viresh Kumar Oct. 19, 2020, 10:35 a.m. UTC | #11
On 19-10-20, 11:12, Sudeep Holla wrote:
> Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> by any clock provider driver. E.g. the issue you will see if "clocks"
> property is used instead of "qcom,freq-domain" on Qcom parts.

Okay, I understand. But what I still don't understand is why it fails
for you. You have a clocks property in DT for the CPU, the OPP core
tries to get it and will get deferred-probed, which will try probing
at a later point of time and it shall work then. Isn't it ?
Sudeep Holla Oct. 19, 2020, 2:10 p.m. UTC | #12
On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> On 19-10-20, 11:12, Sudeep Holla wrote:
> > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > by any clock provider driver. E.g. the issue you will see if "clocks"
> > property is used instead of "qcom,freq-domain" on Qcom parts.
> 
> Okay, I understand. But what I still don't understand is why it fails
> for you. You have a clocks property in DT for the CPU, the OPP core
> tries to get it and will get deferred-probed, which will try probing
> at a later point of time and it shall work then. Isn't it ?
>

Nope unfortunately. We don't have clock provider, so clk_get will
never succeed and always return -EPROBE_DEFER.
Viresh Kumar Oct. 20, 2020, 5:05 a.m. UTC | #13
On 19-10-20, 15:10, Sudeep Holla wrote:
> On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> > On 19-10-20, 11:12, Sudeep Holla wrote:
> > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > > by any clock provider driver. E.g. the issue you will see if "clocks"
> > > property is used instead of "qcom,freq-domain" on Qcom parts.
> > 
> > Okay, I understand. But what I still don't understand is why it fails
> > for you. You have a clocks property in DT for the CPU, the OPP core
> > tries to get it and will get deferred-probed, which will try probing
> > at a later point of time and it shall work then. Isn't it ?
> >
> 
> Nope unfortunately. We don't have clock provider, so clk_get will
> never succeed and always return -EPROBE_DEFER.

Now this is really bad, you have a fake clocks property, how is the
OPP core supposed to know it ? Damn.
Viresh Kumar Oct. 20, 2020, 5:54 a.m. UTC | #14
On 20-10-20, 10:35, Viresh Kumar wrote:
> On 19-10-20, 15:10, Sudeep Holla wrote:

> > On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:

> > > On 19-10-20, 11:12, Sudeep Holla wrote:

> > > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not

> > > > by any clock provider driver. E.g. the issue you will see if "clocks"

> > > > property is used instead of "qcom,freq-domain" on Qcom parts.

> > > 

> > > Okay, I understand. But what I still don't understand is why it fails

> > > for you. You have a clocks property in DT for the CPU, the OPP core

> > > tries to get it and will get deferred-probed, which will try probing

> > > at a later point of time and it shall work then. Isn't it ?

> > >

> > 

> > Nope unfortunately. We don't have clock provider, so clk_get will

> > never succeed and always return -EPROBE_DEFER.

> 

> Now this is really bad, you have a fake clocks property, how is the

> OPP core supposed to know it ? Damn.


What about instead of fixing the OPP core, which really is doing the
right thing, we fix your driver (as you can't fix the DT) and add a
dummy CPU clk to make it all work ?

-- 
viresh
Sudeep Holla Oct. 20, 2020, 9:37 a.m. UTC | #15
On Tue, Oct 20, 2020 at 11:24:32AM +0530, Viresh Kumar wrote:
> On 20-10-20, 10:35, Viresh Kumar wrote:

> > On 19-10-20, 15:10, Sudeep Holla wrote:

> > > On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:

> > > > On 19-10-20, 11:12, Sudeep Holla wrote:

> > > > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not

> > > > > by any clock provider driver. E.g. the issue you will see if "clocks"

> > > > > property is used instead of "qcom,freq-domain" on Qcom parts.

> > > > 

> > > > Okay, I understand. But what I still don't understand is why it fails

> > > > for you. You have a clocks property in DT for the CPU, the OPP core

> > > > tries to get it and will get deferred-probed, which will try probing

> > > > at a later point of time and it shall work then. Isn't it ?

> > > >

> > > 

> > > Nope unfortunately. We don't have clock provider, so clk_get will

> > > never succeed and always return -EPROBE_DEFER.

> > 

> > Now this is really bad, you have a fake clocks property, how is the

> > OPP core supposed to know it ? Damn.

> 

> What about instead of fixing the OPP core, which really is doing the

> right thing, we fix your driver (as you can't fix the DT) and add a

> dummy CPU clk to make it all work ?

>


I really would avoid that. I would rather change the binding as there is
no single official users of that binding in the upstream tree.

-- 
Regards,
Sudeep
Viresh Kumar Oct. 20, 2020, 9:41 a.m. UTC | #16
On 20-10-20, 10:37, Sudeep Holla wrote:
> On Tue, Oct 20, 2020 at 11:24:32AM +0530, Viresh Kumar wrote:
> > On 20-10-20, 10:35, Viresh Kumar wrote:
> > > On 19-10-20, 15:10, Sudeep Holla wrote:
> > > > On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> > > > > On 19-10-20, 11:12, Sudeep Holla wrote:
> > > > > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > > > > > by any clock provider driver. E.g. the issue you will see if "clocks"
> > > > > > property is used instead of "qcom,freq-domain" on Qcom parts.
> > > > > 
> > > > > Okay, I understand. But what I still don't understand is why it fails
> > > > > for you. You have a clocks property in DT for the CPU, the OPP core
> > > > > tries to get it and will get deferred-probed, which will try probing
> > > > > at a later point of time and it shall work then. Isn't it ?
> > > > >
> > > > 
> > > > Nope unfortunately. We don't have clock provider, so clk_get will
> > > > never succeed and always return -EPROBE_DEFER.
> > > 
> > > Now this is really bad, you have a fake clocks property, how is the
> > > OPP core supposed to know it ? Damn.
> > 
> > What about instead of fixing the OPP core, which really is doing the
> > right thing, we fix your driver (as you can't fix the DT) and add a
> > dummy CPU clk to make it all work ?
> >
> 
> I really would avoid that. I would rather change the binding as there is
> no single official users of that binding in the upstream tree.

But how will you solve backward compatibility thing then ?
Viresh Kumar Oct. 20, 2020, 9:59 a.m. UTC | #17
On 20-10-20, 10:52, Sudeep Holla wrote:
> On Tue, Oct 20, 2020 at 03:11:34PM +0530, Viresh Kumar wrote:
> > On 20-10-20, 10:37, Sudeep Holla wrote:
> > > On Tue, Oct 20, 2020 at 11:24:32AM +0530, Viresh Kumar wrote:
> > > > On 20-10-20, 10:35, Viresh Kumar wrote:
> > > > > On 19-10-20, 15:10, Sudeep Holla wrote:
> > > > > > On Mon, Oct 19, 2020 at 04:05:35PM +0530, Viresh Kumar wrote:
> > > > > > > On 19-10-20, 11:12, Sudeep Holla wrote:
> > > > > > > > Yes it has clocks property but used by SCMI(for CPUFreq/DevFreq) and not
> > > > > > > > by any clock provider driver. E.g. the issue you will see if "clocks"
> > > > > > > > property is used instead of "qcom,freq-domain" on Qcom parts.
> > > > > > > 
> > > > > > > Okay, I understand. But what I still don't understand is why it fails
> > > > > > > for you. You have a clocks property in DT for the CPU, the OPP core
> > > > > > > tries to get it and will get deferred-probed, which will try probing
> > > > > > > at a later point of time and it shall work then. Isn't it ?
> > > > > > >
> > > > > > 
> > > > > > Nope unfortunately. We don't have clock provider, so clk_get will
> > > > > > never succeed and always return -EPROBE_DEFER.
> > > > > 
> > > > > Now this is really bad, you have a fake clocks property, how is the
> > > > > OPP core supposed to know it ? Damn.
> > > > 
> > > > What about instead of fixing the OPP core, which really is doing the
> > > > right thing, we fix your driver (as you can't fix the DT) and add a
> > > > dummy CPU clk to make it all work ?
> > > >
> > > 
> > > I really would avoid that. I would rather change the binding as there is
> > > no single official users of that binding in the upstream tree.
> > 
> > But how will you solve backward compatibility thing then ?
> > 
> 
> I am just betting on the fact that no users upstream means no backward
> compatibility needed. If someone raises issue we need to add backward
> compatibility with dummy clk as you suggested.

Okay. I would have done a change in the OPP core to fix the issue, but
the current code looks correct and we shouldn't change it to satisfy
buggy users. I hope that makes sense.
diff mbox series

Patch

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 2cb5e04cf86c..b92bb61550d3 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -2044,8 +2044,9 @@  int of_genpd_add_provider_simple(struct device_node *np,
 	if (genpd->set_performance_state) {
 		ret = dev_pm_opp_of_add_table(&genpd->dev);
 		if (ret) {
-			dev_err(&genpd->dev, "Failed to add OPP table: %d\n",
-				ret);
+			if (ret != -EPROBE_DEFER)
+				dev_err(&genpd->dev, "Failed to add OPP table: %d\n",
+					ret);
 			goto unlock;
 		}
 
@@ -2054,7 +2055,7 @@  int of_genpd_add_provider_simple(struct device_node *np,
 		 * state.
 		 */
 		genpd->opp_table = dev_pm_opp_get_opp_table(&genpd->dev);
-		WARN_ON(!genpd->opp_table);
+		WARN_ON(IS_ERR(genpd->opp_table));
 	}
 
 	ret = genpd_add_provider(np, genpd_xlate_simple, genpd);
@@ -2111,8 +2112,9 @@  int of_genpd_add_provider_onecell(struct device_node *np,
 		if (genpd->set_performance_state) {
 			ret = dev_pm_opp_of_add_table_indexed(&genpd->dev, i);
 			if (ret) {
-				dev_err(&genpd->dev, "Failed to add OPP table for index %d: %d\n",
-					i, ret);
+				if (ret != -EPROBE_DEFER)
+					dev_err(&genpd->dev, "Failed to add OPP table for index %d: %d\n",
+						i, ret);
 				goto error;
 			}
 
@@ -2121,7 +2123,7 @@  int of_genpd_add_provider_onecell(struct device_node *np,
 			 * performance state.
 			 */
 			genpd->opp_table = dev_pm_opp_get_opp_table_indexed(&genpd->dev, i);
-			WARN_ON(!genpd->opp_table);
+			WARN_ON(IS_ERR(genpd->opp_table));
 		}
 
 		genpd->provider = &np->fwnode;
diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 6978b9218c6e..8c69a764d0a4 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1068,7 +1068,7 @@  static struct opp_table *_allocate_opp_table(struct device *dev, int index)
 	 */
 	opp_table = kzalloc(sizeof(*opp_table), GFP_KERNEL);
 	if (!opp_table)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	mutex_init(&opp_table->lock);
 	mutex_init(&opp_table->genpd_virt_dev_lock);
@@ -1079,8 +1079,8 @@  static struct opp_table *_allocate_opp_table(struct device *dev, int index)
 
 	opp_dev = _add_opp_dev(dev, opp_table);
 	if (!opp_dev) {
-		kfree(opp_table);
-		return NULL;
+		ret = -ENOMEM;
+		goto err;
 	}
 
 	_of_init_opp_table(opp_table, dev, index);
@@ -1089,16 +1089,21 @@  static struct opp_table *_allocate_opp_table(struct device *dev, int index)
 	opp_table->clk = clk_get(dev, NULL);
 	if (IS_ERR(opp_table->clk)) {
 		ret = PTR_ERR(opp_table->clk);
-		if (ret != -EPROBE_DEFER)
-			dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__,
-				ret);
+		if (ret == -EPROBE_DEFER)
+			goto err;
+
+		dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
 	}
 
 	/* Find interconnect path(s) for the device */
 	ret = dev_pm_opp_of_find_icc_paths(dev, opp_table);
-	if (ret)
+	if (ret) {
+		if (ret == -EPROBE_DEFER)
+			goto err;
+
 		dev_warn(dev, "%s: Error finding interconnect paths: %d\n",
 			 __func__, ret);
+	}
 
 	BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head);
 	INIT_LIST_HEAD(&opp_table->opp_list);
@@ -1107,6 +1112,10 @@  static struct opp_table *_allocate_opp_table(struct device *dev, int index)
 	/* Secure the device table modification */
 	list_add(&opp_table->node, &opp_tables);
 	return opp_table;
+
+err:
+	kfree(opp_table);
+	return ERR_PTR(ret);
 }
 
 void _get_opp_table_kref(struct opp_table *opp_table)
@@ -1129,7 +1138,7 @@  static struct opp_table *_opp_get_opp_table(struct device *dev, int index)
 	if (opp_table) {
 		if (!_add_opp_dev_unlocked(dev, opp_table)) {
 			dev_pm_opp_put_opp_table(opp_table);
-			opp_table = NULL;
+			opp_table = ERR_PTR(-ENOMEM);
 		}
 		goto unlock;
 	}
@@ -1573,8 +1582,8 @@  struct opp_table *dev_pm_opp_set_supported_hw(struct device *dev,
 	struct opp_table *opp_table;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(opp_table))
+		return opp_table;
 
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
@@ -1632,8 +1641,8 @@  struct opp_table *dev_pm_opp_set_prop_name(struct device *dev, const char *name)
 	struct opp_table *opp_table;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(opp_table))
+		return opp_table;
 
 	/* Make sure there are no concurrent readers while updating opp_table */
 	WARN_ON(!list_empty(&opp_table->opp_list));
@@ -1725,8 +1734,8 @@  struct opp_table *dev_pm_opp_set_regulators(struct device *dev,
 	int ret, i;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(opp_table))
+		return opp_table;
 
 	/* This should be called before OPPs are initialized */
 	if (WARN_ON(!list_empty(&opp_table->opp_list))) {
@@ -1833,8 +1842,8 @@  struct opp_table *dev_pm_opp_set_clkname(struct device *dev, const char *name)
 	int ret;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(opp_table))
+		return opp_table;
 
 	/* This should be called before OPPs are initialized */
 	if (WARN_ON(!list_empty(&opp_table->opp_list))) {
@@ -1901,8 +1910,8 @@  struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev,
 		return ERR_PTR(-EINVAL);
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (!IS_ERR(opp_table))
+		return opp_table;
 
 	/* This should be called before OPPs are initialized */
 	if (WARN_ON(!list_empty(&opp_table->opp_list))) {
@@ -1982,8 +1991,8 @@  struct opp_table *dev_pm_opp_attach_genpd(struct device *dev,
 	const char **name = names;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(opp_table))
+		return opp_table;
 
 	/*
 	 * If the genpd's OPP table isn't already initialized, parsing of the
@@ -2153,8 +2162,8 @@  int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
 	int ret;
 
 	opp_table = dev_pm_opp_get_opp_table(dev);
-	if (!opp_table)
-		return -ENOMEM;
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	/* Fix regulator count for dynamic OPPs */
 	opp_table->regulator_count = 1;
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 7d9d4455a59e..e39ddcc779af 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -947,8 +947,8 @@  int dev_pm_opp_of_add_table(struct device *dev)
 	int ret;
 
 	opp_table = dev_pm_opp_get_opp_table_indexed(dev, 0);
-	if (!opp_table)
-		return -ENOMEM;
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	/*
 	 * OPPs have two version of bindings now. Also try the old (v1)
@@ -1002,8 +1002,8 @@  int dev_pm_opp_of_add_table_indexed(struct device *dev, int index)
 	}
 
 	opp_table = dev_pm_opp_get_opp_table_indexed(dev, index);
-	if (!opp_table)
-		return -ENOMEM;
+	if (IS_ERR(opp_table))
+		return PTR_ERR(opp_table);
 
 	ret = _of_add_opp_table_v2(dev, opp_table);
 	if (ret)
diff --git a/drivers/soc/samsung/exynos-asv.c b/drivers/soc/samsung/exynos-asv.c
index 30bb7b7cc769..8abf4dfaa5c5 100644
--- a/drivers/soc/samsung/exynos-asv.c
+++ b/drivers/soc/samsung/exynos-asv.c
@@ -93,7 +93,7 @@  static int exynos_asv_update_opps(struct exynos_asv *asv)
 			continue;
 
 		opp_table = dev_pm_opp_get_opp_table(cpu);
-		if (IS_ERR_OR_NULL(opp_table))
+		if (IS_ERR(opp_table))
 			continue;
 
 		if (!last_opp_table || opp_table != last_opp_table) {