diff mbox series

memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get()

Message ID 20200804061210.5415-1-m.szyprowski@samsung.com
State New
Headers show
Series memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get() | expand

Commit Message

Marek Szyprowski Aug. 4, 2020, 6:12 a.m. UTC
exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
devfreq event counter is not yet probed. Propagate that error value to
the caller to ensure that the exynos5422-dmc driver will be probed again
when devfreq event contuner is available.

This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
compiled as modules.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

---
 drivers/memory/samsung/exynos5422-dmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.17.1

Comments

Krzysztof Kozlowski Aug. 4, 2020, 6:42 a.m. UTC | #1
On Tue, Aug 04, 2020 at 08:12:10AM +0200, Marek Szyprowski wrote:
> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for

> devfreq event counter is not yet probed. Propagate that error value to

> the caller to ensure that the exynos5422-dmc driver will be probed again

> when devfreq event contuner is available.

> 

> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are

> compiled as modules.

> 

> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---

>  drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)


Thanks, looks good. I'll apply it to fixes after merge window.

Best regards,
Krzysztof
Lukasz Luba Aug. 4, 2020, 9:11 a.m. UTC | #2
Hi Marek,

On 8/4/20 7:12 AM, Marek Szyprowski wrote:
> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
> devfreq event counter is not yet probed. Propagate that error value to
> the caller to ensure that the exynos5422-dmc driver will be probed again
> when devfreq event contuner is available.
> 
> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
> compiled as modules.
> 
> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> ---
>   drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
> index b9c7956e5031..639811a3eecb 100644
> --- a/drivers/memory/samsung/exynos5422-dmc.c
> +++ b/drivers/memory/samsung/exynos5422-dmc.c
> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device *dev,
>   	} else {
>   		ret = exynos5_counters_get(dmc, &load, &total);
>   		if (ret < 0)
> -			return -EINVAL;
> +			return ret;
>   
>   		/* To protect from overflow, divide by 1024 */
>   		stat->busy_time = load >> 10;
> 

Thank you for the patch, LGTM.
Some questions are still there, though. The function
exynos5_performance_counters_init() should capture that the counters
couldn't be enabled or set. So the functions:
exynos5_counters_enable_edev() and exynos5_counters_set_event()
must pass gently because devfreq device is registered.
Then devfreq checks device status, and reaches the state when
counters 'get' function returns that they are not ready...

If that is a kind of 2-stage initialization, maybe we should add
another 'check' in the exynos5_performance_counters_init() and call
the devfreq_event_get_event() to make sure that we are ready to go,
otherwise return ret from that function (which is probably EPROBE_DEFER)
and not register the devfreq device.

Marek do want to submit such patch or I should bake it and submit on top
of this patch?

Could you tell me how I can reproduce this? Do you simply load one
module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

Regards,
Lukasz
Marek Szyprowski Aug. 4, 2020, 12:19 p.m. UTC | #3
Hi Lukasz,

On 04.08.2020 11:11, Lukasz Luba wrote:
> Hi Marek,
>
> On 8/4/20 7:12 AM, Marek Szyprowski wrote:
>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
>> devfreq event counter is not yet probed. Propagate that error value to
>> the caller to ensure that the exynos5422-dmc driver will be probed again
>> when devfreq event contuner is available.
>>
>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
>> compiled as modules.
>>
>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> ---
>>   drivers/memory/samsung/exynos5422-dmc.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c 
>> b/drivers/memory/samsung/exynos5422-dmc.c
>> index b9c7956e5031..639811a3eecb 100644
>> --- a/drivers/memory/samsung/exynos5422-dmc.c
>> +++ b/drivers/memory/samsung/exynos5422-dmc.c
>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device 
>> *dev,
>>       } else {
>>           ret = exynos5_counters_get(dmc, &load, &total);
>>           if (ret < 0)
>> -            return -EINVAL;
>> +            return ret;
>>             /* To protect from overflow, divide by 1024 */
>>           stat->busy_time = load >> 10;
>>
>
> Thank you for the patch, LGTM.
> Some questions are still there, though. The function
> exynos5_performance_counters_init() should capture that the counters
> couldn't be enabled or set. So the functions:
> exynos5_counters_enable_edev() and exynos5_counters_set_event()
> must pass gently because devfreq device is registered.
> Then devfreq checks device status, and reaches the state when
> counters 'get' function returns that they are not ready...
>
> If that is a kind of 2-stage initialization, maybe we should add
> another 'check' in the exynos5_performance_counters_init() and call
> the devfreq_event_get_event() to make sure that we are ready to go,
> otherwise return ret from that function (which is probably EPROBE_DEFER)
> and not register the devfreq device.

I've finally investigated this further and it turned out that the issue 
is elsewhere. The $subject patch can be discarded, as it doesn't fix 
anything. The -EPROBE_DEFER is properly returned by 
exynos5_performance_counters_init(), which redirects exynos5_dmc_probe() 
to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks 
what in turn *sometimes* causes boot hang. This random behavior mislead 
me that the $subject patch fixes the issue, but then longer tests 
revealed that it didn't change anything.

It looks that the proper fix would be to keep fout_bpll enabled all the 
time.

>
> Marek do want to submit such patch or I should bake it and submit on top
> of this patch?
>
> Could you tell me how I can reproduce this? Do you simply load one
> module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

I've just boot zImage built from multi_v7_defconfig with modules 
installed. Modules are automatically loaded by udev during boot.

Best regards
Lukasz Luba Aug. 4, 2020, 12:38 p.m. UTC | #4
On 8/4/20 1:19 PM, Marek Szyprowski wrote:
> Hi Lukasz,

> 

> On 04.08.2020 11:11, Lukasz Luba wrote:

>> Hi Marek,

>>

>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:

>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for

>>> devfreq event counter is not yet probed. Propagate that error value to

>>> the caller to ensure that the exynos5422-dmc driver will be probed again

>>> when devfreq event contuner is available.

>>>

>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are

>>> compiled as modules.

>>>

>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

>>> ---

>>>    drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>>>    1 file changed, 1 insertion(+), 1 deletion(-)

>>>

>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c

>>> b/drivers/memory/samsung/exynos5422-dmc.c

>>> index b9c7956e5031..639811a3eecb 100644

>>> --- a/drivers/memory/samsung/exynos5422-dmc.c

>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c

>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device

>>> *dev,

>>>        } else {

>>>            ret = exynos5_counters_get(dmc, &load, &total);

>>>            if (ret < 0)

>>> -            return -EINVAL;

>>> +            return ret;

>>>              /* To protect from overflow, divide by 1024 */

>>>            stat->busy_time = load >> 10;

>>>

>>

>> Thank you for the patch, LGTM.

>> Some questions are still there, though. The function

>> exynos5_performance_counters_init() should capture that the counters

>> couldn't be enabled or set. So the functions:

>> exynos5_counters_enable_edev() and exynos5_counters_set_event()

>> must pass gently because devfreq device is registered.

>> Then devfreq checks device status, and reaches the state when

>> counters 'get' function returns that they are not ready...

>>

>> If that is a kind of 2-stage initialization, maybe we should add

>> another 'check' in the exynos5_performance_counters_init() and call

>> the devfreq_event_get_event() to make sure that we are ready to go,

>> otherwise return ret from that function (which is probably EPROBE_DEFER)

>> and not register the devfreq device.

> 

> I've finally investigated this further and it turned out that the issue

> is elsewhere. The $subject patch can be discarded, as it doesn't fix

> anything. The -EPROBE_DEFER is properly returned by

> exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()

> to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks

> what in turn *sometimes* causes boot hang. This random behavior mislead

> me that the $subject patch fixes the issue, but then longer tests

> revealed that it didn't change anything.


Really good investigation, great work Marek!

> 

> It looks that the proper fix would be to keep fout_bpll enabled all the

> time.


Yes, I agree. I am looking for your next patch to test it then.

> 

>>

>> Marek do want to submit such patch or I should bake it and submit on top

>> of this patch?

>>

>> Could you tell me how I can reproduce this? Do you simply load one

>> module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

> 

> I've just boot zImage built from multi_v7_defconfig with modules

> installed. Modules are automatically loaded by udev during boot.


Thank you sharing this test procedure.

Regards,
Lukasz
Krzysztof Kozlowski Aug. 17, 2020, 12:07 p.m. UTC | #5
On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:
> 
> 
> On 8/4/20 1:19 PM, Marek Szyprowski wrote:
> > Hi Lukasz,
> > 
> > On 04.08.2020 11:11, Lukasz Luba wrote:
> > > Hi Marek,
> > > 
> > > On 8/4/20 7:12 AM, Marek Szyprowski wrote:
> > > > exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
> > > > devfreq event counter is not yet probed. Propagate that error value to
> > > > the caller to ensure that the exynos5422-dmc driver will be probed again
> > > > when devfreq event contuner is available.
> > > > 
> > > > This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
> > > > compiled as modules.
> > > > 
> > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > > > ---
> > > >    drivers/memory/samsung/exynos5422-dmc.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/memory/samsung/exynos5422-dmc.c
> > > > b/drivers/memory/samsung/exynos5422-dmc.c
> > > > index b9c7956e5031..639811a3eecb 100644
> > > > --- a/drivers/memory/samsung/exynos5422-dmc.c
> > > > +++ b/drivers/memory/samsung/exynos5422-dmc.c
> > > > @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device
> > > > *dev,
> > > >        } else {
> > > >            ret = exynos5_counters_get(dmc, &load, &total);
> > > >            if (ret < 0)
> > > > -            return -EINVAL;
> > > > +            return ret;
> > > >              /* To protect from overflow, divide by 1024 */
> > > >            stat->busy_time = load >> 10;
> > > > 
> > > 
> > > Thank you for the patch, LGTM.
> > > Some questions are still there, though. The function
> > > exynos5_performance_counters_init() should capture that the counters
> > > couldn't be enabled or set. So the functions:
> > > exynos5_counters_enable_edev() and exynos5_counters_set_event()
> > > must pass gently because devfreq device is registered.
> > > Then devfreq checks device status, and reaches the state when
> > > counters 'get' function returns that they are not ready...
> > > 
> > > If that is a kind of 2-stage initialization, maybe we should add
> > > another 'check' in the exynos5_performance_counters_init() and call
> > > the devfreq_event_get_event() to make sure that we are ready to go,
> > > otherwise return ret from that function (which is probably EPROBE_DEFER)
> > > and not register the devfreq device.
> > 
> > I've finally investigated this further and it turned out that the issue
> > is elsewhere. The $subject patch can be discarded, as it doesn't fix
> > anything. The -EPROBE_DEFER is properly returned by
> > exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()
> > to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks
> > what in turn *sometimes* causes boot hang. This random behavior mislead
> > me that the $subject patch fixes the issue, but then longer tests
> > revealed that it didn't change anything.
> 
> Really good investigation, great work Marek!
> 
> > 
> > It looks that the proper fix would be to keep fout_bpll enabled all the
> > time.
> 
> Yes, I agree. I am looking for your next patch to test it then.

Hi all,

Is the patch still useful then? Shall I apply it?

Best regards,
Krzysztof
Marek Szyprowski Aug. 24, 2020, 7:44 a.m. UTC | #6
Hi,

On 17.08.2020 14:27, Lukasz Luba wrote:
> On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote:

>> On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:

>>> On 8/4/20 1:19 PM, Marek Szyprowski wrote:

>>>> On 04.08.2020 11:11, Lukasz Luba wrote:

>>>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:

>>>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the 

>>>>>> driver for

>>>>>> devfreq event counter is not yet probed. Propagate that error 

>>>>>> value to

>>>>>> the caller to ensure that the exynos5422-dmc driver will be 

>>>>>> probed again

>>>>>> when devfreq event contuner is available.

>>>>>>

>>>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu 

>>>>>> drivers are

>>>>>> compiled as modules.

>>>>>>

>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

>>>>>> ---

>>>>>>     drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)

>>>>>>

>>>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> b/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> index b9c7956e5031..639811a3eecb 100644

>>>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device

>>>>>> *dev,

>>>>>>         } else {

>>>>>>             ret = exynos5_counters_get(dmc, &load, &total);

>>>>>>             if (ret < 0)

>>>>>> -            return -EINVAL;

>>>>>> +            return ret;

>>>>>>               /* To protect from overflow, divide by 1024 */

>>>>>>             stat->busy_time = load >> 10;

>>>>>>

>>>>>

>>>>> Thank you for the patch, LGTM.

>>>>> Some questions are still there, though. The function

>>>>> exynos5_performance_counters_init() should capture that the counters

>>>>> couldn't be enabled or set. So the functions:

>>>>> exynos5_counters_enable_edev() and exynos5_counters_set_event()

>>>>> must pass gently because devfreq device is registered.

>>>>> Then devfreq checks device status, and reaches the state when

>>>>> counters 'get' function returns that they are not ready...

>>>>>

>>>>> If that is a kind of 2-stage initialization, maybe we should add

>>>>> another 'check' in the exynos5_performance_counters_init() and call

>>>>> the devfreq_event_get_event() to make sure that we are ready to go,

>>>>> otherwise return ret from that function (which is probably 

>>>>> EPROBE_DEFER)

>>>>> and not register the devfreq device.

>>>>

>>>> I've finally investigated this further and it turned out that the 

>>>> issue

>>>> is elsewhere. The $subject patch can be discarded, as it doesn't fix

>>>> anything. The -EPROBE_DEFER is properly returned by

>>>> exynos5_performance_counters_init(), which redirects 

>>>> exynos5_dmc_probe()

>>>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll 

>>>> clocks

>>>> what in turn *sometimes* causes boot hang. This random behavior 

>>>> mislead

>>>> me that the $subject patch fixes the issue, but then longer tests

>>>> revealed that it didn't change anything.

>>>

>>> Really good investigation, great work Marek!

>>>

>>>>

>>>> It looks that the proper fix would be to keep fout_bpll enabled all 

>>>> the

>>>> time.

>>>

>>> Yes, I agree. I am looking for your next patch to test it then.

>>

>> Hi all,

>>

>> Is the patch still useful then? Shall I apply it?

>

> Marek has created different patch for it, which fixes the clock:

> https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/ 

>

>

> So you don't have to apply this one IMO.


Indeed, you can drop this one.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland
diff mbox series

Patch

diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
index b9c7956e5031..639811a3eecb 100644
--- a/drivers/memory/samsung/exynos5422-dmc.c
+++ b/drivers/memory/samsung/exynos5422-dmc.c
@@ -914,7 +914,7 @@  static int exynos5_dmc_get_status(struct device *dev,
 	} else {
 		ret = exynos5_counters_get(dmc, &load, &total);
 		if (ret < 0)
-			return -EINVAL;
+			return ret;
 
 		/* To protect from overflow, divide by 1024 */
 		stat->busy_time = load >> 10;