memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get()

Message ID 20200804061210.5415-1-m.szyprowski@samsung.com
State New
Headers show
Series
  • memory: samsung: exynos5422-dmc: propagate error from exynos5_counters_get()
Related show

Commit Message

Marek Szyprowski Aug. 4, 2020, 6:12 a.m.
exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for
devfreq event counter is not yet probed. Propagate that error value to
the caller to ensure that the exynos5422-dmc driver will be probed again
when devfreq event contuner is available.

This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are
compiled as modules.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

---
 drivers/memory/samsung/exynos5422-dmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.17.1

Comments

Krzysztof Kozlowski Aug. 4, 2020, 6:42 a.m. | #1
On Tue, Aug 04, 2020 at 08:12:10AM +0200, Marek Szyprowski wrote:
> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for

> devfreq event counter is not yet probed. Propagate that error value to

> the caller to ensure that the exynos5422-dmc driver will be probed again

> when devfreq event contuner is available.

> 

> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are

> compiled as modules.

> 

> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---

>  drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)


Thanks, looks good. I'll apply it to fixes after merge window.

Best regards,
Krzysztof
Lukasz Luba Aug. 4, 2020, 12:38 p.m. | #2
On 8/4/20 1:19 PM, Marek Szyprowski wrote:
> Hi Lukasz,

> 

> On 04.08.2020 11:11, Lukasz Luba wrote:

>> Hi Marek,

>>

>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:

>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the driver for

>>> devfreq event counter is not yet probed. Propagate that error value to

>>> the caller to ensure that the exynos5422-dmc driver will be probed again

>>> when devfreq event contuner is available.

>>>

>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu drivers are

>>> compiled as modules.

>>>

>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

>>> ---

>>>    drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>>>    1 file changed, 1 insertion(+), 1 deletion(-)

>>>

>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c

>>> b/drivers/memory/samsung/exynos5422-dmc.c

>>> index b9c7956e5031..639811a3eecb 100644

>>> --- a/drivers/memory/samsung/exynos5422-dmc.c

>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c

>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device

>>> *dev,

>>>        } else {

>>>            ret = exynos5_counters_get(dmc, &load, &total);

>>>            if (ret < 0)

>>> -            return -EINVAL;

>>> +            return ret;

>>>              /* To protect from overflow, divide by 1024 */

>>>            stat->busy_time = load >> 10;

>>>

>>

>> Thank you for the patch, LGTM.

>> Some questions are still there, though. The function

>> exynos5_performance_counters_init() should capture that the counters

>> couldn't be enabled or set. So the functions:

>> exynos5_counters_enable_edev() and exynos5_counters_set_event()

>> must pass gently because devfreq device is registered.

>> Then devfreq checks device status, and reaches the state when

>> counters 'get' function returns that they are not ready...

>>

>> If that is a kind of 2-stage initialization, maybe we should add

>> another 'check' in the exynos5_performance_counters_init() and call

>> the devfreq_event_get_event() to make sure that we are ready to go,

>> otherwise return ret from that function (which is probably EPROBE_DEFER)

>> and not register the devfreq device.

> 

> I've finally investigated this further and it turned out that the issue

> is elsewhere. The $subject patch can be discarded, as it doesn't fix

> anything. The -EPROBE_DEFER is properly returned by

> exynos5_performance_counters_init(), which redirects exynos5_dmc_probe()

> to remove_clocks label. This causes disabling mout_bpll/fout_bpll clocks

> what in turn *sometimes* causes boot hang. This random behavior mislead

> me that the $subject patch fixes the issue, but then longer tests

> revealed that it didn't change anything.


Really good investigation, great work Marek!

> 

> It looks that the proper fix would be to keep fout_bpll enabled all the

> time.


Yes, I agree. I am looking for your next patch to test it then.

> 

>>

>> Marek do want to submit such patch or I should bake it and submit on top

>> of this patch?

>>

>> Could you tell me how I can reproduce this? Do you simply load one

>> module after another (exynos-ppmu than exynos5422-dmc) or in parallel?

> 

> I've just boot zImage built from multi_v7_defconfig with modules

> installed. Modules are automatically loaded by udev during boot.


Thank you sharing this test procedure.

Regards,
Lukasz
Marek Szyprowski Aug. 24, 2020, 7:44 a.m. | #3
Hi,

On 17.08.2020 14:27, Lukasz Luba wrote:
> On 8/17/20 1:07 PM, Krzysztof Kozlowski wrote:

>> On Tue, Aug 04, 2020 at 01:38:11PM +0100, Lukasz Luba wrote:

>>> On 8/4/20 1:19 PM, Marek Szyprowski wrote:

>>>> On 04.08.2020 11:11, Lukasz Luba wrote:

>>>>> On 8/4/20 7:12 AM, Marek Szyprowski wrote:

>>>>>> exynos5_counters_get() might fail with -EPROBE_DEFER if the 

>>>>>> driver for

>>>>>> devfreq event counter is not yet probed. Propagate that error 

>>>>>> value to

>>>>>> the caller to ensure that the exynos5422-dmc driver will be 

>>>>>> probed again

>>>>>> when devfreq event contuner is available.

>>>>>>

>>>>>> This fixes boot hang if both exynos5422-dmc and exynos-ppmu 

>>>>>> drivers are

>>>>>> compiled as modules.

>>>>>>

>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>

>>>>>> ---

>>>>>>     drivers/memory/samsung/exynos5422-dmc.c | 2 +-

>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)

>>>>>>

>>>>>> diff --git a/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> b/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> index b9c7956e5031..639811a3eecb 100644

>>>>>> --- a/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> +++ b/drivers/memory/samsung/exynos5422-dmc.c

>>>>>> @@ -914,7 +914,7 @@ static int exynos5_dmc_get_status(struct device

>>>>>> *dev,

>>>>>>         } else {

>>>>>>             ret = exynos5_counters_get(dmc, &load, &total);

>>>>>>             if (ret < 0)

>>>>>> -            return -EINVAL;

>>>>>> +            return ret;

>>>>>>               /* To protect from overflow, divide by 1024 */

>>>>>>             stat->busy_time = load >> 10;

>>>>>>

>>>>>

>>>>> Thank you for the patch, LGTM.

>>>>> Some questions are still there, though. The function

>>>>> exynos5_performance_counters_init() should capture that the counters

>>>>> couldn't be enabled or set. So the functions:

>>>>> exynos5_counters_enable_edev() and exynos5_counters_set_event()

>>>>> must pass gently because devfreq device is registered.

>>>>> Then devfreq checks device status, and reaches the state when

>>>>> counters 'get' function returns that they are not ready...

>>>>>

>>>>> If that is a kind of 2-stage initialization, maybe we should add

>>>>> another 'check' in the exynos5_performance_counters_init() and call

>>>>> the devfreq_event_get_event() to make sure that we are ready to go,

>>>>> otherwise return ret from that function (which is probably 

>>>>> EPROBE_DEFER)

>>>>> and not register the devfreq device.

>>>>

>>>> I've finally investigated this further and it turned out that the 

>>>> issue

>>>> is elsewhere. The $subject patch can be discarded, as it doesn't fix

>>>> anything. The -EPROBE_DEFER is properly returned by

>>>> exynos5_performance_counters_init(), which redirects 

>>>> exynos5_dmc_probe()

>>>> to remove_clocks label. This causes disabling mout_bpll/fout_bpll 

>>>> clocks

>>>> what in turn *sometimes* causes boot hang. This random behavior 

>>>> mislead

>>>> me that the $subject patch fixes the issue, but then longer tests

>>>> revealed that it didn't change anything.

>>>

>>> Really good investigation, great work Marek!

>>>

>>>>

>>>> It looks that the proper fix would be to keep fout_bpll enabled all 

>>>> the

>>>> time.

>>>

>>> Yes, I agree. I am looking for your next patch to test it then.

>>

>> Hi all,

>>

>> Is the patch still useful then? Shall I apply it?

>

> Marek has created different patch for it, which fixes the clock:

> https://lore.kernel.org/linux-clk/20200807133143.22748-1-m.szyprowski@samsung.com/ 

>

>

> So you don't have to apply this one IMO.


Indeed, you can drop this one.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Patch

diff --git a/drivers/memory/samsung/exynos5422-dmc.c b/drivers/memory/samsung/exynos5422-dmc.c
index b9c7956e5031..639811a3eecb 100644
--- a/drivers/memory/samsung/exynos5422-dmc.c
+++ b/drivers/memory/samsung/exynos5422-dmc.c
@@ -914,7 +914,7 @@  static int exynos5_dmc_get_status(struct device *dev,
 	} else {
 		ret = exynos5_counters_get(dmc, &load, &total);
 		if (ret < 0)
-			return -EINVAL;
+			return ret;
 
 		/* To protect from overflow, divide by 1024 */
 		stat->busy_time = load >> 10;