diff mbox series

mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue

Message ID 1614161930-8513-1-git-send-email-loic.poulain@linaro.org
State Accepted
Commit 0fccbf0a3b690b162f53b13ed8bc442ea33437dc
Headers show
Series mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue | expand

Commit Message

Loic Poulain Feb. 24, 2021, 10:18 a.m. UTC
A recent change created a dedicated workqueue for the state-change work
with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,
but the state-change work (mhi_pm_st_worker) does not guarantee forward
progress under memory pressure, and will even wait on various memory
allocations when e.g. creating devices, loading firmware, etc... The
work is then not part of a memory reclaim path...

Moreover, this causes a warning in check_flush_dependency() since we end
up in code that flushes a non-reclaim workqueue:

[   40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog
[   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140
[   40.969733] Call Trace:
[   40.969740]  __flush_work+0x97/0x1d0
[   40.969745]  ? wake_up_process+0x15/0x20
[   40.969749]  ? insert_work+0x70/0x80
[   40.969750]  ? __queue_work+0x14a/0x3e0
[   40.969753]  flush_work+0x10/0x20
[   40.969756]  rollback_registered_many+0x1c9/0x510
[   40.969759]  unregister_netdevice_queue+0x94/0x120
[   40.969761]  unregister_netdev+0x1d/0x30
[   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]
[   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]
[   40.969776]  device_release_driver_internal+0xf0/0x1d0
[   40.969778]  device_release_driver+0x12/0x20
[   40.969782]  bus_remove_device+0xe1/0x150
[   40.969786]  device_del+0x17b/0x3e0
[   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]
[   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]
[   40.969799]  device_for_each_child+0x5e/0xa0
[   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>

---
 drivers/bus/mhi/core/init.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

-- 
2.7.4

Comments

Manivannan Sadhasivam Feb. 24, 2021, 10:25 a.m. UTC | #1
On Wed, Feb 24, 2021 at 11:18:50AM +0100, Loic Poulain wrote:
> A recent change created a dedicated workqueue for the state-change work

> with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,

> but the state-change work (mhi_pm_st_worker) does not guarantee forward

> progress under memory pressure, and will even wait on various memory

> allocations when e.g. creating devices, loading firmware, etc... The

> work is then not part of a memory reclaim path...

> 

> Moreover, this causes a warning in check_flush_dependency() since we end

> up in code that flushes a non-reclaim workqueue:

> 

> [   40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog

> [   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140

> [   40.969733] Call Trace:

> [   40.969740]  __flush_work+0x97/0x1d0

> [   40.969745]  ? wake_up_process+0x15/0x20

> [   40.969749]  ? insert_work+0x70/0x80

> [   40.969750]  ? __queue_work+0x14a/0x3e0

> [   40.969753]  flush_work+0x10/0x20

> [   40.969756]  rollback_registered_many+0x1c9/0x510

> [   40.969759]  unregister_netdevice_queue+0x94/0x120

> [   40.969761]  unregister_netdev+0x1d/0x30

> [   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]

> [   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]

> [   40.969776]  device_release_driver_internal+0xf0/0x1d0

> [   40.969778]  device_release_driver+0x12/0x20

> [   40.969782]  bus_remove_device+0xe1/0x150

> [   40.969786]  device_del+0x17b/0x3e0

> [   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]

> [   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]

> [   40.969799]  device_for_each_child+0x5e/0xa0

> [   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

> 

> Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue")

> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>


Fix looks good to me but I want Bhaumik to share his review since he
authored the offending patch.

Thanks,
Mani

> ---

>  drivers/bus/mhi/core/init.c | 3 +--

>  1 file changed, 1 insertion(+), 2 deletions(-)

> 

> diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c

> index 32eb90f..03ddd6e 100644

> --- a/drivers/bus/mhi/core/init.c

> +++ b/drivers/bus/mhi/core/init.c

> @@ -890,8 +890,7 @@ int mhi_register_controller(struct mhi_controller *mhi_cntrl,

>  	INIT_WORK(&mhi_cntrl->st_worker, mhi_pm_st_worker);

>  	init_waitqueue_head(&mhi_cntrl->state_event);

>  

> -	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue

> -				("mhi_hiprio_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI);

> +	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue("mhi_hiprio_wq", WQ_HIGHPRI);

>  	if (!mhi_cntrl->hiprio_wq) {

>  		dev_err(mhi_cntrl->cntrl_dev, "Failed to allocate workqueue\n");

>  		ret = -ENOMEM;

> -- 

> 2.7.4

>
Bhaumik Bhatt Feb. 24, 2021, 5:46 p.m. UTC | #2
On 2021-02-24 02:25 AM, Manivannan Sadhasivam wrote:
> On Wed, Feb 24, 2021 at 11:18:50AM +0100, Loic Poulain wrote:

>> A recent change created a dedicated workqueue for the state-change 

>> work

>> with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,

>> but the state-change work (mhi_pm_st_worker) does not guarantee 

>> forward

>> progress under memory pressure, and will even wait on various memory

>> allocations when e.g. creating devices, loading firmware, etc... The

>> work is then not part of a memory reclaim path...

>> 

>> Moreover, this causes a warning in check_flush_dependency() since we 

>> end

>> up in code that flushes a non-reclaim workqueue:

>> 

>> [   40.969601] workqueue: WQ_MEM_RECLAIM 

>> mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM 

>> events_highpri:flush_backlog

>> [   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 

>> check_flush_dependency+0x11c/0x140

>> [   40.969733] Call Trace:

>> [   40.969740]  __flush_work+0x97/0x1d0

>> [   40.969745]  ? wake_up_process+0x15/0x20

>> [   40.969749]  ? insert_work+0x70/0x80

>> [   40.969750]  ? __queue_work+0x14a/0x3e0

>> [   40.969753]  flush_work+0x10/0x20

>> [   40.969756]  rollback_registered_many+0x1c9/0x510

>> [   40.969759]  unregister_netdevice_queue+0x94/0x120

>> [   40.969761]  unregister_netdev+0x1d/0x30

>> [   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]

>> [   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]

>> [   40.969776]  device_release_driver_internal+0xf0/0x1d0

>> [   40.969778]  device_release_driver+0x12/0x20

>> [   40.969782]  bus_remove_device+0xe1/0x150

>> [   40.969786]  device_del+0x17b/0x3e0

>> [   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]

>> [   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]

>> [   40.969799]  device_for_each_child+0x5e/0xa0

>> [   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

>> 

>> Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority 

>> workqueue")

>> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>


Reviewed-by: Bhaumik Bhatt <bbhatt@codeaurora.org>


> 

> Fix looks good to me but I want Bhaumik to share his review since he

> authored the offending patch.

> 

We have seen this internally as well. I agree this patch needs to go in.

We had previously seen issues using global workqueue hence decided to 
move to a
dedicated one with WQ_HIGHPRI in order to speed up execution of the 
worker when
a certain task is queued. For example, handling SBL or power down needs 
to be
done promptly.

> Thanks,

> Mani

> 

>> ---

>>  drivers/bus/mhi/core/init.c | 3 +--

>>  1 file changed, 1 insertion(+), 2 deletions(-)

>> 

>> diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c

>> index 32eb90f..03ddd6e 100644

>> --- a/drivers/bus/mhi/core/init.c

>> +++ b/drivers/bus/mhi/core/init.c

>> @@ -890,8 +890,7 @@ int mhi_register_controller(struct mhi_controller 

>> *mhi_cntrl,

>>  	INIT_WORK(&mhi_cntrl->st_worker, mhi_pm_st_worker);

>>  	init_waitqueue_head(&mhi_cntrl->state_event);

>> 

>> -	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue

>> -				("mhi_hiprio_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI);

>> +	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue("mhi_hiprio_wq", 

>> WQ_HIGHPRI);

>>  	if (!mhi_cntrl->hiprio_wq) {

>>  		dev_err(mhi_cntrl->cntrl_dev, "Failed to allocate workqueue\n");

>>  		ret = -ENOMEM;

>> --

>> 2.7.4

>> 


Thanks,
Bhaumik
---
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project
Manivannan Sadhasivam March 10, 2021, 1:41 p.m. UTC | #3
On Wed, Feb 24, 2021 at 11:18:50AM +0100, Loic Poulain wrote:
> A recent change created a dedicated workqueue for the state-change work

> with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,

> but the state-change work (mhi_pm_st_worker) does not guarantee forward

> progress under memory pressure, and will even wait on various memory

> allocations when e.g. creating devices, loading firmware, etc... The

> work is then not part of a memory reclaim path...

> 

> Moreover, this causes a warning in check_flush_dependency() since we end

> up in code that flushes a non-reclaim workqueue:

> 

> [   40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog

> [   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140

> [   40.969733] Call Trace:

> [   40.969740]  __flush_work+0x97/0x1d0

> [   40.969745]  ? wake_up_process+0x15/0x20

> [   40.969749]  ? insert_work+0x70/0x80

> [   40.969750]  ? __queue_work+0x14a/0x3e0

> [   40.969753]  flush_work+0x10/0x20

> [   40.969756]  rollback_registered_many+0x1c9/0x510

> [   40.969759]  unregister_netdevice_queue+0x94/0x120

> [   40.969761]  unregister_netdev+0x1d/0x30

> [   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]

> [   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]

> [   40.969776]  device_release_driver_internal+0xf0/0x1d0

> [   40.969778]  device_release_driver+0x12/0x20

> [   40.969782]  bus_remove_device+0xe1/0x150

> [   40.969786]  device_del+0x17b/0x3e0

> [   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]

> [   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]

> [   40.969799]  device_for_each_child+0x5e/0xa0

> [   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

> 

> Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue")

> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>


Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>


Thanks,
Mani

> ---

>  drivers/bus/mhi/core/init.c | 3 +--

>  1 file changed, 1 insertion(+), 2 deletions(-)

> 

> diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c

> index 32eb90f..03ddd6e 100644

> --- a/drivers/bus/mhi/core/init.c

> +++ b/drivers/bus/mhi/core/init.c

> @@ -890,8 +890,7 @@ int mhi_register_controller(struct mhi_controller *mhi_cntrl,

>  	INIT_WORK(&mhi_cntrl->st_worker, mhi_pm_st_worker);

>  	init_waitqueue_head(&mhi_cntrl->state_event);

>  

> -	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue

> -				("mhi_hiprio_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI);

> +	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue("mhi_hiprio_wq", WQ_HIGHPRI);

>  	if (!mhi_cntrl->hiprio_wq) {

>  		dev_err(mhi_cntrl->cntrl_dev, "Failed to allocate workqueue\n");

>  		ret = -ENOMEM;

> -- 

> 2.7.4

>
Manivannan Sadhasivam March 10, 2021, 1:43 p.m. UTC | #4
On Wed, Feb 24, 2021 at 11:18:50AM +0100, Loic Poulain wrote:
> A recent change created a dedicated workqueue for the state-change work

> with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,

> but the state-change work (mhi_pm_st_worker) does not guarantee forward

> progress under memory pressure, and will even wait on various memory

> allocations when e.g. creating devices, loading firmware, etc... The

> work is then not part of a memory reclaim path...

> 

> Moreover, this causes a warning in check_flush_dependency() since we end

> up in code that flushes a non-reclaim workqueue:

> 

> [   40.969601] workqueue: WQ_MEM_RECLAIM mhi_hiprio_wq:mhi_pm_st_worker [mhi] is flushing !WQ_MEM_RECLAIM events_highpri:flush_backlog

> [   40.969612] WARNING: CPU: 4 PID: 158 at kernel/workqueue.c:2607 check_flush_dependency+0x11c/0x140

> [   40.969733] Call Trace:

> [   40.969740]  __flush_work+0x97/0x1d0

> [   40.969745]  ? wake_up_process+0x15/0x20

> [   40.969749]  ? insert_work+0x70/0x80

> [   40.969750]  ? __queue_work+0x14a/0x3e0

> [   40.969753]  flush_work+0x10/0x20

> [   40.969756]  rollback_registered_many+0x1c9/0x510

> [   40.969759]  unregister_netdevice_queue+0x94/0x120

> [   40.969761]  unregister_netdev+0x1d/0x30

> [   40.969765]  mhi_net_remove+0x1a/0x40 [mhi_net]

> [   40.969770]  mhi_driver_remove+0x124/0x250 [mhi]

> [   40.969776]  device_release_driver_internal+0xf0/0x1d0

> [   40.969778]  device_release_driver+0x12/0x20

> [   40.969782]  bus_remove_device+0xe1/0x150

> [   40.969786]  device_del+0x17b/0x3e0

> [   40.969791]  mhi_destroy_device+0x9a/0x100 [mhi]

> [   40.969796]  ? mhi_unmap_single_use_bb+0x50/0x50 [mhi]

> [   40.969799]  device_for_each_child+0x5e/0xa0

> [   40.969804]  mhi_pm_st_worker+0x921/0xf50 [mhi]

> 

> Fixes: 8f7039787687 ("bus: mhi: core: Move to using high priority workqueue")

> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>


Applied to mhi-next!

Thanks,
Mani

> ---

>  drivers/bus/mhi/core/init.c | 3 +--

>  1 file changed, 1 insertion(+), 2 deletions(-)

> 

> diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c

> index 32eb90f..03ddd6e 100644

> --- a/drivers/bus/mhi/core/init.c

> +++ b/drivers/bus/mhi/core/init.c

> @@ -890,8 +890,7 @@ int mhi_register_controller(struct mhi_controller *mhi_cntrl,

>  	INIT_WORK(&mhi_cntrl->st_worker, mhi_pm_st_worker);

>  	init_waitqueue_head(&mhi_cntrl->state_event);

>  

> -	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue

> -				("mhi_hiprio_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI);

> +	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue("mhi_hiprio_wq", WQ_HIGHPRI);

>  	if (!mhi_cntrl->hiprio_wq) {

>  		dev_err(mhi_cntrl->cntrl_dev, "Failed to allocate workqueue\n");

>  		ret = -ENOMEM;

> -- 

> 2.7.4

>
patchwork-bot+linux-arm-msm@kernel.org May 26, 2021, 7:03 p.m. UTC | #5
Hello:

This patch was applied to qcom/linux.git (refs/heads/for-next):

On Wed, 24 Feb 2021 11:18:50 +0100 you wrote:
> A recent change created a dedicated workqueue for the state-change work

> with WQ_HIGHPRI (no strong reason for that) and WQ_MEM_RECLAIM flags,

> but the state-change work (mhi_pm_st_worker) does not guarantee forward

> progress under memory pressure, and will even wait on various memory

> allocations when e.g. creating devices, loading firmware, etc... The

> work is then not part of a memory reclaim path...

> 

> [...]


Here is the summary with links:
  - mhi: pci_generic: Remove WQ_MEM_RECLAIM flag from state workqueue
    https://git.kernel.org/qcom/c/0fccbf0a3b69

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
diff mbox series

Patch

diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c
index 32eb90f..03ddd6e 100644
--- a/drivers/bus/mhi/core/init.c
+++ b/drivers/bus/mhi/core/init.c
@@ -890,8 +890,7 @@  int mhi_register_controller(struct mhi_controller *mhi_cntrl,
 	INIT_WORK(&mhi_cntrl->st_worker, mhi_pm_st_worker);
 	init_waitqueue_head(&mhi_cntrl->state_event);
 
-	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue
-				("mhi_hiprio_wq", WQ_MEM_RECLAIM | WQ_HIGHPRI);
+	mhi_cntrl->hiprio_wq = alloc_ordered_workqueue("mhi_hiprio_wq", WQ_HIGHPRI);
 	if (!mhi_cntrl->hiprio_wq) {
 		dev_err(mhi_cntrl->cntrl_dev, "Failed to allocate workqueue\n");
 		ret = -ENOMEM;