Message ID | 20230218075956.1563118-1-zyytlz.wz@163.com |
---|---|
State | New |
Headers | show |
Series | mwifiex: Fix use-after-free bug due to race condition between main thread thread and timer thread | expand |
On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote: > Note that, this bug is found by static analysis, it could be wrong. We > could discuss that before writing the fix. Yeah, please don't accept this patch. It deserves an "RFC" in the title at best. Sure, it's an identified race condition, but the cure here (deleting all possible recovery from firmware crashes) is worse than the disease. There's no real attempt at analyzing the race or providing solutions, so there's not much to discuss yet. Brian
Brian Norris <briannorris@chromium.org> 于2023年2月22日周三 11:31写道: > > On Sat, Feb 18, 2023 at 03:59:56PM +0800, Zheng Wang wrote: > > Note that, this bug is found by static analysis, it could be wrong. We > > could discuss that before writing the fix. > > Yeah, please don't accept this patch. It deserves an "RFC" in the title > at best. Sure, it's an identified race condition, but the cure here > (deleting all possible recovery from firmware crashes) is worse than the > disease. > Hello Brain, Thanks for your reply. I do need add "RFC" in the title. Sorry about that. The invoking chain is as descriped in the original mail. There is some place which may confuse you. In mwifiex_exec_next_cmd function, the adapter is got from cmd_node->priv->adapter. You might think if this is the same adapter in outer function. In mwifiex_register function, it inits the priv_arrary with adapter->priv[i]->adapter = adapter, Then use mwifiex_init_lock_list to init the list. When it fetch adapter like calling mwifiex_cfg80211_get_tx_power, it travers the array to find the target priv. So all the adapter paramter is pointed to the same adapter. The UAF of it is vulnerable. In summary, after the firmware is downloaded, it will start the timer function, which is known as mwifiex_cmd_timeout_func. The if_ops.card_reset function pointer is assigned with mwifiex_sdio_card_reset, which will schedule_work the card->work. It finally pass the check becauese card->work_flags has MWIFIEX_IFACE_WORK_CARD_RESET. Finnaly, in _mwifiex_fw_dp, if init is failed, it will call mwifiex_free_adapter and free the adapter. > There's no real attempt at analyzing the race or providing solutions, so > there's not much to discuss yet. Yes, I did't figure out a good plan to fix. As I say, it free the adapter in the error handling path. Could you please provide some advice about the fix? Best regards, Zheng
On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote:
> Could you please provide some advice about the fix?
This entire driver's locking patterns (or lack
thereof) need rewritten. This driver was probably written by someone
that doesn't really understand concurrent programming. It really only
works because the bulk of normal operation is sequentialized into the
main loop (mwifiex_main_process()). Any time you get outside that,
you're likely to find bugs.
But now that I've looked a little further, I'm not confident you pointed
out a real bug. How does mwifiex_sdio_card_reset_work() get past
mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That
should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit
the race you point out.
Note to self: ignore most "static analysis" reports of race conditions,
unless they have thorough analysis or a runtime reproduction.
Brian
Hello Brain, Thanks for your detailed review. Sorry we missed something. As you say, We are lacking some consideration, the pointed race path could not happen. But after diving deep into the code, we found another path. Here is the possible path. In summary, if the execution of CPU1 is stuck by fw_done, the cpu0 will continue executing and free the adapter gets into error-path. The cpu1 did not notice that and UAF happened. If there is something else we didn't know, please feel free to let us know. CPU0 CPU1 mwifiex_sdio_probe mwifiex_add_card mwifiex_init_hw_fw request_firmware_nowait mwifiex_fw_dpc _mwifiex_fw_dpc mwifiex_init_fw mwifiex_main_process mwifiex_exec_next_cmd mwifiex_dnld_cmd_to_fw mod_timer(&adapter->cmd_timer,..) mwifiex_cmd_timeout_func if_ops.card_reset(adapter) mwifiex_sdio_card_reset [*] stuck in mwifiex_shutdown_sw return 0 in mwifiex_dnld_cmd_to_fw return 0 in mwifiex_exec_next_cmd return 0 in mwifiex_main_process return -EINPROGRESS in mwifiex_init_fw get into error path, mwifiex_free_adapter // free adapter complete_all(fw_done); [*]Go on Use adapter->fw_done Best regards, Zheng Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道: > > On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote: > > Could you please provide some advice about the fix? > > This entire driver's locking patterns (or lack > thereof) need rewritten. This driver was probably written by someone > that doesn't really understand concurrent programming. It really only > works because the bulk of normal operation is sequentialized into the > main loop (mwifiex_main_process()). Any time you get outside that, > you're likely to find bugs. > > But now that I've looked a little further, I'm not confident you pointed > out a real bug. How does mwifiex_sdio_card_reset_work() get past > mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That > should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit > the race you point out. > > Note to self: ignore most "static analysis" reports of race conditions, > unless they have thorough analysis or a runtime reproduction. > > Brian
This email is broken for the statement is too long, Here is the newest email. Hello Brain, Thanks for your detailed review. Sorry we missed something. As you say, We are lacking some consideration, the pointed race path could not happen. But after diving deep into the code, we found another path. Here is the possible path. In summary, if the execution of CPU1 is stuck by fw_done, the cpu0 will continue executing and free the adapter gets into error-path. The cpu1 did not notice that and UAF happened. If there is something else we didn't know, please feel free to let us know. CPU0 CPU1 mwifiex_sdio_probe mwifiex_add_card mwifiex_init_hw_fw request_firmware_nowait mwifiex_fw_dpc _mwifiex_fw_dpc mwifiex_init_fw mwifiex_main_process mwifiex_exec_next_cmd mwifiex_dnld_cmd_to_fw mod_timer(&adapter->cmd_timer,..) mwifiex_cmd_timeout_func if_ops.card_reset(adapter) mwifiex_sdio_card_reset [*] stuck in mwifiex_shutdown_sw retn 0 in mwifiex_dnld_cmd_to_fw retn 0 in mwifiex_exec_next_cmd retn 0 in mwifiex_main_process retn -EINPROGRESS in mwifiex_init_fw mwifiex_free_adapter when in error // free adapter complete_all(fw_done); [*]Go on Use adapter->fw_done Best regards, Zheng Zheng Hacker <hackerzheng666@gmail.com> 于2023年2月24日周五 13:37写道: > > Hello Brain, > > Thanks for your detailed review. Sorry we missed something. As you say, We are > lacking some consideration, the pointed race path could not happen. But after > diving deep into the code, we found another path. > > Here is the possible path. In summary, if the execution of CPU1 is > stuck by fw_done, > the cpu0 will continue executing and free the adapter gets into > error-path. The cpu1 > did not notice that and UAF happened. > > If there is something else we didn't know, please feel free to let us know. > > CPU0 CPU1 > mwifiex_sdio_probe > mwifiex_add_card > mwifiex_init_hw_fw > request_firmware_nowait > mwifiex_fw_dpc > _mwifiex_fw_dpc > mwifiex_init_fw > mwifiex_main_process > mwifiex_exec_next_cmd > mwifiex_dnld_cmd_to_fw > mod_timer(&adapter->cmd_timer,..) > mwifiex_cmd_timeout_func > > if_ops.card_reset(adapter) > > mwifiex_sdio_card_reset > [*] stuck > in mwifiex_shutdown_sw > return 0 in mwifiex_dnld_cmd_to_fw > return 0 in mwifiex_exec_next_cmd > return 0 in mwifiex_main_process > return -EINPROGRESS in mwifiex_init_fw > get into error path, mwifiex_free_adapter > // free adapter > complete_all(fw_done); > [*]Go on > Use > adapter->fw_done > > > Best regards, > Zheng > > > Brian Norris <briannorris@chromium.org> 于2023年2月23日周四 05:20写道: > > > > On Wed, Feb 22, 2023 at 12:17:21PM +0800, Zheng Hacker wrote: > > > Could you please provide some advice about the fix? > > > > This entire driver's locking patterns (or lack > > thereof) need rewritten. This driver was probably written by someone > > that doesn't really understand concurrent programming. It really only > > works because the bulk of normal operation is sequentialized into the > > main loop (mwifiex_main_process()). Any time you get outside that, > > you're likely to find bugs. > > > > But now that I've looked a little further, I'm not confident you pointed > > out a real bug. How does mwifiex_sdio_card_reset_work() get past > > mwifiex_shutdown_sw() -> wait_for_completion(adapter->fw_done) ? That > > should ensure that _mwifiex_fw_dpc() is finished, and so we can't hit > > the race you point out. > > > > Note to self: ignore most "static analysis" reports of race conditions, > > unless they have thorough analysis or a runtime reproduction. > > > > Brian
On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote: > This email is broken for the statement is too long, Here is the newest email. It still wraps a bit weird, but it's good enough I suppose. > retn -EINPROGRESS in mwifiex_init_fw > mwifiex_free_adapter when in error These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as a true error; -EINPROGRESS is treated as success, such that we continue to wait for the command response. Now, we might hang here if that response doesn't come, but that's a different problem... I'm sure there are true bugs in here somewhere, but I've spent enough time reading your incorrect reports and don't plan to spend more. (If you're lucky, maybe you can pique my curiosity again, but don't count on it.) Regards, Brian
Brian Norris <briannorris@chromium.org> 于2023年2月25日周六 05:39写道: > > On Fri, Feb 24, 2023 at 02:17:59PM +0800, Zheng Hacker wrote: > > This email is broken for the statement is too long, Here is the newest email. > > It still wraps a bit weird, but it's good enough I suppose. > > > retn -EINPROGRESS in mwifiex_init_fw > > mwifiex_free_adapter when in error > > These two statements don't connect. _mwifiex_fw_dpc() only treats -1 as > a true error; -EINPROGRESS is treated as success, such that we continue > to wait for the command response. Now, we might hang here if that > response doesn't come, but that's a different problem... > Hi Brain, Sorry for my unclear description. As you say, they don't have any connection. What I really want to express is after mwifiex_init_fw return -EINPROGRESS to its invoker, which is _mwifiex_fw_dpc. It will pass the check of return value, as the following code. ```cpp ret = mwifiex_init_fw(adapter); if (ret == -1) { goto err_init_fw; } else if (!ret) { adapter->hw_status = MWIFIEX_HW_STATUS_READY; goto done; } ``` it continues executing in _mwifiex_fw_dpc. Then in some unexpected situation,, it'll get into error path like mwifiex_init_channel_scan_gap return non-zero code if there is no more memory to use. It'll then get into err_init_chan_scan label and call mwifiex_free_adapte finally. The other thread may USE it afterwards. ```cpp if (mwifiex_init_channel_scan_gap(adapter)) { mwifiex_dbg(adapter, ERROR, "could not init channel stats table\n"); goto err_init_chan_scan; } ``` > I'm sure there are true bugs in here somewhere, but I've spent enough > time reading your incorrect reports and don't plan to spend more. (If > you're lucky, maybe you can pique my curiosity again, but don't count on > it.) > BUT after reviewing the code carefully, I found this might not happen due to some exclusive condition. So yes, I also think there is still some problem here but I'm kind of busy nowadays. I promise to attach a clearer report next time. So sorry to bother you so many times. And appreciate for your precious time spending on this report. Best regards, Zheng
diff --git a/drivers/net/wireless/marvell/mwifiex/cmdevt.c b/drivers/net/wireless/marvell/mwifiex/cmdevt.c index d3339d67e7a0..688dd451aba9 100644 --- a/drivers/net/wireless/marvell/mwifiex/cmdevt.c +++ b/drivers/net/wireless/marvell/mwifiex/cmdevt.c @@ -1016,8 +1016,6 @@ mwifiex_cmd_timeout_func(struct timer_list *t) if (adapter->if_ops.device_dump) adapter->if_ops.device_dump(adapter); - if (adapter->if_ops.card_reset) - adapter->if_ops.card_reset(adapter); } void diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c index 7dddb4b5dea1..ff2d447c1de3 100644 --- a/drivers/net/wireless/marvell/mwifiex/init.c +++ b/drivers/net/wireless/marvell/mwifiex/init.c @@ -47,8 +47,6 @@ static void wakeup_timer_fn(struct timer_list *t) adapter->hw_status = MWIFIEX_HW_STATUS_RESET; mwifiex_cancel_all_pending_cmd(adapter); - if (adapter->if_ops.card_reset) - adapter->if_ops.card_reset(adapter); } static void fw_dump_work(struct work_struct *work)
This is a potential race condition by executing the following order. In summary, the adapter could be freed in timer function and be used after that. The race condition needs 10s window which could be extended by the paper : https://www.usenix.org/system/files/sec21-lee-yoochan.pdf And the function in wakeup_timer_fn may have the same problem. I dont't really know how to fix that, so I just removed the reset call, which is totally wrong. If you know anything abouth the fix, plz free to let me know. Note that, this bug is found by static analysis, it could be wrong. We could discuss that before writing the fix. CPU0 CPU1 mwifiex_sdio_probe mwifiex_add_card mwifiex_init_hw_fw request_firmware_nowait mwifiex_fw_dpc _mwifiex_fw_dpc mwifiex_init_fw mwifiex_main_process mwifiex_exec_next_cmd mwifiex_dnld_cmd_to_fw mod_timer(&adapter->cmd_timer,..) mwifiex_cmd_timeout_func if_ops.card_reset(adapter) mwifiex_sdio_card_reset schedule_work(&card->work) mwifiex_sdio_work mwifiex_sdio_card_reset_work mwifiex_reinit_sw _mwifiex_fw_dpc mwifiex_free_adapter mwifiex_unregister kfree(adapter) //free adapter mwifiex_get_priv // Use adapter Signed-off-by: Zheng Wang <zyytlz.wz@163.com> --- drivers/net/wireless/marvell/mwifiex/cmdevt.c | 2 -- drivers/net/wireless/marvell/mwifiex/init.c | 2 -- 2 files changed, 4 deletions(-)