diff mbox series

[RFC] iwlwifi: iwl-dbg: Use del_timer_sync() before freeing

Message ID 20220406153410.1899768-1-linux@roeck-us.net
State New
Headers show
Series [RFC] iwlwifi: iwl-dbg: Use del_timer_sync() before freeing | expand

Commit Message

Guenter Roeck April 6, 2022, 3:34 p.m. UTC
In Chrome OS, a large number of crashes is observed due to corrupted timer
lists. Steven Rostedt pointed out that this usually happens when a timer
is freed while still active, and that the problem is often triggered
by code calling del_timer() instead of del_timer_sync() just before
freeing.

Steven also identified the iwlwifi driver as one of the possible culprits
since it does exactly that.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Fixes: 60e8abd9d3e91 ("iwlwifi: dbg_ini: add periodic trigger new API support")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
RFC:
    Maybe there was a reason to use del_timer() instead of del_timer_sync().
    Also, I am not sure if the change is sufficient since I don't see any
    obvious locking that would prevent timers from being added and then
    modified in iwl_dbg_tlv_set_periodic_trigs() while being removed in
    iwl_dbg_tlv_del_timers().

 drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Coelho, Luciano April 8, 2022, 5:20 a.m. UTC | #1
On Thu, 2022-04-07 at 12:50 -0700, Guenter Roeck wrote:
> Hi,
> 
> On 4/6/22 08:34, Guenter Roeck wrote:
> > In Chrome OS, a large number of crashes is observed due to corrupted timer
> > lists. Steven Rostedt pointed out that this usually happens when a timer
> > is freed while still active, and that the problem is often triggered
> > by code calling del_timer() instead of del_timer_sync() just before
> > freeing.
> > 
> > Steven also identified the iwlwifi driver as one of the possible culprits
> > since it does exactly that.
> > 
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Shahar S Matityahu <shahar.s.matityahu@intel.com>
> > Cc: Johannes Berg <johannes.berg@intel.com>
> > Fixes: 60e8abd9d3e91 ("iwlwifi: dbg_ini: add periodic trigger new API support")
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> > RFC:
> >      Maybe there was a reason to use del_timer() instead of del_timer_sync().
> >      Also, I am not sure if the change is sufficient since I don't see any
> >      obvious locking that would prevent timers from being added and then
> >      modified in iwl_dbg_tlv_set_periodic_trigs() while being removed in
> >      iwl_dbg_tlv_del_timers().
> > 
> 
> I prepared a new version of this patch, introducing a mutex to protect changes
> to periodic_trig_list. I'd like to get some feedback before sending it out,
> though, so I'll wait until next week before sending it.
> 
> If you have any feedback/thoughts/comments, please let me know.

Hi Guenter,

Thanks for your proposal!

I recently moved from the Intel WiFi team to the Graphics team, so I'm
adding Gregory, who has taken over my duties, to the discussion.

I don't recall any specific reasons for using del_timer() instead of
del_timer_sync() here.  So your patch does look correct to me.

--
Cheers,
Luca.
Guenter Roeck April 8, 2022, 10:30 p.m. UTC | #2
Hi Luca,

On 4/7/22 22:20, Coelho, Luciano wrote:
> On Thu, 2022-04-07 at 12:50 -0700, Guenter Roeck wrote:
>> Hi,
>>
>> On 4/6/22 08:34, Guenter Roeck wrote:
>>> In Chrome OS, a large number of crashes is observed due to corrupted timer
>>> lists. Steven Rostedt pointed out that this usually happens when a timer
>>> is freed while still active, and that the problem is often triggered
>>> by code calling del_timer() instead of del_timer_sync() just before
>>> freeing.
>>>
>>> Steven also identified the iwlwifi driver as one of the possible culprits
>>> since it does exactly that.
>>>
>>> Reported-by: Steven Rostedt <rostedt@goodmis.org>
>>> Cc: Steven Rostedt <rostedt@goodmis.org>
>>> Cc: Shahar S Matityahu <shahar.s.matityahu@intel.com>
>>> Cc: Johannes Berg <johannes.berg@intel.com>
>>> Fixes: 60e8abd9d3e91 ("iwlwifi: dbg_ini: add periodic trigger new API support")
>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>> ---
>>> RFC:
>>>       Maybe there was a reason to use del_timer() instead of del_timer_sync().
>>>       Also, I am not sure if the change is sufficient since I don't see any
>>>       obvious locking that would prevent timers from being added and then
>>>       modified in iwl_dbg_tlv_set_periodic_trigs() while being removed in
>>>       iwl_dbg_tlv_del_timers().
>>>
>>
>> I prepared a new version of this patch, introducing a mutex to protect changes
>> to periodic_trig_list. I'd like to get some feedback before sending it out,
>> though, so I'll wait until next week before sending it.
>>
>> If you have any feedback/thoughts/comments, please let me know.
> 
> Hi Guenter,
> 
> Thanks for your proposal!
> 
> I recently moved from the Intel WiFi team to the Graphics team, so I'm
> adding Gregory, who has taken over my duties, to the discussion.
> 
> I don't recall any specific reasons for using del_timer() instead of
> del_timer_sync() here.  So your patch does look correct to me.
> 

Thanks a lot for the feedback. I spent some time trying to determine
if a mutex to protect the periodic timer list is needed, but concluded
that it is not necessary because the code adding the timer list and
the code removing it are never executed in parallel. Of course,
I may be missing something, so I'd be happy to be corrected.

Thanks,
Guenter
diff mbox series

Patch

diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
index 866a33f49915..3237d4b528b5 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-dbg-tlv.c
@@ -371,7 +371,7 @@  void iwl_dbg_tlv_del_timers(struct iwl_trans *trans)
 	struct iwl_dbg_tlv_timer_node *node, *tmp;
 
 	list_for_each_entry_safe(node, tmp, timer_list, list) {
-		del_timer(&node->timer);
+		del_timer_sync(&node->timer);
 		list_del(&node->list);
 		kfree(node);
 	}