Message ID | 9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp |
---|---|
State | New |
Headers | show |
Series | [v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() | expand |
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes: > lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], > for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif > that is being stopped") guards clear_bit() using fq.lock even before > fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. > > According to discussion [2], Toke was not happy with expanding usage of > fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we > can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). Ah, that's a neat solution! :) Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], > for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif > that is being stopped") guards clear_bit() using fq.lock even before > fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. > > According to discussion [2], Toke was not happy with expanding usage of > fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we > can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). > > Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] > Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] > Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") > Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> > Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> Patch applied to wireless-next.git, thanks. 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ? syzbot is failing to test linux.git for 12 days due to this regression. syzbot will fail to bisect new bugs found in the upcoming merge window if unable to test v5.19 due to this regression. On 2022/07/18 21:01, Kalle Valo wrote: > Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: > >> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], >> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif >> that is being stopped") guards clear_bit() using fq.lock even before >> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. >> >> According to discussion [2], Toke was not happy with expanding usage of >> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we >> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). >> >> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] >> Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] >> Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") >> Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >> Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> > > Patch applied to wireless-next.git, thanks. > > 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() >
(please don't top post, I manually fixed that) Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes: > On 2022/07/18 21:01, Kalle Valo wrote: >> Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: >> >>> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], >>> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif >>> that is being stopped") guards clear_bit() using fq.lock even before >>> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. >>> >>> According to discussion [2], Toke was not happy with expanding usage of >>> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we >>> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). >>> >>> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] >>> Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] >>> Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >>> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") >>> Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >>> Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> >> >> Patch applied to wireless-next.git, thanks. >> >> 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() > > Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ? > > syzbot is failing to test linux.git for 12 days due to this regression. > syzbot will fail to bisect new bugs found in the upcoming merge window > if unable to test v5.19 due to this regression. I took this to wireless-next as I didn't think there's enough time to get this to v5.19 (and I only heard Linus' -rc8 plans after the fact). So this will be in v5.20-rc1 and I recommend pushing this to a v5.19 stable release.
On 7/26/22 7:38 AM, Kalle Valo wrote: > (please don't top post, I manually fixed that) > > Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes: > >> On 2022/07/18 21:01, Kalle Valo wrote: >>> Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote: >>> >>>> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], >>>> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif >>>> that is being stopped") guards clear_bit() using fq.lock even before >>>> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. >>>> >>>> According to discussion [2], Toke was not happy with expanding usage of >>>> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we >>>> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). >>>> >>>> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] >>>> Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] >>>> Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >>>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >>>> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") >>>> Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> >>>> Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> >>> >>> Patch applied to wireless-next.git, thanks. >>> >>> 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() >> >> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ? >> >> syzbot is failing to test linux.git for 12 days due to this regression. >> syzbot will fail to bisect new bugs found in the upcoming merge window >> if unable to test v5.19 due to this regression. > > I took this to wireless-next as I didn't think there's enough time to > get this to v5.19 (and I only heard Linus' -rc8 plans after the fact). > So this will be in v5.20-rc1 and I recommend pushing this to a v5.19 > stable release. Would it be worth reverting the patch that broke things until the first stable 5.19.x tree then? Seems lame to ship an official kernel with a known bug like this. Thanks, Ben
On Tue, 26 Jul 2022 08:05:12 -0700 Ben Greear wrote: > >> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ? > >> > >> syzbot is failing to test linux.git for 12 days due to this regression. > >> syzbot will fail to bisect new bugs found in the upcoming merge window > >> if unable to test v5.19 due to this regression. > > > > I took this to wireless-next as I didn't think there's enough time to > > get this to v5.19 (and I only heard Linus' -rc8 plans after the fact). > > So this will be in v5.20-rc1 and I recommend pushing this to a v5.19 > > stable release. > > Would it be worth reverting the patch that broke things until the first stable 5.19.x > tree then? Seems lame to ship an official kernel with a known bug like this. I cherry-picked the fix across the trees after talking to Kalle and DaveM. Let's see how that goes...
On 2022/07/27 6:38, Jakub Kicinski wrote: > On Tue, 26 Jul 2022 08:05:12 -0700 Ben Greear wrote: >>>> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ? >>>> >>>> syzbot is failing to test linux.git for 12 days due to this regression. >>>> syzbot will fail to bisect new bugs found in the upcoming merge window >>>> if unable to test v5.19 due to this regression. >>> >>> I took this to wireless-next as I didn't think there's enough time to >>> get this to v5.19 (and I only heard Linus' -rc8 plans after the fact). >>> So this will be in v5.20-rc1 and I recommend pushing this to a v5.19 >>> stable release. >> >> Would it be worth reverting the patch that broke things until the first stable 5.19.x >> tree then? Seems lame to ship an official kernel with a known bug like this. > > I cherry-picked the fix across the trees after talking to Kalle and > DaveM. Let's see how that goes... This patch successfully arrived at linux.git, in time for 5.19-final. Thank you.
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c index 15a73b7fdd75..1a9ada411879 100644 --- a/net/mac80211/iface.c +++ b/net/mac80211/iface.c @@ -377,9 +377,8 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_do bool cancel_scan; struct cfg80211_nan_func *func; - spin_lock_bh(&local->fq.lock); clear_bit(SDATA_STATE_RUNNING, &sdata->state); - spin_unlock_bh(&local->fq.lock); + synchronize_rcu(); /* flush _ieee80211_wake_txqs() */ cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata; if (cancel_scan)