Message ID | 983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt |
---|---|
State | New |
Headers | show |
Series | [v2] wifi: cfg80211: Lock wiphy in cfg80211_get_station | expand |
On Sat May 18, 2024 at 5:50 PM CEST, Remi Pommarel wrote: > Wiphy should be locked before calling rdev_get_station() (see lockdep > assert in ieee80211_get_station()). > > This fixes the following kernel NULL dereference: > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 > Mem abort info: > ESR = 0x0000000096000006 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > FSC = 0x06: level 2 translation fault > Data abort info: > ISV = 0, ISS = 0x00000006 > CM = 0, WnR = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000 > [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000 > Internal error: Oops: 0000000096000006 [#1] SMP > Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath > CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705 > Hardware name: RPT (r1) (DT) > Workqueue: bat_events batadv_v_elp_throughput_metric_update > pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core] > lr : sta_set_sinfo+0xcc/0xbd4 > sp : ffff000007b43ad0 > x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98 > x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000 > x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc > x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000 > x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d > x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e > x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000 > x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000 > x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90 > x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000 > Call trace: > ath10k_sta_statistics+0x10/0x2dc [ath10k_core] > sta_set_sinfo+0xcc/0xbd4 > ieee80211_get_station+0x2c/0x44 > cfg80211_get_station+0x80/0x154 > batadv_v_elp_get_throughput+0x138/0x1fc > batadv_v_elp_throughput_metric_update+0x1c/0xa4 > process_one_work+0x1ec/0x414 > worker_thread+0x70/0x46c > kthread+0xdc/0xe0 > ret_from_fork+0x10/0x20 > Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814) > > Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API") > Signed-off-by: Remi Pommarel <repk@triplefau.lt> Reviewed-by: Nicolas Escande <nico.escande@gmail.com>
On Sat, 2024-05-18 at 17:50 +0200, Remi Pommarel wrote: > Wiphy should be locked before calling rdev_get_station() (see lockdep > assert in ieee80211_get_station()). > > This fixes the following kernel NULL dereference: How do you get a NULL pointer dereference from a locking issue? Was something else removing the station simultaneously? johannes
Hi, On 18/05/2024 17:50, Remi Pommarel wrote: > Wiphy should be locked before calling rdev_get_station() (see lockdep > assert in ieee80211_get_station()). Adding the lock is fine as nowadays it is taken in pre_doit and released in post_doit (with some exceptions). Therefore when invoking .get_station from a side path the lock should be taken too. It was actually a05829a7222e9d10c416dd2dbbf3929fe6646b89 that introduced this requirement AFAICS. > > This fixes the following kernel NULL dereference: As already said by Johannes, I am not sure it truly fixes this NULL dereference though. Have you checked where in ath10k_sta_statistics this is exactly happening? Do you think some sta was partly released and thus fields were NULLified? Regards,
Hi, On 21/05/2024 14:15, Remi Pommarel wrote: > On Tue, May 21, 2024 at 09:43:56AM +0200, Antonio Quartulli wrote: >> Hi, >> >> On 18/05/2024 17:50, Remi Pommarel wrote: >>> Wiphy should be locked before calling rdev_get_station() (see lockdep >>> assert in ieee80211_get_station()). >> >> Adding the lock is fine as nowadays it is taken in pre_doit and released in >> post_doit (with some exceptions). Therefore when invoking .get_station from >> a side path the lock should be taken too. >> >> It was actually a05829a7222e9d10c416dd2dbbf3929fe6646b89 that introduced >> this requirement AFAICS. > > IIUC lock requirement was already there before this commit, only it was on > rtnl lock to be taken instead of wiphy one. You're right. > >> >>> >>> This fixes the following kernel NULL dereference: >> >> As already said by Johannes, I am not sure it truly fixes this NULL >> dereference though. >> >> Have you checked where in ath10k_sta_statistics this is exactly happening? >> Do you think some sta was partly released and thus fields were NULLified? > > ath10k_sta_statistics+0x10 is associated with the arsta->arvif->ar > statement. It crashes because arsta->arvif is NULL. > > Here is a scenario that explains the crash where the same sta > disconnects then reconnects quickly (e.g. OPEN network): > > > CPU0: CPU1: > > batadv_v_elp_periodic_work() > queue_work(batadv_v_elp_get_throughput) > > ieee80211_del_station() > ieee80211_add_station() > sta_info_insert() > list_add_tail_rcu() > ath10k_sta_state() > memset(arsta, 0, sizeof(arsta)) > batadv_v_elp_get_throughput() > cfg80211_get_station() > ieee80211_get_station() <-- Find sta with its MAC in list > ath10k_sta_statistics() > arsta->arvif->ar <-- arsta is still zeroed > > > In other words if the same sta has enough time to disconnect then > reconnect before batadv_v_elp_get_throughput get scheduled, sta could be > partially initialized when ath10k_sta_statistics() is called. Locking > the wiphy ensure sta is fully initialized if found. > We were just wondering how you could get the arvif being NULL and your explanation makes sense. This said, holding the lock is required when invoking get_station via netlink, therefore it's meaningful to hold it even when side invoking it from another module. Acked-by: Antonio Quartulli <a@unstable.cc>
diff --git a/net/wireless/util.c b/net/wireless/util.c index 2bde8a354631..082c6f9c5416 100644 --- a/net/wireless/util.c +++ b/net/wireless/util.c @@ -2549,6 +2549,7 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr, { struct cfg80211_registered_device *rdev; struct wireless_dev *wdev; + int ret; wdev = dev->ieee80211_ptr; if (!wdev) @@ -2560,7 +2561,11 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr, memset(sinfo, 0, sizeof(*sinfo)); - return rdev_get_station(rdev, dev, mac_addr, sinfo); + wiphy_lock(&rdev->wiphy); + ret = rdev_get_station(rdev, dev, mac_addr, sinfo); + wiphy_unlock(&rdev->wiphy); + + return ret; } EXPORT_SYMBOL(cfg80211_get_station);