diff mbox series

wifi: cfg80211: Correctly handle Medium Synchronization Delay

Message ID 20241030025515.16004-1-quic_lingbok@quicinc.com
State New
Headers show
Series wifi: cfg80211: Correctly handle Medium Synchronization Delay | expand

Commit Message

Lingbo Kong Oct. 30, 2024, 2:55 a.m. UTC
Currently, when the driver attempts to connect to an AP MLD with multiple
APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium
Synchronization Delay values from different APs of the same AP MLD to be
equal, which may result in connection failures.

This is because when the driver receives a multi-link probe response from
an AP MLD with multiple APs, cfg80211 updates the Elements for each AP
based on the multi-link probe response. If the Medium Synchronization Delay
is set in the multi-link probe response, the Elements for each AP belonging
to the same AP MLD will have the Medium Synchronization Delay set
simultaneously. If non-multi-link probe responses are received from
different APs of the same MLD AP, cfg80211 will still update the Elements
based on the non-multi-link probe response. Since the non-multi-link probe
response does not set the Medium Synchronization Delay
(IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe
response overwrite those from a multi-link probe response that has set the
Medium Synchronization Delay, the Medium Synchronization Delay values for
APs belonging to the same AP MLD will not be equal. This discrepancy causes
the cfg80211_mlme_check_mlo_compat() function to fail, leading to
connection failures. Commit ccb964b4ab16
("wifi: cfg80211: validate MLO connections better") did not take this into
account.

To address this issue, add a u16 type member to the struct cfg80211_bss to
store the Medium Synchronization Delay. When the driver receives a probe
response with the Medium Synchronization Delay set, update this member’s
value; otherwise, do not update it. Additionally, modify the parameter list
of cfg80211_mlme_check_mlo_compat() so that this member variable can be
directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq
a pointer to the struct cfg80211_bss. This will allow checking whether the
Medium Synchronization Delay values from different APs of the same AP MLD
are equal.

Fixes: ccb964b4ab16 ("wifi: cfg80211: validate MLO connections better")
Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
---
 include/net/cfg80211.h |  4 ++++
 net/wireless/mlme.c    | 11 +++++++----
 net/wireless/scan.c    | 15 +++++++++++++++
 3 files changed, 26 insertions(+), 4 deletions(-)


base-commit: e7e2957f403ba4655199f2ba9920c1a015a7be44

Comments

Lingbo Kong Oct. 31, 2024, 5:23 a.m. UTC | #1
On 2024/10/30 18:23, Benjamin Berg wrote:
> Hi,
> 
> On Wed, 2024-10-30 at 10:55 +0800, Lingbo Kong wrote:
>> Currently, when the driver attempts to connect to an AP MLD with multiple
>> APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium
>> Synchronization Delay values from different APs of the same AP MLD to be
>> equal, which may result in connection failures.
>>
>> This is because when the driver receives a multi-link probe response from
>> an AP MLD with multiple APs, cfg80211 updates the Elements for each AP
>> based on the multi-link probe response. If the Medium Synchronization Delay
>> is set in the multi-link probe response, the Elements for each AP belonging
>> to the same AP MLD will have the Medium Synchronization Delay set
>> simultaneously. If non-multi-link probe responses are received from
>> different APs of the same MLD AP, cfg80211 will still update the Elements
>> based on the non-multi-link probe response. Since the non-multi-link probe
>> response does not set the Medium Synchronization Delay
>> (IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe
>> response overwrite those from a multi-link probe response that has set the
>> Medium Synchronization Delay, the Medium Synchronization Delay values for
>> APs belonging to the same AP MLD will not be equal. This discrepancy causes
>> the cfg80211_mlme_check_mlo_compat() function to fail, leading to
>> connection failures. Commit ccb964b4ab16
>> ("wifi: cfg80211: validate MLO connections better") did not take this into
>> account.
> 
> The specification confuses me a bit and I am probably missing
> something.
> 
>> To address this issue, add a u16 type member to the struct cfg80211_bss to
>> store the Medium Synchronization Delay. When the driver receives a probe
>> response with the Medium Synchronization Delay set, update this member’s
>> value; otherwise, do not update it. Additionally, modify the parameter list
>> of cfg80211_mlme_check_mlo_compat() so that this member variable can be
>> directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq
>> a pointer to the struct cfg80211_bss. This will allow checking whether the
>> Medium Synchronization Delay values from different APs of the same AP MLD
>> are equal.
> 
> This feels like a recipe for failures. Whether or not an ML-Probe
> Response is seen should not generally affect the validity of a BSS.
> Which means that the validity check itself may be incorrect.
> 
> Maybe we should only compare the values if the subfield has been
> included?
> 
> 
> That said, do you understand when exactly the subfield should be
> included? It seems that it may be carried in the "(Re)Association
> Response", in which case it would be consistent anyway.
> 
> The ML probe response part seems a bit specific. Could it be that this
> is intended to be used by NSTR APs only, in which case there would not
> be a beacon on the secondary AP. If that is the case, then there
> wouldn't be much of an inconsistency (though we would still need to be
> a bit careful).
> 
> Benjamin
> 

hi,benjamin,

regarding your comment, I believe removing this validity check might be 
the easiest solution. As for the scenarios where this subfield would be 
included, I referred to the IEEE 802.11be specification. According to 
section 35.3.16.8, it states:

A STA shall not start a MediumSyncDelay timer unless the STA is one of 
the following:
— a non-AP STA affiliated with a non-AP MLD operating on an NSTR link 
pair or
— a non-AP STA affiliated with a non-AP MLD operating on an EMLSR link or
— a non-AP STA affiliated with a non-AP MLD operating on an EMLMR link or
— an AP affiliated with an NSTR mobile AP MLD operating on the 
nonprimary link of an NSTR link pair.

Regarding the statement “Could it be that this is intended to be used by 
NSTR APs only,” I did not find any explicit indication in the spec that 
the multi-link probe response carrying the medium sync delay subfield is 
solely applicable to NSTR APs.

Additionally, do you have any suggestions on how to address this issue?🙂

/lingbok

>>
>> Fixes: ccb964b4ab16 ("wifi: cfg80211: validate MLO connections better")
>> Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
>> ---
>>   include/net/cfg80211.h |  4 ++++
>>   net/wireless/mlme.c    | 11 +++++++----
>>   net/wireless/scan.c    | 15 +++++++++++++++
>>   3 files changed, 26 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
>> index 27acf1292a5c..ebeb305c1d0c 100644
>> --- a/include/net/cfg80211.h
>> +++ b/include/net/cfg80211.h
>> @@ -2956,6 +2956,8 @@ struct cfg80211_bss_ies {
>>    * @cannot_use_reasons: the reasons (bitmap) for not being able to connect,
>>    *	if @restrict_use is set and @use_for is zero (empty); may be 0 for
>>    *	unspecified reasons; see &enum nl80211_bss_cannot_use_reasons
>> + * @med_sync_delay: Medium Synchronization delay as described in
>> + *	IEEE 802.11be-2024 Figure 9-1074q.
>>    * @priv: private area for driver use, has at least wiphy->bss_priv_size bytes
>>    */
>>   struct cfg80211_bss {
>> @@ -2986,6 +2988,8 @@ struct cfg80211_bss {
>>   	u8 use_for;
>>   	u8 cannot_use_reasons;
>>   
>> +	u16 med_sync_delay;
>> +
>>   	u8 priv[] __aligned(sizeof(void *));
>>   };
>>   
>> diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c
>> index 4dac81854721..ae00c36779d2 100644
>> --- a/net/wireless/mlme.c
>> +++ b/net/wireless/mlme.c
>> @@ -326,7 +326,9 @@ void cfg80211_oper_and_vht_capa(struct ieee80211_vht_cap *vht_capa,
>>   }
>>   
>>   static int
>> -cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a,
>> +cfg80211_mlme_check_mlo_compat(const struct cfg80211_bss *bss_a,
>> +			       const struct cfg80211_bss *bss_b,
>> +			       const struct ieee80211_multi_link_elem *mle_a,
>>   			       const struct ieee80211_multi_link_elem *mle_b,
>>   			       struct netlink_ext_ack *extack)
>>   {
>> @@ -340,8 +342,7 @@ cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a,
>>   		return -EINVAL;
>>   	}
>>   
>> -	if (ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_a) !=
>> -	    ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_b)) {
>> +	if (bss_a->med_sync_delay != bss_b->med_sync_delay) {
>>   		NL_SET_ERR_MSG(extack, "link EML medium sync delay mismatch");
>>   		return -EINVAL;
>>   	}
>> @@ -426,7 +427,9 @@ static int cfg80211_mlme_check_mlo(struct net_device *dev,
>>   		if (WARN_ON(!mles[i]))
>>   			goto error;
>>   
>> -		if (cfg80211_mlme_check_mlo_compat(mles[req->link_id], mles[i],
>> +		if (cfg80211_mlme_check_mlo_compat(req->links[req->link_id].bss,
>> +						   req->links[i].bss,
>> +						   mles[req->link_id], mles[i],
>>   						   extack)) {
>>   			req->links[i].error = -EINVAL;
>>   			goto error;
>> diff --git a/net/wireless/scan.c b/net/wireless/scan.c
>> index 1c6fd45aa809..6ad935475484 100644
>> --- a/net/wireless/scan.c
>> +++ b/net/wireless/scan.c
>> @@ -1899,6 +1899,13 @@ cfg80211_update_known_bss(struct cfg80211_registered_device *rdev,
>>   	 */
>>   	if (signal_valid)
>>   		known->pub.signal = new->pub.signal;
>> +
>> +	/* update medium synchronization delay when its value is not
>> +	 * the default.
>> +	 */
>> +	if (new->pub.med_sync_delay != IEEE80211_MED_SYNC_DELAY_DEFAULT)
>> +		known->pub.med_sync_delay = new->pub.med_sync_delay;
>> +
>>   	known->pub.capability = new->pub.capability;
>>   	known->ts = new->ts;
>>   	known->ts_boottime = new->ts_boottime;
>> @@ -2224,6 +2231,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy,
>>   	int bss_type;
>>   	bool signal_valid;
>>   	unsigned long ts;
>> +	const struct element *ml;
>>   
>>   	if (WARN_ON(!wiphy))
>>   		return NULL;
>> @@ -2267,6 +2275,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy,
>>   	tmp.pub.use_for = data->use_for;
>>   	tmp.pub.cannot_use_reasons = data->cannot_use_reasons;
>>   	tmp.bss_source = data->bss_source;
>> +	tmp.pub.med_sync_delay = IEEE80211_MED_SYNC_DELAY_DEFAULT;
>>   
>>   	switch (data->bss_source) {
>>   	case BSS_SOURCE_MBSSID:
>> @@ -2321,6 +2330,12 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy,
>>   		break;
>>   	case CFG80211_BSS_FTYPE_PRESP:
>>   		rcu_assign_pointer(tmp.pub.proberesp_ies, ies);
>> +		ml = cfg80211_find_ext_elem(WLAN_EID_EXT_EHT_MULTI_LINK,
>> +					    ies->data, ies->len);
>> +		if (ml)
>> +			tmp.pub.med_sync_delay =
>> +				ieee80211_mle_get_eml_med_sync_delay(ml->data + 1);
>> +
>>   		break;
>>   	}
>>   	rcu_assign_pointer(tmp.pub.ies, ies);
>>
>> base-commit: e7e2957f403ba4655199f2ba9920c1a015a7be44
>
Johannes Berg Oct. 31, 2024, 10:47 a.m. UTC | #2
On Thu, 2024-10-31 at 10:25 +0100, Benjamin Berg wrote:
> 
> Right, which does seem to imply that a non-NSTR AP can very well set a
> Medium Delay Synchronization.

Yes, we also pass it to the firmware in iwlwifi, so definitely.

> > Regarding the statement “Could it be that this is intended to be used by 
> > NSTR APs only,” I did not find any explicit indication in the spec that 
> > the multi-link probe response carrying the medium sync delay subfield is 
> > solely applicable to NSTR APs.
> > 
> > Additionally, do you have any suggestions on how to address this issue?🙂
> 
> Not really. I don't grasp the intention of the specification here. But
> if an AP may include the field in an ML Probe Response, then we
> obviously cannot rely on the value to be consistent.
> 
> Seems to me that the option to remove the check is still the best way
> forward right now.

Agree, let's just remove it - Lingbo do you want to send a patch?

We should probably compare "extended MLD capabilities and ops" though,
haha.

johannes
diff mbox series

Patch

diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index 27acf1292a5c..ebeb305c1d0c 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -2956,6 +2956,8 @@  struct cfg80211_bss_ies {
  * @cannot_use_reasons: the reasons (bitmap) for not being able to connect,
  *	if @restrict_use is set and @use_for is zero (empty); may be 0 for
  *	unspecified reasons; see &enum nl80211_bss_cannot_use_reasons
+ * @med_sync_delay: Medium Synchronization delay as described in
+ *	IEEE 802.11be-2024 Figure 9-1074q.
  * @priv: private area for driver use, has at least wiphy->bss_priv_size bytes
  */
 struct cfg80211_bss {
@@ -2986,6 +2988,8 @@  struct cfg80211_bss {
 	u8 use_for;
 	u8 cannot_use_reasons;
 
+	u16 med_sync_delay;
+
 	u8 priv[] __aligned(sizeof(void *));
 };
 
diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c
index 4dac81854721..ae00c36779d2 100644
--- a/net/wireless/mlme.c
+++ b/net/wireless/mlme.c
@@ -326,7 +326,9 @@  void cfg80211_oper_and_vht_capa(struct ieee80211_vht_cap *vht_capa,
 }
 
 static int
-cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a,
+cfg80211_mlme_check_mlo_compat(const struct cfg80211_bss *bss_a,
+			       const struct cfg80211_bss *bss_b,
+			       const struct ieee80211_multi_link_elem *mle_a,
 			       const struct ieee80211_multi_link_elem *mle_b,
 			       struct netlink_ext_ack *extack)
 {
@@ -340,8 +342,7 @@  cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a,
 		return -EINVAL;
 	}
 
-	if (ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_a) !=
-	    ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_b)) {
+	if (bss_a->med_sync_delay != bss_b->med_sync_delay) {
 		NL_SET_ERR_MSG(extack, "link EML medium sync delay mismatch");
 		return -EINVAL;
 	}
@@ -426,7 +427,9 @@  static int cfg80211_mlme_check_mlo(struct net_device *dev,
 		if (WARN_ON(!mles[i]))
 			goto error;
 
-		if (cfg80211_mlme_check_mlo_compat(mles[req->link_id], mles[i],
+		if (cfg80211_mlme_check_mlo_compat(req->links[req->link_id].bss,
+						   req->links[i].bss,
+						   mles[req->link_id], mles[i],
 						   extack)) {
 			req->links[i].error = -EINVAL;
 			goto error;
diff --git a/net/wireless/scan.c b/net/wireless/scan.c
index 1c6fd45aa809..6ad935475484 100644
--- a/net/wireless/scan.c
+++ b/net/wireless/scan.c
@@ -1899,6 +1899,13 @@  cfg80211_update_known_bss(struct cfg80211_registered_device *rdev,
 	 */
 	if (signal_valid)
 		known->pub.signal = new->pub.signal;
+
+	/* update medium synchronization delay when its value is not
+	 * the default.
+	 */
+	if (new->pub.med_sync_delay != IEEE80211_MED_SYNC_DELAY_DEFAULT)
+		known->pub.med_sync_delay = new->pub.med_sync_delay;
+
 	known->pub.capability = new->pub.capability;
 	known->ts = new->ts;
 	known->ts_boottime = new->ts_boottime;
@@ -2224,6 +2231,7 @@  cfg80211_inform_single_bss_data(struct wiphy *wiphy,
 	int bss_type;
 	bool signal_valid;
 	unsigned long ts;
+	const struct element *ml;
 
 	if (WARN_ON(!wiphy))
 		return NULL;
@@ -2267,6 +2275,7 @@  cfg80211_inform_single_bss_data(struct wiphy *wiphy,
 	tmp.pub.use_for = data->use_for;
 	tmp.pub.cannot_use_reasons = data->cannot_use_reasons;
 	tmp.bss_source = data->bss_source;
+	tmp.pub.med_sync_delay = IEEE80211_MED_SYNC_DELAY_DEFAULT;
 
 	switch (data->bss_source) {
 	case BSS_SOURCE_MBSSID:
@@ -2321,6 +2330,12 @@  cfg80211_inform_single_bss_data(struct wiphy *wiphy,
 		break;
 	case CFG80211_BSS_FTYPE_PRESP:
 		rcu_assign_pointer(tmp.pub.proberesp_ies, ies);
+		ml = cfg80211_find_ext_elem(WLAN_EID_EXT_EHT_MULTI_LINK,
+					    ies->data, ies->len);
+		if (ml)
+			tmp.pub.med_sync_delay =
+				ieee80211_mle_get_eml_med_sync_delay(ml->data + 1);
+
 		break;
 	}
 	rcu_assign_pointer(tmp.pub.ies, ies);