diff mbox series

[RFC,net-next] sfc: replace in_interrupt() usage

Message ID e45d9556-2759-6f33-01a0-d1739ce5760d@solarflare.com
State New
Headers show
Series [RFC,net-next] sfc: replace in_interrupt() usage | expand

Commit Message

Edward Cree Sept. 28, 2020, 8:05 p.m. UTC
efx_ef10_try_update_nic_stats_vf() used in_interrupt() to figure out
 whether it is safe to sleep (for MCDI) or not.
The only caller from which it was not is efx_net_stats(), which can be
 invoked under dev_base_lock from net-sysfs::netstat_show().
So add a new update_stats_atomic() method to struct efx_nic_type, and
 call it from efx_net_stats(), removing the need for
 efx_ef10_try_update_nic_stats_vf() to behave differently for this case
 (which it wasn't doing correctly anyway).
For all nic_types other than EF10 VF, this method is NULL and so we
 call the regular update_stats() methods, which are happy with being
 called from atomic contexts.

Fixes: f00bf2305cab ("sfc: don't update stats on VF when called in atomic context")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Edward Cree <ecree@solarflare.com>
---
Only compile-tested so far, because I'm waiting for my kernel to
 finish rebuilding with CONFIG_DEBUG_ATOMIC_SLEEP which I'm hoping
 is the right thing to detect the bug in the existing code.
I also wasn't quite sure how to give credit to the thorough analysis
 in the commit message of Sebastian's patch.  I don't think we have
 a Whatever-by: tag to cover that, do we?
And this doesn't include your GFP_KERNEL change, which should
 probably go in separately if you take this.

 drivers/net/ethernet/sfc/ef10.c       | 22 +++++++++++++---------
 drivers/net/ethernet/sfc/efx_common.c |  2 +-
 drivers/net/ethernet/sfc/net_driver.h |  5 +++++
 drivers/net/ethernet/sfc/nic_common.h |  7 +++++++
 4 files changed, 26 insertions(+), 10 deletions(-)

Comments

Thomas Gleixner Sept. 28, 2020, 8:24 p.m. UTC | #1
On Mon, Sep 28 2020 at 21:05, Edward Cree wrote:
> efx_ef10_try_update_nic_stats_vf() used in_interrupt() to figure out
>  whether it is safe to sleep (for MCDI) or not.
> The only caller from which it was not is efx_net_stats(), which can be
>  invoked under dev_base_lock from net-sysfs::netstat_show().
> So add a new update_stats_atomic() method to struct efx_nic_type, and
>  call it from efx_net_stats(), removing the need for
>  efx_ef10_try_update_nic_stats_vf() to behave differently for this case
>  (which it wasn't doing correctly anyway).
> For all nic_types other than EF10 VF, this method is NULL and so we
>  call the regular update_stats() methods, which are happy with being
>  called from atomic contexts.
>
> Fixes: f00bf2305cab ("sfc: don't update stats on VF when called in atomic context")
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Edward Cree <ecree@solarflare.com>

That's much nicer.

> ---
> Only compile-tested so far, because I'm waiting for my kernel to
>  finish rebuilding with CONFIG_DEBUG_ATOMIC_SLEEP which I'm hoping
>  is the right thing to detect the bug in the existing code.
> I also wasn't quite sure how to give credit to the thorough analysis
>  in the commit message of Sebastian's patch.  I don't think we have
>  a Whatever-by: tag to cover that, do we?

Sebastian did the analysis and I did some word polishing, but the credit
surely goes to him.

Thanks,

        tglx
Martin Habets Sept. 29, 2020, 9:39 a.m. UTC | #2
On Mon, Sep 28, 2020 at 09:05:52PM +0100, Edward Cree wrote:
> efx_ef10_try_update_nic_stats_vf() used in_interrupt() to figure out
>  whether it is safe to sleep (for MCDI) or not.
> The only caller from which it was not is efx_net_stats(), which can be
>  invoked under dev_base_lock from net-sysfs::netstat_show().
> So add a new update_stats_atomic() method to struct efx_nic_type, and
>  call it from efx_net_stats(), removing the need for
>  efx_ef10_try_update_nic_stats_vf() to behave differently for this case
>  (which it wasn't doing correctly anyway).
> For all nic_types other than EF10 VF, this method is NULL and so we
>  call the regular update_stats() methods, which are happy with being
>  called from atomic contexts.
> 
> Fixes: f00bf2305cab ("sfc: don't update stats on VF when called in atomic context")
> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Reviewed-by: Martin Habets <mhabets@solarflare.com>

> ---
> Only compile-tested so far, because I'm waiting for my kernel to
>  finish rebuilding with CONFIG_DEBUG_ATOMIC_SLEEP which I'm hoping
>  is the right thing to detect the bug in the existing code.
> I also wasn't quite sure how to give credit to the thorough analysis
>  in the commit message of Sebastian's patch.  I don't think we have
>  a Whatever-by: tag to cover that, do we?
> And this doesn't include your GFP_KERNEL change, which should
>  probably go in separately if you take this.
> 
>  drivers/net/ethernet/sfc/ef10.c       | 22 +++++++++++++---------
>  drivers/net/ethernet/sfc/efx_common.c |  2 +-
>  drivers/net/ethernet/sfc/net_driver.h |  5 +++++
>  drivers/net/ethernet/sfc/nic_common.h |  7 +++++++
>  4 files changed, 26 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
> index c9df2e96ebe4..b702ba5986dc 100644
> --- a/drivers/net/ethernet/sfc/ef10.c
> +++ b/drivers/net/ethernet/sfc/ef10.c
> @@ -1871,15 +1871,6 @@ static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
>  
>  	spin_unlock_bh(&efx->stats_lock);
>  
> -	if (in_interrupt()) {
> -		/* If in atomic context, cannot update stats.  Just update the
> -		 * software stats and return so the caller can continue.
> -		 */
> -		spin_lock_bh(&efx->stats_lock);
> -		efx_update_sw_stats(efx, stats);
> -		return 0;
> -	}
> -
>  	efx_ef10_get_stat_mask(efx, mask);
>  
>  	rc = efx_nic_alloc_buffer(efx, &stats_buf, dma_len, GFP_ATOMIC);
> @@ -1938,6 +1929,18 @@ static size_t efx_ef10_update_stats_vf(struct efx_nic *efx, u64 *full_stats,
>  	return efx_ef10_update_stats_common(efx, full_stats, core_stats);
>  }
>  
> +static size_t efx_ef10_update_stats_atomic_vf(struct efx_nic *efx, u64 *full_stats,
> +					      struct rtnl_link_stats64 *core_stats)
> +{
> +	struct efx_ef10_nic_data *nic_data = efx->nic_data;
> +
> +	/* In atomic context, cannot update HW stats.  Just update the
> +	 * software stats and return so the caller can continue.
> +	 */
> +	efx_update_sw_stats(efx, nic_data->stats);
> +	return efx_ef10_update_stats_common(efx, full_stats, core_stats);
> +}
> +
>  static void efx_ef10_push_irq_moderation(struct efx_channel *channel)
>  {
>  	struct efx_nic *efx = channel->efx;
> @@ -3998,6 +4001,7 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = {
>  	.finish_flr = efx_port_dummy_op_void,
>  	.describe_stats = efx_ef10_describe_stats,
>  	.update_stats = efx_ef10_update_stats_vf,
> +	.update_stats_atomic = efx_ef10_update_stats_atomic_vf,
>  	.start_stats = efx_port_dummy_op_void,
>  	.pull_stats = efx_port_dummy_op_void,
>  	.stop_stats = efx_port_dummy_op_void,
> diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c
> index c256db241570..72a3f0e09f52 100644
> --- a/drivers/net/ethernet/sfc/efx_common.c
> +++ b/drivers/net/ethernet/sfc/efx_common.c
> @@ -602,7 +602,7 @@ void efx_net_stats(struct net_device *net_dev, struct rtnl_link_stats64 *stats)
>  	struct efx_nic *efx = netdev_priv(net_dev);
>  
>  	spin_lock_bh(&efx->stats_lock);
> -	efx->type->update_stats(efx, NULL, stats);
> +	efx_nic_update_stats_atomic(efx, NULL, stats);
>  	spin_unlock_bh(&efx->stats_lock);
>  }
>  
> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
> index 47aa753e64bd..9f7dfdf708cf 100644
> --- a/drivers/net/ethernet/sfc/net_driver.h
> +++ b/drivers/net/ethernet/sfc/net_driver.h
> @@ -1172,6 +1172,9 @@ struct efx_udp_tunnel {
>   * @describe_stats: Describe statistics for ethtool
>   * @update_stats: Update statistics not provided by event handling.
>   *	Either argument may be %NULL.
> + * @update_stats_atomic: Update statistics while in atomic context, if that
> + *	is more limiting than @update_stats.  Otherwise, leave %NULL and
> + *	driver core will call @update_stats.
>   * @start_stats: Start the regular fetching of statistics
>   * @pull_stats: Pull stats from the NIC and wait until they arrive.
>   * @stop_stats: Stop the regular fetching of statistics
> @@ -1316,6 +1319,8 @@ struct efx_nic_type {
>  	size_t (*describe_stats)(struct efx_nic *efx, u8 *names);
>  	size_t (*update_stats)(struct efx_nic *efx, u64 *full_stats,
>  			       struct rtnl_link_stats64 *core_stats);
> +	size_t (*update_stats_atomic)(struct efx_nic *efx, u64 *full_stats,
> +				      struct rtnl_link_stats64 *core_stats);
>  	void (*start_stats)(struct efx_nic *efx);
>  	void (*pull_stats)(struct efx_nic *efx);
>  	void (*stop_stats)(struct efx_nic *efx);
> diff --git a/drivers/net/ethernet/sfc/nic_common.h b/drivers/net/ethernet/sfc/nic_common.h
> index 82271f0b8627..b9cafe9cd568 100644
> --- a/drivers/net/ethernet/sfc/nic_common.h
> +++ b/drivers/net/ethernet/sfc/nic_common.h
> @@ -244,6 +244,13 @@ void efx_nic_update_stats(const struct efx_hw_stat_desc *desc, size_t count,
>  			  const unsigned long *mask, u64 *stats,
>  			  const void *dma_buf, bool accumulate);
>  void efx_nic_fix_nodesc_drop_stat(struct efx_nic *efx, u64 *stat);
> +static inline size_t efx_nic_update_stats_atomic(struct efx_nic *efx, u64 *full_stats,
> +						 struct rtnl_link_stats64 *core_stats)
> +{
> +	if (efx->type->update_stats_atomic)
> +		return efx->type->update_stats_atomic(efx, full_stats, core_stats);
> +	return efx->type->update_stats(efx, full_stats, core_stats);
> +}
>  
>  #define EFX_MAX_FLUSH_TIME 5000
>
Edward Cree Sept. 29, 2020, 3:15 p.m. UTC | #3
> On Mon, Sep 28 2020 at 21:05, Edward Cree wrote:
>> Only compile-tested so far, because I'm waiting for my kernel to
>>  finish rebuilding with CONFIG_DEBUG_ATOMIC_SLEEP

I've now tested and confirmed that the might_sleep warning goes
 away with this patch.

Thomas, do you want to pull it into v2 of your series, or should
 I submit it separately to David?

-ed
Thomas Gleixner Sept. 29, 2020, 7:27 p.m. UTC | #4
On Tue, Sep 29 2020 at 16:15, Edward Cree wrote:
>> On Mon, Sep 28 2020 at 21:05, Edward Cree wrote:
>>> Only compile-tested so far, because I'm waiting for my kernel to
>>>  finish rebuilding with CONFIG_DEBUG_ATOMIC_SLEEP
>
> I've now tested and confirmed that the might_sleep warning goes
>  away with this patch.
>
> Thomas, do you want to pull it into v2 of your series, or should
>  I submit it separately to David?

I have it already, but if Dave applies it right away, that's fine.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index c9df2e96ebe4..b702ba5986dc 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1871,15 +1871,6 @@  static int efx_ef10_try_update_nic_stats_vf(struct efx_nic *efx)
 
 	spin_unlock_bh(&efx->stats_lock);
 
-	if (in_interrupt()) {
-		/* If in atomic context, cannot update stats.  Just update the
-		 * software stats and return so the caller can continue.
-		 */
-		spin_lock_bh(&efx->stats_lock);
-		efx_update_sw_stats(efx, stats);
-		return 0;
-	}
-
 	efx_ef10_get_stat_mask(efx, mask);
 
 	rc = efx_nic_alloc_buffer(efx, &stats_buf, dma_len, GFP_ATOMIC);
@@ -1938,6 +1929,18 @@  static size_t efx_ef10_update_stats_vf(struct efx_nic *efx, u64 *full_stats,
 	return efx_ef10_update_stats_common(efx, full_stats, core_stats);
 }
 
+static size_t efx_ef10_update_stats_atomic_vf(struct efx_nic *efx, u64 *full_stats,
+					      struct rtnl_link_stats64 *core_stats)
+{
+	struct efx_ef10_nic_data *nic_data = efx->nic_data;
+
+	/* In atomic context, cannot update HW stats.  Just update the
+	 * software stats and return so the caller can continue.
+	 */
+	efx_update_sw_stats(efx, nic_data->stats);
+	return efx_ef10_update_stats_common(efx, full_stats, core_stats);
+}
+
 static void efx_ef10_push_irq_moderation(struct efx_channel *channel)
 {
 	struct efx_nic *efx = channel->efx;
@@ -3998,6 +4001,7 @@  const struct efx_nic_type efx_hunt_a0_vf_nic_type = {
 	.finish_flr = efx_port_dummy_op_void,
 	.describe_stats = efx_ef10_describe_stats,
 	.update_stats = efx_ef10_update_stats_vf,
+	.update_stats_atomic = efx_ef10_update_stats_atomic_vf,
 	.start_stats = efx_port_dummy_op_void,
 	.pull_stats = efx_port_dummy_op_void,
 	.stop_stats = efx_port_dummy_op_void,
diff --git a/drivers/net/ethernet/sfc/efx_common.c b/drivers/net/ethernet/sfc/efx_common.c
index c256db241570..72a3f0e09f52 100644
--- a/drivers/net/ethernet/sfc/efx_common.c
+++ b/drivers/net/ethernet/sfc/efx_common.c
@@ -602,7 +602,7 @@  void efx_net_stats(struct net_device *net_dev, struct rtnl_link_stats64 *stats)
 	struct efx_nic *efx = netdev_priv(net_dev);
 
 	spin_lock_bh(&efx->stats_lock);
-	efx->type->update_stats(efx, NULL, stats);
+	efx_nic_update_stats_atomic(efx, NULL, stats);
 	spin_unlock_bh(&efx->stats_lock);
 }
 
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 47aa753e64bd..9f7dfdf708cf 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -1172,6 +1172,9 @@  struct efx_udp_tunnel {
  * @describe_stats: Describe statistics for ethtool
  * @update_stats: Update statistics not provided by event handling.
  *	Either argument may be %NULL.
+ * @update_stats_atomic: Update statistics while in atomic context, if that
+ *	is more limiting than @update_stats.  Otherwise, leave %NULL and
+ *	driver core will call @update_stats.
  * @start_stats: Start the regular fetching of statistics
  * @pull_stats: Pull stats from the NIC and wait until they arrive.
  * @stop_stats: Stop the regular fetching of statistics
@@ -1316,6 +1319,8 @@  struct efx_nic_type {
 	size_t (*describe_stats)(struct efx_nic *efx, u8 *names);
 	size_t (*update_stats)(struct efx_nic *efx, u64 *full_stats,
 			       struct rtnl_link_stats64 *core_stats);
+	size_t (*update_stats_atomic)(struct efx_nic *efx, u64 *full_stats,
+				      struct rtnl_link_stats64 *core_stats);
 	void (*start_stats)(struct efx_nic *efx);
 	void (*pull_stats)(struct efx_nic *efx);
 	void (*stop_stats)(struct efx_nic *efx);
diff --git a/drivers/net/ethernet/sfc/nic_common.h b/drivers/net/ethernet/sfc/nic_common.h
index 82271f0b8627..b9cafe9cd568 100644
--- a/drivers/net/ethernet/sfc/nic_common.h
+++ b/drivers/net/ethernet/sfc/nic_common.h
@@ -244,6 +244,13 @@  void efx_nic_update_stats(const struct efx_hw_stat_desc *desc, size_t count,
 			  const unsigned long *mask, u64 *stats,
 			  const void *dma_buf, bool accumulate);
 void efx_nic_fix_nodesc_drop_stat(struct efx_nic *efx, u64 *stat);
+static inline size_t efx_nic_update_stats_atomic(struct efx_nic *efx, u64 *full_stats,
+						 struct rtnl_link_stats64 *core_stats)
+{
+	if (efx->type->update_stats_atomic)
+		return efx->type->update_stats_atomic(efx, full_stats, core_stats);
+	return efx->type->update_stats(efx, full_stats, core_stats);
+}
 
 #define EFX_MAX_FLUSH_TIME 5000