diff mbox series

[net] ice: fix memory leak of aRFS after resuming from suspend

Message ID 20210318081507.36287-1-yongxin.liu@windriver.com
State New
Headers show
Series [net] ice: fix memory leak of aRFS after resuming from suspend | expand

Commit Message

Yongxin Liu March 18, 2021, 8:15 a.m. UTC
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then
irq_free_descs() will be eventually called to free irq and its descriptor.

In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs.
However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe
cannot be freed, if the irqs that released in ice_suspend() were reassigned
to other devices, which makes irq descriptor's affinity_notify lost.

So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which can
make sure all irq_glue and cpu_rmap can be correctly released before
corresponding irq and descriptor are released.

Fix the following memeory leak.

unreferenced object 0xffff95bd951afc00 (size 512):
  comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
  hex dump (first 32 bytes):
    18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff  ........p.......
    00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff  ................
  backtrace:
    [<0000000072e4b914>] __kmalloc+0x336/0x540
    [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0
    [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice]
    [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
    [<00000000d692edba>] local_pci_probe+0x47/0xa0
    [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
    [<00000000555a9e4a>] process_one_work+0x1dd/0x410
    [<000000002c4b414a>] worker_thread+0x221/0x3f0
    [<00000000bb2b556b>] kthread+0x14c/0x170
    [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30
unreferenced object 0xffff95bd81b0a2a0 (size 96):
  comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s)
  hex dump (first 32 bytes):
    38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00  8...............
    b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff  ................
  backtrace:
    [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0
    [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0
    [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice]
    [<000000002370a632>] ice_probe+0x941/0x1180 [ice]
    [<00000000d692edba>] local_pci_probe+0x47/0xa0
    [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30
    [<00000000555a9e4a>] process_one_work+0x1dd/0x410
    [<000000002c4b414a>] worker_thread+0x221/0x3f0
    [<00000000bb2b556b>] kthread+0x14c/0x170
    [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30

Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
---
 drivers/net/ethernet/intel/ice/ice_arfs.c | 1 -
 drivers/net/ethernet/intel/ice/ice_main.c | 3 +++
 2 files changed, 3 insertions(+), 1 deletion(-)

Comments

Yongxin Liu March 19, 2021, 2:33 a.m. UTC | #1
> -----Original Message-----

> From: Creeley, Brett <brett.creeley@intel.com>

> Sent: Friday, March 19, 2021 06:20

> To: Liu, Yongxin <Yongxin.Liu@windriver.com>; jeffrey.t.kirsher@intel.com;

> Chittim, Madhu <madhu.chittim@intel.com>; Nguyen, Anthony L

> <anthony.l.nguyen@intel.com>; andrewx.bowers@intel.com

> Cc: netdev@vger.kernel.org

> Subject: Re: [PATCH net] ice: fix memory leak of aRFS after resuming from

> suspend

> 

> 

> On Thu, 2021-03-18 at 16:15 +0800, Yongxin Liu wrote:

> > In ice_suspend(), ice_clear_interrupt_scheme() is called, and then

> > irq_free_descs() will be eventually called to free irq and its

> > descriptor.

> >

> > In ice_resume(), ice_init_interrupt_scheme() is called to allocate new

> > irqs.

> > However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap

> > maybe cannot be freed, if the irqs that released in ice_suspend() were

> > reassigned to other devices, which makes irq descriptor's

> > affinity_notify lost.

> >

> > So move ice_remove_arfs() before ice_clear_interrupt_scheme(), which

> > can make sure all irq_glue and cpu_rmap can be correctly released

> > before corresponding irq and descriptor are released.

> >

> > Fix the following memeory leak.

> 

> s/memeory/memory

> 

> <snip>

> 

> > diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c

> > b/drivers/net/ethernet/intel/ice/ice_arfs.c

> > index 6560acd76c94..c748d0a5c7d4 100644

> > --- a/drivers/net/ethernet/intel/ice/ice_arfs.c

> > +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c

> > @@ -654,7 +654,6 @@ void ice_rebuild_arfs(struct ice_pf *pf)

> >       if (!pf_vsi)

> >               return;

> >

> > -     ice_remove_arfs(pf);

> 

> This should not be removed. Removing this would break the reset flows

> outside of the suspend/remove case.

> 

> >       if (ice_set_cpu_rx_rmap(pf_vsi)) {

> >               dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n");

> >               return;

> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c

> > b/drivers/net/ethernet/intel/ice/ice_main.c

> > index 2c23c8f468a5..dba901bf2b9b 100644

> > --- a/drivers/net/ethernet/intel/ice/ice_main.c

> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c

> > @@ -4568,6 +4568,9 @@ static int __maybe_unused ice_suspend(struct

> > device *dev)

> >                       continue;

> >               ice_vsi_free_q_vectors(pf->vsi[v]);

> >       }

> > +     if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {

> > +             ice_remove_arfs(pf);

> > +     }

> 

> Braces aren't needed around a single if statement like this.

> 

> Also, I don't think this is the right solution. I think a better approach

> would be to call ice_free_rx_cpu_map() here. With this, it seems like no

> other changes are necessary. It also isn't necessary to check the

> ICE_FLAG_FD_ENA bit with this change.


Thanks for your valuable review. I will send V2.

--Yongxin

> 

> >       ice_clear_interrupt_scheme(pf);

> >

> >       pci_save_state(pdev);
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c
index 6560acd76c94..c748d0a5c7d4 100644
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c
+++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
@@ -654,7 +654,6 @@  void ice_rebuild_arfs(struct ice_pf *pf)
 	if (!pf_vsi)
 		return;
 
-	ice_remove_arfs(pf);
 	if (ice_set_cpu_rx_rmap(pf_vsi)) {
 		dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n");
 		return;
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 2c23c8f468a5..dba901bf2b9b 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -4568,6 +4568,9 @@  static int __maybe_unused ice_suspend(struct device *dev)
 			continue;
 		ice_vsi_free_q_vectors(pf->vsi[v]);
 	}
+	if (test_bit(ICE_FLAG_FD_ENA, pf->flags)) {
+		ice_remove_arfs(pf);
+	}
 	ice_clear_interrupt_scheme(pf);
 
 	pci_save_state(pdev);