[net] macvlan: Fix the bug when destroy macvlan dev port

Message ID 53EC7D8C.6060908@huawei.com
State New
Headers show

Commit Message

Ding Tianhong Aug. 14, 2014, 9:12 a.m.
The bug was reported by Keller.Jacob E, and introduced by commit a188a54d1162
("macvlan: simplify the structure port")

--------------------------------------------------------------------------

[   80.643286] BUG: unable to handle kernel NULL pointer dereference at 0000000000000878
[   80.670103] IP: [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
[   80.691289] PGD 22c102067 PUD 235bf0067 PMD 0
[   80.706611] Oops: 0002 [#1] SMP
[   80.717836] Modules linked in: macvlan nfsd lockd nfs_acl exportfs auth_rpcgss sunrpc oid_registry ioatdma ixgbe(-) mdio igb dca
[   80.757935] CPU: 37 PID: 6724 Comm: rmmod Not tainted 3.16.0-net-next-08-12-2014-FCoE+ #1
[   80.785688] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
[   80.820310] task: ffff880235a9eae0 ti: ffff88022e844000 task.ti: ffff88022e844000
[   80.845770] RIP: 0010:[<ffffffff810832e4>]  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
[   80.875326] RSP: 0018:ffff88022e847b28  EFLAGS: 00010046
[   80.893251] RAX: 0000000000037a6a RBX: 0000000000000878 RCX: 0000000000000000
[   80.917187] RDX: ffff880235a9eae0 RSI: 0000000000000001 RDI: ffffffff810832db
[   80.941125] RBP: ffff88022e847b58 R08: 0000000000000000 R09: 0000000000000000
[   80.965056] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022e847b70
[   80.988994] R13: 0000000000000000 R14: ffff88022e847be8 R15: ffffffff81ebe440
[   81.012929] FS:  00007fab90b07700(0000) GS:ffff88043f7a0000(0000) knlGS:0000000000000000
[   81.040400] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   81.059757] CR2: 0000000000000878 CR3: 0000000235a42000 CR4: 00000000001407e0
[   81.083689] Stack:
[   81.090739]  ffff880235a9eae0 0000000000000878 ffff88022e847b70 0000000000000000
[   81.116253]  ffff88022e847be8 ffffffff81ebe440 ffff88022e847b98 ffffffff810847f1
[   81.141766]  ffff88022e847b78 0000000000000286 ffff880234200000 0000000000000000
[   81.167282] Call Trace:
[   81.175768]  [<ffffffff810847f1>] __cancel_work_timer+0x31/0x170
[   81.195985]  [<ffffffff8108494b>] cancel_work_sync+0xb/0x10
[   81.214769]  [<ffffffffa015ae68>] macvlan_port_destroy+0x28/0x60 [macvlan]
[   81.237844]  [<ffffffffa015b930>] macvlan_uninit+0x40/0x50 [macvlan]
[   81.259209]  [<ffffffff816bf6e2>] rollback_registered_many+0x1a2/0x2c0
[   81.281140]  [<ffffffff816bf81a>] unregister_netdevice_many+0x1a/0xb0
[   81.302786]  [<ffffffffa015a4ff>] macvlan_device_event+0x1ef/0x240 [macvlan]
[   81.326439]  [<ffffffff8108a13d>] notifier_call_chain+0x4d/0x70
[   81.346366]  [<ffffffff8108a201>] raw_notifier_call_chain+0x11/0x20
[   81.367439]  [<ffffffff816bf25b>] call_netdevice_notifiers_info+0x3b/0x70
[   81.390228]  [<ffffffff816bf2a1>] call_netdevice_notifiers+0x11/0x20
[   81.411587]  [<ffffffff816bf6bd>] rollback_registered_many+0x17d/0x2c0
[   81.433518]  [<ffffffff816bf925>] unregister_netdevice_queue+0x75/0x110
[   81.455735]  [<ffffffff816bfb2b>] unregister_netdev+0x1b/0x30
[   81.475094]  [<ffffffffa0039b50>] ixgbe_remove+0x170/0x1d0 [ixgbe]
[   81.495886]  [<ffffffff813512a2>] pci_device_remove+0x32/0x60
[   81.515246]  [<ffffffff814c75c4>] __device_release_driver+0x64/0xd0
[   81.536321]  [<ffffffff814c76f8>] driver_detach+0xc8/0xd0
[   81.554530]  [<ffffffff814c656e>] bus_remove_driver+0x4e/0xa0
[   81.573888]  [<ffffffff814c828b>] driver_unregister+0x2b/0x60
[   81.593246]  [<ffffffff8135143e>] pci_unregister_driver+0x1e/0xa0
[   81.613749]  [<ffffffffa005db18>] ixgbe_exit_module+0x1c/0x2e [ixgbe]
[   81.635401]  [<ffffffff810e738b>] SyS_delete_module+0x15b/0x1e0
[   81.655334]  [<ffffffff8187a395>] ? sysret_check+0x22/0x5d
[   81.673833]  [<ffffffff810abd2d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
[   81.696339]  [<ffffffff8132bfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   81.717985]  [<ffffffff8187a369>] system_call_fastpath+0x16/0x1b
[   81.738199] Code: 00 48 83 3d 6e bb da 00 00 48 89 c2 0f 84 67 01 00 00 fa 66 0f 1f 44 00 00 49 89 14 24 e8 b5 4b 02 00 45 84 ed 0f 85 ac 00 00 00 <f0> 0f ba 2b 00 72 1d 31 c0 48 8b 5d d8 4c 8b 65 e0 4c 8b 6d e8
[   81.807026] RIP  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
[   81.828468]  RSP <ffff88022e847b28>
[   81.840384] CR2: 0000000000000878
[   81.851731] ---[ end trace 9f6c7232e3464e11 ]---

------------------------------------------------------------

This bug could be triggered by these steps:

modprobe ixgbe ; modprobe macvlan
ip link add link p96p1 address 00:1B:21:6E:06:00 macvlan0 type macvlan
ip link add link p96p1 address 00:1B:21:6E:06:01 macvlan1 type macvlan
ip link add link p96p1 address 00:1B:21:6E:06:02 macvlan2 type macvlan
ip link add link p96p1 address 00:1B:21:6E:06:03 macvlan3 type macvlan
rmmod ixgbe

The reason is that when remove the ixgbe driver, the macvlan_uninit will be
called multiple because the lowerdev is being destroyed, but at the first
calling of macvlan_uninit, the port for the macvlan has been destroyed,
so when the other macvlan_uninit entered, the port is not exist and panic
was happened.

To fix this problem, we need to check whether the port is exist when calling
macvlan_port_destroy(), do nothing if the port has been destroyed yet.

Reported-by: "Keller, Jacob E" <jacob.e.keller@intel.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/macvlan.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Cong Wang Aug. 14, 2014, 4:58 p.m. | #1
On Thu, Aug 14, 2014 at 2:12 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
>
> To fix this problem, we need to check whether the port is exist when calling
> macvlan_port_destroy(), do nothing if the port has been destroyed yet.
>

As I said, this will make the first call of macvlan_port_destroy()
free the port,
which was freed by the last one before.

Why not just revert your change? It doesn't give any benefit.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 14, 2014, 5:38 p.m. | #2
On Thu, 2014-08-14 at 09:58 -0700, Cong Wang wrote:
> On Thu, Aug 14, 2014 at 2:12 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
> >
> > To fix this problem, we need to check whether the port is exist when calling
> > macvlan_port_destroy(), do nothing if the port has been destroyed yet.
> >
> 
> As I said, this will make the first call of macvlan_port_destroy()
> free the port,
> which was freed by the last one before.
> 
> Why not just revert your change? It doesn't give any benefit.
> --

Yes please, I think patch should be reverted.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keller, Jacob E Aug. 14, 2014, 8:32 p.m. | #3
On Thu, 2014-08-14 at 10:38 -0700, Eric Dumazet wrote:
> On Thu, 2014-08-14 at 09:58 -0700, Cong Wang wrote:

> > On Thu, Aug 14, 2014 at 2:12 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:

> > >

> > > To fix this problem, we need to check whether the port is exist when calling

> > > macvlan_port_destroy(), do nothing if the port has been destroyed yet.

> > >

> > 

> > As I said, this will make the first call of macvlan_port_destroy()

> > free the port,

> > which was freed by the last one before.

> > 

> > Why not just revert your change? It doesn't give any benefit.

> > --

> 

> Yes please, I think patch should be reverted.

> 

> 


As far as I can tell from the original patch, the only advantage it has
is the removal of a single int from a structure size. I'm not sure
that's very advantageous.

The other fix I saw might work is move the list_del so that vlans is
emptied inside uninit instead of dellink. This would be the same place
that count was decremented. However, I think this has many potential
side effects, and is not worth the saving of a single integer's worth of
space from a structure.

Thanks,
Jake
David Miller Aug. 14, 2014, 9:32 p.m. | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 14 Aug 2014 10:38:30 -0700

> On Thu, 2014-08-14 at 09:58 -0700, Cong Wang wrote:
>> On Thu, Aug 14, 2014 at 2:12 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
>> >
>> > To fix this problem, we need to check whether the port is exist when calling
>> > macvlan_port_destroy(), do nothing if the port has been destroyed yet.
>> >
>> 
>> As I said, this will make the first call of macvlan_port_destroy()
>> free the port,
>> which was freed by the last one before.
>> 
>> Why not just revert your change? It doesn't give any benefit.
>> --
> 
> Yes please, I think patch should be reverted.

I agree, this should be reverted, I'm going to do it myself.

 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ding Tianhong Aug. 15, 2014, 1:41 a.m. | #5
On 2014/8/15 5:32, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 14 Aug 2014 10:38:30 -0700
> 
>> On Thu, 2014-08-14 at 09:58 -0700, Cong Wang wrote:
>>> On Thu, Aug 14, 2014 at 2:12 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
>>>>
>>>> To fix this problem, we need to check whether the port is exist when calling
>>>> macvlan_port_destroy(), do nothing if the port has been destroyed yet.
>>>>
>>>
>>> As I said, this will make the first call of macvlan_port_destroy()
>>> free the port,
>>> which was freed by the last one before.
>>>
>>> Why not just revert your change? It doesn't give any benefit.
>>> --
>>
>> Yes please, I think patch should be reverted.
> 
> I agree, this should be reverted, I'm going to do it myself.
> 
>  

Ok, sorry to trouble you.

> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index ef8a5c2..182f31b 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -933,6 +933,9 @@  static void macvlan_port_destroy(struct net_device *dev)
 {
 	struct macvlan_port *port = macvlan_port_get_rtnl(dev);
 
+	if (!macvlan_port_exists(dev))
+		return;
+
 	cancel_work_sync(&port->bc_work);
 	dev->priv_flags &= ~IFF_MACVLAN_PORT;
 	netdev_rx_handler_unregister(dev);