diff mbox series

[1/1] net: Allow all multicast packets to be received on a interface.

Message ID 20210617095020.28628-2-callum.sinclair@alliedtelesis.co.nz
State New
Headers show
Series [1/1] net: Allow all multicast packets to be received on a interface. | expand

Commit Message

Callum Sinclair June 17, 2021, 9:50 a.m. UTC
To receive IGMP or MLD packets on a IP socket on any interface the
multicast group needs to be explicitly joined. This works well for when
the multicast group the user is interested in is known, but does not
provide an easy way to snoop all packets in the 224.0.0.0/8 or the
FF00::/8 range.

Define a new sysctl to allow a given interface to become a IGMP or MLD
snooper. When set the interface will allow any IGMP or MLD packet to be
received on sockets bound to these devices.
---
 Documentation/networking/ip-sysctl.rst |  8 ++++++++
 include/linux/inetdevice.h             |  1 +
 include/linux/ipv6.h                   |  1 +
 include/uapi/linux/ip.h                |  1 +
 include/uapi/linux/ipv6.h              |  1 +
 include/uapi/linux/netconf.h           |  1 +
 include/uapi/linux/sysctl.h            |  1 +
 net/ipv4/devinet.c                     |  7 +++++++
 net/ipv4/igmp.c                        |  5 +++++
 net/ipv6/addrconf.c                    | 14 ++++++++++++++
 net/ipv6/mcast.c                       |  5 +++++
 11 files changed, 45 insertions(+)

Comments

Linus Lüssing June 17, 2021, 12:33 p.m. UTC | #1
Hi Callum,

On Thu, Jun 17, 2021 at 09:50:20PM +1200, Callum Sinclair wrote:
> +mc_snooping - BOOLEAN
> +	Enable multicast snooping on the interface. This allows any given
> +	multicast group to be received without explicitly being joined.
> +	The kernel needs to be compiled with CONFIG_MROUTE and/or
> +	CONFIG_IPV6_MROUTE.
> +	conf/all/mc_snooping must also be set to TRUE to enable multicast
> +	snooping for the interface.
> +

Generally this sounds like a useful feature. One note: When there
are snooping bridges/switches involved, you might run into issues
in receiving all multicast packets, as due to the missing IGMP/MLD
reports the snooping switches won't forward to you.

In that case, to conform to RFC4541 you would also need to become
the selected IGMP/MLD querier and send IGMP/MLD query messages. Or
better, you'd need to send Multicast Router Advertisements
(RFC4286). The latter is the recommended, more flexible way but
might not be supported by all multicast snooping switches yet.
The Linux bridge supports this.

There is a userspace tool called mrdisc you can use for MRD-Adv.
though: https://github.com/troglobit/mrdisc. So no need to
implement MRD Advertisements in the kernel with this patch (though
I could imagine that it might be a useful feature to have, having
MRD support out-of-the-box with this option). Just a note that some
IGMP/MLD Querier or MRD Adv. would be needed when IGMP/MLD snooping
switches are invoved might be nice to have in the mc_snooping
description for now, to avoid potential confusion later.


I'm also wondering if it could be useful to configure it via
setsockopt() per application instead of per device via sysctl. Either by
adding a new socket option. Or by allowing the any IP address
0.0.0.0 / :: with IP_ADD_MEMBERSHIP/IPV6_JOIN_GROUP. So that you
could use this for instance:

$ socat -u UDP6-RECV:1234,reuseaddr,ipv6-join-group="[::]:eth0" -
(currently :: fails with "Invalid argument")

I'm not sure however what the requirements for adding or extending
socket options are, if there are some POSIX standards that'd need
to be followed for compatibility with other OSes, for instance.


Hm, actually, I just noticed that there seem to be some multicast
related setsockopt()s already:

- PACKET_MR_PROMISC
- PACKET_MR_MULTICAST
- PACKET_MR_ALLMULTI

The last one seems to be what you are looking for, I think, the
manpage here says:

"PACKET_MR_ALLMULTI sets the socket up to receive all multicast
packets arriving at the interface"
https://www.man7.org/linux/man-pages/man7/packet.7.html

Or would you prefer to be able to use a layer 3 IP instead of
a layer 2 packet socket?

Regards, Linus
Andrew Lunn June 17, 2021, 2:18 p.m. UTC | #2
On Thu, Jun 17, 2021 at 09:50:20PM +1200, Callum Sinclair wrote:
> To receive IGMP or MLD packets on a IP socket on any interface the
> multicast group needs to be explicitly joined. This works well for when
> the multicast group the user is interested in is known, but does not
> provide an easy way to snoop all packets in the 224.0.0.0/8 or the
> FF00::/8 range.
> 
> Define a new sysctl to allow a given interface to become a IGMP or MLD
> snooper. When set the interface will allow any IGMP or MLD packet to be
> received on sockets bound to these devices.

Hi Callum

What is the big picture here? Are you trying to move the snooping
algorithm into user space? User space will then add/remove Multicast
FIB entries to the bridge to control where mulitcast frames are sent?

In the past i have written a multicast routing daemon. It is a similar
problem. You need access to all the join/leaves. But the stack does
provide them, if you bind to the multicast routing socket. Why not use
that mechanism? Look in the mrouted sources for an example.

     Andrew
kernel test robot June 17, 2021, 8:17 p.m. UTC | #3
Hi Callum,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc6 next-20210617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 70585216fe7730d9fb5453d3e2804e149d0fe201
config: x86_64-rhel-8.3-kselftests (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/4220b6837f4315ff557bd44f7aada23b69e181b6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
        git checkout 4220b6837f4315ff557bd44f7aada23b69e181b6
        # save the attached .config to linux build tree
        make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from net/ipv4/devinet.c:47:
   net/ipv4/devinet.c: In function 'inet_netconf_fill_devconf':
>> include/linux/inetdevice.h:53:45: error: 'IPV4_DEVCONF_NETCONFA_MC_SNOOPING' undeclared (first use in this function); did you mean 'IPV4_DEVCONF_MC_SNOOPING'?
      53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
         |                                             ^~~~~~~~~~~~~
   net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
    2069 |    IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
         |    ^~~~~~~~~~~~
   include/linux/inetdevice.h:53:45: note: each undeclared identifier is reported only once for each function it appears in
      53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
         |                                             ^~~~~~~~~~~~~
   net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
    2069 |    IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
         |    ^~~~~~~~~~~~


vim +53 include/linux/inetdevice.h

^1da177e4c3f41 Linus Torvalds    2005-04-16  52  
02291680ffba92 Eric W. Biederman 2010-02-14 @53  #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
586f12115264b7 Pavel Emelyanov   2007-12-16  54  #define IPV4_DEVCONF_ALL(net, attr) \
586f12115264b7 Pavel Emelyanov   2007-12-16  55  	IPV4_DEVCONF((*(net)->ipv4.devconf_all), attr)
42f811b8bcdf66 Herbert Xu        2007-06-04  56  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
kernel test robot June 17, 2021, 9:14 p.m. UTC | #4
Hi Callum,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc6 next-20210617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 70585216fe7730d9fb5453d3e2804e149d0fe201
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/4220b6837f4315ff557bd44f7aada23b69e181b6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
        git checkout 4220b6837f4315ff557bd44f7aada23b69e181b6
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from net/ipv4/devinet.c:47:
   net/ipv4/devinet.c: In function 'inet_netconf_fill_devconf':
>> include/linux/inetdevice.h:53:45: error: 'IPV4_DEVCONF_NETCONFA_MC_SNOOPING' undeclared (first use in this function); did you mean 'IPV4_DEVCONF_MC_SNOOPING'?
      53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
         |                                             ^~~~~~~~~~~~~
   net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
    2069 |    IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
         |    ^~~~~~~~~~~~
   include/linux/inetdevice.h:53:45: note: each undeclared identifier is reported only once for each function it appears in
      53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
         |                                             ^~~~~~~~~~~~~
   net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
    2069 |    IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
         |    ^~~~~~~~~~~~


vim +53 include/linux/inetdevice.h

^1da177e4c3f41 Linus Torvalds    2005-04-16  52  
02291680ffba92 Eric W. Biederman 2010-02-14 @53  #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
586f12115264b7 Pavel Emelyanov   2007-12-16  54  #define IPV4_DEVCONF_ALL(net, attr) \
586f12115264b7 Pavel Emelyanov   2007-12-16  55  	IPV4_DEVCONF((*(net)->ipv4.devconf_all), attr)
42f811b8bcdf66 Herbert Xu        2007-06-04  56  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Callum Sinclair June 18, 2021, 12:07 a.m. UTC | #5
Hi Linus 

> I'm also wondering if it could be useful to configure it via
> setsockopt() per application instead of per device via sysctl. Either by
> adding a new socket option. Or by allowing the any IP address
> 0.0.0.0 / :: with IP_ADD_MEMBERSHIP/IPV6_JOIN_GROUP. So that you
> could use this for instance:

Yes perhaps this would be a better way to get multicast snooping working with the existing
options. I can see that using a multicast routing IP socket I will receive all IGMP and MLD
data from that. I was just not creating the socket as a multicast routing socket.

> Or would you prefer to be able to use a layer 3 IP instead of
> a layer 2 packet socket?

Yes I was preferring to use a L3 IP socket instead of a L2 packet socket. This was to have
access to additional data from IP_PKTINFO.

Cheers
Callum
diff mbox series

Patch

diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index a5c250044500..12f82da52684 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1357,6 +1357,14 @@  mc_forwarding - BOOLEAN
 	conf/all/mc_forwarding must also be set to TRUE to enable multicast
 	routing	for the interface
 
+mc_snooping - BOOLEAN
+	Enable multicast snooping on the interface. This allows any given
+	multicast group to be received without explicitly being joined.
+	The kernel needs to be compiled with CONFIG_MROUTE and/or
+	CONFIG_IPV6_MROUTE.
+	conf/all/mc_snooping must also be set to TRUE to enable multicast
+	snooping for the interface.
+
 medium_id - INTEGER
 	Integer value used to differentiate the devices by the medium they
 	are attached to. Two devices can have different id values when
diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index 53aa0343bf69..071edf7d4f9c 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -95,6 +95,7 @@  static inline void ipv4_devconf_setall(struct in_device *in_dev)
 
 #define IN_DEV_FORWARD(in_dev)		IN_DEV_CONF_GET((in_dev), FORWARDING)
 #define IN_DEV_MFORWARD(in_dev)		IN_DEV_ANDCONF((in_dev), MC_FORWARDING)
+#define IN_DEV_MSNOOPING(in_dev)	IN_DEV_ANDCONF((in_dev), MC_SNOOPING)
 #define IN_DEV_BFORWARD(in_dev)		IN_DEV_ANDCONF((in_dev), BC_FORWARDING)
 #define IN_DEV_RPFILTER(in_dev)		IN_DEV_MAXCONF((in_dev), RP_FILTER)
 #define IN_DEV_SRC_VMARK(in_dev)    	IN_DEV_ORCONF((in_dev), SRC_VMARK)
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index 70b2ad3b9884..d88c34b1b3ae 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -52,6 +52,7 @@  struct ipv6_devconf {
 #endif
 #ifdef CONFIG_IPV6_MROUTE
 	__s32		mc_forwarding;
+	__s32		mc_snooping;
 #endif
 	__s32		disable_ipv6;
 	__s32		drop_unicast_in_l2_multicast;
diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h
index e42d13b55cf3..07956b4613d0 100644
--- a/include/uapi/linux/ip.h
+++ b/include/uapi/linux/ip.h
@@ -169,6 +169,7 @@  enum
 	IPV4_DEVCONF_DROP_UNICAST_IN_L2_MULTICAST,
 	IPV4_DEVCONF_DROP_GRATUITOUS_ARP,
 	IPV4_DEVCONF_BC_FORWARDING,
+	IPV4_DEVCONF_MC_SNOOPING,
 	__IPV4_DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 70603775fe91..aa9389e1c1fd 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -190,6 +190,7 @@  enum {
 	DEVCONF_NDISC_TCLASS,
 	DEVCONF_RPL_SEG_ENABLED,
 	DEVCONF_RA_DEFRTR_METRIC,
+	DEVCONF_MC_SNOOPING,
 	DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/netconf.h b/include/uapi/linux/netconf.h
index fac4edd55379..5259742a700b 100644
--- a/include/uapi/linux/netconf.h
+++ b/include/uapi/linux/netconf.h
@@ -19,6 +19,7 @@  enum {
 	NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN,
 	NETCONFA_INPUT,
 	NETCONFA_BC_FORWARDING,
+	NETCONFA_MC_SNOOPING,
 	__NETCONFA_MAX
 };
 #define NETCONFA_MAX	(__NETCONFA_MAX - 1)
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 1e05d3caa712..1b7be9dc78de 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -482,6 +482,7 @@  enum
 	NET_IPV4_CONF_PROMOTE_SECONDARIES=20,
 	NET_IPV4_CONF_ARP_ACCEPT=21,
 	NET_IPV4_CONF_ARP_NOTIFY=22,
+	NET_IPV4_CONF_MC_SNOOPING=23,
 };
 
 /* /proc/sys/net/ipv4/netfilter */
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 50deeff48c8b..3e4ac6aead9d 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -2014,6 +2014,8 @@  static int inet_netconf_msgsize_devconf(int type)
 		size += nla_total_size(4);
 	if (all || type == NETCONFA_MC_FORWARDING)
 		size += nla_total_size(4);
+	if (all || type == NETCONFA_MC_SNOOPING)
+		size += nla_total_size(4);
 	if (all || type == NETCONFA_BC_FORWARDING)
 		size += nla_total_size(4);
 	if (all || type == NETCONFA_PROXY_NEIGH)
@@ -2062,6 +2064,10 @@  static int inet_netconf_fill_devconf(struct sk_buff *skb, int ifindex,
 	    nla_put_s32(skb, NETCONFA_MC_FORWARDING,
 			IPV4_DEVCONF(*devconf, MC_FORWARDING)) < 0)
 		goto nla_put_failure;
+	if ((all || type == NETCONFA_MC_SNOOPING) &&
+	    nla_put_s32(skb, NETCONFA_MC_SNOOPING,
+			IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
+		goto nla_put_failure;
 	if ((all || type == NETCONFA_BC_FORWARDING) &&
 	    nla_put_s32(skb, NETCONFA_BC_FORWARDING,
 			IPV4_DEVCONF(*devconf, BC_FORWARDING)) < 0)
@@ -2506,6 +2512,7 @@  static struct devinet_sysctl_table {
 		DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, "forwarding",
 					     devinet_sysctl_forward),
 		DEVINET_SYSCTL_RO_ENTRY(MC_FORWARDING, "mc_forwarding"),
+		DEVINET_SYSCTL_RW_ENTRY(MC_SNOOPING, "mc_snooping"),
 		DEVINET_SYSCTL_RW_ENTRY(BC_FORWARDING, "bc_forwarding"),
 
 		DEVINET_SYSCTL_RW_ENTRY(ACCEPT_REDIRECTS, "accept_redirects"),
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 7b272bbed2b4..cd5a837dfb0c 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -2692,6 +2692,11 @@  int ip_check_mc_rcu(struct in_device *in_dev, __be32 mc_addr, __be32 src_addr, u
 	struct ip_sf_list *psf;
 	int rv = 0;
 
+#ifdef CONFIG_IP_MROUTE
+	if (IN_DEV_MSNOOPING(in_dev))
+		return 1;
+#endif /* CONFIG_IP_MROUTE */
+
 	mc_hash = rcu_dereference(in_dev->mc_hash);
 	if (mc_hash) {
 		u32 hash = hash_32((__force u32)mc_addr, MC_HASH_SZ_LOG);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 048570900fdf..b92ac4e8f37d 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -502,6 +502,8 @@  static int inet6_netconf_msgsize_devconf(int type)
 #ifdef CONFIG_IPV6_MROUTE
 	if (all || type == NETCONFA_MC_FORWARDING)
 		size += nla_total_size(4);
+	if (all || type == NETCONFA_MC_SNOOPING)
+		size += nla_total_size(4);
 #endif
 	if (all || type == NETCONFA_PROXY_NEIGH)
 		size += nla_total_size(4);
@@ -546,6 +548,10 @@  static int inet6_netconf_fill_devconf(struct sk_buff *skb, int ifindex,
 	    nla_put_s32(skb, NETCONFA_MC_FORWARDING,
 			devconf->mc_forwarding) < 0)
 		goto nla_put_failure;
+	if ((all || type == NETCONFA_MC_SNOOPING) &&
+	    nla_put_s32(skb, NETCONFA_MC_SNOOPING,
+			devconf->mc_snooping) < 0)
+		goto nla_put_failure;
 #endif
 	if ((all || type == NETCONFA_PROXY_NEIGH) &&
 	    nla_put_s32(skb, NETCONFA_PROXY_NEIGH, devconf->proxy_ndp) < 0)
@@ -5503,6 +5509,7 @@  static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
 #endif
 #ifdef CONFIG_IPV6_MROUTE
 	array[DEVCONF_MC_FORWARDING] = cnf->mc_forwarding;
+	array[DEVCONF_MC_SNOOPING] = cnf->mc_snooping;
 #endif
 	array[DEVCONF_DISABLE_IPV6] = cnf->disable_ipv6;
 	array[DEVCONF_ACCEPT_DAD] = cnf->accept_dad;
@@ -6786,6 +6793,13 @@  static const struct ctl_table addrconf_sysctl[] = {
 		.mode		= 0444,
 		.proc_handler	= proc_dointvec,
 	},
+	{
+		.procname	= "mc_snooping",
+		.data		= &ipv6_devconf.mc_snooping,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #endif
 	{
 		.procname	= "disable_ipv6",
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 54ec163fbafa..25046ee8276f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1013,6 +1013,11 @@  bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group,
 	struct ifmcaddr6 *mc;
 	bool rv = false;
 
+#ifdef CONFIG_IPV6_MROUTE
+	if (dev_net(dev)->ipv6.devconf_all->mc_snooping)
+		return true;
+#endif
+
 	rcu_read_lock();
 	idev = __in6_dev_get(dev);
 	if (idev) {