[V2,net-next] net: hns: Fix to conditionally convey RX checksum flag to stack

Message ID 20161129130945.919372-1-salil.mehta@huawei.com
State New
Headers show

Commit Message

Salil Mehta Nov. 29, 2016, 1:09 p.m.
This patch introduces the RX checksum function to check the
status of the hardware calculated checksum and its error and
appropriately convey status to the upper stack in skb->ip_summed
field.

We only support checksum for IPv4, UDP(over IPv4 or IPv6),
TCP(over IPv4 or IPv6) and SCTP but we support many L3(IPv4, IPv6,
MPLS, PPPoE etc) and L4(TCP, UDP, GRE, SCTP, IGMP, ICMP etc.)
protocols. We want to filter out L3 and L4 protocols early on for
which checksum is not supported.

Our present hardware RX Descriptor lacks L3/L4 "Checksum
Status & Error" bit (indicating whether checksum was calculated
and if there was an error encountered) for the supported protocol
received in the packet. Therefore, we do the following:
1. Filter the protocols for which checksum is not supported.
2. Check if there were any errors encountered in L3 or L4
   protocols. These errors might not just be Checksum errors but
   could be related to version, length of IPv4, UDP, TCP etc.
   2a. If L3 Errors amd L4 Errors exists, then return as our RX
       descriptor lacks Status-and-Error bits for checksum so
       cannot identify specifically if error was because of
       checksum error or other error for this packet.
   2b. If above errors do not exists, then we set checksum
       un-necessary for upper layers.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>

---
Change Log:
Patch V2: Addressed the comment by David Miller
          Link: https://www.spinics.net/lists/netdev/msg406697.html
Patch V1: This patch is a result of the comments given by
          David Miller <davem@davemloft.net>
          Link: https://lkml.org/lkml/2016/6/15/27
---
 drivers/net/ethernet/hisilicon/hns/hnae.h     |    2 +
 drivers/net/ethernet/hisilicon/hns/hns_enet.c |   71 ++++++++++++++++++++++---
 2 files changed, 66 insertions(+), 7 deletions(-)

-- 
1.7.9.5

Comments

Salil Mehta Dec. 1, 2016, 4:59 p.m. | #1
> -----Original Message-----

> From: Salil Mehta

> Sent: Thursday, December 01, 2016 12:09 PM

> To: 'David Miller'

> Cc: Zhuangyuzeng (Yisen); mehta.salil.lnk@gmail.com;

> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm

> Subject: RE: [PATCH V2 net-next] net: hns: Fix to conditionally convey

> RX checksum flag to stack

> 

> > -----Original Message-----

> > From: David Miller [mailto:davem@davemloft.net]

> > Sent: Wednesday, November 30, 2016 7:26 PM

> > To: Salil Mehta

> > Cc: Zhuangyuzeng (Yisen); mehta.salil.lnk@gmail.com;

> > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm

> > Subject: Re: [PATCH V2 net-next] net: hns: Fix to conditionally

> convey

> > RX checksum flag to stack

> >

> > From: Salil Mehta <salil.mehta@huawei.com>

> > Date: Tue, 29 Nov 2016 13:09:45 +0000

> >

> > > +	/* We only support checksum for IPv4,UDP(over IPv4 or IPv6),

> > TCP(over

> > > +	 * IPv4 or IPv6) and SCTP but we support many L3(IPv4, IPv6,

> > MPLS,

> > > +	 * PPPoE etc) and L4(TCP, UDP, GRE, SCTP, IGMP, ICMP etc.)

> > protocols.

> > > +	 * We want to filter out L3 and L4 protocols early on for which

> > checksum

> > > +	 * is not supported.

> >  ...

> > > +	 */

> > > +	l3id = hnae_get_field(flag, HNS_RXD_L3ID_M, HNS_RXD_L3ID_S);

> > > +	l4id = hnae_get_field(flag, HNS_RXD_L4ID_M, HNS_RXD_L4ID_S);

> > > +	if ((l3id != HNS_RX_FLAG_L3ID_IPV4) &&

> > > +	    ((l3id != HNS_RX_FLAG_L3ID_IPV6) ||

> > > +	     (l4id != HNS_RX_FLAG_L4ID_UDP)) &&

> > > +	    ((l3id != HNS_RX_FLAG_L3ID_IPV6) ||

> > > +	     (l4id != HNS_RX_FLAG_L4ID_TCP)) &&

> > > +	    (l4id != HNS_RX_FLAG_L4ID_SCTP))

> > > +		return;

> >

> > I have a hard time understanding this seemingly overcomplicated

> > check.

> >

> > It looks like if L3 is IPV4 it will accept any underlying L4

> protocol,

> > but is that what is really intended?  That doesn't match what this

> new

> > comment states.

> I agree that it is bit difficult to read. Earlier, I was banking on the

> register(mistakenly, its hardware implementation err ) to de-multiplex

> the checksum error type. The register supported indication of just IPv4

> Header Checksum Error as well (which meant it could carry any L4

> protocol).

> IPv6 does not have similar Header checksum error. Therefore, to check

> if it is just IPv4 Header checksum error or any supported L4 transport

> (UDP or TCP) error over IPv4 or IPv6 I had to bank upon above complex

> check

> 

> Below suggested solution check would have been insufficient for

> example, if packet had IPv4/IGMP and there was a checksum error in IPv4

> header.

> 

> Comment states:

> " We only support checksum for IPv4, UDP(over IPv4 or IPv6),

>   TCP(over IPv4 or IPv6) and SCTP"

>    1) Checksum of IPv4 (IPv4 header)

>    2) UDP(over IPv4 or IPv6)

>    3) TCP(over IPv4 or IPv6)

>    4) SCTP (over IPv4 or IPv6)*

> 

> (*) I should have put IPv4/IPv6 check for SCTP in the code

>     and made it clear in the comment as well?

> 

> >

> > My understanding is that the chip supports checksums for:

> >

> > 	UDP/IPV4

> > 	UDP/IPV6

> > 	TCP/IPV4

> > 	TCP/IPV6

> > 	SCTP/IPV4

> > 	SCTP/IPV6

> 

> Hardware also supports checksum of IPv4 Header.

> 

> >

> > So the simplest thing is to validate each level one at a time:

> >

> > 	if (l3 != IPV4 && l3 != IPV6)

> > 		return;

> > 	if (l4 != UDP && l4 != TCP && l4 != SCTP)

> > 		return;

> I guess above check will fail to catch cases like IPv4/IGMP, when there

> is a bad checksum in The IPv4 header.

> 

> But maybe now since we don't have any method to de-multiplex the kind

> of

> checksum error (cannot depend upon register) we can have below code

> re-arrangement:

> 

> hns_nic_rx_checksum() {

>       /* check supported L3 protocol */

> 	if (l3 != IPV4 && l3 != IPV6)

> 		return;

>       /* check if L3 protocols error */

>       if (l3e)

> 	 	return;

> 

>       /* check if the packets are fragmented */

> 	If (l3frags)

> 		Return;

> 

>       /* check supported L4 protocol */

>  	if (l4 != UDP && l4 != TCP && l4 != SCTP)

>  		return;

Hi David,
This logic might not do as well since IGMP packet with valid IP
Checksum will be bypassed here. We would want to set
CHECKSUM_UNNECESSARY for such IP packets as well.

It looks to me the cumbersome check in the PATCH V2 should
be retained.

>       /* check if any L4 protocol error */

>       if (l3e)

> 	 	return;

> 

>       /* packet with valid checksum - covey to stack */

>       skb->ip_summed = CHECKSUM_UNNECESSARY

> }


> 

> Hope I am not missing something here. Please correct my understanding

> if there is any gap here. Thanks!

> 

> Best regards

> Salil

Patch hide | download patch | download mbox

diff --git a/drivers/net/ethernet/hisilicon/hns/hnae.h b/drivers/net/ethernet/hisilicon/hns/hnae.h
index 09602f1..8016854 100644
--- a/drivers/net/ethernet/hisilicon/hns/hnae.h
+++ b/drivers/net/ethernet/hisilicon/hns/hnae.h
@@ -99,6 +99,8 @@  enum hnae_led_state {
 #define HNS_RX_FLAG_L3ID_IPV6 0x1
 #define HNS_RX_FLAG_L4ID_UDP 0x0
 #define HNS_RX_FLAG_L4ID_TCP 0x1
+#define HNS_RX_FLAG_L4ID_SCTP 0x3
+
 
 #define HNS_TXD_ASID_S 0
 #define HNS_TXD_ASID_M (0xff << HNS_TXD_ASID_S)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 255fede..6450d0e 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -565,6 +565,66 @@  static void get_rx_desc_bnum(u32 bnum_flag, int *out_bnum)
 				   HNS_RXD_BUFNUM_M, HNS_RXD_BUFNUM_S);
 }
 
+static void hns_nic_rx_checksum(struct hns_nic_ring_data *ring_data,
+				struct sk_buff *skb, u32 flag)
+{
+	struct net_device *netdev = ring_data->napi.dev;
+	u32 l3id;
+	u32 l4id;
+
+	/* check if RX checksum offload is enabled */
+	if (unlikely(!(netdev->features & NETIF_F_RXCSUM)))
+		return;
+
+	/* We only support checksum for IPv4,UDP(over IPv4 or IPv6), TCP(over
+	 * IPv4 or IPv6) and SCTP but we support many L3(IPv4, IPv6, MPLS,
+	 * PPPoE etc) and L4(TCP, UDP, GRE, SCTP, IGMP, ICMP etc.) protocols.
+	 * We want to filter out L3 and L4 protocols early on for which checksum
+	 * is not supported.
+	 *
+	 * Our present hardware RX Descriptor lacks L3/L4 "Checksum Status &
+	 * Error" bit (indicating whether checksum was calculated and if there
+	 * was an error encountered) for the supported protocol received in the
+	 * packet. Therefore, we do the following:
+	 * 1. Filter the protocols for which checksum is not supported.
+	 * 2. Check if there were any errors encountered in L3 or L4 protocols.
+	 *    These errors might not just be Checksum errors but could be
+	 *    related to version, length of IPv4, UDP, TCP etc.
+	 *    2a. If L3 Errors amd L4 Errors exists, then return as our RX
+	 *        descriptor lacks Status-and-Error bits for checksum so cannot
+	 *        identify specifically if error was because of checksum error
+	 *        or other error for this packet.
+	 *    2b. If above errors do not exists, then we set checksum
+	 *        un-necessary for upper layers.
+	 */
+	l3id = hnae_get_field(flag, HNS_RXD_L3ID_M, HNS_RXD_L3ID_S);
+	l4id = hnae_get_field(flag, HNS_RXD_L4ID_M, HNS_RXD_L4ID_S);
+	if ((l3id != HNS_RX_FLAG_L3ID_IPV4) &&
+	    ((l3id != HNS_RX_FLAG_L3ID_IPV6) ||
+	     (l4id != HNS_RX_FLAG_L4ID_UDP)) &&
+	    ((l3id != HNS_RX_FLAG_L3ID_IPV6) ||
+	     (l4id != HNS_RX_FLAG_L4ID_TCP)) &&
+	    (l4id != HNS_RX_FLAG_L4ID_SCTP))
+		return;
+
+	/* We do not support checksum of fragmented packets */
+	if (unlikely(hnae_get_bit(flag, HNS_RXD_FRAG_B)))
+		return;
+
+	/* Check if there are any L3/L4 errors encountered during RX checksum */
+	if (unlikely(hnae_get_bit(flag, HNS_RXD_L3E_B) ||
+		     hnae_get_bit(flag, HNS_RXD_L4E_B))) {
+		/* we dont have any way to detect if this is checksum error or
+		 * any specific protocol error. Therefore, return no checksum
+		 * all protocol errors.
+		 */
+		return;
+	}
+
+	/* Now, this is a packet with valid RX checksum */
+	skb->ip_summed = CHECKSUM_UNNECESSARY;
+}
+
 static int hns_nic_poll_rx_skb(struct hns_nic_ring_data *ring_data,
 			       struct sk_buff **out_skb, int *out_bnum)
 {
@@ -683,13 +743,10 @@  static int hns_nic_poll_rx_skb(struct hns_nic_ring_data *ring_data,
 	ring->stats.rx_pkts++;
 	ring->stats.rx_bytes += skb->len;
 
-	if (unlikely(hnae_get_bit(bnum_flag, HNS_RXD_L3E_B) ||
-		     hnae_get_bit(bnum_flag, HNS_RXD_L4E_B))) {
-		ring->stats.l3l4_csum_err++;
-		return 0;
-	}
-
-	skb->ip_summed = CHECKSUM_UNNECESSARY;
+	/* indicate to upper stack if our hardware has already calculated
+	 * the RX checksum
+	 */
+	hns_nic_rx_checksum(ring_data, skb, bnum_flag);
 
 	return 0;
 }