diff mbox series

[net-next,2/6] igb: Add double-check MTA_REGISTER for i210 and i211

Message ID 20210416204500.2012073-3-anthony.l.nguyen@intel.com
State New
Headers show
Series 1GbE Intel Wired LAN Driver Updates 2021-04-16 | expand

Commit Message

Tony Nguyen April 16, 2021, 8:44 p.m. UTC
From: Grzegorz Siwik <grzegorz.siwik@intel.com>

Add new function which checks MTA_REGISTER if its filled correctly.
If not then writes again to same register.
There is possibility that i210 and i211 could not accept
MTA_REGISTER settings, specially when you add and remove
many of multicast addresses in short time.
Without this patch there is possibility that multicast settings will be
not always set correctly in hardware.

Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_mac.c | 27 ++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Tony Nguyen April 20, 2021, 4:22 p.m. UTC | #1
On Fri, 2021-04-16 at 14:12 -0700, Jakub Kicinski wrote:
> On Fri, 16 Apr 2021 13:44:56 -0700 Tony Nguyen wrote:

> > +	bool is_failed;

> > +	int i;

> > +

> > +	do {

> > +		is_failed = false;

> > +		for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {

> > +			if (array_rd32(E1000_MTA, i) != hw-

> > >mac.mta_shadow[i]) {

> > +				is_failed = true;

> > +				array_wr32(E1000_MTA, i, hw-

> > >mac.mta_shadow[i]);

> > +				wrfl();

> > +				break;

> > +			}

> > +		}

> > +	} while (is_failed);

> 

> Looks like a potential infinite loop on persistent failure.

> Also you don't need "is_failed", you can use while (i >= 0), or

> assign i = hw->mac.mta_reg_count, or consider using a goto. 


We will make a follow on patch to address these issues.

Thanks,
Tony
Nick Lowe May 10, 2021, 1:43 p.m. UTC | #2
Hi all,

> > Looks like a potential infinite loop on persistent failure.

> > Also you don't need "is_failed", you can use while (i >= 0), or

> > assign i = hw->mac.mta_reg_count, or consider using a goto.

>

> We will make a follow on patch to address these issues.

>

> Thanks,

> Tony


The patch for this that has been queued is as follows:

+     int failed_cnt = 3;
+     bool is_failed;
+     int i;
+
+     do {
+          is_failed = false;
+          for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
+               if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {
+                    is_failed = true;
+                    array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
+                    wrfl();
+               }
+          }
+          if (is_failed && --failed_cnt <= 0) {
+               hw_dbg("Failed to update MTA_REGISTER, too many retries");
+               break;
+          }
+     } while (is_failed);

https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git/commit/?h=dev-queue&id=9db33b54fb98525e323d0d3f16b01778f17b9493

This will not reset the counter when checking each register and it
will not debug output which register failed, this does not seem
optimal.

Could it make more sense to instead do something like this? (Untested)

+     int i;
+     int attempt;
+     for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
+          for (attempt = 3; attempt >= 1; attempt--) {
+               if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {
+                    array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
+                    wrfl();
+
+                    if (attempt == 1 && array_rd32(E1000_MTA, i) !=
hw->mac.mta_shadow[i]) {
+                         hw_dbg("Failed to update MTA_REGISTER %d,
too many retries\n", i);
+                    }
+               }
+          }
+     }

Best,

Nick
Nick Lowe May 10, 2021, 2:01 p.m. UTC | #3
Or better, make attempt increment instead of the original decrement:

+     int i;
+     int attempt;
+     for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
+          for (attempt = 1; attempt <= 3; attempt++) {
+               if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {
+                    array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
+                    wrfl();
+
+                    if (attempt == 3 && array_rd32(E1000_MTA, i) !=
hw->mac.mta_shadow[i]) {
+                         hw_dbg("Failed to update MTA_REGISTER %d,
too many retries\n", i);
+                    }
+               }
+          }
+     }
Siwik, Grzegorz May 13, 2021, 12:19 p.m. UTC | #4
Hi all,

> > > Looks like a potential infinite loop on persistent failure.

> > > Also you don't need "is_failed", you can use while (i >= 0), or 

> > > assign i = hw->mac.mta_reg_count, or consider using a goto.

> >

> > We will make a follow on patch to address these issues.

> >

> > Thanks,

> > Tony


> The patch for this that has been queued is as follows:


+     int failed_cnt = 3;
+     bool is_failed;
+     int i;
+
+     do {
+          is_failed = false;
+          for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
+               if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {
+                    is_failed = true;
+                    array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
+                    wrfl();
+               }
+          }
+          if (is_failed && --failed_cnt <= 0) {
+               hw_dbg("Failed to update MTA_REGISTER, too many retries");
+               break;
+          }
+     } while (is_failed);

> https://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue.git/commit/?h=dev-queue&id=9db33b54fb98525e323d0d3f16b01778f17b9493


> This will not reset the counter when checking each register and it will not debug output which register failed, this does not seem optimal.


> Could it make more sense to instead do something like this? (Untested)


I cannot agree for this part. In your solution we are checking every register 3 times. 
Entire MTA_ARRAY you will check MTA_REG_COUNT*3 times.
In my code this is worst case scenario - in best scenario I'm checking every MTA only one time.
Please remember that performance is also really important
Also the problem is that i21x devices could not always accept MTA_REGISTER setting. My code has been tested and verified as working.
In my opinion we don't have to know which E1000_MTA register has failed, but we should know that there is problem with entire MTA_REGISTER.
When I was checking this with test script for over 11M iterations this issue never happened more than one time in row. 

> +     int i;

> +     int attempt;

> +     for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {

> +          for (attempt = 3; attempt >= 1; attempt--) {

> +               if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {

> +                    array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);

> +                    wrfl();

> +

> +                    if (attempt == 1 && array_rd32(E1000_MTA, i) !=

> hw->mac.mta_shadow[i]) {

> +                         hw_dbg("Failed to update MTA_REGISTER %d,

> too many retries\n", i);

> +                    }

> +               }

> +          }

> +     }


> Best,


> Nick


Best Regards,
Grzegorz
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igb/e1000_mac.c b/drivers/net/ethernet/intel/igb/e1000_mac.c
index fd8eb2f9ab9d..e63ee3cca5ea 100644
--- a/drivers/net/ethernet/intel/igb/e1000_mac.c
+++ b/drivers/net/ethernet/intel/igb/e1000_mac.c
@@ -483,6 +483,31 @@  static u32 igb_hash_mc_addr(struct e1000_hw *hw, u8 *mc_addr)
 	return hash_value;
 }
 
+/**
+ * igb_i21x_hw_doublecheck - double checks potential HW issue in i21X
+ * @hw: pointer to the HW structure
+ *
+ * Checks if multicast array is wrote correctly
+ * If not then rewrites again to register
+ **/
+static void igb_i21x_hw_doublecheck(struct e1000_hw *hw)
+{
+	bool is_failed;
+	int i;
+
+	do {
+		is_failed = false;
+		for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) {
+			if (array_rd32(E1000_MTA, i) != hw->mac.mta_shadow[i]) {
+				is_failed = true;
+				array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
+				wrfl();
+				break;
+			}
+		}
+	} while (is_failed);
+}
+
 /**
  *  igb_update_mc_addr_list - Update Multicast addresses
  *  @hw: pointer to the HW structure
@@ -516,6 +541,8 @@  void igb_update_mc_addr_list(struct e1000_hw *hw,
 	for (i = hw->mac.mta_reg_count - 1; i >= 0; i--)
 		array_wr32(E1000_MTA, i, hw->mac.mta_shadow[i]);
 	wrfl();
+	if (hw->mac.type == e1000_i210 || hw->mac.type == e1000_i211)
+		igb_i21x_hw_doublecheck(hw);
 }
 
 /**