diff mbox series

[v2,02/15] lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3

Message ID 20210104180240.46824-3-jsmart2021@gmail.com
State New
Headers show
Series lpfc: Update lpfc to revision 12.8.0.7 | expand

Commit Message

James Smart Jan. 4, 2021, 6:02 p.m. UTC
A very long time ago, there was a feature: auto sli mode. It gave the
user the ability to auto select the SLI mode (SLI2 or SLI3) to run the
port in, or even force SLI2 mode if configured.  Because of the
convoluted logic, the CONFIG_PORT mbox command ends up being called 2 or
3 times. It should have been called only once.  Additionally, the driver
no longer supports SLI-2, so only SLI-3 mode should be allowed.

The following changes were made:
- Force module parameter to SLI3 only.
- Rip out redundant CONFIG_PORT mbox commands.
- Force CONFIG_PORT mbox command to be in beginning of enable ISR routine.
- Added changes for offline to online behavior

Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/scsi/lpfc/lpfc.h      |  1 +
 drivers/scsi/lpfc/lpfc_attr.c |  7 ++----
 drivers/scsi/lpfc/lpfc_init.c | 20 ++++++++-------
 drivers/scsi/lpfc/lpfc_sli.c  | 46 +++++++++--------------------------
 4 files changed, 26 insertions(+), 48 deletions(-)

Comments

Daniel Wagner June 7, 2021, 11:06 a.m. UTC | #1
Hi James,

On Mon, Jan 04, 2021 at 10:02:27AM -0800, James Smart wrote:
> A very long time ago, there was a feature: auto sli mode. It gave the

> user the ability to auto select the SLI mode (SLI2 or SLI3) to run the

> port in, or even force SLI2 mode if configured.  Because of the

> convoluted logic, the CONFIG_PORT mbox command ends up being called 2 or

> 3 times. It should have been called only once.  Additionally, the driver

> no longer supports SLI-2, so only SLI-3 mode should be allowed.

> 

> The following changes were made:

> - Force module parameter to SLI3 only.

> - Rip out redundant CONFIG_PORT mbox commands.

> - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine.

> - Added changes for offline to online behavior


We got a regression report for this patch. The problem seems to be
related with older Emulex HBAs. The symptom is in this case one port is
not enabled. A revert of this patch fixed the problem. This was
observed with:

  Emulex LPe11000 FV2.72X2 DV12.8.0.7 HN:FR2AS6AP2-0001 OS:Linux

Here some ramblings from my debugging:

In the logs I found:

> 0000:0b:00.0: 0:0431 Failed to enable interrupt.

> 0000:0b:00.0: 0:0431 Failed to enable interrupt.

> 0000:0b:00.0: 0:0431 Failed to enable interrupt.


cfg_sli_mode used to be 0 (auto) and the config port setup
used to try first mode = 3 and then fall back to mode = 2

> -       rc = lpfc_sli_config_port(phba, mode);

> -

> -       if (rc && phba->cfg_sli_mode == 3)

> -               lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,

> -                               "1820 Unable to select SLI-3.  "

> -                               "Not supported by adapter.\n");

> -       if (rc && mode != 2)

> -               rc = lpfc_sli_config_port(phba, 2);


the port config is now in lpfc_sli_enable_intr which is hardcoded
to LPFC_SLI_REV3 and I think this fails and the HBA_NEEDS_CFG_PORT
flag is not resetted, hence in lpfc_sli_hba_setup() the new
code tries to enable the port again with:

> +       /* Enable ISR already does config_port because of config_msi mbx */

> +       if (phba->hba_flag & HBA_NEEDS_CFG_PORT) {

> +               rc = lpfc_sli_config_port(phba, LPFC_SLI_REV3);

> +               if (rc)

> +                       return -EIO;

> +               phba->hba_flag &= ~HBA_NEEDS_CFG_PORT;


Though I think this should something like

   lpfc_sli_config_port(phba, LPFC_SLI_REV2);

for the specific case.

HTH!

Thanks,
Daniel
James Smart June 7, 2021, 3:12 p.m. UTC | #2
On 6/7/2021 4:06 AM, Daniel Wagner wrote:
> Hi James,

> 

> On Mon, Jan 04, 2021 at 10:02:27AM -0800, James Smart wrote:

>> A very long time ago, there was a feature: auto sli mode. It gave the

>> user the ability to auto select the SLI mode (SLI2 or SLI3) to run the

>> port in, or even force SLI2 mode if configured.  Because of the

>> convoluted logic, the CONFIG_PORT mbox command ends up being called 2 or

>> 3 times. It should have been called only once.  Additionally, the driver

>> no longer supports SLI-2, so only SLI-3 mode should be allowed.

>>

>> The following changes were made:

>> - Force module parameter to SLI3 only.

>> - Rip out redundant CONFIG_PORT mbox commands.

>> - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine.

>> - Added changes for offline to online behavior

> 

> We got a regression report for this patch. The problem seems to be

> related with older Emulex HBAs. The symptom is in this case one port is

> not enabled. A revert of this patch fixed the problem. This was

> observed with:

> 

>    Emulex LPe11000 FV2.72X2 DV12.8.0.7 HN:FR2AS6AP2-0001 OS:Linux

> 

> Here some ramblings from my debugging:

> 

> In the logs I found:

> 

>> 0000:0b:00.0: 0:0431 Failed to enable interrupt.

>> 0000:0b:00.0: 0:0431 Failed to enable interrupt.

>> 0000:0b:00.0: 0:0431 Failed to enable interrupt.

> 

> cfg_sli_mode used to be 0 (auto) and the config port setup

> used to try first mode = 3 and then fall back to mode = 2

> 

>> -       rc = lpfc_sli_config_port(phba, mode);

>> -

>> -       if (rc && phba->cfg_sli_mode == 3)

>> -               lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,

>> -                               "1820 Unable to select SLI-3.  "

>> -                               "Not supported by adapter.\n");

>> -       if (rc && mode != 2)

>> -               rc = lpfc_sli_config_port(phba, 2);

> 

> the port config is now in lpfc_sli_enable_intr which is hardcoded

> to LPFC_SLI_REV3 and I think this fails and the HBA_NEEDS_CFG_PORT

> flag is not resetted, hence in lpfc_sli_hba_setup() the new

> code tries to enable the port again with:

> 

>> +       /* Enable ISR already does config_port because of config_msi mbx */

>> +       if (phba->hba_flag & HBA_NEEDS_CFG_PORT) {

>> +               rc = lpfc_sli_config_port(phba, LPFC_SLI_REV3);

>> +               if (rc)

>> +                       return -EIO;

>> +               phba->hba_flag &= ~HBA_NEEDS_CFG_PORT;

> 

> Though I think this should something like

> 

>     lpfc_sli_config_port(phba, LPFC_SLI_REV2);

> 

> for the specific case.

> 

> HTH!

> 

> Thanks,

> Daniel

> 


ouch - What you are describing is likely true, but sli-2 firmware is 
*extremely* old - 2 decades or more. If a change wont work first shot, 
it likely won't be worth the effort to try to fix it. Other 
functionality may be hanging on by a thread.  That adapter certainly 
runs SLI-3 (even that is 10-15 yrs old), so the best solution is a fw 
upgrade that picks up the sli3 interface. Is that possible?

Given that the error message you quoted was a failure of interrupt, that 
may be a clue. It may well be the adapter has sli3 firmware and it's 
failing on setting the interrupt vector type.  The older adapters 
supported MSI and INTx. SLI-2 may have been limited to INTx only. There 
used to be hiccups in some platforms with MSI support (platform said it 
did, but was broken) which is why the driver had "set it, test it, 
revert it" logic. I believe the driver has a lpfc_use_msi module 
parameter that when set to 0 should use only INTx, which may be what the 
sli2 downgrade is effectively doing. Try setting that and seeing if the 
card loads the sli3 image and runs.


-- james
Daniel Wagner June 15, 2021, 12:45 p.m. UTC | #3
Hi James,

On Mon, Jun 07, 2021 at 08:12:22AM -0700, James Smart wrote:
> ouch - What you are describing is likely true, but sli-2 firmware is

> *extremely* old - 2 decades or more. If a change wont work first shot, it

> likely won't be worth the effort to try to fix it. Other functionality may

> be hanging on by a thread.  That adapter certainly runs SLI-3 (even that is

> 10-15 yrs old), so the best solution is a fw upgrade that picks up the sli3

> interface. Is that possible?


I forwarded the info.

> Given that the error message you quoted was a failure of interrupt, that may

> be a clue. It may well be the adapter has sli3 firmware and it's failing on

> setting the interrupt vector type.  The older adapters supported MSI and

> INTx. SLI-2 may have been limited to INTx only. There used to be hiccups in

> some platforms with MSI support (platform said it did, but was broken) which

> is why the driver had "set it, test it, revert it" logic. I believe the

> driver has a lpfc_use_msi module parameter that when set to 0 should use

> only INTx, which may be what the sli2 downgrade is effectively doing. Try

> setting that and seeing if the card loads the sli3 image and runs.


I haven't heard back yet if the lpfc_use_msi=0 setting fixes the problem
(waiting for the next maintenance window for the experiment).

Thanks,
Daniel
Daniel Wagner June 18, 2021, 8:52 a.m. UTC | #4
Hi James,

On Tue, Jun 15, 2021 at 02:45:02PM +0200, Daniel Wagner wrote:
> I haven't heard back yet if the lpfc_use_msi=0 setting fixes the problem

> (waiting for the next maintenance window for the experiment).


lpfc_use_msi=0 did not help in this case. Only a complete revert of the
patch fixes the issue.

Anyway, I was thinking about adding something a workaround to our
downstream version. I figured that sli_mode is unused and we could abuse
it to select SLI version for lpfc_sli_config_port(). Something like:


diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 1ad1beb2a8a8..cf8538ca8402 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -5355,7 +5355,10 @@ lpfc_sli_hba_setup(struct lpfc_hba *phba)
 
        /* Enable ISR already does config_port because of config_msi mbx */
        if (phba->hba_flag & HBA_NEEDS_CFG_PORT) {
-               rc = lpfc_sli_config_port(phba, LPFC_SLI_REV3);
+               if (phba->cfg_sli_mode == 2)
+                       rc = lpfc_sli_config_port(phba, LPFC_SLI_REV2);
+               else
+                       rc = lpfc_sli_config_port(phba, LPFC_SLI_REV3);
                if (rc)
                        return -EIO;
                phba->hba_flag &= ~HBA_NEEDS_CFG_PORT;


Thanks,
Daniel
diff mbox series

Patch

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index a54c8da30273..7875552c07d3 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -779,6 +779,7 @@  struct lpfc_hba {
 					 */
 #define HBA_FLOGI_ISSUED	0x100000 /* FLOGI was issued */
 #define HBA_DEFER_FLOGI		0x800000 /* Defer FLOGI till read_sparm cmpl */
+#define HBA_NEEDS_CFG_PORT	0x2000000 /* SLI3 - needs a CONFIG_PORT mbox */
 
 	uint32_t fcp_ring_in_use; /* When polling test if intr-hndlr active*/
 	struct lpfc_dmabuf slim2p;
diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 4528166dee36..f8bb6a4f780c 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -3441,11 +3441,8 @@  unsigned long lpfc_no_hba_reset[MAX_HBAS_NO_RESET] = {
 module_param_array(lpfc_no_hba_reset, ulong, &lpfc_no_hba_reset_cnt, 0444);
 MODULE_PARM_DESC(lpfc_no_hba_reset, "WWPN of HBAs that should not be reset");
 
-LPFC_ATTR(sli_mode, 0, 0, 3,
-	"SLI mode selector:"
-	" 0 - auto (SLI-3 if supported),"
-	" 2 - select SLI-2 even on SLI-3 capable HBAs,"
-	" 3 - select SLI-3");
+LPFC_ATTR(sli_mode, 3, 3, 3,
+	"SLI mode selector: 3 - select SLI-3");
 
 LPFC_ATTR_R(enable_npiv, 1, 0, 1,
 	"Enable NPIV functionality");
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index ac67f420ec26..1f0a62ecfad8 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -10728,17 +10728,19 @@  lpfc_sli_enable_intr(struct lpfc_hba *phba, uint32_t cfg_mode)
 	uint32_t intr_mode = LPFC_INTR_ERROR;
 	int retval;
 
+	/* Need to issue conf_port mbox cmd before conf_msi mbox cmd */
+	retval = lpfc_sli_config_port(phba, LPFC_SLI_REV3);
+	if (retval)
+		return intr_mode;
+	phba->hba_flag &= ~HBA_NEEDS_CFG_PORT;
+
 	if (cfg_mode == 2) {
-		/* Need to issue conf_port mbox cmd before conf_msi mbox cmd */
-		retval = lpfc_sli_config_port(phba, LPFC_SLI_REV3);
+		/* Now, try to enable MSI-X interrupt mode */
+		retval = lpfc_sli_enable_msix(phba);
 		if (!retval) {
-			/* Now, try to enable MSI-X interrupt mode */
-			retval = lpfc_sli_enable_msix(phba);
-			if (!retval) {
-				/* Indicate initialization to MSI-X mode */
-				phba->intr_type = MSIX;
-				intr_mode = 2;
-			}
+			/* Indicate initialization to MSI-X mode */
+			phba->intr_type = MSIX;
+			intr_mode = 2;
 		}
 	}
 
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index 95caad764fb7..735fa1d484eb 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -4359,6 +4359,8 @@  lpfc_sli_brdready_s3(struct lpfc_hba *phba, uint32_t mask)
 	if (lpfc_readl(phba->HSregaddr, &status))
 		return 1;
 
+	phba->hba_flag |= HBA_NEEDS_CFG_PORT;
+
 	/*
 	 * Check status register every 100ms for 5 retries, then every
 	 * 500ms for 5, then every 2.5 sec for 5, then reset board and
@@ -4687,6 +4689,7 @@  lpfc_sli_brdreset(struct lpfc_hba *phba)
 	/* perform board reset */
 	phba->fc_eventTag = 0;
 	phba->link_events = 0;
+	phba->hba_flag |= HBA_NEEDS_CFG_PORT;
 	if (phba->pport) {
 		phba->pport->fc_myDID = 0;
 		phba->pport->fc_prevDID = 0;
@@ -5020,6 +5023,8 @@  lpfc_sli_chipset_init(struct lpfc_hba *phba)
 		return -EIO;
 	}
 
+	phba->hba_flag |= HBA_NEEDS_CFG_PORT;
+
 	/* Clear all interrupt enable conditions */
 	writel(0, phba->HCregaddr);
 	readl(phba->HCregaddr); /* flush */
@@ -5316,45 +5321,18 @@  int
 lpfc_sli_hba_setup(struct lpfc_hba *phba)
 {
 	uint32_t rc;
-	int  mode = 3, i;
+	int  i;
 	int longs;
 
-	switch (phba->cfg_sli_mode) {
-	case 2:
-		if (phba->cfg_enable_npiv) {
-			lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
-				"1824 NPIV enabled: Override sli_mode "
-				"parameter (%d) to auto (0).\n",
-				phba->cfg_sli_mode);
-			break;
-		}
-		mode = 2;
-		break;
-	case 0:
-	case 3:
-		break;
-	default:
-		lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
-				"1819 Unrecognized sli_mode parameter: %d.\n",
-				phba->cfg_sli_mode);
-
-		break;
+	/* Enable ISR already does config_port because of config_msi mbx */
+	if (phba->hba_flag & HBA_NEEDS_CFG_PORT) {
+		rc = lpfc_sli_config_port(phba, LPFC_SLI_REV3);
+		if (rc)
+			return -EIO;
+		phba->hba_flag &= ~HBA_NEEDS_CFG_PORT;
 	}
 	phba->fcp_embed_io = 0;	/* SLI4 FC support only */
 
-	rc = lpfc_sli_config_port(phba, mode);
-
-	if (rc && phba->cfg_sli_mode == 3)
-		lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT,
-				"1820 Unable to select SLI-3.  "
-				"Not supported by adapter.\n");
-	if (rc && mode != 2)
-		rc = lpfc_sli_config_port(phba, 2);
-	else if (rc && mode == 2)
-		rc = lpfc_sli_config_port(phba, 3);
-	if (rc)
-		goto lpfc_sli_hba_setup_error;
-
 	/* Enable PCIe device Advanced Error Reporting (AER) if configured */
 	if (phba->cfg_aer_support == 1 && !(phba->hba_flag & HBA_AER_ENABLED)) {
 		rc = pci_enable_pcie_error_reporting(phba->pcidev);