diff mbox series

[v4] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT

Message ID 20231226124254.66102-1-mat.jonczyk@o2.pl
State New
Headers show
Series [v4] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT | expand

Commit Message

Mateusz Jończyk Dec. 26, 2023, 12:42 p.m. UTC
On some platforms, the ACPI _PRT function returns duplicate interrupt
routing entries. Linux uses the first matching entry, but sometimes the
second matching entry contains the correct interrupt vector.

As a debugging aid, print a warning to dmesg if duplicate interrupt
routing entries are present. This way, we could check how many models
are affected.

This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
SMBus controller. This controller is nonfunctional unless its interrupt
usage is disabled (using the "disable_features=0x10" module parameter).

After investigation, it turned out that the driver was using an
incorrect interrupt vector: in lspci output for this device there was:
        Interrupt: pin B routed to IRQ 19
but after running i2cdetect (without using any i2c-i801 module
parameters) the following was logged to dmesg:

        [...]
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        irq 17: nobody cared (try booting with the "irqpoll" option)

Existence of duplicate entries in a table returned by the _PRT method
was confirmed by disassembling the ACPI DSDT table.

Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
Manager), which is neither of the two vectors returned by _PRT.
As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
working under Windows. It appears that Windows has reconfigured the
chipset independently to use another interrupt vector for the device.
This is possible, according to the chipset datasheet [1], page 436 for
example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).

[1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Delvare <jdelvare@suse.de>
Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
Previously-tested-by: Jean Delvare <jdelvare@suse.de>

---
Hello,

I'm resurrecting an older patch that was discussed back in January:

https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u

To consider: should we print a warning or an error in case of duplicate
entries? This may not be serious enough to disturb the user with an
error message at boot.

I'm also looking into modifying the i2c-i801 driver to disable its usage
of interrupts if one did not fire.

v2: - add a newline at the end of the kernel log message,
    - replace: "if (match == NULL)" -> "if (!match)"
    - patch description tweaks.
v3: - fix C style issues pointed by Jean Delvare,
    - switch severity from warning to error.
v3 RESEND: retested on top of v6.2-rc4
v4: - rebase and retest on top of v6.7-rc7
    - switch severity back to warning,
    - change pr_err() to dev_warn() and simplify the code,
    - modify patch description (describe Windows behaviour etc.)
---
 drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)


base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082

Comments

Jean Delvare Feb. 13, 2024, 9:39 p.m. UTC | #1
Hi Mateusz,

On Tue, 26 Dec 2023 13:42:54 +0100, Mateusz Jończyk wrote:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
> 
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
> 
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
> 
> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> 
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>
> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>

I'm still happy with this patch, so you can change that back to:

Reviewed-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jean Delvare <jdelvare@suse.de>

Thanks,
Rafael J. Wysocki Feb. 16, 2024, 6:26 p.m. UTC | #2
On Tue, Dec 26, 2023 at 1:50 PM Mateusz Jończyk <mat.jonczyk@o2.pl> wrote:
>
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
>
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
>
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
>
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
>
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
>
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
>
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).
>
> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
>
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>
> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>
>
> ---
> Hello,
>
> I'm resurrecting an older patch that was discussed back in January:
>
> https://lore.kernel.org/lkml/20230121153314.6109-1-mat.jonczyk@o2.pl/T/#u
>
> To consider: should we print a warning or an error in case of duplicate
> entries? This may not be serious enough to disturb the user with an
> error message at boot.
>
> I'm also looking into modifying the i2c-i801 driver to disable its usage
> of interrupts if one did not fire.
>
> v2: - add a newline at the end of the kernel log message,
>     - replace: "if (match == NULL)" -> "if (!match)"
>     - patch description tweaks.
> v3: - fix C style issues pointed by Jean Delvare,
>     - switch severity from warning to error.
> v3 RESEND: retested on top of v6.2-rc4
> v4: - rebase and retest on top of v6.7-rc7
>     - switch severity back to warning,
>     - change pr_err() to dev_warn() and simplify the code,
>     - modify patch description (describe Windows behaviour etc.)
> ---
>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index ff30ceca2203..1fcf72e335b0 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>         struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>         struct acpi_pci_routing_table *entry;
>         acpi_handle handle = NULL;
> +       struct acpi_prt_entry *match = NULL;
> +       const char *match_int_source = NULL;
>
>         if (dev->bus->bridge)
>                 handle = ACPI_HANDLE(dev->bus->bridge);
> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>
>         entry = buffer.pointer;
>         while (entry && (entry->length > 0)) {
> -               if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -                                                entry, entry_ptr))
> -                       break;
> +               struct acpi_prt_entry *curr;
> +
> +               if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> +                       if (!match) {
> +                               match = curr;
> +                               match_int_source = entry->source;
> +                       } else {
> +                               dev_warn(&dev->dev, FW_BUG

dev_info() would be sufficient here IMV.

> +                                      "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> +                                      pin_name(curr->pin),
> +                                      match_int_source, match->index,
> +                                      entry->source, curr->index);
> +                               /* We use the first matching entry nonetheless,
> +                                * for compatibility with older kernels.
> +                                */
> +                       }
> +               }
> +
>                 entry = (struct acpi_pci_routing_table *)
>                     ((unsigned long)entry + entry->length);
>         }
>
> +       *entry_ptr = match;
> +
>         kfree(buffer.pointer);
>         return 0;
>  }
>
> base-commit: 861deac3b092f37b2c5e6871732f3e11486f7082
> --

Bjorn, any concerns regarding this one?
Andy Shevchenko Feb. 20, 2024, 10:12 p.m. UTC | #3
Tue, Dec 26, 2023 at 01:42:54PM +0100, Mateusz Jończyk kirjoitti:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> As a debugging aid, print a warning to dmesg if duplicate interrupt
> routing entries are present. This way, we could check how many models
> are affected.
> 
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller is nonfunctional unless its interrupt
> usage is disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.
> 
> Windows XP is using IRQ3 (as reported by HWiNFO32 and in the Device
> Manager), which is neither of the two vectors returned by _PRT.
> As HWiNFO32 decoded contents of the SPD EEPROMs, the i2c-i801 device is
> working under Windows. It appears that Windows has reconfigured the
> chipset independently to use another interrupt vector for the device.
> This is possible, according to the chipset datasheet [1], page 436 for
> example (PIRQ[n]_ROUT—PIRQ[A,B,C,D] Routing Control Register).

> [1] https://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf
> 

Can you convert this to be a Link tag?

Link: ...URL... # [1]
Signed-off-by: ...

> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.de>

Please, move these (Cc lines) down after --- cutter line.

> Previously-reviewed-by: Jean Delvare <jdelvare@suse.de>
> Previously-tested-by: Jean Delvare <jdelvare@suse.de>

This shouldn't be in the commit message, just use the comment block
(after --- line) for this.

...

>  	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>  	struct acpi_pci_routing_table *entry;
>  	acpi_handle handle = NULL;
> +	struct acpi_prt_entry *match = NULL;
> +	const char *match_int_source = NULL;

Can you preserve reversed xmas tree ordering?

...

>  	while (entry && (entry->length > 0)) {
> -		if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -						 entry, entry_ptr))
> -			break;
> +		struct acpi_prt_entry *curr;
> +
> +		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {

> +			if (!match) {

Why not positive condition?

> +				match = curr;
> +				match_int_source = entry->source;
> +			} else {
> +				dev_warn(&dev->dev, FW_BUG
> +				       "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
> +				       pin_name(curr->pin),
> +				       match_int_source, match->index,
> +				       entry->source, curr->index);
> +				/* We use the first matching entry nonetheless,
> +				 * for compatibility with older kernels.
> +				 */
> +			}
> +		}
diff mbox series

Patch

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index ff30ceca2203..1fcf72e335b0 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -203,6 +203,8 @@  static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_pci_routing_table *entry;
 	acpi_handle handle = NULL;
+	struct acpi_prt_entry *match = NULL;
+	const char *match_int_source = NULL;
 
 	if (dev->bus->bridge)
 		handle = ACPI_HANDLE(dev->bus->bridge);
@@ -219,13 +221,30 @@  static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 
 	entry = buffer.pointer;
 	while (entry && (entry->length > 0)) {
-		if (!acpi_pci_irq_check_entry(handle, dev, pin,
-						 entry, entry_ptr))
-			break;
+		struct acpi_prt_entry *curr;
+
+		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
+			if (!match) {
+				match = curr;
+				match_int_source = entry->source;
+			} else {
+				dev_warn(&dev->dev, FW_BUG
+				       "ACPI _PRT returned duplicate IRQ routing entries for INT%c: %s[%d] and %s[%d]\n",
+				       pin_name(curr->pin),
+				       match_int_source, match->index,
+				       entry->source, curr->index);
+				/* We use the first matching entry nonetheless,
+				 * for compatibility with older kernels.
+				 */
+			}
+		}
+
 		entry = (struct acpi_pci_routing_table *)
 		    ((unsigned long)entry + entry->length);
 	}
 
+	*entry_ptr = match;
+
 	kfree(buffer.pointer);
 	return 0;
 }