diff mbox series

[v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.

Message ID 161821396263.48361.2796709239866588652.stgit@jupiter
State Accepted
Commit 5ae5bc12d0728db60a0aa9b62160ffc038875f1a
Headers show
Series [v2] powerpc/eeh: Fix EEH handling for hugepages in ioremap space. | expand

Commit Message

Mahesh J Salgaonkar April 12, 2021, 7:52 a.m. UTC
During the EEH MMIO error checking, the current implementation fails to map
the (virtual) MMIO address back to the pci device on radix with hugepage
mappings for I/O. This results into failure to dispatch EEH event with no
recovery even when EEH capability has been enabled on the device.

eeh_check_failure(token)		# token = virtual MMIO address
  addr = eeh_token_to_phys(token);
  edev = eeh_addr_cache_get_dev(addr);
  if (!edev)
	return 0;
  eeh_dev_check_failure(edev);	<= Dispatch the EEH event

In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys
translation that results in wrong physical address, which is then passed to
eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges
to get to a PCI device. Hence, it fails to find a match and the EEH event
never gets dispatched leaving the device in failed state.

The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
introduced following logic to translate virt to phys for hugepage mappings:

eeh_token_to_phys():
+	pa = pte_pfn(*ptep);
+
+	/* On radix we can do hugepage mappings for io, so handle that */
+       if (hugepage_shift) {
+               pa <<= hugepage_shift;			<= This is wrong
+               pa |= token & ((1ul << hugepage_shift) - 1);
+       }

This patch fixes the virt -> phys translation in eeh_token_to_phys()
function.

$ cat /sys/kernel/debug/powerpc/eeh_address_cache
mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1
mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1
mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0
mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0
mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1
mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0
mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1
mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0

Above is the list of cached io address ranges of pci 0030:01:00.<fn>.

Before this patch:

Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection
clearly shows that 'addr=' contains wrong physical address:

   kworker/u16:0-7       [001] ....   108.883775: eeh_addr_cache_get_dev:
	   (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510

dmesg shows no EEH recovery messages:

[  108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff)
[  108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295)
[  108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff  resource_bit 0x1
[  108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout
[  108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout
<..>

After this patch:

eeh_addr_cache_get_dev() trace shows correct physical address:

  <idle>-0       [001] ..s.  1043.123828: eeh_addr_cache_get_dev:
	  (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8

dmesg logs shows EEH recovery getting triggerred:

[  964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff)
[  964.323991] EEH: Recovering PHB#30-PE#10000
[  964.324002] EEH: PE location: N/A, PHB location: N/A
[  964.324006] EEH: Frozen PHB#30-PE#10000 detected
<..>

Cc: stable@vger.kernel.org
Fixes: 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Dominic DeMarco <ddemarc@us.ibm.com>
---
Change in v2:
- Fixed error reported by checkpatch.pl.
---
 arch/powerpc/kernel/eeh.c |   11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

Comments

Mahesh J Salgaonkar April 14, 2021, 8:31 a.m. UTC | #1
On 2021-04-13 10:53:39 Tue, Oliver O'Halloran wrote:
> On Mon, Apr 12, 2021 at 5:52 PM Mahesh Salgaonkar <mahesh@linux.ibm.com> wrote:

> >

> > During the EEH MMIO error checking, the current implementation fails to map

> > the (virtual) MMIO address back to the pci device on radix with hugepage

> > mappings for I/O. This results into failure to dispatch EEH event with no

> > recovery even when EEH capability has been enabled on the device.

> >

> > eeh_check_failure(token)                # token = virtual MMIO address

> >   addr = eeh_token_to_phys(token);

> >   edev = eeh_addr_cache_get_dev(addr);

> >   if (!edev)

> >         return 0;

> >   eeh_dev_check_failure(edev);  <= Dispatch the EEH event

> >

> > In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys

> > translation that results in wrong physical address, which is then passed to

> > eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges

> > to get to a PCI device. Hence, it fails to find a match and the EEH event

> > never gets dispatched leaving the device in failed state.

> >

> > The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space")

> > introduced following logic to translate virt to phys for hugepage mappings:

> >

> > eeh_token_to_phys():

> > +       pa = pte_pfn(*ptep);

> > +

> > +       /* On radix we can do hugepage mappings for io, so handle that */

> > +       if (hugepage_shift) {

> > +               pa <<= hugepage_shift;                  <= This is wrong

> > +               pa |= token & ((1ul << hugepage_shift) - 1);

> > +       }

> 

> I think I vaguely remember thinking "is this right?" at the time.

> Apparently not!

> 

> Reviewed-by: Oliver O'Halloran <oohall@gmail.com>

> 


Thanks for the review.

> 

> It would probably be a good idea to add a debugfs interface to help

> with testing the address translation. Maybe something like:

> 

> echo <mmio addr> > /sys/kernel/debug/powerpc/eeh_addr_check

> 

> Then in the kernel:

> 

> struct resource *r = lookup_resource(mmio_addr);

> void *virt = ioremap_resource(r);

> ret = eeh_check_failure(virt);

> iounmap(virt)

> 

> return ret;

> 

> A little tedious, but then you can write a selftest :)


Sure, will give a try.

Thanks,
-Mahesh.

-- 
Mahesh J Salgaonkar
Michael Ellerman April 19, 2021, 3:59 a.m. UTC | #2
On Mon, 12 Apr 2021 13:22:50 +0530, Mahesh Salgaonkar wrote:
> During the EEH MMIO error checking, the current implementation fails to map

> the (virtual) MMIO address back to the pci device on radix with hugepage

> mappings for I/O. This results into failure to dispatch EEH event with no

> recovery even when EEH capability has been enabled on the device.

> 

> eeh_check_failure(token)		# token = virtual MMIO address

>   addr = eeh_token_to_phys(token);

>   edev = eeh_addr_cache_get_dev(addr);

>   if (!edev)

> 	return 0;

>   eeh_dev_check_failure(edev);	<= Dispatch the EEH event

> 

> [...]


Applied to powerpc/next.

[1/1] powerpc/eeh: Fix EEH handling for hugepages in ioremap space.
      https://git.kernel.org/powerpc/c/5ae5bc12d0728db60a0aa9b62160ffc038875f1a

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index cd60bc1c87011..7040e430a1249 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -362,14 +362,11 @@  static inline unsigned long eeh_token_to_phys(unsigned long token)
 	pa = pte_pfn(*ptep);
 
 	/* On radix we can do hugepage mappings for io, so handle that */
-	if (hugepage_shift) {
-		pa <<= hugepage_shift;
-		pa |= token & ((1ul << hugepage_shift) - 1);
-	} else {
-		pa <<= PAGE_SHIFT;
-		pa |= token & (PAGE_SIZE - 1);
-	}
+	if (!hugepage_shift)
+		hugepage_shift = PAGE_SHIFT;
 
+	pa <<= PAGE_SHIFT;
+	pa |= token & ((1ul << hugepage_shift) - 1);
 	return pa;
 }