diff mbox

Marvell Phy (1510) issue since v4.7 kernel

Message ID 587410EF.5090607@ti.com
State New
Headers show

Commit Message

Murali Karicheri Jan. 9, 2017, 10:38 p.m. UTC
Hello Charles-Antoine, Andrew,

We have recently upgraded our kernel to v4.9 and started seeing an issue
on our Keystone EVMs (K2E/L) that uses Marvel Phy 1510. The issue is that
when we do a reboot command from the Linux console or do a SoC reset,
the DHCP times out at U-Boot due to Phy auto negotiation failure. This works
fine when the board is power cycled.

This is traced back to v4.7 where following commits were introduced:-

3758be3dc162b56ea Marvell phy: add functions to suspend and resume both interfaces: fiber and copper links.
78301ebe9b5a2645d Marvell phy: add configuration of autonegociation for fiber link.
2170fef78a400ca3c Marvell phy: add field to get errors from fiber link.
6cfb3bcc064110995 Marvell phy: check link status in case of fiber link 

Specifically the commit 6cfb3bcc0641109951a124019cd2e0623107d18d which changed
the marvell_read_status() behavior

Once we add the below HACK, this works fine.

commit e5bd8bfe7f544df03772c094331bb27e1a5a5600
Author: Murali Karicheri <m-karicheri2@ti.com>
Date:   Fri Jan 6 12:22:13 2017 -0500

    TEMP: work around in marvel Phy driver for u-boot dhcp timeout
    
    Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>




Also we see comments/logic in the code that contradicts with each other

For example in marvell_read_status(), it says "Check the fiber mode first and
uses the logic 
     if (phydev->supported & SUPPORTED_FIBRE)

Where as in  marvell_resume() comment says 'Resume the fiber mode first' and
uses the logic 

     if (!(phydev->supported & SUPPORTED_FIBRE))

Both are not symmetrical. Could you please explain?

-- 
Murali Karicheri
Linux Kernel, Keystone

Comments

Andrew Lunn Jan. 9, 2017, 10:56 p.m. UTC | #1
On Mon, Jan 09, 2017 at 05:38:39PM -0500, Murali Karicheri wrote:
> Hello Charles-Antoine, Andrew,

> 

> We have recently upgraded our kernel to v4.9 and started seeing an issue

> on our Keystone EVMs (K2E/L) that uses Marvel Phy 1510. The issue is that

> when we do a reboot command from the Linux console or do a SoC reset,

> the DHCP times out at U-Boot due to Phy auto negotiation failure. This works

> fine when the board is power cycled.


I've seen similar before. What version of kernel are you upgrading
from?

The problem i've had is that on shutdown/reboot, linux powers the PHYs
off. The uboot i have does not power them on again. Clearly a uboot
issue, it should not assume the PHYs are powered on.

I work around this with a uboot command:

mii write 1 0 0x3100

where the PHY is at address 1 on the bus. This writes to register 0,
and the import bit which needs setting to 0 is bit 11.

If mii read shows bit 11 is set, you know there uboot did not power up
the PHY.

    Andrew
Kwok, WingMan Jan. 9, 2017, 11:48 p.m. UTC | #2
Hi Andrew,


> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Monday, January 09, 2017 5:57 PM

> To: Karicheri, Muralidharan

> Cc: netdev@vger.kernel.org; Kwok, WingMan

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> On Mon, Jan 09, 2017 at 05:38:39PM -0500, Murali Karicheri wrote:

> > Hello Charles-Antoine, Andrew,

> >

> > We have recently upgraded our kernel to v4.9 and started seeing an

> issue

> > on our Keystone EVMs (K2E/L) that uses Marvel Phy 1510. The issue is

> that

> > when we do a reboot command from the Linux console or do a SoC

> reset,

> > the DHCP times out at U-Boot due to Phy auto negotiation failure.

> This works

> > fine when the board is power cycled.

> 

> I've seen similar before. What version of kernel are you upgrading

> from?

> 

> The problem i've had is that on shutdown/reboot, linux powers the PHYs

> off. The uboot i have does not power them on again. Clearly a uboot

> issue, it should not assume the PHYs are powered on.

> 

> I work around this with a uboot command:

> 

> mii write 1 0 0x3100

> 

> where the PHY is at address 1 on the bus. This writes to register 0,

> and the import bit which needs setting to 0 is bit 11.

> 

> If mii read shows bit 11 is set, you know there uboot did not power up

> the PHY.

> 

>     Andrew


Thanks for the suggestion.

But when kernel is reboot into u-boot, "mii read 0 0" shows
0x1000, ie. bit 11 is 0, and still the phy auto-nego times out.

The "mii read 0 0" shows the same value 0x1000 if we power cycle
board and the phy auto-nego works fine.

Could there be some other registers that need to be re-configured?

We are upgrading from kernel v4.4 to v4.9. And the u-boot is 2016.05.

Regards,
WingMan
Andrew Lunn Jan. 9, 2017, 11:55 p.m. UTC | #3
> But when kernel is reboot into u-boot, "mii read 0 0" shows

> 0x1000, ie. bit 11 is 0, and still the phy auto-nego times out.


O.K, not so simple then.

I suggest you dump all the registers, in both the good and bad state,
and see how they differ.

    Andrew
Kwok, WingMan Jan. 10, 2017, 1:37 a.m. UTC | #4
> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Monday, January 09, 2017 6:56 PM

> To: Kwok, WingMan

> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> > But when kernel is reboot into u-boot, "mii read 0 0" shows

> > 0x1000, ie. bit 11 is 0, and still the phy auto-nego times out.

> 

> O.K, not so simple then.

> 

> I suggest you dump all the registers, in both the good and bad state,

> and see how they differ.

> 

>     Andrew


From Marvell's brief description http://www.marvell.com/transceivers/alaska-gbe/,
it seems that 88E1510/1518 don't support fiber. Only 88E1512 does. In
that case, the fiber support patch is not applicable to 88E1510/1518.

I investigate a little more of the u-boot auto-nego time out
problem after reboot and found that the problem is caused by
marvell_read_status() which leaves the fiber page out because the
phydev->link is TRUE. But if fiber is not applicable to 88E1510/1518,
this is an invalid check.

Any suggestion?

WingMan
Andrew Lunn Jan. 10, 2017, 1:53 a.m. UTC | #5
> From Marvell's brief description http://www.marvell.com/transceivers/alaska-gbe/,

> it seems that 88E1510/1518 don't support fiber. Only 88E1512 does. In

> that case, the fiber support patch is not applicable to 88E1510/1518.


O.K. That makes it easier.

Please add the relevant IDs to include/linux/marvell.h, and add
entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE and
1510 and 1518 without.

     Andrew
Kwok, WingMan Jan. 11, 2017, 10:18 p.m. UTC | #6
> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Monday, January 09, 2017 8:54 PM

> To: Kwok, WingMan

> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> > From Marvell's brief description

> http://www.marvell.com/transceivers/alaska-gbe/,

> > it seems that 88E1510/1518 don't support fiber. Only 88E1512 does.

> In

> > that case, the fiber support patch is not applicable to

> 88E1510/1518.

> 

> O.K. That makes it easier.

> 

> Please add the relevant IDs to include/linux/marvell.h, and add

> entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE and

> 1510 and 1518 without.

> 

>      Andrew


By any chance you have the ID of 1512?

Thanks,
WingMan
Andrew Lunn Jan. 11, 2017, 10:27 p.m. UTC | #7
On Wed, Jan 11, 2017 at 10:18:02PM +0000, Kwok, WingMan wrote:
> 

> 

> > -----Original Message-----

> > From: Andrew Lunn [mailto:andrew@lunn.ch]

> > Sent: Monday, January 09, 2017 8:54 PM

> > To: Kwok, WingMan

> > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> > 

> > > From Marvell's brief description

> > http://www.marvell.com/transceivers/alaska-gbe/,

> > > it seems that 88E1510/1518 don't support fiber. Only 88E1512 does.

> > In

> > > that case, the fiber support patch is not applicable to

> > 88E1510/1518.

> > 

> > O.K. That makes it easier.

> > 

> > Please add the relevant IDs to include/linux/marvell.h, and add

> > entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE and

> > 1510 and 1518 without.

> > 

> >      Andrew

> 

> By any chance you have the ID of 1512?


Nope, sorry.

Try Russell King.

    Andrew
Kwok, WingMan Jan. 11, 2017, 10:32 p.m. UTC | #8
> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Wednesday, January 11, 2017 5:28 PM

> To: Kwok, WingMan

> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> On Wed, Jan 11, 2017 at 10:18:02PM +0000, Kwok, WingMan wrote:

> >

> >

> > > -----Original Message-----

> > > From: Andrew Lunn [mailto:andrew@lunn.ch]

> > > Sent: Monday, January 09, 2017 8:54 PM

> > > To: Kwok, WingMan

> > > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> > > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> > >

> > > > From Marvell's brief description

> > > http://www.marvell.com/transceivers/alaska-gbe/,

> > > > it seems that 88E1510/1518 don't support fiber. Only 88E1512

> does.

> > > In

> > > > that case, the fiber support patch is not applicable to

> > > 88E1510/1518.

> > >

> > > O.K. That makes it easier.

> > >

> > > Please add the relevant IDs to include/linux/marvell.h, and add

> > > entries to driver/net/phy/marvell.c for 1512 with SUPPORTED_FIBRE

> and

> > > 1510 and 1518 without.

> > >

> > >      Andrew

> >

> > By any chance you have the ID of 1512?

> 

> Nope, sorry.

> 

> Try Russell King.

> 

>     Andrew


Russell,

Do you have the ID of Marvell PHY 88E1512?

Thanks,
WingMan
Kwok, WingMan Jan. 11, 2017, 10:41 p.m. UTC | #9
> -----Original Message-----

> From: Kwok, WingMan

> Sent: Wednesday, January 11, 2017 5:32 PM

> To: 'Andrew Lunn'; 'rmk+kernel@arm.linux.org.uk'

> Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: RE: Marvell Phy (1510) issue since v4.7 kernel

> 

> 

> 

> > -----Original Message-----

> > From: Andrew Lunn [mailto:andrew@lunn.ch]

> > Sent: Wednesday, January 11, 2017 5:28 PM

> > To: Kwok, WingMan

> > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> >

> > On Wed, Jan 11, 2017 at 10:18:02PM +0000, Kwok, WingMan wrote:

> > >

> > >

> > > > -----Original Message-----

> > > > From: Andrew Lunn [mailto:andrew@lunn.ch]

> > > > Sent: Monday, January 09, 2017 8:54 PM

> > > > To: Kwok, WingMan

> > > > Cc: Karicheri, Muralidharan; netdev@vger.kernel.org

> > > > Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> > > >

> > > > > From Marvell's brief description

> > > > http://www.marvell.com/transceivers/alaska-gbe/,

> > > > > it seems that 88E1510/1518 don't support fiber. Only 88E1512

> > does.

> > > > In

> > > > > that case, the fiber support patch is not applicable to

> > > > 88E1510/1518.

> > > >

> > > > O.K. That makes it easier.

> > > >

> > > > Please add the relevant IDs to include/linux/marvell.h, and add

> > > > entries to driver/net/phy/marvell.c for 1512 with

> SUPPORTED_FIBRE

> > and

> > > > 1510 and 1518 without.

> > > >

> > > >      Andrew

> > >

> > > By any chance you have the ID of 1512?

> >

> > Nope, sorry.

> >

> > Try Russell King.

> >

> >     Andrew

> 

> Russell,

> 

> Do you have the ID of Marvell PHY 88E1512?

> 

> Thanks,

> WingMan

> 


looping in Charles-Antoine (author of patch
with commit id 6cfb3bcc)

Charles-Antoine,

Do you have the ID of Marvell PHY 88E1512?

Thanks,
WingMan
Andrew Lunn Jan. 11, 2017, 10:59 p.m. UTC | #10
> looping in Charles-Antoine (author of patch

> with commit id 6cfb3bcc)

> 

> Charles-Antoine,

> 

> Do you have the ID of Marvell PHY 88E1512?


I suspect that is the wrong question to ask.

The Marvell driver is being loaded, so it must be using on of the IDs
in the driver. There is no ID in the driver specifically for the
88E1512. It seems like the 88E1512 uses the 88E1510 ID.

So i think the correct question should be, how can we tell the 88E1512
from the 88E1510 if they have the same ID in register 3.

It appears that for the 88E1512, page 0 are the copper registers and
page 1 is the fibre registers. Maybe the 88E1512 has an ID in page 1
register 3? Maybe the 88E1510 does not have an ID in page 1 register
3?

	Andrew
Kwok, WingMan Jan. 11, 2017, 11:09 p.m. UTC | #11
> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Wednesday, January 11, 2017 6:00 PM

> To: Kwok, WingMan

> Cc: rmk+kernel@arm.linux.org.uk; charles-antoine.couret@nexvision.fr;

> Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> > looping in Charles-Antoine (author of patch

> > with commit id 6cfb3bcc)

> >

> > Charles-Antoine,

> >

> > Do you have the ID of Marvell PHY 88E1512?

> 

> I suspect that is the wrong question to ask.

> 

> The Marvell driver is being loaded, so it must be using on of the IDs

> in the driver. There is no ID in the driver specifically for the

> 88E1512. It seems like the 88E1512 uses the 88E1510 ID.

> 

> So i think the correct question should be, how can we tell the 88E1512

> from the 88E1510 if they have the same ID in register 3.

> 

> It appears that for the 88E1512, page 0 are the copper registers and

> page 1 is the fibre registers. Maybe the 88E1512 has an ID in page 1

> register 3? Maybe the 88E1510 does not have an ID in page 1 register

> 3?

> 

> 	Andrew


Andrew,

Would Charles-Antoine be the better person to submit a patch
to fix the original problem then, since he tested the original
fiber support patch with 1512? I unfortunately don't have the
datasheet for 1512, and it does not seem to be available publicly.

Regards,
WingMan
Andrew Lunn Jan. 11, 2017, 11:22 p.m. UTC | #12
> > So i think the correct question should be, how can we tell the 88E1512

> > from the 88E1510 if they have the same ID in register 3.


Hi WingMan

Do you know you really have a 1510? You can see it written on the
chip?

Please can you read page 18, register 30 and let us know what value
you get.

    Andrew
Kwok, WingMan Jan. 12, 2017, 12:31 a.m. UTC | #13
> Hi WingMan

> 

> Do you know you really have a 1510? You can see it written on the

> chip?

> 

> Please can you read page 18, register 30 and let us know what value

> you get.

> 

>     Andrew


Removed Charles-Antonie from the list since the email
address is no longer valid.

Hi Andrew,

I double checked that ours is actually a 88E1514. According
to Marvell's brief description, it seems that it does support
fiber.

Reading page 18 reg 30 returns 6.

Thanks,
WingMan
Andrew Lunn Jan. 12, 2017, 1:06 a.m. UTC | #14
> Hi Andrew,

> 

> I double checked that ours is actually a 88E1514. According

> to Marvell's brief description, it seems that it does support

> fiber.

> 

> Reading page 18 reg 30 returns 6.


O.K, that is consistent. Looking at the Marvell SDK:

phy/Include/madApiDefs.h:#define    MAD_SUB_88E1510  4   /* 88E1510 */
phy/Include/madApiDefs.h:#define    MAD_SUB_88E1518  5   /* 88E1518 */
phy/Include/madApiDefs.h:#define    MAD_SUB_88E1512  6   /* 88E1512/88E1514 */

I took another look at the patch adding Fibre support. The
suspend/resume functions are wrong.

        /* Suspend the fiber mode first */
        if (!(phydev->supported & SUPPORTED_FIBRE)) {
 
The ! should not be there. Does removing them both fix your problem?

    Andrew
Kwok, WingMan Jan. 12, 2017, 9:31 p.m. UTC | #15
> > I double checked that ours is actually a 88E1514. According

> > to Marvell's brief description, it seems that it does support

> > fiber.

> >

> > Reading page 18 reg 30 returns 6.

> 

> O.K, that is consistent. Looking at the Marvell SDK:

> 

> phy/Include/madApiDefs.h:#define    MAD_SUB_88E1510  4   /* 88E1510 */

> phy/Include/madApiDefs.h:#define    MAD_SUB_88E1518  5   /* 88E1518 */

> phy/Include/madApiDefs.h:#define    MAD_SUB_88E1512  6   /*

> 88E1512/88E1514 */

> 

> I took another look at the patch adding Fibre support. The

> suspend/resume functions are wrong.

> 

>         /* Suspend the fiber mode first */

>         if (!(phydev->supported & SUPPORTED_FIBRE)) {

> 

> The ! should not be there. Does removing them both fix your problem?

> 


Hi Andrew,

Thanks for taking the time to look into those patches. Yes we notice
the error in the suspend/resume functions also.

But our problem is caused by the read_status function:

	if ((phydev->supported & SUPPORTED_FIBRE)) {
		err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);
		if (err < 0)
			goto error;
		
		err = marvell_read_status_page(phydev, MII_M1111_FIBER);
		if (err < 0)
			goto error;
		
		/* If the fiber link is up, it is the selected and used link.
		* In this case, we need to stay in the fiber page.
		* Please to be careful about that, avoid to restore Copper page
		* in other functions which could break the behaviour
		* for some fiber phy like 88E1512.
		* */
		if (phydev->link)
			return 0;

which keeps the fiber page if phydev->link is true (for some
reason this is the case even though we are not using fiber)

However, this causes a problem in kernel reboot because neither
the suspend/resume is called to restore the copper page and
u-boot marvell phy driver does not support 1510 fiber, which
will then result in writing to the wrong phy regs and causes
a sgmii auto-nego time out.

In addition to fixing the ! in suspend/resume, my suggestion
would be to change also the read_status function to
always restore the copper page after doing the fiber stuffs:

	if ((phydev->supported & SUPPORTED_FIBRE)) {
		err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);
		if (err < 0)
			goto error;
		
		err = marvell_read_status_page(phydev, MII_M1111_FIBER);
		if (err < 0)
			goto error;
		
		err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);
		if (err < 0)
			goto error;

		/* If the fiber link is up, it is the selected and used link.
		* In this case, we need to stay in the fiber page.
		* Please to be careful about that, avoid to restore Copper page
		* in other functions which could break the behaviour
		* for some fiber phy like 88E1512.
		* */
		if (phydev->link)
			return 0;

This makes sure that it is always the copper page brought up
and whenever it needs to do some fiber stuffs, the fiber page
needs to be brought out explicitly, which is already the case
currently.

Another issue is that, as of now, FIBER is enabled regardless
of the specific 88e151x. But I believe there is 88e151x chip(s)
that does not support fiber. Should fiber be enabled only for
those that do support fiber?

WingMan
Andrew Lunn Jan. 12, 2017, 9:49 p.m. UTC | #16
> But our problem is caused by the read_status function:

> 

> 	if ((phydev->supported & SUPPORTED_FIBRE)) {

> 		err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);

> 		if (err < 0)

> 			goto error;

> 		

> 		err = marvell_read_status_page(phydev, MII_M1111_FIBER);

> 		if (err < 0)

> 			goto error;

> 		

> 		/* If the fiber link is up, it is the selected and used link.

> 		* In this case, we need to stay in the fiber page.

> 		* Please to be careful about that, avoid to restore Copper page

> 		* in other functions which could break the behaviour

> 		* for some fiber phy like 88E1512.

> 		* */

> 		if (phydev->link)

> 			return 0;

> 

> which keeps the fiber page if phydev->link is true (for some

> reason this is the case even though we are not using fiber)


How are you using the PHY. What phy-mode do you have set?  Do you
happen to be using it as an RGMII to SERDES/SGMII bridge? This is what
Russell King is doing, i think.

Have you tried the patch Russell submitted recently.

Author: Russell King <rmk+kernel@armlinux.org.uk>
Date:   Tue Jan 10 23:13:45 2017 +0000

    net: phy: marvell: fix Marvell 88E1512 used in SGMII mode
    
    When an Marvell 88E1512 PHY is connected to a nic in SGMII mode, the
    fiber page is used for the SGMII host-side connection.  The PHY driver
    notices that SUPPORTED_FIBRE is set, so it tries reading the fiber page
    for the link status, and ends up reading the MAC-side status instead of
    the outgoing (copper) link.  This leads to incorrect results reported
    via ethtool.
    
    If the PHY is connected via SGMII to the host, ignore the fiber page.
    However, continue to allow the existing power management code to
    suspend and resume the fiber page.

> However, this causes a problem in kernel reboot because neither

> the suspend/resume is called to restore the copper page and

> u-boot marvell phy driver does not support 1510 fiber, which

> will then result in writing to the wrong phy regs and causes

> a sgmii auto-nego time out.


This is still a u-boot bug. It should not assume the PHY is in a sane
state. It should reset it and configure it as needed. So far, i don't
think you have reported any issues with Linux usage of the PHY. There
clearly are bugs, but your real problem is u-boot.

> In addition to fixing the ! in suspend/resume, my suggestion

> would be to change also the read_status function to

> always restore the copper page after doing the fiber stuffs:


Nope. This is done deliberately, as the comment suggests:

> 		/* If the fiber link is up, it is the selected and used link.

> 		* In this case, we need to stay in the fiber page.

> 		* Please to be careful about that, avoid to restore Copper page

> 		* in other functions which could break the behaviour

> 		* for some fiber phy like 88E1512.

> 		* */

> 		if (phydev->link)

> 			return 0;


The point is, the phylib will continue polling the PHY registers,
reading them. If the FIBRE is up, we want to read the FIBRE values,
not the copper.

> Another issue is that, as of now, FIBER is enabled regardless

> of the specific 88e151x. But I believe there is 88e151x chip(s)

> that does not support fiber. Should fiber be enabled only for

> those that do support fiber?


Yes, we should look at register 30, page 18 any set SUPPORTED_FIBRE
based on that.

      Andrew
Russell King (Oracle) Jan. 12, 2017, 10:31 p.m. UTC | #17
On Thu, Jan 12, 2017 at 10:49:41PM +0100, Andrew Lunn wrote:
> How are you using the PHY. What phy-mode do you have set?  Do you

> happen to be using it as an RGMII to SERDES/SGMII bridge? This is what

> Russell King is doing, i think.


No - the 88E1512 here is connected to the ethernet host via SGMII.

	Armada 8040 ----SGMII----> 88E1512 ----> RJ45

The problem that my commit fixes is that when the 88E1512 is used in
this mode, the copper page reflects the status of the RJ45-side link,
and the fiber page reflects the Armada 8040 side of the link.

Without my patch, the 88E1512 read_status function spots that the fiber
page reports its link is up (because the Armada 8040 side is indeed up)
and decides to report that as the link settings, rather than the
(correct) copper side.

> The point is, the phylib will continue polling the PHY registers,

> reading them. If the FIBRE is up, we want to read the FIBRE values,

> not the copper.


... but only if we aren't connected in SGMII mode to the ethernet host!

The 88E1512 has the RGMII pins, and it has the Serdes pins.  The Serdes
pins are used for the optical transceiver, or they're used for a SGMII
connection to an ethernet host.  So they can't do both.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
Kwok, WingMan Jan. 12, 2017, 10:40 p.m. UTC | #18
> -----Original Message-----

> From: Russell King - ARM Linux [mailto:linux@armlinux.org.uk]

> Sent: Thursday, January 12, 2017 4:42 PM

> To: Kwok, WingMan

> Cc: Andrew Lunn; Karicheri, Muralidharan; netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> On Thu, Jan 12, 2017 at 09:31:27PM +0000, Kwok, WingMan wrote:

> >

> > > > I double checked that ours is actually a 88E1514. According

> > > > to Marvell's brief description, it seems that it does support

> > > > fiber.

> > > >

> > > > Reading page 18 reg 30 returns 6.

> > >

> > > O.K, that is consistent. Looking at the Marvell SDK:

> > >

> > > phy/Include/madApiDefs.h:#define    MAD_SUB_88E1510  4   /*

> 88E1510 */

> > > phy/Include/madApiDefs.h:#define    MAD_SUB_88E1518  5   /*

> 88E1518 */

> > > phy/Include/madApiDefs.h:#define    MAD_SUB_88E1512  6   /*

> > > 88E1512/88E1514 */

> > >

> > > I took another look at the patch adding Fibre support. The

> > > suspend/resume functions are wrong.

> > >

> > >         /* Suspend the fiber mode first */

> > >         if (!(phydev->supported & SUPPORTED_FIBRE)) {

> > >

> > > The ! should not be there. Does removing them both fix your

> problem?

> > >

> >

> > Hi Andrew,

> >

> > Thanks for taking the time to look into those patches. Yes we notice

> > the error in the suspend/resume functions also.

> >

> > But our problem is caused by the read_status function:

> >

> > 	if ((phydev->supported & SUPPORTED_FIBRE)) {

> > 		err = phy_write(phydev, MII_MARVELL_PHY_PAGE,

> MII_M1111_FIBER);

> > 		if (err < 0)

> > 			goto error;

> >

> > 		err = marvell_read_status_page(phydev, MII_M1111_FIBER);

> > 		if (err < 0)

> > 			goto error;

> >

> > 		/* If the fiber link is up, it is the selected and used

> link.

> > 		* In this case, we need to stay in the fiber page.

> > 		* Please to be careful about that, avoid to restore Copper

> page

> > 		* in other functions which could break the behaviour

> > 		* for some fiber phy like 88E1512.

> > 		* */

> > 		if (phydev->link)

> > 			return 0;

> >

> > which keeps the fiber page if phydev->link is true (for some

> > reason this is the case even though we are not using fiber)

> 

> See below.

> 

> > However, this causes a problem in kernel reboot because neither

> > the suspend/resume is called to restore the copper page and

> > u-boot marvell phy driver does not support 1510 fiber, which

> > will then result in writing to the wrong phy regs and causes

> > a sgmii auto-nego time out.

> 

> So what we need to do is to have a .shutdown function installed - but

> it's not that simple...

> 

> 	.mdiodrv = {

> 		.driver = {

> 			.shutdown = mv88e1512_shutdown,

> 		},

> 	},

> 

> in the appropriate phy_driver array element, and then have:

> 

> static void mv88e1512_shutdown(struct device *dev)

> {

> 	struct phy_device *phydev = to_phy_device(dev);

> 

> 	phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_COPPER);

> }

> 

> which will restore the copper page on non-emergency reboots.  For

> emergency reboots, we're out of luck... the real fix would be for

> uboot to ensure that when it detects this PHY, it ensures that the

> right page is selected.

> 

> Now, that all said, there's a bug in this code, which is fixed by

> a patch I recently submitted.  If you have an 88E151x connected via

> SGMII, then the read_status function incorrectly believes it should

> be reading from the fiber page - when in fact the fiber page is

> reporting the status of the host-side link.  So, before trying the

> above modification, please try this patch - it's already been merged

> into the net tree, so should hit -rc soon.

> 

> My guess from what you've said above is that your 88E151x is connected

> via SGMII, so you need the patch below rather than trying to install

> a shutdown function.

> 

> 8<====

> From: Russell King <rmk+kernel@armlinux.org.uk>

> Subject: [PATCH] net: phy: marvell: fix Marvell 88E1512 used in SGMII

> mode

> 


Hi Russell,

Thanks for the explanations. Yes, our 88e1514 is connected to the
host via sgmii and your patch does provide a solution to our problem.

Regards,
WingMan
Kwok, WingMan Jan. 12, 2017, 10:49 p.m. UTC | #19
Hi Andrew,

> -----Original Message-----

> From: Andrew Lunn [mailto:andrew@lunn.ch]

> Sent: Thursday, January 12, 2017 4:50 PM

> To: Kwok, WingMan

> Cc: rmk+kernel@arm.linux.org.uk; Karicheri, Muralidharan;

> netdev@vger.kernel.org

> Subject: Re: Marvell Phy (1510) issue since v4.7 kernel

> 

> > But our problem is caused by the read_status function:

> >

> > 	if ((phydev->supported & SUPPORTED_FIBRE)) {

> > 		err = phy_write(phydev, MII_MARVELL_PHY_PAGE,

> MII_M1111_FIBER);

> > 		if (err < 0)

> > 			goto error;

> >

> > 		err = marvell_read_status_page(phydev, MII_M1111_FIBER);

> > 		if (err < 0)

> > 			goto error;

> >

> > 		/* If the fiber link is up, it is the selected and used

> link.

> > 		* In this case, we need to stay in the fiber page.

> > 		* Please to be careful about that, avoid to restore Copper

> page

> > 		* in other functions which could break the behaviour

> > 		* for some fiber phy like 88E1512.

> > 		* */

> > 		if (phydev->link)

> > 			return 0;

> >

> > which keeps the fiber page if phydev->link is true (for some

> > reason this is the case even though we are not using fiber)

> 

> How are you using the PHY. What phy-mode do you have set?  Do you

> happen to be using it as an RGMII to SERDES/SGMII bridge? This is what

> Russell King is doing, i think.

> 


our 88e1514 is connected to the host via sgmii.

> Have you tried the patch Russell submitted recently.

> 

> Author: Russell King <rmk+kernel@armlinux.org.uk>

> Date:   Tue Jan 10 23:13:45 2017 +0000

> 

>     net: phy: marvell: fix Marvell 88E1512 used in SGMII mode

> 

>     When an Marvell 88E1512 PHY is connected to a nic in SGMII mode,

> the

>     fiber page is used for the SGMII host-side connection.  The PHY

> driver

>     notices that SUPPORTED_FIBRE is set, so it tries reading the fiber

> page

>     for the link status, and ends up reading the MAC-side status

> instead of

>     the outgoing (copper) link.  This leads to incorrect results

> reported

>     via ethtool.

> 

>     If the PHY is connected via SGMII to the host, ignore the fiber

> page.

>     However, continue to allow the existing power management code to

>     suspend and resume the fiber page.

> 


Thanks for pointer. It does fix the problem.

> > However, this causes a problem in kernel reboot because neither

> > the suspend/resume is called to restore the copper page and

> > u-boot marvell phy driver does not support 1510 fiber, which

> > will then result in writing to the wrong phy regs and causes

> > a sgmii auto-nego time out.

> 

> This is still a u-boot bug. It should not assume the PHY is in a sane

> state. It should reset it and configure it as needed. So far, i don't

> think you have reported any issues with Linux usage of the PHY. There

> clearly are bugs, but your real problem is u-boot.

> 


Yes. Agree.

> > In addition to fixing the ! in suspend/resume, my suggestion

> > would be to change also the read_status function to

> > always restore the copper page after doing the fiber stuffs:

> 

> Nope. This is done deliberately, as the comment suggests:

> 

> > 		/* If the fiber link is up, it is the selected and used

> link.

> > 		* In this case, we need to stay in the fiber page.

> > 		* Please to be careful about that, avoid to restore Copper

> page

> > 		* in other functions which could break the behaviour

> > 		* for some fiber phy like 88E1512.

> > 		* */

> > 		if (phydev->link)

> > 			return 0;

> 

> The point is, the phylib will continue polling the PHY registers,

> reading them. If the FIBRE is up, we want to read the FIBRE values,

> not the copper.

> 


Thanks for the explanations.

> > Another issue is that, as of now, FIBER is enabled regardless

> > of the specific 88e151x. But I believe there is 88e151x chip(s)

> > that does not support fiber. Should fiber be enabled only for

> > those that do support fiber?

> 

> Yes, we should look at register 30, page 18 any set SUPPORTED_FIBRE

> based on that.

> 


Thanks for the suggestions.

>       Andrew


Just want to know if there is already a patch in the net tree
fixing the incorrect ! in the suspend/resume functions also?

WingMan
diff mbox

Patch

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index c2dcf02..e91bdd3 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1194,7 +1194,7 @@  static int marvell_read_status(struct phy_device *phydev)
        int err;
 
        /* Check the fiber mode first */
-       if (phydev->supported & SUPPORTED_FIBRE) {
+       if (!(phydev->supported & SUPPORTED_FIBRE)) {
                err = phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_M1111_FIBER);
                if (err < 0)
                        goto error;