diff mbox series

[net-next,1/4] ixgbe: sparc: rename the ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

Message ID 1491031554-19516-2-git-send-email-dingtianhong@huawei.com
State New
Headers show
Series ixgbe: enable Relaxed Order for ARM64 | expand

Commit Message

Ding Tianhong April 1, 2017, 7:25 a.m. UTC
Till now only the Intel ixgbe could support enable
Relaxed ordering in the drivers for special architecture,
but the ARCH_WANT_RELAX_ORDER is looks like a general name
for all arch, so rename to a specific name for intel
card looks more appropriate.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

---
 arch/Kconfig                                    | 2 +-
 arch/sparc/Kconfig                              | 2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

-- 
1.9.0

Comments

David Miller April 1, 2017, 6:26 p.m. UTC | #1
From: Ding Tianhong <dingtianhong@huawei.com>

Date: Sat, 1 Apr 2017 15:25:51 +0800

> Till now only the Intel ixgbe could support enable

> Relaxed ordering in the drivers for special architecture,

> but the ARCH_WANT_RELAX_ORDER is looks like a general name

> for all arch, so rename to a specific name for intel

> card looks more appropriate.

> 

> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>


This is not a driver specific facility.

Any driver can test this symbol and act accordingly.

Just because IXGBE is the first and only user, doesn't mean
the facility is driver specific.

Thank you.
Ding Tianhong April 2, 2017, 6:49 a.m. UTC | #2
On 2017/4/2 2:26, David Miller wrote:
> From: Ding Tianhong <dingtianhong@huawei.com>

> Date: Sat, 1 Apr 2017 15:25:51 +0800

> 

>> Till now only the Intel ixgbe could support enable

>> Relaxed ordering in the drivers for special architecture,

>> but the ARCH_WANT_RELAX_ORDER is looks like a general name

>> for all arch, so rename to a specific name for intel

>> card looks more appropriate.

>>

>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

> 

> This is not a driver specific facility.

> 

> Any driver can test this symbol and act accordingly.

> 

> Just because IXGBE is the first and only user, doesn't mean

> the facility is driver specific.

> 


Understand clearly,but the ARCH_WANT_RELAX_ORDER is really too generic and simple,
cause much misleading to indicate that it looks like the hack code for some architecture.
what do you think of the ETHERNET_ALLOW_RELAXED_ORDER in the drivers/net/ethernet/*,
it will only affect ethernet and not only for Ixgbe.

Thanks
Ding


> Thank you.

> 

> .

>
John Garry April 5, 2017, 1:05 p.m. UTC | #3
On 02/04/2017 07:49, Ding Tianhong wrote:
>

>

> On 2017/4/2 2:26, David Miller wrote:

>> From: Ding Tianhong <dingtianhong@huawei.com>

>> Date: Sat, 1 Apr 2017 15:25:51 +0800

>>

>>> Till now only the Intel ixgbe could support enable

>>> Relaxed ordering in the drivers for special architecture,

>>> but the ARCH_WANT_RELAX_ORDER is looks like a general name

>>> for all arch, so rename to a specific name for intel

>>> card looks more appropriate.

>>>

>>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

>>

>> This is not a driver specific facility.

>>

>> Any driver can test this symbol and act accordingly.

>>

>> Just because IXGBE is the first and only user, doesn't mean

>> the facility is driver specific.

>>

>

> Understand clearly,but the ARCH_WANT_RELAX_ORDER is really too generic and simple,

> cause much misleading to indicate that it looks like the hack code for some architecture.

> what do you think of the ETHERNET_ALLOW_RELAXED_ORDER in the drivers/net/ethernet/*,

> it will only affect ethernet and not only for Ixgbe.

>


Hi Ding,

I think the actual original config ARCH_WANT_RELAX_ORDER is quite 
dubious, and it does not really tell us which feature(s) of the 
architecture supports this (if indeed it is arch specific).

According to the original commit, 1a8b6d76dc5b net:add one common config 
ARCH_WANT_RELAX_ORDER to support relax ordering, this is specific to 
SPARC only:
"Currently it only supports one special cpu architecture(SPARC) in 82599 
driver to enable RO feature, this is not very common for other cpu 
architecture which really needs RO feature".

This sounds wooly.

So I think that we need to know which specific architecture, memory 
model, or PCI host/port/EP features, or combination of them, allows this 
so called relaxed ordering.

And a config option is probably not even the proper check.

John

> Thanks

> Ding

>

>

>> Thank you.

>>

>> .

>>

>

>

> .

>
Ding Tianhong April 6, 2017, 11:28 a.m. UTC | #4
On 2017/4/5 21:05, John Garry wrote:
> On 02/04/2017 07:49, Ding Tianhong wrote:

>>

>>

>> On 2017/4/2 2:26, David Miller wrote:

>>> From: Ding Tianhong <dingtianhong@huawei.com>

>>> Date: Sat, 1 Apr 2017 15:25:51 +0800

>>>

>>>> Till now only the Intel ixgbe could support enable

>>>> Relaxed ordering in the drivers for special architecture,

>>>> but the ARCH_WANT_RELAX_ORDER is looks like a general name

>>>> for all arch, so rename to a specific name for intel

>>>> card looks more appropriate.

>>>>

>>>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

>>>

>>> This is not a driver specific facility.

>>>

>>> Any driver can test this symbol and act accordingly.

>>>

>>> Just because IXGBE is the first and only user, doesn't mean

>>> the facility is driver specific.

>>>

>>

>> Understand clearly,but the ARCH_WANT_RELAX_ORDER is really too generic and simple,

>> cause much misleading to indicate that it looks like the hack code for some architecture.

>> what do you think of the ETHERNET_ALLOW_RELAXED_ORDER in the drivers/net/ethernet/*,

>> it will only affect ethernet and not only for Ixgbe.

>>

> 


Hi John:

> Hi Ding,

> 

> I think the actual original config ARCH_WANT_RELAX_ORDER is quite dubious, and it does not really tell us which feature(s) of the architecture supports this (if indeed it is arch specific).

> 


Agree.

> According to the original commit, 1a8b6d76dc5b net:add one common config ARCH_WANT_RELAX_ORDER to support relax ordering, this is specific to SPARC only:

> "Currently it only supports one special cpu architecture(SPARC) in 82599 driver to enable RO feature, this is not very common for other cpu architecture which really needs RO feature".

> 


Relaxed Ordering is a general setting compare to the SO for PCIE controller, if the drivers support it, the architecture could choose to enable it, of course the feature is not support for
every arch.

> This sounds wooly.

> 

> So I think that we need to know which specific architecture, memory model, or PCI host/port/EP features, or combination of them, allows this so called relaxed ordering.

> 


This depends on the PCIE design in the chip, I couldn't know whether other arch has some issues when enable RO,if the chip totally support PCIE3.0 standard and has no defect,should both support RO and SO.

Thanks
Ding
> And a config option is probably not even the proper check.

> 

> John






> 

>> Thanks

>> Ding

>>

>>

>>> Thank you.

>>>

>>> .

>>>

>>

>>

>> .

>>

> 

> 

> 

> .

>
Gabriele Paoloni April 13, 2017, 9:10 a.m. UTC | #5
Hi David

> -----Original Message-----

> Subject: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> Date: Sat, 1 Apr 2017 11:26:03 -0700

> From: David Miller <davem@davemloft.net>

> To: dingtianhong@huawei.com

> CC: catalin.marinas@arm.com, will.deacon@arm.com, mark.rutland@arm.com,

> robin.murphy@arm.com, jeffrey.t.kirsher@intel.com,

> alexander.duyck@gmail.com, linux-arm-kernel@lists.infradead.org,

> netdev@vger.kernel.org

> 

> From: Ding Tianhong <dingtianhong@huawei.com>

> Date: Sat, 1 Apr 2017 15:25:51 +0800

> 

> > Till now only the Intel ixgbe could support enable

> > Relaxed ordering in the drivers for special architecture,

> > but the ARCH_WANT_RELAX_ORDER is looks like a general name

> > for all arch, so rename to a specific name for intel

> > card looks more appropriate.

> >

> > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

> 

> This is not a driver specific facility.

> 

> Any driver can test this symbol and act accordingly.

> 

> Just because IXGBE is the first and only user, doesn't mean

> the facility is driver specific.



Please correct me if I am wrong but my understanding is that the standard
way to enable/disable relaxed ordering is to set/clear bit 4 of the Device
Control Register (PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed
ordering */).
Now I have looked up for all drivers either enabling or disabling relaxed
ordering and none of them seems to need a symbol to decide whether to
enable it or not.
Also it seems to me enabling/disabling relaxed ordering is never bound to the
host architecture.

So why this should be (or it is expected to be) a generic symbol?
Wouldn't it be more correct to have this as a driver specific symbol now and
move it to a generic one later once we have other drivers requiring it?
  
Many thanks
Gab

> 

> Thank you.

> 

> .

>
David Miller April 13, 2017, 2:53 p.m. UTC | #6
From: Gabriele Paoloni <gabriele.paoloni@huawei.com>

Date: Thu, 13 Apr 2017 09:10:32 +0000

> Wouldn't it be more correct to have this as a driver specific symbol

> now and move it to a generic one later once we have other drivers

> requiring it?


No, it would not.
David Laight April 18, 2017, 1:25 p.m. UTC | #7
From: Gabriele Paoloni

> Sent: 13 April 2017 10:11

> > > Till now only the Intel ixgbe could support enable

> > > Relaxed ordering in the drivers for special architecture,

> > > but the ARCH_WANT_RELAX_ORDER is looks like a general name

> > > for all arch, so rename to a specific name for intel

> > > card looks more appropriate.

> > >

> > > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

> >

> > This is not a driver specific facility.

> >

> > Any driver can test this symbol and act accordingly.

> >

> > Just because IXGBE is the first and only user, doesn't mean

> > the facility is driver specific.

> 

> 

> Please correct me if I am wrong but my understanding is that the standard

> way to enable/disable relaxed ordering is to set/clear bit 4 of the Device

> Control Register (PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed

> ordering */).

> Now I have looked up for all drivers either enabling or disabling relaxed

> ordering and none of them seems to need a symbol to decide whether to

> enable it or not.

> Also it seems to me enabling/disabling relaxed ordering is never bound to the

> host architecture.

> 

> So why this should be (or it is expected to be) a generic symbol?

> Wouldn't it be more correct to have this as a driver specific symbol now and

> move it to a generic one later once we have other drivers requiring it?


'Relaxed ordering' is a bit in the TLP header.
A device (like the ixgbe hardware) can set it for some transactions and
still have the transactions 'ordered enough' for the driver to work.
(If all transactions have 'relaxed ordering' set then I suspect it is
almost impossible to write a working driver.)
The host side could (probably does) have a bit to enable 'relaxed ordering',
it could also enforce stronger ordering than required by the PCIe spec
(eg keeping reads in order).

Clearly, on some sparc64 systems, ixgbe needs to use 'relaxed ordering'.
To me this requires two separate bits be enabled:
1) to the ixgbe driver to generate the 'relaxed' TLP.
2) to the host to actually act on them.
If the ixgbe driver works when both are enabled why have options to
disable either (except for bug-finding)?

	David
Gabriele Paoloni April 19, 2017, 2:28 p.m. UTC | #8
Hi David

Many thanks for your reply

> -----Original Message-----

> From: David Laight [mailto:David.Laight@ACULAB.COM]

> Sent: 18 April 2017 14:26

> To: Gabriele Paoloni; davem@davemloft.net

> Cc: Catalin Marinas; Will Deacon; Mark Rutland; Robin Murphy;

> jeffrey.t.kirsher@intel.com; alexander.duyck@gmail.com; linux-arm-

> kernel@lists.infradead.org; netdev@vger.kernel.org; Dingtianhong;

> Linuxarm

> Subject: RE: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> 

> From: Gabriele Paoloni

> > Sent: 13 April 2017 10:11

> > > > Till now only the Intel ixgbe could support enable

> > > > Relaxed ordering in the drivers for special architecture,

> > > > but the ARCH_WANT_RELAX_ORDER is looks like a general name

> > > > for all arch, so rename to a specific name for intel

> > > > card looks more appropriate.

> > > >

> > > > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

> > >

> > > This is not a driver specific facility.

> > >

> > > Any driver can test this symbol and act accordingly.

> > >

> > > Just because IXGBE is the first and only user, doesn't mean

> > > the facility is driver specific.

> >

> >

> > Please correct me if I am wrong but my understanding is that the

> standard

> > way to enable/disable relaxed ordering is to set/clear bit 4 of the

> Device

> > Control Register (PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed

> > ordering */).

> > Now I have looked up for all drivers either enabling or disabling

> relaxed

> > ordering and none of them seems to need a symbol to decide whether to

> > enable it or not.

> > Also it seems to me enabling/disabling relaxed ordering is never

> bound to the

> > host architecture.

> >

> > So why this should be (or it is expected to be) a generic symbol?

> > Wouldn't it be more correct to have this as a driver specific symbol

> now and

> > move it to a generic one later once we have other drivers requiring

> it?

> 

> 'Relaxed ordering' is a bit in the TLP header.

> A device (like the ixgbe hardware) can set it for some transactions and

> still have the transactions 'ordered enough' for the driver to work.

> (If all transactions have 'relaxed ordering' set then I suspect it is

> almost impossible to write a working driver.)

> The host side could (probably does) have a bit to enable 'relaxed

> ordering',

> it could also enforce stronger ordering than required by the PCIe spec

> (eg keeping reads in order).


My understanding is that from the host side the host is always allowed
(as long as it complies with the rules specified in sec.2.4.1 of the PCIe
Specs) to set the RO attribute in the TLP and the target function should
be abel to cope with it.

On the device side the device is allowed to set the RO attribute in the
TLP only if bit4 of the "Device Control Register" is set.

> 

> Clearly, on some sparc64 systems, ixgbe needs to use 'relaxed

> ordering'.

> To me this requires two separate bits be enabled:

> 1) to the ixgbe driver to generate the 'relaxed' TLP.

> 2) to the host to actually act on them.


My understanding is that for performance reasons when possible we
should enable relaxed ordering and I think this is up to the host
(i.e. the host somehow should know when he is capable of handling
RO TLPs and therefore it will try to enable it on the driver)

> If the ixgbe driver works when both are enabled why have options to

> disable either (except for bug-finding)?


I think that by default the ixgbe driver disable RO since there are
issues with "some chipsets" according to commit 3d5c520727ce "ixgbe:
move disabling of relaxed ordering in start_hw()".
What this means is a bit obscure to me and seems to be not related to
the host architecture

Also looking at where and why the other drivers set/clear the "Enable
Relaxed Ordering" bit it seems that currently this is not tied to the
host architecture nor to any global symbol; instead it seems purely
dependent on the PCIe device chipset itself.

> 

> 	David
Gabriele Paoloni April 19, 2017, 2:46 p.m. UTC | #9
Hi Amir
 
> From: Amir Ancel [mailto:amira@mellanox.com]

> Sent: 18 April 2017 21:18

> To: David Laight; Gabriele Paoloni; davem@davemloft.net

> Cc: Catalin Marinas; Will Deacon; Mark Rutland; Robin Murphy;

> jeffrey.t.kirsher@intel.com; alexander.duyck@gmail.com; linux-arm-

> kernel@lists.infradead.org; netdev@vger.kernel.org; Dingtianhong;

> Linuxarm

> Subject: Re: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> 

> Hi,

> mlx5 driver is planned to have RO support this year.

> I believe drivers should be able to query whether the arch support it


I guess that here when you say query you mean having a config symbol
that is set accordingly to the host architecture, right?

As already said I have looked around a bit and other drivers do not seem
to enable/disable RO for their EP on the basis of the host architecture.
So why should mlx5 do it according to the host?

Also my understating is that some architectures (like ARM64 for example)
can have different PCI host controller implementations depending on the
vendor...therefore maybe it is not appropriate there to have a Kconfig
symbol selected by the architecture...  

Thanks
Gab

> or not and enable it in the network adapter accordingly.

> 

> -Amir

> ________________________________________

> From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> on

> behalf of David Laight <David.Laight@ACULAB.COM>

> Sent: Tuesday, April 18, 2017 4:25:44 PM

> To: 'Gabriele Paoloni'; davem@davemloft.net

> Cc: Catalin Marinas; Will Deacon; Mark Rutland; Robin Murphy;

> jeffrey.t.kirsher@intel.com; alexander.duyck@gmail.com; linux-arm-

> kernel@lists.infradead.org; netdev@vger.kernel.org; Dingtianhong;

> Linuxarm

> Subject: RE: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> 

> From: Gabriele Paoloni

> > Sent: 13 April 2017 10:11

> > > > Till now only the Intel ixgbe could support enable

> > > > Relaxed ordering in the drivers for special architecture,

> > > > but the ARCH_WANT_RELAX_ORDER is looks like a general name

> > > > for all arch, so rename to a specific name for intel

> > > > card looks more appropriate.

> > > >

> > > > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

> > >

> > > This is not a driver specific facility.

> > >

> > > Any driver can test this symbol and act accordingly.

> > >

> > > Just because IXGBE is the first and only user, doesn't mean

> > > the facility is driver specific.

> >

> >

> > Please correct me if I am wrong but my understanding is that the

> standard

> > way to enable/disable relaxed ordering is to set/clear bit 4 of the

> Device

> > Control Register (PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed

> > ordering */).

> > Now I have looked up for all drivers either enabling or disabling

> relaxed

> > ordering and none of them seems to need a symbol to decide whether to

> > enable it or not.

> > Also it seems to me enabling/disabling relaxed ordering is never

> bound to the

> > host architecture.

> >

> > So why this should be (or it is expected to be) a generic symbol?

> > Wouldn't it be more correct to have this as a driver specific symbol

> now and

> > move it to a generic one later once we have other drivers requiring

> it?

> 

> 'Relaxed ordering' is a bit in the TLP header.

> A device (like the ixgbe hardware) can set it for some transactions and

> still have the transactions 'ordered enough' for the driver to work.

> (If all transactions have 'relaxed ordering' set then I suspect it is

> almost impossible to write a working driver.)

> The host side could (probably does) have a bit to enable 'relaxed

> ordering',

> it could also enforce stronger ordering than required by the PCIe spec

> (eg keeping reads in order).

> 

> Clearly, on some sparc64 systems, ixgbe needs to use 'relaxed

> ordering'.

> To me this requires two separate bits be enabled:

> 1) to the ixgbe driver to generate the 'relaxed' TLP.

> 2) to the host to actually act on them.

> If the ixgbe driver works when both are enabled why have options to

> disable either (except for bug-finding)?

> 

>         David
Will Deacon April 24, 2017, 2:53 p.m. UTC | #10
On Wed, Apr 19, 2017 at 02:46:19PM +0000, Gabriele Paoloni wrote:
> > From: Amir Ancel [mailto:amira@mellanox.com]

> > Sent: 18 April 2017 21:18

> > To: David Laight; Gabriele Paoloni; davem@davemloft.net

> > Cc: Catalin Marinas; Will Deacon; Mark Rutland; Robin Murphy;

> > jeffrey.t.kirsher@intel.com; alexander.duyck@gmail.com; linux-arm-

> > kernel@lists.infradead.org; netdev@vger.kernel.org; Dingtianhong;

> > Linuxarm

> > Subject: Re: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> > ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> > 

> > Hi,

> > mlx5 driver is planned to have RO support this year.

> > I believe drivers should be able to query whether the arch support it

> 

> I guess that here when you say query you mean having a config symbol

> that is set accordingly to the host architecture, right?

> 

> As already said I have looked around a bit and other drivers do not seem

> to enable/disable RO for their EP on the basis of the host architecture.

> So why should mlx5 do it according to the host?

> 

> Also my understating is that some architectures (like ARM64 for example)

> can have different PCI host controller implementations depending on the

> vendor...therefore maybe it is not appropriate there to have a Kconfig

> symbol selected by the architecture...  


Indeed. We're not able to determine whether or not RO is supported at
compile time, so we'd have to detect this dynamically if we want to support
it for arm64 with a single kernel Image. That means either passing something
through firmware, having the PCI host controller opt-in or something coarse
like a command-line option.

Will
Ding Tianhong April 26, 2017, 9:26 a.m. UTC | #11
Hi Amir:

It is really glad to hear that the mlx5 will support RO mode this year, if so, do you agree that enable it dynamic by ethtool -s xxx,
we have try it several month ago but there was only one drivers would use it at that time so the maintainer against it, it mlx5 would support RO,
we could try to restart this solution, what do you think about it. :)

Thanks
Ding

On 2017/4/19 4:17, Amir Ancel wrote:
> Hi,

> 

> mlx5 driver is planned to have RO support this year.

> 

> I believe drivers should be able to query whether the arch support it or not and enable it in the network adapter accordingly.

> 

>  

> 

> -Amir

> 

> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

> *From:* netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org> on behalf of David Laight <David.Laight@ACULAB.COM>

> *Sent:* Tuesday, April 18, 2017 4:25:44 PM

> *To:* 'Gabriele Paoloni'; davem@davemloft.net

> *Cc:* Catalin Marinas; Will Deacon; Mark Rutland; Robin Murphy; jeffrey.t.kirsher@intel.com; alexander.duyck@gmail.com; linux-arm-kernel@lists.infradead.org; netdev@vger.kernel.org; Dingtianhong; Linuxarm

> *Subject:* RE: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

>  

> From: Gabriele Paoloni

>> Sent: 13 April 2017 10:11

>> > > Till now only the Intel ixgbe could support enable

>> > > Relaxed ordering in the drivers for special architecture,

>> > > but the ARCH_WANT_RELAX_ORDER is looks like a general name

>> > > for all arch, so rename to a specific name for intel

>> > > card looks more appropriate.

>> > >

>> > > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

>> >

>> > This is not a driver specific facility.

>> >

>> > Any driver can test this symbol and act accordingly.

>> >

>> > Just because IXGBE is the first and only user, doesn't mean

>> > the facility is driver specific.

>> 

>> 

>> Please correct me if I am wrong but my understanding is that the standard

>> way to enable/disable relaxed ordering is to set/clear bit 4 of the Device

>> Control Register (PCI_EXP_DEVCTL_RELAX_EN 0x0010 /* Enable relaxed

>> ordering */).

>> Now I have looked up for all drivers either enabling or disabling relaxed

>> ordering and none of them seems to need a symbol to decide whether to

>> enable it or not.

>> Also it seems to me enabling/disabling relaxed ordering is never bound to the

>> host architecture.

>> 

>> So why this should be (or it is expected to be) a generic symbol?

>> Wouldn't it be more correct to have this as a driver specific symbol now and

>> move it to a generic one later once we have other drivers requiring it?

> 

> 'Relaxed ordering' is a bit in the TLP header.

> A device (like the ixgbe hardware) can set it for some transactions and

> still have the transactions 'ordered enough' for the driver to work.

> (If all transactions have 'relaxed ordering' set then I suspect it is

> almost impossible to write a working driver.)

> The host side could (probably does) have a bit to enable 'relaxed ordering',

> it could also enforce stronger ordering than required by the PCIe spec

> (eg keeping reads in order).

> 

> Clearly, on some sparc64 systems, ixgbe needs to use 'relaxed ordering'.

> To me this requires two separate bits be enabled:

> 1) to the ixgbe driver to generate the 'relaxed' TLP.

> 2) to the host to actually act on them.

> If the ixgbe driver works when both are enabled why have options to

> disable either (except for bug-finding)?

> 

>         David

>
Alexander Duyck April 26, 2017, 4:18 p.m. UTC | #12
On Wed, Apr 26, 2017 at 2:26 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:
> Hi Amir:

>

> It is really glad to hear that the mlx5 will support RO mode this year, if so, do you agree that enable it dynamic by ethtool -s xxx,

> we have try it several month ago but there was only one drivers would use it at that time so the maintainer against it, it mlx5 would support RO,

> we could try to restart this solution, what do you think about it. :)

>

> Thanks

> Ding


Hi Ding,

Enabing relaxed ordering really doesn't have any place in ethtool. It
is a PCIe attribute that you are essentially wanting to enable.

It might be worth while to take a look at updating the PCIe code path
to handle this. Really what we should probably do is guarantee that
the architectures that need relaxed ordering are setting it in the
PCIe Device Control register and that the ones that don't are clearing
the bit. It's possible that this is already occurring, but I don't
know the state of handling those bits is in the kernel. Once we can
guarantee that we could use that to have the drivers determine their
behavior in regards to relaxed ordering. For example in the case of
igb/ixgbe we could probably change the behavior so that it will bey
default try to use relaxed ordering but if it is not enabled in PCIe
Device Control register the hardware should not request to use it. It
would simplify things in the drivers and allow for each architecture
to control things as needed in their PCIe code.

- Alex
Bjorn Helgaas April 27, 2017, 5:19 p.m. UTC | #13
[+cc Casey]

On Wed, Apr 26, 2017 at 09:18:33AM -0700, Alexander Duyck wrote:
> On Wed, Apr 26, 2017 at 2:26 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:

> > Hi Amir:

> >

> > It is really glad to hear that the mlx5 will support RO mode this year, if so, do you agree that enable it dynamic by ethtool -s xxx,

> > we have try it several month ago but there was only one drivers would use it at that time so the maintainer against it, it mlx5 would support RO,

> > we could try to restart this solution, what do you think about it. :)

> >

> > Thanks

> > Ding

> 

> Hi Ding,

> 

> Enabing relaxed ordering really doesn't have any place in ethtool. It

> is a PCIe attribute that you are essentially wanting to enable.

> 

> It might be worth while to take a look at updating the PCIe code path

> to handle this. Really what we should probably do is guarantee that

> the architectures that need relaxed ordering are setting it in the

> PCIe Device Control register and that the ones that don't are clearing

> the bit. It's possible that this is already occurring, but I don't

> know the state of handling those bits is in the kernel. Once we can

> guarantee that we could use that to have the drivers determine their

> behavior in regards to relaxed ordering. For example in the case of

> igb/ixgbe we could probably change the behavior so that it will bey

> default try to use relaxed ordering but if it is not enabled in PCIe

> Device Control register the hardware should not request to use it. It

> would simplify things in the drivers and allow for each architecture

> to control things as needed in their PCIe code.


I thought Relaxed Ordering was an optimization.  Are there cases where
it is actually required for correct behavior?

The PCI core doesn't currently do anything with Relaxed Ordering.
Some drivers enable/disable it directly.  I think it would probably be
better if the core provided an interface for this.  One reason is
because I think Casey has identified some systems where Relaxed
Ordering doesn't work correctly, and I'd rather deal with them once in
the core than in every driver.

Are you hinting that the PCI core or arch code could actually *enable*
Relaxed Ordering without the driver doing anything?  Is it safe to do
that?  Is there such a thing as a device that is capable of using RO,
but where the driver must be aware of it being enabled, so it programs
the device appropriately?

Bjorn
Casey Leedom April 27, 2017, 7 p.m. UTC | #14
Thanks for adding me to the Cc list Bjorn.  Hopefully my message will make
it out to the netdev and linux-pci lists -- I'm not currently subscribed to
them.  I've explicitly marked this message to be sent in plain text but
modern email agents suck with respect to this. (sigh) I have officially
become a curmudgeon. 

  So, officially, Relaxed Ordering should be a Semantic Noop as far as PCIe
transfers are concerned, as long as you don't care what order the PCIe
Transaction Layer Packets are processed in by the target PCIe Fabric End
Point.

  Basically, if you have some number of back-to-back PCIe TLPs between two
Fabric End Points {A} -> {B} which have the Relaxed Ordering Attribute set,
the End Point {B} receiving these RO TLPs may process them in any order it
likes.  When a TLP without Relaxed Ordering is sent {A} -> {B}, all
preceding TLPs with Relaxed Ordering set must be processed by {B} prior to
processing the TLP without Relaxed Ordering set.  In this sense, a TLP
without Relaxed Ordering set is something akin to a "memory barrier".

  All of this is covered in Section 2.4.1 of the PCIe 3.0 Specification (PCI
Express(r) Base Specification Revision 3.0 November 10, 2010).

  The advantage of using Relaxed Ordering (which is heavily used when
sending data to Graphics Cards as I understand it), is that the PCIe
Endpoint can potentially optimize the processing order of RO TLPs with
things like a local multi-channel Memory Controller in order to achieve the
highest transfer bandwidth possible.

  However, we have discovered at least two PCIe 3.0 Root Complex
implementations which have problems with TLPs directed at them with the
Relaxed Ordering Attribute set and I'm in the process of working up a Linux
Kernel PCI "Quirk" to allow those PCIe End Points to be marked as "not being
advisable to send RO TLPs to".  These problems range from "mere" Performance
Problems to outright Data Corruption.  I'm working with the vendors of these
...  "problematic" Root Complex implementations and hope to have this patch
submitted to the linux-pci list by tomorrow.

  By the way, it's important to note that just because, say, a Root Complex
has problems with RO TLPs directed at it, that doesn't mean that you want to
avoid all use of Relaxed Ordering within the PCIe Fabric.  For instance,
with the vendor whose Root Complex has a Performance Problem with RO TLPs
directed at it, it's perfectly reasonable -- and desired -- to use Relaxed
Ordering in Peer-to-Peer traffic.  Say for instance, with an NVMe <->
Ethernet application.

Casey
Casey Leedom April 27, 2017, 8:34 p.m. UTC | #15
| From: Bjorn Helgaas <helgaas@kernel.org>
| Sent: Thursday, April 27, 2017 10:19 AM
|
| Are you hinting that the PCI core or arch code could actually *enable*
| Relaxed Ordering without the driver doing anything?  Is it safe to do that?
| Is there such a thing as a device that is capable of using RO, but where the
| driver must be aware of it being enabled, so it programs the device
| appropriately?

  I forgot to reply to this portion of Bjorn's email.

  The PCI Configuration Space PCI Capability Device Control[Enable Relaxed
Ordering] bit governs enabling the _ability_ for the PCIe Device to send
TLPs with the Relaxed Ordering Attribute set.  It does not _cause_ RO to be
set on TLPs.  Doing that would almost certainly cause Data Corruption Bugs
since you only want a subset of TLPs to have RO set.

  For instance, we typically use RO for Ingress Packet Data delivery but
non-RO for messages notifying the Host that an Ingress Packet has been
delivered.  This ensures that the "Ingress Packet Delivered" non-RO TLP is
processed _after_ any preceding RO TLPs delivering the actual Ingress Packet
Data.

  In the above scenario, if one were to turn off Enable Relaxed Ordering via
the PCIe Capability, then the on-chip PCIe engine would simply never send a
TLP with the Relaxed Ordering Attribute set, regardless of any other chip
programming.

  And finally, just to be absolutely clear, using Relaxed Ordering isn't and
"Architecture Thing".  It's a PCIe Fabric End Point Thing.  Many End Points
simply ignore the Relaxed Ordering Attribute (except to reflect it back in
Response TLPs).  In this sense, Relaxed Ordering simply provides
potentially useful optimization information to the PCIe End Point.

Casey
Lucas Stach April 28, 2017, 8:51 a.m. UTC | #16
Am Donnerstag, den 27.04.2017, 12:19 -0500 schrieb Bjorn Helgaas:
> [+cc Casey]

> 

> On Wed, Apr 26, 2017 at 09:18:33AM -0700, Alexander Duyck wrote:

> > On Wed, Apr 26, 2017 at 2:26 AM, Ding Tianhong <dingtianhong@huawei.com> wrote:

> > > Hi Amir:

> > >

> > > It is really glad to hear that the mlx5 will support RO mode this year, if so, do you agree that enable it dynamic by ethtool -s xxx,

> > > we have try it several month ago but there was only one drivers would use it at that time so the maintainer against it, it mlx5 would support RO,

> > > we could try to restart this solution, what do you think about it. :)

> > >

> > > Thanks

> > > Ding

> > 

> > Hi Ding,

> > 

> > Enabing relaxed ordering really doesn't have any place in ethtool. It

> > is a PCIe attribute that you are essentially wanting to enable.

> > 

> > It might be worth while to take a look at updating the PCIe code path

> > to handle this. Really what we should probably do is guarantee that

> > the architectures that need relaxed ordering are setting it in the

> > PCIe Device Control register and that the ones that don't are clearing

> > the bit. It's possible that this is already occurring, but I don't

> > know the state of handling those bits is in the kernel. Once we can

> > guarantee that we could use that to have the drivers determine their

> > behavior in regards to relaxed ordering. For example in the case of

> > igb/ixgbe we could probably change the behavior so that it will bey

> > default try to use relaxed ordering but if it is not enabled in PCIe

> > Device Control register the hardware should not request to use it. It

> > would simplify things in the drivers and allow for each architecture

> > to control things as needed in their PCIe code.

> 

> I thought Relaxed Ordering was an optimization.  Are there cases where

> it is actually required for correct behavior?


Yes, at least the Tegra 2 TRM claims that RO needs to be enabled on the
device side for correct operation with the following language:

"Tegra 2 requires relaxed ordering for responses to downstream requests
(responses can pass writes). It is possible in some circumstances for
PCIe transfers from an external bus masters (i.e. upstream transfers) to
become blocked by a downstream read or non-posted write. The responses
to these downstream requests are blocked by upstream posted writes only
when PCIe strict ordering is imposed. It is therefore necessary to never
impose strict ordering that would block a response to a downstream
NPW/read request and always set the relaxed ordering bit to 1. Only
devices that are capable of relaxed ordering may be used with Tegra 2
devices."

Regards,
Lucas
Gabriele Paoloni April 28, 2017, 9:12 a.m. UTC | #17
Hi Casey

Many thanks for the detailed explanation

> -----Original Message-----

> From: Casey Leedom [mailto:leedom@chelsio.com]

> Sent: 27 April 2017 21:35

> To: Bjorn Helgaas; Alexander Duyck

> Cc: Dingtianhong; Mark Rutland; Amir Ancel; Gabriele Paoloni; linux-

> pci@vger.kernel.org; Catalin Marinas; Will Deacon; Linuxarm; David

> Laight; jeffrey.t.kirsher@intel.com; netdev@vger.kernel.org; Robin

> Murphy; davem@davemloft.net; linux-arm-kernel@lists.infradead.org

> Subject: Re: [PATCH net-next 1/4] ixgbe: sparc: rename the

> ARCH_WANT_RELAX_ORDER to IXGBE_ALLOW_RELAXED_ORDER

> 

> | From: Bjorn Helgaas <helgaas@kernel.org>

> | Sent: Thursday, April 27, 2017 10:19 AM

> |

> | Are you hinting that the PCI core or arch code could actually

> *enable*

> | Relaxed Ordering without the driver doing anything?  Is it safe to do

> that?

> | Is there such a thing as a device that is capable of using RO, but

> where the

> | driver must be aware of it being enabled, so it programs the device

> | appropriately?

> 

>   I forgot to reply to this portion of Bjorn's email.

> 

>   The PCI Configuration Space PCI Capability Device Control[Enable

> Relaxed

> Ordering] bit governs enabling the _ability_ for the PCIe Device to

> send

> TLPs with the Relaxed Ordering Attribute set.  It does not _cause_ RO

> to be

> set on TLPs.  Doing that would almost certainly cause Data Corruption

> Bugs

> since you only want a subset of TLPs to have RO set.

> 

>   For instance, we typically use RO for Ingress Packet Data delivery

> but

> non-RO for messages notifying the Host that an Ingress Packet has been

> delivered.  This ensures that the "Ingress Packet Delivered" non-RO TLP

> is

> processed _after_ any preceding RO TLPs delivering the actual Ingress

> Packet

> Data.

> 

>   In the above scenario, if one were to turn off Enable Relaxed

> Ordering via

> the PCIe Capability, then the on-chip PCIe engine would simply never

> send a

> TLP with the Relaxed Ordering Attribute set, regardless of any other

> chip

> programming.

> 

>   And finally, just to be absolutely clear, using Relaxed Ordering

> isn't and

> "Architecture Thing".  It's a PCIe Fabric End Point Thing.  Many End

> Points

> simply ignore the Relaxed Ordering Attribute (except to reflect it back

> in

> Response TLPs).  In this sense, Relaxed Ordering simply provides

> potentially useful optimization information to the PCIe End Point.


I think your view matches what I found out about the current usage of the
"Enable Relaxed Ordering" bit in Linux mainline: i.e. looking at where and
why the other drivers set/clear the "Enable Relaxed Ordering" they do not
look for any global symbol, nor they look at the host architecture.

So with respect to this specific ixgbe driver I guess the main question is
why RO was disabled by default by Intel for this EP (commit 3d5c520727ce
mentions issues with "some chipsets"), then why it is safe to enable it back
on SPARC....?

Thanks
Gab

> 

> Casey
Casey Leedom April 28, 2017, 6:42 p.m. UTC | #18
| From: Lucas Stach <l.stach@pengutronix.de>
| Sent: Friday, April 28, 2017 1:51 AM
|     
| Am Donnerstag, den 27.04.2017, 12:19 -0500 schrieb Bjorn Helgaas:
| > 
| > 
| > I thought Relaxed Ordering was an optimization.  Are there cases where
| > it is actually required for correct behavior?
| 
| Yes, at least the Tegra 2 TRM claims that RO needs to be enabled on the
| device side for correct operation with the following language:
| 
| "Tegra 2 requires relaxed ordering for responses to downstream requests
| (responses can pass writes). It is possible in some circumstances for PCIe
| transfers from an external bus masters (i.e. upstream transfers) to become
| blocked by a downstream read or non-posted write. The responses to these
| downstream requests are blocked by upstream posted writes only when PCIe
| strict ordering is imposed. It is therefore necessary to never impose strict
| ordering that would block a response to a downstream NPW/read request and
| always set the relaxed ordering bit to 1. Only devices that are capable of
| relaxed ordering may be used with Tegra 2 devices."

  (woof) Reading through the above paragraph is difficult because the author
seems to shift language and terminology mid sentence and isn't following
standard PCI terminology conventions.  The Root Complex is "Upstream", a
non-Root Complex Node in the PCIe Fabric is "Downstream", Requests that a
Downstream Device (End Point) send to the Root Complex are called "Upstream
Requests", responses that the Root Complex send to a Device are called
"Downstream Responses" (or, even more pedantically, "Responses sent
Downstream for an earlier Upstream Request").

  Because a Root Complex is Upstream, but the Requests it sent Downstream,
and Downstream Devices send their Requests Upstream, it's very important
that we use exceedingly precise language.

  So, it ~sounds like~ the nVidia Tegra 2 document is talking about the need
for Downstream Devices to echo the Relaxed Ordering Attribute in their
Responses directed Upstream to Requests sent Downstream from the Root
Complex.  Moreover, there's code in drivers/pci/host/pci-tegra.c:
tegra_pcie_relax_enable() which appears to set the PCIe Capability Device
Control[Enable Relaxed Ordering] bit on all PCIe Fabric Nodes.

  If my reading of the intent of the nVidia document is correct -- and
that's a Big If because of the extremely imprecise language used -- that
means that the tegra_pcie_relax_enable() is completely bogus.  The PCIe 3.0
Specification states that Responses MUST reflect the Relaxed Ordering and No
Snoop Attributes of the Requests for which they are responding.  Section
2.2.9 of PCI Express(r) Base Specification Revision 3.0 November 10, 2010:
"Completion headers must supply the same values for the Attribute as were
supplied in the header of the corresponding Request, except as explicitly
allowed when IDO is used."

  And, specifically, the PCIe Capability Device Control[Enable Relaxed
Ordering] bit _only_ affects the ability of that Device to originate
Transaction Layer Packet Requests with the Relaxed Ordering Attribute set.
Thus, tegra_pcie_relax_enable() setting those bits on all the Downstream
Devices (and intervening Bridges) does not _cause_ those Devices to generate
Requests with Relaxed Ordering set.  And, if the Devices are PCIe 3.0
compliant, it also doesn't affect the Responses that they send back Upstream
to the Root Complex.

  I apologize for the incredibly detailed nature of these responses, but
it's very easy for people new to PCIe to get these things wrong and/or
misinterpret the PCIe Specifications.

Casey
diff mbox series

Patch

diff --git a/arch/Kconfig b/arch/Kconfig
index cd211a1..bc0ab44 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -844,7 +844,7 @@  config STRICT_MODULE_RWX
 	  and non-text memory will be made non-executable. This provides
 	  protection against certain security exploits (e.g. writing to text)
 
-config ARCH_WANT_RELAX_ORDER
+config IXGBE_ALLOW_RELAXED_ORDER
 	bool
 
 source "kernel/gcov/Kconfig"
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 68ac5c7..f56bcf4 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -44,7 +44,7 @@  config SPARC
 	select CPU_NO_EFFICIENT_FFS
 	select HAVE_ARCH_HARDENED_USERCOPY
 	select PROVE_LOCKING_SMALL if PROVE_LOCKING
-	select ARCH_WANT_RELAX_ORDER
+	select IXGBE_ALLOW_RELAXED_ORDER
 
 config SPARC32
 	def_bool !64BIT
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index c38d50c..563ea15 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -350,7 +350,7 @@  s32 ixgbe_start_hw_gen2(struct ixgbe_hw *hw)
 	}
 	IXGBE_WRITE_FLUSH(hw);
 
-#ifndef CONFIG_ARCH_WANT_RELAX_ORDER
+#ifndef CONFIG_IXGBE_ALLOW_RELAX_ORDER
 	/* Disable relaxed ordering */
 	for (i = 0; i < hw->mac.max_tx_queues; i++) {
 		u32 regval;