mbox series

[net-next,0/7] tc: Introduce a trap-and-forward action

Message ID 20210408133829.2135103-1-petrm@nvidia.com
Headers show
Series tc: Introduce a trap-and-forward action | expand

Message

Petr Machata April 8, 2021, 1:38 p.m. UTC
The TC action "trap" is used to instruct the HW datapath to drop the
matched packet and transfer it to the host for processing in the SW
pipeline. If instead it is desirable to forward the packet in the HW
datapath, and to transfer a _copy_ to the SW pipeline, there is no
practical way to achieve that.

As a particular use case, the mlxsw driver could instruct a Spectrum
machine to mirror packets that are ECN-marked to the host. However these
packets are still forwarded in the HW datapath, therefore describing this
mirroring through the "trap" action is incorrect. A new action is needed.

To that end, this patchset introduces a new generic action, trap_fwd. In
the software pipeline, it is equivalent to an OK. When offloading, it
should forward the packet to the host, but unlike trap it should not drop
the packet.

This patchset proceeds as follows:

- In patch #1, introduce the new action, and modify the TC code to
  recognize it as an OK.

- In patches #2 and #3, introduce the artifacts necessary for offloading
  the trap_fwd action, and a new trap so that drivers can report the
  trapped packets.

- Patches #4 and #5 offload trap_fwd in mlxsw.

- Patches #6 and #7 add selftests.

Petr Machata (7):
  net: sched: Add a trap-and-forward action
  net: sched: Make the action trap_fwd offloadable
  devlink: Add a new trap for the trap_fwd action
  mlxsw: Propagate extack to mlxsw_afa_block_commit()
  mlxsw: Offload trap_fwd
  selftests: forwarding: Add a test for TC trapping behavior
  selftests: mlxsw: Add a trap_fwd test to devlink_trap_control

 .../networking/devlink/devlink-trap.rst       |   4 +
 .../mellanox/mlxsw/core_acl_flex_actions.c    |  28 ++-
 .../mellanox/mlxsw/core_acl_flex_actions.h    |   3 +-
 .../net/ethernet/mellanox/mlxsw/spectrum.h    |   4 +-
 .../mellanox/mlxsw/spectrum1_acl_tcam.c       |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_acl.c    |  11 +-
 .../ethernet/mellanox/mlxsw/spectrum_flower.c |   9 +-
 .../mellanox/mlxsw/spectrum_mr_tcam.c         |   2 +-
 .../ethernet/mellanox/mlxsw/spectrum_trap.c   |   8 +
 drivers/net/ethernet/mellanox/mlxsw/trap.h    |   2 +
 include/net/devlink.h                         |   3 +
 include/net/flow_offload.h                    |   1 +
 include/net/tc_act/tc_gact.h                  |   5 +
 include/uapi/linux/pkt_cls.h                  |   6 +-
 net/core/dev.c                                |   2 +
 net/core/devlink.c                            |   1 +
 net/sched/act_bpf.c                           |  13 +-
 net/sched/cls_api.c                           |   2 +
 net/sched/cls_bpf.c                           |   1 +
 net/sched/sch_dsmark.c                        |   1 +
 tools/include/uapi/linux/pkt_cls.h            |   6 +-
 .../drivers/net/mlxsw/devlink_trap_control.sh |  23 ++-
 .../selftests/net/forwarding/tc_trap.sh       | 170 ++++++++++++++++++
 23 files changed, 288 insertions(+), 19 deletions(-)
 create mode 100755 tools/testing/selftests/net/forwarding/tc_trap.sh

Comments

Jakub Kicinski April 8, 2021, 9:25 p.m. UTC | #1
On Thu, 8 Apr 2021 10:05:07 -0400 Jamal Hadi Salim wrote:
> On 2021-04-08 9:38 a.m., Petr Machata wrote:
> > The TC action "trap" is used to instruct the HW datapath to drop the
> > matched packet and transfer it for processing in the SW pipeline. If
> > instead it is desirable to forward the packet and transferring a _copy_ to
> > the SW pipeline, there is no practical way to achieve that.
> > 
> > To that end add a new generic action, trap_fwd. In the software pipeline,
> > it is equivalent to an OK. When offloading, it should forward the packet to
> > the host, but unlike trap it should not drop the packet.
> 
> I am concerned about adding new opcodes which only make sense if you
> offload (or make sense only if you are running in s/w).
> 
> Those opcodes are intended to be generic abstractions so the dispatcher
> can decide what to do next. Adding things that are specific only
> to scenarios of hardware offload removes that opaqueness.
> I must have missed the discussion on ACT_TRAP because it is the
> same issue there i.e shouldnt be an opcode. For details see:
> https://people.netfilter.org/pablo/netdev0.1/papers/Linux-Traffic-Control-Classifier-Action-Subsystem-Architecture.pdf
> 
> IMO:
> It seems to me there are two actions here encapsulated in one.
> The first is to "trap" and the second is to "drop".
> This is no different semantically than say "mirror and drop"
> offload being enunciated by "skip_sw".
> 
> Does the spectrum not support multiple actions?
> e.g with a policy like:
>   match blah action trap action drop skip_sw

To make sure I understand - are you saying that trap should become 
more general and support both "and then drop" as well as "and then 
pass" semantics?

Seems like that ship has sailed, but also - how does it make it any
better WRT not having HW only opcodes? Or are you saying one is better
than two?
Petr Machata April 9, 2021, 11:03 a.m. UTC | #2
Jamal Hadi Salim <jhs@mojatatu.com> writes:

> I am concerned about adding new opcodes which only make sense if you

> offload (or make sense only if you are running in s/w).

>

> Those opcodes are intended to be generic abstractions so the dispatcher

> can decide what to do next. Adding things that are specific only

> to scenarios of hardware offload removes that opaqueness.

> I must have missed the discussion on ACT_TRAP because it is the

> same issue there i.e shouldnt be an opcode. For details see:

> https://people.netfilter.org/pablo/netdev0.1/papers/Linux-Traffic-Control-Classifier-Action-Subsystem-Architecture.pdf


Trap has been in since 4.13, so 2017ish. It's done and dusted at this
point.

> IMO:

> It seems to me there are two actions here encapsulated in one.

> The first is to "trap" and the second is to "drop".

>

> This is no different semantically than say "mirror and drop"

> offload being enunciated by "skip_sw".

>

> Does the spectrum not support multiple actions?

> e.g with a policy like:

>  match blah action trap action drop skip_sw


Trap drops implicitly. We need a "trap, but don't drop". Expressed in
terms of existing actions it would be "mirred egress redirect dev
$cpu_port". But how to express $cpu_port except again by a HW-specific
magic token I don't know.
Jamal Hadi Salim April 9, 2021, 11:13 a.m. UTC | #3
On 2021-04-08 5:25 p.m., Jakub Kicinski wrote:
> On Thu, 8 Apr 2021 10:05:07 -0400 Jamal Hadi Salim wrote:

>> On 2021-04-08 9:38 a.m., Petr Machata wrote:

>>> The TC action "trap" is used to instruct the HW datapath to drop the

>>> matched packet and transfer it for processing in the SW pipeline. If

>>> instead it is desirable to forward the packet and transferring a _copy_ to

>>> the SW pipeline, there is no practical way to achieve that.

>>>

>>> To that end add a new generic action, trap_fwd. In the software pipeline,

>>> it is equivalent to an OK. When offloading, it should forward the packet to

>>> the host, but unlike trap it should not drop the packet.

>>

>> I am concerned about adding new opcodes which only make sense if you

>> offload (or make sense only if you are running in s/w).

>>

>> Those opcodes are intended to be generic abstractions so the dispatcher

>> can decide what to do next. Adding things that are specific only

>> to scenarios of hardware offload removes that opaqueness.

>> I must have missed the discussion on ACT_TRAP because it is the

>> same issue there i.e shouldnt be an opcode. For details see:

>> https://people.netfilter.org/pablo/netdev0.1/papers/Linux-Traffic-Control-Classifier-Action-Subsystem-Architecture.pdf

>>

>> IMO:

>> It seems to me there are two actions here encapsulated in one.

>> The first is to "trap" and the second is to "drop".

>> This is no different semantically than say "mirror and drop"

>> offload being enunciated by "skip_sw".

>>

>> Does the spectrum not support multiple actions?

>> e.g with a policy like:

>>    match blah action trap action drop skip_sw

> 

> To make sure I understand - are you saying that trap should become

> more general and support both "and then drop" as well as "and then

> pass" semantics?

> 


No.
Main issue is the pollution of the opcodes - whether it is
one or multiple actions is less of a concern.
Those opcodes are intended to be for the core action dispatcher's
consumption. See figure 6 and table 1 of the document i referred to.

Basically:
You dont an action then add an opcode for it even if it is hardware
offloaded (otherwise that opcode space would have grown a lot more by
now for all those actions that are offloaded).
Trap, for example, could have been a dummy action that just returns
the STOLEN/DROP/PASS opcode and does nothing else.
Typically we expect things that are offloaded to have a software
equivalent. It makes for good control consistency etc clean.

> Seems like that ship has sailed, but also - how does it make it any

> better WRT not having HW only opcodes? Or are you saying one is better

> than two?


The opcodes are not tied to whether an action is offloaded or not.
That role belongs to the "skip_sw" axes - which works well
today since we dont offload actions on their own without some
filter rule which specifies the offload option.

I will barf if someone implements 3 actions: "trap", "trap and forward",
"trap and drop" - but that is not messing up with the core architecture
so the barfing is more due to the bad taste of that approach.
A cleaner approach is to code one and change the return code for those
3 to "STOLEN", "PIPE", and "DROP"

cheers,
jamal
Jamal Hadi Salim April 9, 2021, 11:44 a.m. UTC | #4
On 2021-04-09 7:03 a.m., Petr Machata wrote:
> 

> Jamal Hadi Salim <jhs@mojatatu.com> writes:

> 

>> I am concerned about adding new opcodes which only make sense if you

>> offload (or make sense only if you are running in s/w).

>>

>> Those opcodes are intended to be generic abstractions so the dispatcher

>> can decide what to do next. Adding things that are specific only

>> to scenarios of hardware offload removes that opaqueness.

>> I must have missed the discussion on ACT_TRAP because it is the

>> same issue there i.e shouldnt be an opcode. For details see:

>> https://people.netfilter.org/pablo/netdev0.1/papers/Linux-Traffic-Control-Classifier-Action-Subsystem-Architecture.pdf

> 

> Trap has been in since 4.13, so 2017ish. It's done and dusted at this

> point.

> 


I am afraid that is not a good arguement. With all due respect,
here's how it translates:
"We already made a mistake, therefore, its ok to build on it and
make more mistakes". Touching those opcodes is really dirty; at
least i have seen no convincing arguement _at all_ for it. And,
it is not too late not to make more mistakes.
I dont remember, I may have spoken against TRAP; what i know is had
i seen the patch i would have said something - maybe i did and should
have been louder. Mea culpa.

>> IMO:

>> It seems to me there are two actions here encapsulated in one.

>> The first is to "trap" and the second is to "drop".

>>

>> This is no different semantically than say "mirror and drop"

>> offload being enunciated by "skip_sw".

>>

>> Does the spectrum not support multiple actions?

>> e.g with a policy like:

>>   match blah action trap action drop skip_sw

> 

> Trap drops implicitly. We need a "trap, but don't drop". Expressed in

> terms of existing actions it would be "mirred egress redirect dev

> $cpu_port". But how to express $cpu_port except again by a HW-specific

> magic token I don't know.



Note: mirred was originally intended to send redirect/mirror
packets to user space (the comment is still there in the code).
Infact there is a patch lying around somewhere that does that with
packet sockets (the author hasnt been serious about pushing it
upstream). In that case the semantics are redirecting to a file
descriptor. Could we have something like that here which points
to whatever representation $cpu_port has? Sounds like semantics
for "trap and forward" are just "mirror and forward".

I think there is value in having something like trap action
which generalizes the combinations only to the fact that
it will make it easier to relay the info to the offload without
much transformation.
If i was to do it i would write one action configured by user space:
- to return DROP if you want action trap-and-drop semantics.
- to return STOLEN if you want trap
- to return PIPE if you want trap and forward. You will need a second
action composed to forward.

I said dummy because this action has no value in s/w. Someone could
use it in s/w but it would be no different than gact.
Maybe it could be extended to work also in s/w by adding the "trap to
fd" in userspace.

cheers,
jamal
Petr Machata April 9, 2021, 1:43 p.m. UTC | #5
Jamal Hadi Salim <jhs@mojatatu.com> writes:

> On 2021-04-09 7:03 a.m., Petr Machata wrote:

>> Jamal Hadi Salim <jhs@mojatatu.com> writes:

>> 

>>> I am concerned about adding new opcodes which only make sense if you

>>> offload (or make sense only if you are running in s/w).

>>>

>>> Those opcodes are intended to be generic abstractions so the dispatcher

>>> can decide what to do next.

>>> [...]

>>> For details see:

>>> https://people.netfilter.org/pablo/netdev0.1/papers/Linux-Traffic-Control-Classifier-Action-Subsystem-Architecture.pdf

>>

>> Trap has been in since 4.13, so 2017ish. It's done and dusted at this

>> point.

>

> here's how it translates:

> "We already made a mistake, therefore, its ok to build on it and

> make more mistakes".


I can see how it reads that way, but that was not the intention. I was
actually thinking about whether there might be a way to gradually
migrate all this stuff over to mirred, but at this point, trap is very
much baked in.

>>> IMO:

>>> It seems to me there are two actions here encapsulated in one.

>>> The first is to "trap" and the second is to "drop".

>>>

>>> This is no different semantically than say "mirror and drop"

>>> offload being enunciated by "skip_sw".

>>>

>>> Does the spectrum not support multiple actions?

>>> e.g with a policy like:

>>>   match blah action trap action drop skip_sw

>> Trap drops implicitly. We need a "trap, but don't drop". Expressed in

>> terms of existing actions it would be "mirred egress redirect dev

>> $cpu_port". But how to express $cpu_port except again by a HW-specific

>> magic token I don't know.


(I meant mirred egress mirror, not redirect.)

> Note: mirred was originally intended to send redirect/mirror

> packets to user space (the comment is still there in the code).

> Infact there is a patch lying around somewhere that does that with

> packet sockets (the author hasnt been serious about pushing it

> upstream). In that case the semantics are redirecting to a file

> descriptor. Could we have something like that here which points

> to whatever representation $cpu_port has? Sounds like semantics

> for "trap and forward" are just "mirror and forward".


Hmm, we have devlink ports, the CPU port is exposed there. But that's
the only thing that comes to mind. Those are specific for the given
device though, it doesn't look suitable...

> I think there is value in having something like trap action

> which generalizes the combinations only to the fact that

> it will make it easier to relay the info to the offload without

> much transformation.

> If i was to do it i would write one action configured by user space:

> - to return DROP if you want action trap-and-drop semantics.

> - to return STOLEN if you want trap

> - to return PIPE if you want trap and forward. You will need a second

> action composed to forward.


I think your STOLEN and PIPE are the same behavior. Both are "transfer
the packet to the SW datapath, but keep it in the HW datapath".

In general I have no issue expressing this stuff as a new action,
instead of an opcode. I'll take a look at this.
Jamal Hadi Salim April 11, 2021, 7:23 p.m. UTC | #6
On 2021-04-09 9:43 a.m., Petr Machata wrote:
> 

> Jamal Hadi Salim <jhs@mojatatu.com> writes:

> 


>>>> Does the spectrum not support multiple actions?

>>>> e.g with a policy like:

>>>>    match blah action trap action drop skip_sw

>>> Trap drops implicitly. We need a "trap, but don't drop". Expressed in

>>> terms of existing actions it would be "mirred egress redirect dev

>>> $cpu_port". But how to express $cpu_port except again by a HW-specific

>>> magic token I don't know.

> 

> (I meant mirred egress mirror, not redirect.)

>


Ok.

>> Note: mirred was originally intended to send redirect/mirror

>> packets to user space (the comment is still there in the code).

>> Infact there is a patch lying around somewhere that does that with

>> packet sockets (the author hasnt been serious about pushing it

>> upstream). In that case the semantics are redirecting to a file

>> descriptor. Could we have something like that here which points

>> to whatever representation $cpu_port has? Sounds like semantics

>> for "trap and forward" are just "mirror and forward".

> 

> Hmm, we have devlink ports, the CPU port is exposed there. But that's

> the only thing that comes to mind. Those are specific for the given

> device though, it doesn't look suitable...

> 


If it has an ifindex should be good enough for abstraction
purposes.

>> I think there is value in having something like trap action

>> which generalizes the combinations only to the fact that

>> it will make it easier to relay the info to the offload without

>> much transformation.

>> If i was to do it i would write one action configured by user space:

>> - to return DROP if you want action trap-and-drop semantics.

>> - to return STOLEN if you want trap

>> - to return PIPE if you want trap and forward. You will need a second

>> action composed to forward.

> 

> I think your STOLEN and PIPE are the same behavior. Both are "transfer

> the packet to the SW datapath, but keep it in the HW datapath".

> 

> In general I have no issue expressing this stuff as a new action,

> instead of an opcode. I'll take a look at this.

> 


Ok, thanks.

cheers,
jamal