Message ID | 20220818170005.747015-1-dima@arista.com |
---|---|
Headers | show |
Series | net/tcp: Add TCP-AO support | expand |
On 8/21/22 2:34 PM, Leonard Crestez wrote: > On 8/18/22 19:59, Dmitry Safonov wrote: >> This patchset implements the TCP-AO option as described in RFC5925. There >> is a request from industry to move away from TCP-MD5SIG and it seems >> the time >> is right to have a TCP-AO upstreamed. This TCP option is meant to replace >> the TCP MD5 option and address its shortcomings. Specifically, it >> provides >> more secure hashing, key rotation and support for long-lived connections >> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). >> The patch series starts with six patches that are not specific to TCP-AO >> but implement a general crypto facility that we thought is useful >> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as >> other >> crypto users. These six patches are being submitted separately in >> a different patchset [1]. Including them here will show better the gain >> in code sharing. Next are 18 patches that implement the actual TCP-AO >> option, >> followed by patches implementing selftests. >> >> The patch set was written as a collaboration of three authors (in >> alphabetical >> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. >> Additional >> credits should be given to Prasad Koya, who was involved in early >> prototyping >> a few years back. There is also a separate submission done by Leonard >> Crestez >> whom we thank for his efforts getting an implementation of RFC5925 >> submitted >> for review upstream [2]. This is an independent implementation that makes >> different design decisions. > > Is this based on something that Arista has had running for a while now > or is a recent new development? > ... > Seeing an entirely distinct unrelated implementation is very unexpected. > What made you do this? > I am curious as well. You are well aware of Leonard's efforts which go back a long time, why go off and do a separate implementation?
On Sun, Aug 21, 2022 at 1:34 PM Leonard Crestez <cdleonard@gmail.com> wrote: > > On 8/18/22 19:59, Dmitry Safonov wrote: > > This patchset implements the TCP-AO option as described in RFC5925. There > > is a request from industry to move away from TCP-MD5SIG and it seems the time > > is right to have a TCP-AO upstreamed. This TCP option is meant to replace > > the TCP MD5 option and address its shortcomings. ... > > > > The patch set was written as a collaboration of three authors (in alphabetical > > order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. Additional > > credits should be given to Prasad Koya, who was involved in early prototyping > > a few years back. There is also a separate submission done by Leonard Crestez > > whom we thank for his efforts getting an implementation of RFC5925 submitted > > for review upstream [2]. This is an independent implementation that makes > > different design decisions. > > Is this based on something that Arista has had running for a while now > or is a recent new development? > This is based on prototype code we had worked on internally three years ago. The implementation effort was restarted to get it over the finish line. For business reasons we had to have our own implementation ready and not tied to unmerged upstream code. Please also note that our implementation is based on linux-4.19 and was only ported forward to mainline recently. So it wasn’t ready to be submitted upstream. > > For example, we chose a similar design to the TCP-MD5SIG implementation and > > used setsockopt()s to program per-socket keys, avoiding the extra complexity > > of managing a centralized key database in the kernel. A centralized database > > in the kernel has dubious benefits since it doesn’t eliminate per-socket > > setsockopts needed to specify which sockets need TCP-AO and what are the > > currently preferred keys. It also complicates traffic key caching and > > preventing deletion of in-use keys. > > My implementation started with per-socket lists but switched to a global > list because this way is much easier to manage from userspace. In > practice userspace apps will want to ensure that all sockets use the > same set of keys anyway. > We did consider a global list early on but we didn’t find it beneficial. We still believe that per-socket lists reduce complexity of the implementation, are more scalable and ensure predictable behavior. Our expectation is that TCP-AO will be only useful for a limited set of routing applications, rather than used transparently like IPSEC for non-routing apps. We would be happy to discuss this in more detail. > > In this implementation, a centralized database of keys can be thought of > > as living in user space and user applications would have to program those > > keys on matching sockets. On the server side, the user application programs > > keys (MKTS in TCP-AO nomenclature) on the listening socket for all peers that > > are expected to connect. Prefix matching on the peer address is supported. ... > > My series doesn't try to prevent inconsistencies inside the key lists > because it's not clear that the kernel should prevent userspace from > shooting itself in the foot. Worst case is connection failure on > misconfiguration which seems fine. > > The RFC doesn't specify in detail how key management is to be performed, > for example if two valid keys are available it doesn't mention which one > should be used. Some guidance is found in RFC8177 but again not very much. > > I implemented an ABI that can be used by userspace for RFC8177-style key > management and asked for feedback but received very little. If you had > come with a clear ABI proposal I would have tried to implement it. > > Here's a link to our older discussion: > > https://lore.kernel.org/netdev/e7f0449a-2bad-99ad-4737-016a0e6b8b84@gmail.com/ > > Seeing an entirely distinct unrelated implementation is very unexpected. > What made you do this? > > -- > Regards, > Leonard Our goal was not to have a competing TCP-AO upstream submission but to implement the RFC for our customers to use. Had there been an already upstreamed implementation we would have used it and implemented customer requirements on top of it. Just like we do with all other kernel features. This is not a bad situation, we believe it is good for the upstream community to have two fully functioning implementations to consider. Possibly a third collaborative submission might emerge that takes the best of both. A year ago, there wasn’t much available online about TCP-AO besides the RFC. We are excited with the current interest in TCP-AO and hope to see it upstreamed soon. Best, Salam
Hi Leonard, David, On 8/22/22 00:51, David Ahern wrote: > On 8/21/22 2:34 PM, Leonard Crestez wrote: >> On 8/18/22 19:59, Dmitry Safonov wrote: >>> This patchset implements the TCP-AO option as described in RFC5925. There >>> is a request from industry to move away from TCP-MD5SIG and it seems >>> the time >>> is right to have a TCP-AO upstreamed. This TCP option is meant to replace >>> the TCP MD5 option and address its shortcomings. Specifically, it >>> provides >>> more secure hashing, key rotation and support for long-lived connections >>> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). >>> The patch series starts with six patches that are not specific to TCP-AO >>> but implement a general crypto facility that we thought is useful >>> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as >>> other >>> crypto users. These six patches are being submitted separately in >>> a different patchset [1]. Including them here will show better the gain >>> in code sharing. Next are 18 patches that implement the actual TCP-AO >>> option, >>> followed by patches implementing selftests. >>> >>> The patch set was written as a collaboration of three authors (in >>> alphabetical >>> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. >>> Additional >>> credits should be given to Prasad Koya, who was involved in early >>> prototyping >>> a few years back. There is also a separate submission done by Leonard >>> Crestez >>> whom we thank for his efforts getting an implementation of RFC5925 >>> submitted >>> for review upstream [2]. This is an independent implementation that makes >>> different design decisions. >> >> Is this based on something that Arista has had running for a while now >> or is a recent new development? >> > > ... > >> Seeing an entirely distinct unrelated implementation is very unexpected. >> What made you do this? >> > > I am curious as well. You are well aware of Leonard's efforts which go > back a long time, why go off and do a separate implementation? When I started working on this, there was a prototype that was neither good for upstream, nor for customers. At the moment Leonard submitted his RFC, I was already giving feedback/reviews to local code and patches. So, as I was aware of the details of TCP-AO, I started giving Leonard feedback and reviews, based on what I've learned from RFC/code. I thought whatever code will make it upstream, it can benefit from my reviews. Some of my comments were based on a better code I saw locally, or a way of improving it that I've suggested to both sides. I'm quite happy that Leonard addressed some of my comments (i.e. extendable syscalls), I see that it improved his patches. On the other hand, some of the comments I've left moved to "known limitations" with no foreseeable plan to fix them, while they were addressed in local/Arista code. And now a little bit more than a year later, it seems that the quality of local patches has reached a point where they can be submitted for an upstream review. So, please don't misunderstand me, it's not that "drop your implementation, take ours" and it's not that we've intentionally hidden that we're working on TCP-AO. It's that it is the first moment we can make upstream aware of an alternative implementation. Personally, I think it's best for opensource community: - Arista's implementation is now public - there are now at least 4 people that understand RFC5925 and the code/details - in a discussion, it will be possible to find what will be the best from both implementations for Linux and come up with better code At this particular moment, it seems neither of patch sets is ready to be merged "as-is". But it seems that there's enough interest from both sides and likely it guarantees that there will be enough effort to make something merge-able, that will work for all interested parties. As for my part, I'm interested in the best code upstream, regardless who is the author. This includes: - reusing the existing TCP-MD5 code, rather than copying'n'pasting for TCP-AO with intent to refactor it some day later - making setsockopt()s and other syscalls extendable - cover functionality with selftests - following RFC5925 in implementation, especially "required" and "must" parts I hope that clarifies how and why now there are two patch sets that implement the same RFC/functionality. Thanks, Dmitry
On 8/22/22 23:35, Dmitry Safonov wrote: > Hi Leonard, David, > > On 8/22/22 00:51, David Ahern wrote: >> On 8/21/22 2:34 PM, Leonard Crestez wrote: >>> On 8/18/22 19:59, Dmitry Safonov wrote: >>>> This patchset implements the TCP-AO option as described in RFC5925. There >>>> is a request from industry to move away from TCP-MD5SIG and it seems >>>> the time >>>> is right to have a TCP-AO upstreamed. This TCP option is meant to replace >>>> the TCP MD5 option and address its shortcomings. Specifically, it >>>> provides >>>> more secure hashing, key rotation and support for long-lived connections >>>> (see the summary of TCP-AO advantages over TCP-MD5 in (1.3) of RFC5925). >>>> The patch series starts with six patches that are not specific to TCP-AO >>>> but implement a general crypto facility that we thought is useful >>>> to eliminate code duplication between TCP-MD5SIG and TCP-AO as well as >>>> other >>>> crypto users. These six patches are being submitted separately in >>>> a different patchset [1]. Including them here will show better the gain >>>> in code sharing. Next are 18 patches that implement the actual TCP-AO >>>> option, >>>> followed by patches implementing selftests. >>>> >>>> The patch set was written as a collaboration of three authors (in >>>> alphabetical >>>> order): Dmitry Safonov, Francesco Ruggeri and Salam Noureddine. >>>> Additional >>>> credits should be given to Prasad Koya, who was involved in early >>>> prototyping >>>> a few years back. There is also a separate submission done by Leonard >>>> Crestez >>>> whom we thank for his efforts getting an implementation of RFC5925 >>>> submitted >>>> for review upstream [2]. This is an independent implementation that makes >>>> different design decisions. >>> >>> Is this based on something that Arista has had running for a while now >>> or is a recent new development? >>> >> >> ... >> >>> Seeing an entirely distinct unrelated implementation is very unexpected. >>> What made you do this? >>> >> >> I am curious as well. You are well aware of Leonard's efforts which go >> back a long time, why go off and do a separate implementation? > > When I started working on this, there was a prototype that was neither > good for upstream, nor for customers. At the moment Leonard submitted > his RFC, I was already giving feedback/reviews to local code and > patches. So, as I was aware of the details of TCP-AO, I started giving > Leonard feedback and reviews, based on what I've learned from RFC/code. > I thought whatever code will make it upstream, it can benefit from my > reviews. Some of my comments were based on a better code I saw locally, > or a way of improving it that I've suggested to both sides. > > I'm quite happy that Leonard addressed some of my comments (i.e. > extendable syscalls), I see that it improved his patches. > On the other hand, some of the comments I've left moved to "known > limitations" with no foreseeable plan to fix them, while they were > addressed in local/Arista code. > > And now a little bit more than a year later, it seems that the quality > of local patches has reached a point where they can be submitted for > an upstream review. So, please don't misunderstand me, it's not that > "drop your implementation, take ours" and it's not that we've > intentionally hidden that we're working on TCP-AO. It's that it is the > first moment we can make upstream aware of an alternative implementation. > > Personally, I think it's best for opensource community: > - Arista's implementation is now public > - there are now at least 4 people that understand RFC5925 and the > code/details > - in a discussion, it will be possible to find what will be the best > from both implementations for Linux and come up with better code > > At this particular moment, it seems neither of patch sets is ready to be > merged "as-is". But it seems that there's enough interest from both > sides and likely it guarantees that there will be enough effort to make > something merge-able, that will work for all interested parties. > > As for my part, I'm interested in the best code upstream, regardless who > is the author. This includes: > - reusing the existing TCP-MD5 code, rather than copying'n'pasting for > TCP-AO with intent to refactor it some day later I had a requirement to deploy on linux 5.4 so I very deliberately avoided touching MD5. I'm not sure there very much duplication anyway. > - making setsockopt()s and other syscalls extendable > - cover functionality with selftests My implementation is tested with a standalone python package, this is a design choice which doesn't particularly matter. > - following RFC5925 in implementation, especially "required" and "must" > parts I'm not convinced that "don't delete current key" needs to be literally interpreted as a hard requirement for the linux ABI. Most TCP RFCs don't specify any sort of API at all and it would be entirely valid to implement BGP-TCP-AO as a single executable with no internally documented boundaries. > I hope that clarifies how and why now there are two patch sets that > implement the same RFC/functionality. As far as I can tell the biggest problem is that is quite difficult to implement the userspace side of TCP-AO complete with key rollover. Our two implementation both claim to support this but through different ABI and both require active management from userspace. I think it would make sense to push key validity times and the key selection policy entirely in the kernel so that it can handle key rotation/expiration by itself. This way userspace only has to configure the keys and doesn't have to touch established connections at all. My series has a "flags" field on the key struct where it can filter by IP, prefix, ifindex and so on. It would be possible to add additional flags for making the key only valid between certain times (by wall time). The kernel could then make key selections itself: - send rnextkeyid based on the key with the longest recv lifetime - send keyid based on remote rnextkeyid. - If not applicable (rnextkeyid not found locally, or for SYN) pick based on longest send lifetime. - If all keys expire then return an error on write() - Solve other ambiguities in a predictable way: if valid times are equal then pick the lowest numeric send_id or recv_id. Explicit key selection from userspace could still be supported but it would be optional and most apps wouldn't bother implementing their own policy. The biggest advantage is that it would be much easier for applications to adopt TCP-AO. -- Regards, Leonard
On 8/23/22 16:30, Leonard Crestez wrote: > On 8/22/22 23:35, Dmitry Safonov wrote: >> Hi Leonard, David, [..] >> At this particular moment, it seems neither of patch sets is ready to be >> merged "as-is". But it seems that there's enough interest from both >> sides and likely it guarantees that there will be enough effort to make >> something merge-able, that will work for all interested parties. >> >> As for my part, I'm interested in the best code upstream, regardless who >> is the author. This includes: >> - reusing the existing TCP-MD5 code, rather than copying'n'pasting for >> TCP-AO with intent to refactor it some day later > > I had a requirement to deploy on linux 5.4 so I very deliberately > avoided touching MD5. I'm not sure there very much duplication anyway. Yeah, I know what you mean: we deployed it on v4.19. But for the code upstream I personally prefer to see "reusing" rather than copying. Lesser code is easier to maintain in future. Upstream submissions in my view should be based on "what would be easier to maintain in future", rather than on "what would be easier to backport to my maintenance release". >> - making setsockopt()s and other syscalls extendable >> - cover functionality with selftests > > My implementation is tested with a standalone python package, this is a > design choice which doesn't particularly matter. > >> - following RFC5925 in implementation, especially "required" and "must" >> parts > > I'm not convinced that "don't delete current key" needs to be literally > interpreted as a hard requirement for the linux ABI. Most TCP RFCs don't > specify any sort of API at all and it would be entirely valid to > implement BGP-TCP-AO as a single executable with no internally > documented boundaries. I agree that RFC requirements and "musts" can be implemented in userspace, rather than in kernel. On the other hand, my opinion is that if you have "must"/"must not"/"required" in RFC and it's not hard to limit those in kernel, than you _should_ do it. In this point of view, debugging "hey, setsockopt() for key removal returned -EBUSY, what's going on?" is better than "hey, tcp connection died on my side and I didn't have tcp dump running, what was that?". >> I hope that clarifies how and why now there are two patch sets that >> implement the same RFC/functionality. > > As far as I can tell the biggest problem is that is quite difficult to > implement the userspace side of TCP-AO complete with key rollover. Our > two implementation both claim to support this but through different ABI > and both require active management from userspace. > > I think it would make sense to push key validity times and the key > selection policy entirely in the kernel so that it can handle key > rotation/expiration by itself. This way userspace only has to configure > the keys and doesn't have to touch established connections at all. Respectfully I disagree here. I think all such policies should be implemented in userspace. The kernel has to have as lesser as possible, but enough to provide a way to sign, verify, log messages on TCP segments. All the logic that may change, all business decisions and key management should be implemented in userspace, keeping the kernel part as easier in "KISS" sense as possible. > My series has a "flags" field on the key struct where it can filter by > IP, prefix, ifindex and so on. It would be possible to add additional > flags for making the key only valid between certain times (by wall time). > > The kernel could then make key selections itself: > - send rnextkeyid based on the key with the longest recv lifetime > - send keyid based on remote rnextkeyid. > - If not applicable (rnextkeyid not found locally, or for SYN) pick > based on longest send lifetime. > - If all keys expire then return an error on write() > - Solve other ambiguities in a predictable way: if valid times are > equal then pick the lowest numeric send_id or recv_id. > > Explicit key selection from userspace could still be supported but it > would be optional and most apps wouldn't bother implementing their own > policy. The biggest advantage is that it would be much easier for > applications to adopt TCP-AO. Personally, I would think that all you mentioned better stay in userspace app. The kernel should do as minimal and as much predictable as possible job here, without 10 possible outcomes. If you want to share the logic of key rotation/expiration, all timers and synchronization between different BGP applications, that would be a proper job for a shared library. Thanks, Dmitry
> I think it would make sense to push key validity times and the key selection > policy entirely in the kernel so that it can handle key rotation/expiration > by itself. This way userspace only has to configure the keys and doesn't > have to touch established connections at all. I know nothing aobut TCP-AO, nor much about kTLS. But doesn't kTLS have the same issue? Is there anything which can be learnt from kTLS? Maybe the same mechanisms can be used? No point inventing something new if you can copy/refactor working code? > My series has a "flags" field on the key struct where it can filter by IP, > prefix, ifindex and so on. It would be possible to add additional flags for > making the key only valid between certain times (by wall time). What out for wall clock time, it jumps around in funny ways. Plus the kernel has no idea what time zone the wall the wall clock is mounted on is in. Andrew
On 8/24/22 15:46, Andrew Lunn wrote: >> I think it would make sense to push key validity times and the key selection >> policy entirely in the kernel so that it can handle key rotation/expiration >> by itself. This way userspace only has to configure the keys and doesn't >> have to touch established connections at all. > > I know nothing aobut TCP-AO, nor much about kTLS. But doesn't kTLS > have the same issue? Is there anything which can be learnt from kTLS? > Maybe the same mechanisms can be used? No point inventing something > new if you can copy/refactor working code? > >> My series has a "flags" field on the key struct where it can filter by IP, >> prefix, ifindex and so on. It would be possible to add additional flags for >> making the key only valid between certain times (by wall time). > > What out for wall clock time, it jumps around in funny ways. Plus the > kernel has no idea what time zone the wall the wall clock is mounted > on is in. A close equivalent seems to exist in ipsec in the "xfrm_lifetime_cfg" struct, specifically the soft/hard expires timers. These are optional validity times for each xfrm_state which is equivalent to a "key". I'm not familiar with how those are used but ipsec usually relies on complex userspace daemons for managing xfrm states and policies and those daemons should be capable of adding and removing keys based on internal timers. Still, the linux kernel supports checking for key validity on it's own. -- Regards, Leonard