[API-NEXT,PATCHv4,1/4] api: packet reference count support

Message ID 1444395547-13968-2-git-send-email-maxim.uvarov@linaro.org
State New
Headers show

Commit Message

Maxim Uvarov Oct. 9, 2015, 12:59 p.m.
Add api for packet reference count support. Which is useful in case:
 - multicast support
 - TCP retransmission
 - traffic generator

Introduced new call: newpkt = odp_packet_create_ref(pkt) which creates
reference to original packet, but handles for reference packet and original
are different.

Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
---
 include/odp/api/config.h | 5 +++++
 include/odp/api/packet.h | 9 +++++++++
 2 files changed, 14 insertions(+)

Comments

Nicolas Morey-Chaisemartin Oct. 9, 2015, 1:04 p.m. | #1
On 10/09/2015 02:59 PM, Maxim Uvarov wrote:
> Add api for packet reference count support. Which is useful in case:
>  - multicast support
>  - TCP retransmission
>  - traffic generator
>
> Introduced new call: newpkt = odp_packet_create_ref(pkt) which creates
> reference to original packet, but handles for reference packet and original
> are different.
"May" be different  more than "are". But not a big deal as only in the commit message.

>
> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
> ---
>  include/odp/api/config.h | 5 +++++
>  include/odp/api/packet.h | 9 +++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/include/odp/api/config.h b/include/odp/api/config.h
> index 295b10d..985290a 100644
> --- a/include/odp/api/config.h
> +++ b/include/odp/api/config.h
> @@ -126,6 +126,11 @@ extern "C" {
>   */
>  #define ODP_CONFIG_PACKET_BUF_LEN_MAX (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>  
> +/**
> + * Maximum packet references.
> + */
> +#define ODP_CONFIG_PACKET_REFS 2
> +
>  /** Maximum number of shared memory blocks.
>   *
>   * This the the number of separate SHM areas that can be reserved concurrently
> diff --git a/include/odp/api/packet.h b/include/odp/api/packet.h
> index 5d46b7b..f9745fb 100644
> --- a/include/odp/api/packet.h
> +++ b/include/odp/api/packet.h
> @@ -125,6 +125,15 @@ odp_packet_t odp_packet_from_event(odp_event_t ev);
>   */
>  odp_event_t odp_packet_to_event(odp_packet_t pkt);
>  
> +/**
> + * Create reference for packet handle
> + *
> + * @param pkt  Packet handle
> + *
> + * @return New packet handle
> + */
> +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
> +
>  /*
>   *
>   * Pointers and lengths
Ola Liljedahl Oct. 15, 2015, 8:49 p.m. | #2
On 9 October 2015 at 08:59, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> Add api for packet reference count support. Which is useful in case:
>  - multicast support
>
Different packets will probably need different headers prepended to the
shared packet. How is this handled?

 - TCP retransmission
>
When a TCP packet is retransmitted, the IP id should be changed (similar
for QUIC retransmissions where the SN is updated). So you might have to
modify the packet (header) before you know that the previous transmit has
actually completed. How is this handled?

 - traffic generator
>
A traffic generator might want to include an increasing SN or timestamp in
each packet (packet payload) which otherwise are identical. How is this
handled?


>
> Introduced new call: newpkt = odp_packet_create_ref(pkt) which creates
> reference to original packet, but handles for reference packet and original
> are different.
>
I think something like Bill's "copy-on-write" proposal (from Connect) is
better than explicit reference counting (of the whole packet). Increase the
abstraction level and allow for innovation in the underlying implementation.

Any API support should be based on use case analysis and code examples of
how the use cases are implemented using the suggested API's. Otherwise the
risk of missing the target is significant.


> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
> ---
>  include/odp/api/config.h | 5 +++++
>  include/odp/api/packet.h | 9 +++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/include/odp/api/config.h b/include/odp/api/config.h
> index 295b10d..985290a 100644
> --- a/include/odp/api/config.h
> +++ b/include/odp/api/config.h
> @@ -126,6 +126,11 @@ extern "C" {
>   */
>  #define ODP_CONFIG_PACKET_BUF_LEN_MAX (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>
> +/**
> + * Maximum packet references.
> + */
> +#define ODP_CONFIG_PACKET_REFS 2
>
???
Can a packet only have two references? Why? Why is there a limit at all?



> +
>  /** Maximum number of shared memory blocks.
>   *
>   * This the the number of separate SHM areas that can be reserved
> concurrently
> diff --git a/include/odp/api/packet.h b/include/odp/api/packet.h
> index 5d46b7b..f9745fb 100644
> --- a/include/odp/api/packet.h
> +++ b/include/odp/api/packet.h
> @@ -125,6 +125,15 @@ odp_packet_t odp_packet_from_event(odp_event_t ev);
>   */
>  odp_event_t odp_packet_to_event(odp_packet_t pkt);
>
> +/**
> + * Create reference for packet handle
> + *
> + * @param pkt  Packet handle
> + *
> + * @return New packet handle
> + */
> +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
> +
>  /*
>   *
>   * Pointers and lengths
> --
> 1.9.1
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
Maxim Uvarov Oct. 20, 2015, 3:41 p.m. | #3
On 10/15/2015 23:49, Ola Liljedahl wrote:
> On 9 October 2015 at 08:59, Maxim Uvarov <maxim.uvarov@linaro.org 
> <mailto:maxim.uvarov@linaro.org>> wrote:
>
>     Add api for packet reference count support. Which is useful in case:
>      - multicast support
>
> Different packets will probably need different headers prepended to 
> the shared packet. How is this handled?
>

That has to be handled with reference counts for segments support. For 
now you have packet, you can copy body of packet to new packet and put 
your own header.

>      - TCP retransmission
>
> When a TCP packet is retransmitted, the IP id should be changed 
> (similar for QUIC retransmissions where the SN is updated). So you 
> might have to modify the packet (header) before you know that the 
> previous transmit has actually completed. How is this handled?
>

Same as above. Without segmentation you can only copy to new packet.

>      - traffic generator
>
> A traffic generator might want to include an increasing SN or 
> timestamp in each packet (packet payload) which otherwise are 
> identical. How is this handled?
Same question. If reference count for hole packet is more then 1 you can 
not modify that packet.

You can modify packet after send:

pkt = odp_packet_create_ref(pkt);
pktio_send(pkt)
modify (pkt)
pkt = odp_packet_create_ref(pkt);
pktio_send(pkt)
modify (pkt)
........


Maxim.
>
>
>     Introduced new call: newpkt = odp_packet_create_ref(pkt) which creates
>     reference to original packet, but handles for reference packet and
>     original
>     are different.
>
> I think something like Bill's "copy-on-write" proposal (from Connect) 
> is better than explicit reference counting (of the whole packet). 
> Increase the abstraction level and allow for innovation in the 
> underlying implementation.
>
> Any API support should be based on use case analysis and code examples 
> of how the use cases are implemented using the suggested API's. 
> Otherwise the risk of missing the target is significant.
>
>
>     Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>     <mailto:maxim.uvarov@linaro.org>>
>     ---
>      include/odp/api/config.h | 5 +++++
>      include/odp/api/packet.h | 9 +++++++++
>      2 files changed, 14 insertions(+)
>
>     diff --git a/include/odp/api/config.h b/include/odp/api/config.h
>     index 295b10d..985290a 100644
>     --- a/include/odp/api/config.h
>     +++ b/include/odp/api/config.h
>     @@ -126,6 +126,11 @@ extern "C" {
>       */
>      #define ODP_CONFIG_PACKET_BUF_LEN_MAX
>     (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>
>     +/**
>     + * Maximum packet references.
>     + */
>     +#define ODP_CONFIG_PACKET_REFS 2
>
> ???
> Can a packet only have two references? Why? Why is there a limit at all?
>
>     +
>      /** Maximum number of shared memory blocks.
>       *
>       * This the the number of separate SHM areas that can be reserved
>     concurrently
>     diff --git a/include/odp/api/packet.h b/include/odp/api/packet.h
>     index 5d46b7b..f9745fb 100644
>     --- a/include/odp/api/packet.h
>     +++ b/include/odp/api/packet.h
>     @@ -125,6 +125,15 @@ odp_packet_t
>     odp_packet_from_event(odp_event_t ev);
>       */
>      odp_event_t odp_packet_to_event(odp_packet_t pkt);
>
>     +/**
>     + * Create reference for packet handle
>     + *
>     + * @param pkt  Packet handle
>     + *
>     + * @return New packet handle
>     + */
>     +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
>     +
>      /*
>       *
>       * Pointers and lengths
>     --
>     1.9.1
>
>     _______________________________________________
>     lng-odp mailing list
>     lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>     https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
Ola Liljedahl Oct. 21, 2015, 12:27 p.m. | #4
On 20 October 2015 at 17:41, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> On 10/15/2015 23:49, Ola Liljedahl wrote:
>
>> On 9 October 2015 at 08:59, Maxim Uvarov <maxim.uvarov@linaro.org
>> <mailto:maxim.uvarov@linaro.org>> wrote:
>>
>>     Add api for packet reference count support. Which is useful in case:
>>      - multicast support
>>
>> Different packets will probably need different headers prepended to the
>> shared packet. How is this handled?
>>
>>
> That has to be handled with reference counts for segments support. For now
> you have packet, you can copy body of packet to new packet and put your own
> header.

So your proposal does not support multicasting (i.e. making it more
efficient, e.g. by avoiding copying the whole packet)?


>
>
>      - TCP retransmission
>>
>> When a TCP packet is retransmitted, the IP id should be changed (similar
>> for QUIC retransmissions where the SN is updated). So you might have to
>> modify the packet (header) before you know that the previous transmit has
>> actually completed. How is this handled?
>>
>>
> Same as above. Without segmentation you can only copy to new packet.

So this use case is also not supported?

In the normal case, the packet has already been sent when the
retransmission timer expires and you need to retransmit the packet. How can
the application see that it is OK to reuse (modify) the original packet and
not have to create a copy?
Without having to expose the reference counter, you could have a call which
conditionally creates a new packet (as a clone of the specified packet) if
the specified packet cannot be modified (because it has other references).
When transmission is complete, the system automatically frees the packet
(i.e. decreases the refcount). When the retransmission timer expires, the
application can call this function that either returns the original packet
(because it is no longer being "in transmission") or it clones the packet
and return the new handle. Safe and optimised for that normal case of
retransmission.


>
>      - traffic generator
>>
>> A traffic generator might want to include an increasing SN or timestamp
>> in each packet (packet payload) which otherwise are identical. How is this
>> handled?
>>
> Same question. If reference count for hole packet is more then 1 you can
> not modify that packet.
>
> You can modify packet after send:
>
I am not so sure. Packet transmission is not immediate, it is essentially
an asynchronous operation without any notification that it has completed.
pktio_send() does not make a copy of the packet so that you can immediately
modify the packet whose handle you kept. There is no specific ODP call that
indicates that you want to modify the packet and thus want a "clean" copy
that is safe to modify.


> pkt = odp_packet_create_ref(pkt);
> pktio_send(pkt)
> modify (pkt)
> pkt = odp_packet_create_ref(pkt);
> pktio_send(pkt)
> modify (pkt)
> ........
>

I don't see how this proposal actually solves any interesting problems. I
see reference counting as being related to packet segments (an mbuf or
skbuf represents a segment so they have a refcount per mbuf/skbuf (actually
skbufs are more complicated than this, the data has its own refcount,
different skbufs can refer to the same data but with separate start and
lengths)) but for ODP packets, the segmentation is not directly visible (we
don't link packets together), it happens under the hood with some limited
support so that you can access packet data directly.

ODP reference counting should support the relevant use cases (e.g.
multicasting, retransmission, pktgen can be seen as similar to multicasting
I think), making them more performant (and less memory consuming) than
creating new copies of every packet (which is what you need to do today). A
solution should be able to reuse common segments (which normally
constitutes the larger part of the packet data) for different packets that
may use different headers (a smaller part of the packet data). The API
might not need a specific function for creating a new packet reference.
Instead you specify a packet handle and the area (starting offset, size)
you want to be able to modify in a safe way (so not to disturb other users
of the packet). If there is only one reference to this part of the packet,
the implementation can return the original packet handle. If there are more
than one reference, a new packet needs to be allocated and the relevant
segments cloned (refcount==1) while (trailing) untouched segments can be
reused (refcount++).



>
> Maxim.
>
>>
>>
>>     Introduced new call: newpkt = odp_packet_create_ref(pkt) which creates
>>     reference to original packet, but handles for reference packet and
>>     original
>>     are different.
>>
>> I think something like Bill's "copy-on-write" proposal (from Connect) is
>> better than explicit reference counting (of the whole packet). Increase the
>> abstraction level and allow for innovation in the underlying implementation.
>>
>> Any API support should be based on use case analysis and code examples of
>> how the use cases are implemented using the suggested API's. Otherwise the
>> risk of missing the target is significant.
>>
>>
>>     Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>>     <mailto:maxim.uvarov@linaro.org>>
>>
>>     ---
>>      include/odp/api/config.h | 5 +++++
>>      include/odp/api/packet.h | 9 +++++++++
>>      2 files changed, 14 insertions(+)
>>
>>     diff --git a/include/odp/api/config.h b/include/odp/api/config.h
>>     index 295b10d..985290a 100644
>>     --- a/include/odp/api/config.h
>>     +++ b/include/odp/api/config.h
>>     @@ -126,6 +126,11 @@ extern "C" {
>>       */
>>      #define ODP_CONFIG_PACKET_BUF_LEN_MAX
>>     (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>>
>>     +/**
>>     + * Maximum packet references.
>>     + */
>>     +#define ODP_CONFIG_PACKET_REFS 2
>>
>> ???
>> Can a packet only have two references? Why? Why is there a limit at all?
>>
>>     +
>>      /** Maximum number of shared memory blocks.
>>       *
>>       * This the the number of separate SHM areas that can be reserved
>>     concurrently
>>     diff --git a/include/odp/api/packet.h b/include/odp/api/packet.h
>>     index 5d46b7b..f9745fb 100644
>>     --- a/include/odp/api/packet.h
>>     +++ b/include/odp/api/packet.h
>>     @@ -125,6 +125,15 @@ odp_packet_t
>>     odp_packet_from_event(odp_event_t ev);
>>       */
>>      odp_event_t odp_packet_to_event(odp_packet_t pkt);
>>
>>     +/**
>>     + * Create reference for packet handle
>>     + *
>>     + * @param pkt  Packet handle
>>     + *
>>     + * @return New packet handle
>>     + */
>>     +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
>>     +
>>      /*
>>       *
>>       * Pointers and lengths
>>     --
>>     1.9.1
>>
>>     _______________________________________________
>>     lng-odp mailing list
>>     lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>     https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>
Maxim Uvarov Oct. 21, 2015, 12:46 p.m. | #5
On 10/21/2015 15:27, Ola Liljedahl wrote:
> On 20 October 2015 at 17:41, Maxim Uvarov <maxim.uvarov@linaro.org 
> <mailto:maxim.uvarov@linaro.org>> wrote:
>
>     On 10/15/2015 23:49, Ola Liljedahl wrote:
>
>         On 9 October 2015 at 08:59, Maxim Uvarov
>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>         <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>
>             Add api for packet reference count support. Which is
>         useful in case:
>              - multicast support
>
>         Different packets will probably need different headers
>         prepended to the shared packet. How is this handled?
>
>
>     That has to be handled with reference counts for segments support.
>     For now you have packet, you can copy body of packet to new packet
>     and put your own header.
>
> So your proposal does not support multicasting (i.e. making it more 
> efficient, e.g. by avoiding copying the whole packet)?

yes, that is true.

>
>
>              - TCP retransmission
>
>         When a TCP packet is retransmitted, the IP id should be
>         changed (similar for QUIC retransmissions where the SN is
>         updated). So you might have to modify the packet (header)
>         before you know that the previous transmit has actually
>         completed. How is this handled?
>
>
>     Same as above. Without segmentation you can only copy to new packet.
>
> So this use case is also not supported?
>
> In the normal case, the packet has already been sent when the 
> retransmission timer expires and you need to retransmit the packet. 
> How can the application see that it is OK to reuse (modify) the 
> original packet and not have to create a copy?
> Without having to expose the reference counter, you could have a call 
> which conditionally creates a new packet (as a clone of the specified 
> packet) if the specified packet cannot be modified (because it has 
> other references). When transmission is complete, the system 
> automatically frees the packet (i.e. decreases the refcount). When the 
> retransmission timer expires, the application can call this function 
> that either returns the original packet (because it is no longer being 
> "in transmission") or it clones the packet and return the new handle. 
> Safe and optimised for that normal case of retransmission.
>
>
>
>              - traffic generator
>
>         A traffic generator might want to include an increasing SN or
>         timestamp in each packet (packet payload) which otherwise are
>         identical. How is this handled?
>
>     Same question. If reference count for hole packet is more then 1
>     you can not modify that packet.
>
>     You can modify packet after send:
>
> I am not so sure. Packet transmission is not immediate, it is 
> essentially an asynchronous operation without any notification that it 
> has completed. pktio_send() does not make a copy of the packet so that 
> you can immediately modify the packet whose handle you kept. There is 
> no specific ODP call that indicates that you want to modify the packet 
> and thus want a "clean" copy that is safe to modify.
>
>
>     pkt = odp_packet_create_ref(pkt);
>     pktio_send(pkt)
>     modify (pkt)
>     pkt = odp_packet_create_ref(pkt);
>     pktio_send(pkt)
>     modify (pkt)
>     ........
>
>
> I don't see how this proposal actually solves any interesting 
> problems. I see reference counting as being related to packet segments 
> (an mbuf or skbuf represents a segment so they have a refcount per 
> mbuf/skbuf (actually skbufs are more complicated than this, the data 
> has its own refcount, different skbufs can refer to the same data but 
> with separate start and lengths)) but for ODP packets, the 
> segmentation is not directly visible (we don't link packets together), 
> it happens under the hood with some limited support so that you can 
> access packet data directly.
>
> ODP reference counting should support the relevant use cases (e.g. 
> multicasting, retransmission, pktgen can be seen as similar to 
> multicasting I think), making them more performant (and less memory 
> consuming) than creating new copies of every packet (which is what you 
> need to do today). A solution should be able to reuse common segments 
> (which normally constitutes the larger part of the packet data) for 
> different packets that may use different headers (a smaller part of 
> the packet data). The API might not need a specific function for 
> creating a new packet reference. Instead you specify a packet handle 
> and the area (starting offset, size) you want to be able to modify in 
> a safe way (so not to disturb other users of the packet). If there is 
> only one reference to this part of the packet, the implementation can 
> return the original packet handle. If there are more than one 
> reference, a new packet needs to be allocated and the relevant 
> segments cloned (refcount==1) while (trailing) untouched segments can 
> be reused (refcount++).

Ola, I understand and agree with your point.

In my understanding api can be about the same. Like:

newpkt = odp_packet_create_ref(oldpkt);

Then when you modify newpkt or oldpkt implementation should take care 
about segmentation in that packet. I.e. implementation will take decision
how is is better to reuse old packet or allocate new and copy parts and etc.

Now I'm not really know if we need linux-generic implementation take 
care about packets segments for reference packets. It's looks like we can
propose API and for linux-generic use odp_packet_copy() because it's not 
performance platform. In that case you will be able to use that api
and modify packet as you want. After that we can optimize linux-generic 
to real support segments if needed.

What do you think about that?

Maxim.

>
>
>
>
>     Maxim.
>
>
>
>             Introduced new call: newpkt = odp_packet_create_ref(pkt)
>         which creates
>             reference to original packet, but handles for reference
>         packet and
>             original
>             are different.
>
>         I think something like Bill's "copy-on-write" proposal (from
>         Connect) is better than explicit reference counting (of the
>         whole packet). Increase the abstraction level and allow for
>         innovation in the underlying implementation.
>
>         Any API support should be based on use case analysis and code
>         examples of how the use cases are implemented using the
>         suggested API's. Otherwise the risk of missing the target is
>         significant.
>
>
>             Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>
>             <mailto:maxim.uvarov@linaro.org
>         <mailto:maxim.uvarov@linaro.org>>>
>
>             ---
>              include/odp/api/config.h | 5 +++++
>              include/odp/api/packet.h | 9 +++++++++
>              2 files changed, 14 insertions(+)
>
>             diff --git a/include/odp/api/config.h
>         b/include/odp/api/config.h
>             index 295b10d..985290a 100644
>             --- a/include/odp/api/config.h
>             +++ b/include/odp/api/config.h
>             @@ -126,6 +126,11 @@ extern "C" {
>               */
>              #define ODP_CONFIG_PACKET_BUF_LEN_MAX
>             (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>
>             +/**
>             + * Maximum packet references.
>             + */
>             +#define ODP_CONFIG_PACKET_REFS 2
>
>         ???
>         Can a packet only have two references? Why? Why is there a
>         limit at all?
>
>             +
>              /** Maximum number of shared memory blocks.
>               *
>               * This the the number of separate SHM areas that can be
>         reserved
>             concurrently
>             diff --git a/include/odp/api/packet.h
>         b/include/odp/api/packet.h
>             index 5d46b7b..f9745fb 100644
>             --- a/include/odp/api/packet.h
>             +++ b/include/odp/api/packet.h
>             @@ -125,6 +125,15 @@ odp_packet_t
>             odp_packet_from_event(odp_event_t ev);
>               */
>              odp_event_t odp_packet_to_event(odp_packet_t pkt);
>
>             +/**
>             + * Create reference for packet handle
>             + *
>             + * @param pkt  Packet handle
>             + *
>             + * @return New packet handle
>             + */
>             +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
>             +
>              /*
>               *
>               * Pointers and lengths
>             --
>             1.9.1
>
>             _______________________________________________
>             lng-odp mailing list
>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>         <mailto:lng-odp@lists.linaro.org
>         <mailto:lng-odp@lists.linaro.org>>
>         https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
>
>
Ola Liljedahl Oct. 21, 2015, 5:57 p.m. | #6
On 21 October 2015 at 14:46, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

> On 10/21/2015 15:27, Ola Liljedahl wrote:
>
>> On 20 October 2015 at 17:41, Maxim Uvarov <maxim.uvarov@linaro.org
>> <mailto:maxim.uvarov@linaro.org>> wrote:
>>
>>     On 10/15/2015 23:49, Ola Liljedahl wrote:
>>
>>         On 9 October 2015 at 08:59, Maxim Uvarov
>>         <maxim.uvarov@linaro.org <mailto:maxim.uvarov@linaro.org>
>>         <mailto:maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>>> wrote:
>>
>>             Add api for packet reference count support. Which is
>>         useful in case:
>>              - multicast support
>>
>>         Different packets will probably need different headers
>>         prepended to the shared packet. How is this handled?
>>
>>
>>     That has to be handled with reference counts for segments support.
>>     For now you have packet, you can copy body of packet to new packet
>>     and put your own header.
>>
>> So your proposal does not support multicasting (i.e. making it more
>> efficient, e.g. by avoiding copying the whole packet)?
>>
>
> yes, that is true.
>
>
>
>>
>>              - TCP retransmission
>>
>>         When a TCP packet is retransmitted, the IP id should be
>>         changed (similar for QUIC retransmissions where the SN is
>>         updated). So you might have to modify the packet (header)
>>         before you know that the previous transmit has actually
>>         completed. How is this handled?
>>
>>
>>     Same as above. Without segmentation you can only copy to new packet.
>>
>> So this use case is also not supported?
>>
>> In the normal case, the packet has already been sent when the
>> retransmission timer expires and you need to retransmit the packet. How can
>> the application see that it is OK to reuse (modify) the original packet and
>> not have to create a copy?
>> Without having to expose the reference counter, you could have a call
>> which conditionally creates a new packet (as a clone of the specified
>> packet) if the specified packet cannot be modified (because it has other
>> references). When transmission is complete, the system automatically frees
>> the packet (i.e. decreases the refcount). When the retransmission timer
>> expires, the application can call this function that either returns the
>> original packet (because it is no longer being "in transmission") or it
>> clones the packet and return the new handle. Safe and optimised for that
>> normal case of retransmission.
>>
>>
>>
>>              - traffic generator
>>
>>         A traffic generator might want to include an increasing SN or
>>         timestamp in each packet (packet payload) which otherwise are
>>         identical. How is this handled?
>>
>>     Same question. If reference count for hole packet is more then 1
>>     you can not modify that packet.
>>
>>     You can modify packet after send:
>>
>> I am not so sure. Packet transmission is not immediate, it is essentially
>> an asynchronous operation without any notification that it has completed.
>> pktio_send() does not make a copy of the packet so that you can immediately
>> modify the packet whose handle you kept. There is no specific ODP call that
>> indicates that you want to modify the packet and thus want a "clean" copy
>> that is safe to modify.
>>
>>
>>     pkt = odp_packet_create_ref(pkt);
>>     pktio_send(pkt)
>>     modify (pkt)
>>     pkt = odp_packet_create_ref(pkt);
>>     pktio_send(pkt)
>>     modify (pkt)
>>     ........
>>
>>
>> I don't see how this proposal actually solves any interesting problems. I
>> see reference counting as being related to packet segments (an mbuf or
>> skbuf represents a segment so they have a refcount per mbuf/skbuf (actually
>> skbufs are more complicated than this, the data has its own refcount,
>> different skbufs can refer to the same data but with separate start and
>> lengths)) but for ODP packets, the segmentation is not directly visible (we
>> don't link packets together), it happens under the hood with some limited
>> support so that you can access packet data directly.
>>
>> ODP reference counting should support the relevant use cases (e.g.
>> multicasting, retransmission, pktgen can be seen as similar to multicasting
>> I think), making them more performant (and less memory consuming) than
>> creating new copies of every packet (which is what you need to do today). A
>> solution should be able to reuse common segments (which normally
>> constitutes the larger part of the packet data) for different packets that
>> may use different headers (a smaller part of the packet data). The API
>> might not need a specific function for creating a new packet reference.
>> Instead you specify a packet handle and the area (starting offset, size)
>> you want to be able to modify in a safe way (so not to disturb other users
>> of the packet). If there is only one reference to this part of the packet,
>> the implementation can return the original packet handle. If there are more
>> than one reference, a new packet needs to be allocated and the relevant
>> segments cloned (refcount==1) while (trailing) untouched segments can be
>> reused (refcount++).
>>
>
> Ola, I understand and agree with your point.
>
> In my understanding api can be about the same. Like:
>
> newpkt = odp_packet_create_ref(oldpkt);
>
> Then when you modify newpkt or oldpkt implementation should take care
> about segmentation in that packet. I.e. implementation will take decision
> how is is better to reuse old packet or allocate new and copy parts and
> etc.
>
But ODP might not know when you start to modify the packet. E.g. by getting
the data base address and modify the IP header.


>
> Now I'm not really know if we need linux-generic implementation take care
> about packets segments for reference packets. It's looks like we can
> propose API and for linux-generic use odp_packet_copy() because it's not
> performance platform. In that case you will be able to use that api
> and modify packet as you want. After that we can optimize linux-generic to
> real support segments if needed.
>
Linux-generic might not need to do segmentation, at least not due the
underlying memory system requiring it (some special NPU platforms need
that, they don't use vanilla DRAM for packet buffers) but linux-generic
runs on cache-coherent shared memory systems with lots of DRAM. The API we
define must however support segmentation efficiently (e.g. it must be
possible for an implementation to only clone those segments ("particles")
that will be changed and keep multiple references to the shared unmodified
segments. That's why it is a risk to define a new packet API for
linux-generic which doesn't know anything about segmentation.


>
> What do you think about that?
>
You start from the wrong end. You need to start with use cases and see what
functionality you need from ODP to be able to optimise the design (minimise
cycle and memory consumption). From that you can derive any necessary API
extensions.

Write the ODP-based skeleton code for the multicast and retransmission use
cases using the existing primitives. Identify performance problems.
Identify functionality you need to avoid or minimise those performance
problems (e.g. make it possible to create a virtual copy of a packet where
you can safely modify the header without having to explicitly copy the
whole packet). Define the corresponding ODP functions, prototype their
implementations in linux-generic.

For the retransmission case, creating a second reference (handle) to the
original packet might be sufficient, you create a complete copy if you need
to modify and retransmit the packet. The overhead would be acceptable since
retransmissions are supposed to be infrequent. I still think you want more
than just creating a new reference to an existing packet, you want a
function that conditionally creates that copy but this only needs to happen
when there still are other references to the packet. Otherwise you can
reuse the existing packet since you have the only reference and your
updates won't cause problems for other users.

I leave the multicast and packet generation use case to you. They are
different from the retransmission use case.


> Maxim.
>
>
>>
>>
>>
>>     Maxim.
>>
>>
>>
>>             Introduced new call: newpkt = odp_packet_create_ref(pkt)
>>         which creates
>>             reference to original packet, but handles for reference
>>         packet and
>>             original
>>             are different.
>>
>>         I think something like Bill's "copy-on-write" proposal (from
>>         Connect) is better than explicit reference counting (of the
>>         whole packet). Increase the abstraction level and allow for
>>         innovation in the underlying implementation.
>>
>>         Any API support should be based on use case analysis and code
>>         examples of how the use cases are implemented using the
>>         suggested API's. Otherwise the risk of missing the target is
>>         significant.
>>
>>
>>             Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org
>>         <mailto:maxim.uvarov@linaro.org>
>>             <mailto:maxim.uvarov@linaro.org
>>
>>         <mailto:maxim.uvarov@linaro.org>>>
>>
>>             ---
>>              include/odp/api/config.h | 5 +++++
>>              include/odp/api/packet.h | 9 +++++++++
>>              2 files changed, 14 insertions(+)
>>
>>             diff --git a/include/odp/api/config.h
>>         b/include/odp/api/config.h
>>             index 295b10d..985290a 100644
>>             --- a/include/odp/api/config.h
>>             +++ b/include/odp/api/config.h
>>             @@ -126,6 +126,11 @@ extern "C" {
>>               */
>>              #define ODP_CONFIG_PACKET_BUF_LEN_MAX
>>             (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
>>
>>             +/**
>>             + * Maximum packet references.
>>             + */
>>             +#define ODP_CONFIG_PACKET_REFS 2
>>
>>         ???
>>         Can a packet only have two references? Why? Why is there a
>>         limit at all?
>>
>>             +
>>              /** Maximum number of shared memory blocks.
>>               *
>>               * This the the number of separate SHM areas that can be
>>         reserved
>>             concurrently
>>             diff --git a/include/odp/api/packet.h
>>         b/include/odp/api/packet.h
>>             index 5d46b7b..f9745fb 100644
>>             --- a/include/odp/api/packet.h
>>             +++ b/include/odp/api/packet.h
>>             @@ -125,6 +125,15 @@ odp_packet_t
>>             odp_packet_from_event(odp_event_t ev);
>>               */
>>              odp_event_t odp_packet_to_event(odp_packet_t pkt);
>>
>>             +/**
>>             + * Create reference for packet handle
>>             + *
>>             + * @param pkt  Packet handle
>>             + *
>>             + * @return New packet handle
>>             + */
>>             +odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
>>             +
>>              /*
>>               *
>>               * Pointers and lengths
>>             --
>>             1.9.1
>>
>>             _______________________________________________
>>             lng-odp mailing list
>>         lng-odp@lists.linaro.org <mailto:lng-odp@lists.linaro.org>
>>         <mailto:lng-odp@lists.linaro.org
>>         <mailto:lng-odp@lists.linaro.org>>
>>         https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>>
>>
>>
>>
>

Patch

diff --git a/include/odp/api/config.h b/include/odp/api/config.h
index 295b10d..985290a 100644
--- a/include/odp/api/config.h
+++ b/include/odp/api/config.h
@@ -126,6 +126,11 @@  extern "C" {
  */
 #define ODP_CONFIG_PACKET_BUF_LEN_MAX (ODP_CONFIG_PACKET_SEG_LEN_MIN*6)
 
+/**
+ * Maximum packet references.
+ */
+#define ODP_CONFIG_PACKET_REFS 2
+
 /** Maximum number of shared memory blocks.
  *
  * This the the number of separate SHM areas that can be reserved concurrently
diff --git a/include/odp/api/packet.h b/include/odp/api/packet.h
index 5d46b7b..f9745fb 100644
--- a/include/odp/api/packet.h
+++ b/include/odp/api/packet.h
@@ -125,6 +125,15 @@  odp_packet_t odp_packet_from_event(odp_event_t ev);
  */
 odp_event_t odp_packet_to_event(odp_packet_t pkt);
 
+/**
+ * Create reference for packet handle
+ *
+ * @param pkt  Packet handle
+ *
+ * @return New packet handle
+ */
+odp_packet_t odp_packet_create_ref(odp_packet_t pkt);
+
 /*
  *
  * Pointers and lengths