diff mbox

[API-NEXT,PATCHv2,1/5] api: packet: add support for packet references

Message ID 1479221083-30553-2-git-send-email-bill.fischofer@linaro.org
State New
Headers show

Commit Message

Bill Fischofer Nov. 15, 2016, 2:44 p.m. UTC
Introduce three new APIs that support efficient sharing of portions of
packets.

odp_packet_ref_static() creates an alias for a base packet

odp_packet_ref() creates a reference to a base packet

odp_packet_ref_pkt() creates a reference to a base packet from a supplied
header packet

In addition, three other APIs simplify working with references

odp_packet_is_ref() says whether a packet is a reference

odp_packet_has_ref() says whether a packet has had a reference made to it

odp_packet_unshared_len() gives the length of the prefix bytes that are
unique to this reference. These are the only bytes of the packet that may
be modified safely.

Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

---
 include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 168 insertions(+)

-- 
2.7.4

Comments

Balasubramanian Manoharan Dec. 19, 2016, 10:06 a.m. UTC | #1
Comments inline....


On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> Introduce three new APIs that support efficient sharing of portions of

> packets.

>

> odp_packet_ref_static() creates an alias for a base packet

>

> odp_packet_ref() creates a reference to a base packet

>

> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

> header packet

>

> In addition, three other APIs simplify working with references

>

> odp_packet_is_ref() says whether a packet is a reference

>

> odp_packet_has_ref() says whether a packet has had a reference made to it

>

> odp_packet_unshared_len() gives the length of the prefix bytes that are

> unique to this reference. These are the only bytes of the packet that may

> be modified safely.

>

> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

> ---

>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>  1 file changed, 168 insertions(+)

>

> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

> index faf62e2..137024f 100644

> --- a/include/odp/api/spec/packet.h

> +++ b/include/odp/api/spec/packet.h

> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>  uint32_t odp_packet_len(odp_packet_t pkt);

>

>  /**

> + * Packet unshared data len

> + *

> + * Returns the sum of data lengths over all unshared packet segments. These

> + * are the only bytes that should be modified by the caller. The rest of the

> + * packet should be treated as read-only.

> + *

> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

> + * odp_packet_is_ref() == 0.

> + *

> + * @param pkt Packet handle

> + *

> + * @return Packet unshared data length

> + */

> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

> +

> +/**

>   * Packet headroom length

>   *

>   * Returns the current packet level headroom length.

> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>

>  /*

>   *

> + * References

> + * ********************************************************

> + *

> + */

> +

> +/**

> + * Create a static reference to a packet

> + *

> + * A static reference is used to obtain an additional handle for referring to

> + * a packet so that the storage behind it is not freed until all references to

> + * the packet are freed. This can be used, for example, to support efficient

> + * retransmission processing.

> + *

> + * The intent of a static reference is that both the base packet and the

> + * returned reference will be treated as read-only after this call. Results

> + * are unpredictable if this restriction is not observed.

> + *

> + * Static references have restrictions but may have performance advantages on

> + * some platforms if the caller does not intend to modify the reference

> + * packet. If modification is needed (e.g., to prefix a unique header onto the

> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

> + *

> + * @param pkt    Handle of the base packet for which a static reference is

> + *               to be created.

> + *

> + * @return                    Handle of the static reference packet

> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

> + */

> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);


It is better to change the API signature to return the updated handle
of the base packet also to the application.
Similar to the API for odp_packet_extend_head().

> +

> +/**

> + * Create a reference to a packet

> + *

> + * Create a dynamic reference to a base packet starting at a specified byte

> + * offset. This reference may be used as an argument to header manipulation

> + * APIs to prefix a unique header onto the shared base. The storage associated

> + * with the base packet is not freed until all references to it are freed,

> + * which permits easy multicasting or retransmission processing to be

> + * performed. Following a successful call, the base packet should be treated

> + * as read-only. Results are unpredictable if this restriction is not

> + * observed.

> + *

> + * This operation prepends a zero-length header onto the base packet beginning

> + * at the specified offset. This header is always drawn from the same pool as

> + * the base packet.

> + *

> + * Because references are unique packets, any bytes pushed onto the head of a

> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

> + * to this reference and are not seen by the base packet or by any other

> + * reference to the same base packet.

> + *

> + * The base packet used as input to this routine may itself by a reference to

> + * some other base packet. Implementations MAY restrict the ability to create

> + * such compound references. Attempts to exceed any implementation limits in

> + * this regard will result in this call failing and returning

> + * ODP_PACKET_INVALID.

> + *

> + * If the caller does not indend to push a header onto the returned reference,

> + * the odp_packet_ref_static() API may be used. This may be a more efficient

> + * means of obtaining another reference to a base packet that will be treated

> + * as read-only.

> + *

> + * @param pkt    Handle of the base packet for which a reference is to be

> + *               created.

> + *

> + * @param offset Byte offset in the base packet at which the shared reference

> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

> + *

> + * @return                    Handle of the reference packet

> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

> + *

> +

> + */

> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

> +

> +/**

> + * Create a reference to another packet from a header packet

> + *

> + * Create a dynamic reference to a base packet starting at a specified byte

> + * offset by prepending a supplied header packet. The resulting reference

> + * consists of the unshared header followed by the shared base packet starting

> + * at the specified offset. This shared portion should be regarded as

> + * read-only. The storage associated with the shared portion of the reference

> + * is not freed until all references to it are freed, which permits easy

> + * multicasting or retransmission processing to be performed.


My concerns with this API is what happens when application creates
multiple references from multiple offsets of the base packet. In that
scenario the read-only status of the shared portion could not be
maintained since if different references gives different offset there
will be more than one shared portion.

> + *

> + * Either packet input to this routine may itself be a reference, however

> + * individual implementations MAY impose limits on how deeply splices may be

> + * nested and fail the call if those limits are exceeded.

> + *

> + * The packets used as input to this routine may reside in different pools,

> + * however individual implementations MAY require that both reside in the same

> + * pool and fail the call if this restriction is not observed.


Not sure if there is a requirement to support reference with packets
from multiple pools.

> + *

> + * odp_packet_pool() on the returned reference returns the pool id of the

> + * header packet.

> + *

> + * @param pkt    Handle of the base packet for which a reference is to be

> + *               created.

> + *

> + * @param offset Byte offset in the base packet at which the shared reference

> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

> + *

> + * @param hdr    Handle of the header packet to be prefixed onto the base

> + *               packet to create the reference. If this is specified as

> + *               ODP_PACKET_INVALID, this call is equivalent to

> + *               odp_packet_ref(). The supplied hdr must be distinct from

> + *               the base pkt and results are undefined if an attempt is made

> + *               to create circular references.

> + *

> + * @return       Handle of the reference packet. This may or may not be the

> + *               same as the input hdr packet. The caller should only use this

> + *               handle since the original hdr packet no longer has any

> + *               independent existence.

> + *

> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

> + *                            and hdr are unchanged.

> + */

> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

> +                               odp_packet_t hdr);

> +

> +/**

> + * Test if a packet is a reference

> + *

> + * A packet is a reference if it was created by odp_packet_ref() or

> + * odp_packet_ref_pkt().  Note that a reference created from another

> + * reference is considered a compound reference. Calls on such packets will

> + * result in return values greater than 1.

> + *

> + * @param pkt Packet handle

> + *

> + * @retval 0  Packet is not a reference

> + * @retval >0 Packet is a reference consisting of N individual packets.

> + */

> +int odp_packet_is_ref(odp_packet_t pkt);


It is better to keep the return value as 0 or 1. Is it expected to
return the number of references?
The total number of references created is not interesting and also it
is not so easy since references are created at segment level as per
this proposal.
Application will have to call odp_packet_free() for all the packet
handle it is holding.

> +

> +/**

> + * Test if a packet has a reference

> + *

> + * A packet has a reference if a reference was created on it by

> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

> + *

> + * @param pkt Packet handle

> + *

> + * @retval 0  Packet does not have any references

> + * @retval >0 Packet has N references based on it

> + */

> +int odp_packet_has_ref(odp_packet_t pkt);


What is the difference between odp_packet_has_ref() and
odp_packet_is_ref()? IMO Once a reference is created there is no
difference between the base packet and reference.

Regards,
Bala
> +

> +/*

> + *

>   * Copy

>   * ********************************************************

>   *

> --

> 2.7.4

>
Bill Fischofer Dec. 19, 2016, 2:53 p.m. UTC | #2
On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan
<bala.manoharan@linaro.org> wrote:
> Comments inline....

>

>

> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>> Introduce three new APIs that support efficient sharing of portions of

>> packets.

>>

>> odp_packet_ref_static() creates an alias for a base packet

>>

>> odp_packet_ref() creates a reference to a base packet

>>

>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>> header packet

>>

>> In addition, three other APIs simplify working with references

>>

>> odp_packet_is_ref() says whether a packet is a reference

>>

>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>

>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>> unique to this reference. These are the only bytes of the packet that may

>> be modified safely.

>>

>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>> ---

>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>  1 file changed, 168 insertions(+)

>>

>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>> index faf62e2..137024f 100644

>> --- a/include/odp/api/spec/packet.h

>> +++ b/include/odp/api/spec/packet.h

>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>

>>  /**

>> + * Packet unshared data len

>> + *

>> + * Returns the sum of data lengths over all unshared packet segments. These

>> + * are the only bytes that should be modified by the caller. The rest of the

>> + * packet should be treated as read-only.

>> + *

>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>> + * odp_packet_is_ref() == 0.

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @return Packet unshared data length

>> + */

>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>> +

>> +/**

>>   * Packet headroom length

>>   *

>>   * Returns the current packet level headroom length.

>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>

>>  /*

>>   *

>> + * References

>> + * ********************************************************

>> + *

>> + */

>> +

>> +/**

>> + * Create a static reference to a packet

>> + *

>> + * A static reference is used to obtain an additional handle for referring to

>> + * a packet so that the storage behind it is not freed until all references to

>> + * the packet are freed. This can be used, for example, to support efficient

>> + * retransmission processing.

>> + *

>> + * The intent of a static reference is that both the base packet and the

>> + * returned reference will be treated as read-only after this call. Results

>> + * are unpredictable if this restriction is not observed.

>> + *

>> + * Static references have restrictions but may have performance advantages on

>> + * some platforms if the caller does not intend to modify the reference

>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>> + *

>> + * @param pkt    Handle of the base packet for which a static reference is

>> + *               to be created.

>> + *

>> + * @return                    Handle of the static reference packet

>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>> + */

>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>

> It is better to change the API signature to return the updated handle

> of the base packet also to the application.

> Similar to the API for odp_packet_extend_head().


I think returning the packet ref directly rather than indirectly makes
for easier coding on the part of applications. Failure is indicated by
returning ODP_PACKET_INVALID. So in this sense we're like
odp_packet_alloc(). The alternative:

int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>

>> +

>> +/**

>> + * Create a reference to a packet

>> + *

>> + * Create a dynamic reference to a base packet starting at a specified byte

>> + * offset. This reference may be used as an argument to header manipulation

>> + * APIs to prefix a unique header onto the shared base. The storage associated

>> + * with the base packet is not freed until all references to it are freed,

>> + * which permits easy multicasting or retransmission processing to be

>> + * performed. Following a successful call, the base packet should be treated

>> + * as read-only. Results are unpredictable if this restriction is not

>> + * observed.

>> + *

>> + * This operation prepends a zero-length header onto the base packet beginning

>> + * at the specified offset. This header is always drawn from the same pool as

>> + * the base packet.

>> + *

>> + * Because references are unique packets, any bytes pushed onto the head of a

>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>> + * to this reference and are not seen by the base packet or by any other

>> + * reference to the same base packet.

>> + *

>> + * The base packet used as input to this routine may itself by a reference to

>> + * some other base packet. Implementations MAY restrict the ability to create

>> + * such compound references. Attempts to exceed any implementation limits in

>> + * this regard will result in this call failing and returning

>> + * ODP_PACKET_INVALID.

>> + *

>> + * If the caller does not indend to push a header onto the returned reference,

>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>> + * means of obtaining another reference to a base packet that will be treated

>> + * as read-only.

>> + *

>> + * @param pkt    Handle of the base packet for which a reference is to be

>> + *               created.

>> + *

>> + * @param offset Byte offset in the base packet at which the shared reference

>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>> + *

>> + * @return                    Handle of the reference packet

>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>> + *

>> +

>> + */

>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>> +

>> +/**

>> + * Create a reference to another packet from a header packet

>> + *

>> + * Create a dynamic reference to a base packet starting at a specified byte

>> + * offset by prepending a supplied header packet. The resulting reference

>> + * consists of the unshared header followed by the shared base packet starting

>> + * at the specified offset. This shared portion should be regarded as

>> + * read-only. The storage associated with the shared portion of the reference

>> + * is not freed until all references to it are freed, which permits easy

>> + * multicasting or retransmission processing to be performed.

>

> My concerns with this API is what happens when application creates

> multiple references from multiple offsets of the base packet. In that

> scenario the read-only status of the shared portion could not be

> maintained since if different references gives different offset there

> will be more than one shared portion.

>


Why would this be a problem? We're relying on an "honor system" here
since there is no defined enforcement mechanism. The rule is that you
should only modify the unshared portion of any packet and results are
undefined if this rule is ignored. That's why we have the
odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to
help applications know whether they should or should not attempt to
modify a packet given it's handle.

>> + *

>> + * Either packet input to this routine may itself be a reference, however

>> + * individual implementations MAY impose limits on how deeply splices may be

>> + * nested and fail the call if those limits are exceeded.

>> + *

>> + * The packets used as input to this routine may reside in different pools,

>> + * however individual implementations MAY require that both reside in the same

>> + * pool and fail the call if this restriction is not observed.

>

> Not sure if there is a requirement to support reference with packets

> from multiple pools.


That's why this is a MAY. We could expose this as a capability or
simply state that this is not supported, however some implementations
may be able to support this (e.g., odp-linux has no need to make this
restriction) and I could see how it could be useful to have header
templates in their own pool for storage management purposes.

>

>> + *

>> + * odp_packet_pool() on the returned reference returns the pool id of the

>> + * header packet.

>> + *

>> + * @param pkt    Handle of the base packet for which a reference is to be

>> + *               created.

>> + *

>> + * @param offset Byte offset in the base packet at which the shared reference

>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>> + *

>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>> + *               packet to create the reference. If this is specified as

>> + *               ODP_PACKET_INVALID, this call is equivalent to

>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>> + *               the base pkt and results are undefined if an attempt is made

>> + *               to create circular references.

>> + *

>> + * @return       Handle of the reference packet. This may or may not be the

>> + *               same as the input hdr packet. The caller should only use this

>> + *               handle since the original hdr packet no longer has any

>> + *               independent existence.

>> + *

>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>> + *                            and hdr are unchanged.

>> + */

>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>> +                               odp_packet_t hdr);

>> +

>> +/**

>> + * Test if a packet is a reference

>> + *

>> + * A packet is a reference if it was created by odp_packet_ref() or

>> + * odp_packet_ref_pkt().  Note that a reference created from another

>> + * reference is considered a compound reference. Calls on such packets will

>> + * result in return values greater than 1.

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @retval 0  Packet is not a reference

>> + * @retval >0 Packet is a reference consisting of N individual packets.

>> + */

>> +int odp_packet_is_ref(odp_packet_t pkt);

>

> It is better to keep the return value as 0 or 1. Is it expected to

> return the number of references?

> The total number of references created is not interesting and also it

> is not so easy since references are created at segment level as per

> this proposal.

> Application will have to call odp_packet_free() for all the packet

> handle it is holding.


Since any implementation supporting references needs to have some
internal notion of a reference count this is just attempting to expose
that in an implementation-independent manner. This is also why there
is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>

>> +

>> +/**

>> + * Test if a packet has a reference

>> + *

>> + * A packet has a reference if a reference was created on it by

>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @retval 0  Packet does not have any references

>> + * @retval >0 Packet has N references based on it

>> + */

>> +int odp_packet_has_ref(odp_packet_t pkt);

>

> What is the difference between odp_packet_has_ref() and

> odp_packet_is_ref()? IMO Once a reference is created there is no

> difference between the base packet and reference.


Not true. This is discussed (with diagrams) in the User Guide updates
associated with this patch series.

Consider the call:

pkt_a = odp_packet_ref(pkt_b, 0);

After this call the following conditions hold:

odp_packet_is_ref(pkt_a) == 1
odp_packet_is_ref(pkt_b) == 0
odp_packet_has_ref(pkt_a) == 0
odp_packet_has_ref(pkt_b) == 1

Now consider the more complex case:

pkt_a = odp_packet_ref(pkt_b, offset1);
pkt_c = odp_packet_ref(pkt_b, offset2);
pkt_d = odp_packet_ref(pkt_a, offset3);

Now we have:

odp_packet_is_ref(pkt_a) == 1
odp_packet_is_ref(pkt_b) == 0
odp_packet_is_ref(pkt_c) == 1
odp_packet_is_ref(pkt_d) == 2
odp_packet_has_ref(pkt_a) == 1
odp_packet_has_ref(pkt_b) == 3
odp_packet_has_ref(pkt_c) == 0
odp_packet_has_ref(pkt_d) == 0

Essentially, odp_packet_is_ref() answers the question "How many other
packets are referenced via this handle"? While odp_packet_has_ref()
answers the question "How many other handles can be used to reference
this packet"?

>

> Regards,

> Bala

>> +

>> +/*

>> + *

>>   * Copy

>>   * ********************************************************

>>   *

>> --

>> 2.7.4

>>
Maxim Uvarov Dec. 19, 2016, 3:57 p.m. UTC | #3
how about group all packet reference apis to odp_packet_ref_ naming?

In that case

odp_packet_ref() -> odp_packet_ref_create() (same as odp_pool_create,
odp_queue, create).

both
odp_packet_is_ref() and odp_packet_has_ref()  -> odp_packet_ref()
(function which returns ref counter)

It will be more easy if application could not see difference between
reference and original packet. At least if we do not have special use
case for that.

Maxim.


On 11/15/16 17:44, Bill Fischofer wrote:
> Introduce three new APIs that support efficient sharing of portions of

> packets.

> 

> odp_packet_ref_static() creates an alias for a base packet

> 

> odp_packet_ref() creates a reference to a base packet

> 

> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

> header packet

> 

> In addition, three other APIs simplify working with references

> 

> odp_packet_is_ref() says whether a packet is a reference

> 

> odp_packet_has_ref() says whether a packet has had a reference made to it

> 

> odp_packet_unshared_len() gives the length of the prefix bytes that are

> unique to this reference. These are the only bytes of the packet that may

> be modified safely.

> 

> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

> ---

>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>  1 file changed, 168 insertions(+)

> 

> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

> index faf62e2..137024f 100644

> --- a/include/odp/api/spec/packet.h

> +++ b/include/odp/api/spec/packet.h

> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>  uint32_t odp_packet_len(odp_packet_t pkt);

>  

>  /**

> + * Packet unshared data len

> + *

> + * Returns the sum of data lengths over all unshared packet segments. These

> + * are the only bytes that should be modified by the caller. The rest of the

> + * packet should be treated as read-only.

> + *

> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

> + * odp_packet_is_ref() == 0.

> + *

> + * @param pkt Packet handle

> + *

> + * @return Packet unshared data length

> + */

> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

> +

> +/**

>   * Packet headroom length

>   *

>   * Returns the current packet level headroom length.

> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>  

>  /*

>   *

> + * References

> + * ********************************************************

> + *

> + */

> +

> +/**

> + * Create a static reference to a packet

> + *

> + * A static reference is used to obtain an additional handle for referring to

> + * a packet so that the storage behind it is not freed until all references to

> + * the packet are freed. This can be used, for example, to support efficient

> + * retransmission processing.

> + *

> + * The intent of a static reference is that both the base packet and the

> + * returned reference will be treated as read-only after this call. Results

> + * are unpredictable if this restriction is not observed.

> + *

> + * Static references have restrictions but may have performance advantages on

> + * some platforms if the caller does not intend to modify the reference

> + * packet. If modification is needed (e.g., to prefix a unique header onto the

> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

> + *

> + * @param pkt    Handle of the base packet for which a static reference is

> + *               to be created.

> + *

> + * @return                    Handle of the static reference packet

> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

> + */

> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

> +

> +/**

> + * Create a reference to a packet

> + *

> + * Create a dynamic reference to a base packet starting at a specified byte

> + * offset. This reference may be used as an argument to header manipulation

> + * APIs to prefix a unique header onto the shared base. The storage associated

> + * with the base packet is not freed until all references to it are freed,

> + * which permits easy multicasting or retransmission processing to be

> + * performed. Following a successful call, the base packet should be treated

> + * as read-only. Results are unpredictable if this restriction is not

> + * observed.

> + *

> + * This operation prepends a zero-length header onto the base packet beginning

> + * at the specified offset. This header is always drawn from the same pool as

> + * the base packet.

> + *

> + * Because references are unique packets, any bytes pushed onto the head of a

> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

> + * to this reference and are not seen by the base packet or by any other

> + * reference to the same base packet.

> + *

> + * The base packet used as input to this routine may itself by a reference to

> + * some other base packet. Implementations MAY restrict the ability to create

> + * such compound references. Attempts to exceed any implementation limits in

> + * this regard will result in this call failing and returning

> + * ODP_PACKET_INVALID.

> + *

> + * If the caller does not indend to push a header onto the returned reference,

> + * the odp_packet_ref_static() API may be used. This may be a more efficient

> + * means of obtaining another reference to a base packet that will be treated

> + * as read-only.

> + *

> + * @param pkt    Handle of the base packet for which a reference is to be

> + *               created.

> + *

> + * @param offset Byte offset in the base packet at which the shared reference

> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

> + *

> + * @return                    Handle of the reference packet

> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

> + *

> +

> + */

> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

> +

> +/**

> + * Create a reference to another packet from a header packet

> + *

> + * Create a dynamic reference to a base packet starting at a specified byte

> + * offset by prepending a supplied header packet. The resulting reference

> + * consists of the unshared header followed by the shared base packet starting

> + * at the specified offset. This shared portion should be regarded as

> + * read-only. The storage associated with the shared portion of the reference

> + * is not freed until all references to it are freed, which permits easy

> + * multicasting or retransmission processing to be performed.

> + *

> + * Either packet input to this routine may itself be a reference, however

> + * individual implementations MAY impose limits on how deeply splices may be

> + * nested and fail the call if those limits are exceeded.

> + *

> + * The packets used as input to this routine may reside in different pools,

> + * however individual implementations MAY require that both reside in the same

> + * pool and fail the call if this restriction is not observed.

> + *

> + * odp_packet_pool() on the returned reference returns the pool id of the

> + * header packet.

> + *

> + * @param pkt    Handle of the base packet for which a reference is to be

> + *               created.

> + *

> + * @param offset Byte offset in the base packet at which the shared reference

> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

> + *

> + * @param hdr    Handle of the header packet to be prefixed onto the base

> + *               packet to create the reference. If this is specified as

> + *               ODP_PACKET_INVALID, this call is equivalent to

> + *               odp_packet_ref(). The supplied hdr must be distinct from

> + *               the base pkt and results are undefined if an attempt is made

> + *               to create circular references.

> + *

> + * @return       Handle of the reference packet. This may or may not be the

> + *               same as the input hdr packet. The caller should only use this

> + *               handle since the original hdr packet no longer has any

> + *               independent existence.

> + *

> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

> + *                            and hdr are unchanged.

> + */

> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

> +				odp_packet_t hdr);

> +

> +/**

> + * Test if a packet is a reference

> + *

> + * A packet is a reference if it was created by odp_packet_ref() or

> + * odp_packet_ref_pkt().  Note that a reference created from another

> + * reference is considered a compound reference. Calls on such packets will

> + * result in return values greater than 1.

> + *

> + * @param pkt Packet handle

> + *

> + * @retval 0  Packet is not a reference

> + * @retval >0 Packet is a reference consisting of N individual packets.

> + */

> +int odp_packet_is_ref(odp_packet_t pkt);

> +

> +/**

> + * Test if a packet has a reference

> + *

> + * A packet has a reference if a reference was created on it by

> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

> + *

> + * @param pkt Packet handle

> + *

> + * @retval 0  Packet does not have any references

> + * @retval >0 Packet has N references based on it

> + */

> +int odp_packet_has_ref(odp_packet_t pkt);

> +

> +/*

> + *

>   * Copy

>   * ********************************************************

>   *

>
Bill Fischofer Dec. 19, 2016, 4:41 p.m. UTC | #4
On Mon, Dec 19, 2016 at 9:57 AM, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:
> how about group all packet reference apis to odp_packet_ref_ naming?

>

> In that case

>

> odp_packet_ref() -> odp_packet_ref_create() (same as odp_pool_create,

> odp_queue, create).


I think the _create() suffix is unnecessary but have no objection to
that if that's a consensus view. Note that we also have the other
variants to deal with and finding a way to add _create() to them
without it looking awkward might be a challenge.

>

> both

> odp_packet_is_ref() and odp_packet_has_ref()  -> odp_packet_ref()

> (function which returns ref counter)

>

> It will be more easy if application could not see difference between

> reference and original packet. At least if we do not have special use

> case for that.


The reason applications see a difference (and why there are two calls
rather than one) is that there is a difference, as previously
discussed. I can modify a reference (indeed that's one of the main
uses for it) by pushing a unique header onto it. I cannot modify a
referenced packet as the entire packet must be treated as read-only,
per our convention. If you want to modify both independently the way
you do that is create two references to the same base packet, and then
both of these can have their own unique prefixes while the base packet
remains read-only.

>

> Maxim.

>

>

> On 11/15/16 17:44, Bill Fischofer wrote:

>> Introduce three new APIs that support efficient sharing of portions of

>> packets.

>>

>> odp_packet_ref_static() creates an alias for a base packet

>>

>> odp_packet_ref() creates a reference to a base packet

>>

>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>> header packet

>>

>> In addition, three other APIs simplify working with references

>>

>> odp_packet_is_ref() says whether a packet is a reference

>>

>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>

>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>> unique to this reference. These are the only bytes of the packet that may

>> be modified safely.

>>

>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>> ---

>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>  1 file changed, 168 insertions(+)

>>

>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>> index faf62e2..137024f 100644

>> --- a/include/odp/api/spec/packet.h

>> +++ b/include/odp/api/spec/packet.h

>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>

>>  /**

>> + * Packet unshared data len

>> + *

>> + * Returns the sum of data lengths over all unshared packet segments. These

>> + * are the only bytes that should be modified by the caller. The rest of the

>> + * packet should be treated as read-only.

>> + *

>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>> + * odp_packet_is_ref() == 0.

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @return Packet unshared data length

>> + */

>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>> +

>> +/**

>>   * Packet headroom length

>>   *

>>   * Returns the current packet level headroom length.

>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>

>>  /*

>>   *

>> + * References

>> + * ********************************************************

>> + *

>> + */

>> +

>> +/**

>> + * Create a static reference to a packet

>> + *

>> + * A static reference is used to obtain an additional handle for referring to

>> + * a packet so that the storage behind it is not freed until all references to

>> + * the packet are freed. This can be used, for example, to support efficient

>> + * retransmission processing.

>> + *

>> + * The intent of a static reference is that both the base packet and the

>> + * returned reference will be treated as read-only after this call. Results

>> + * are unpredictable if this restriction is not observed.

>> + *

>> + * Static references have restrictions but may have performance advantages on

>> + * some platforms if the caller does not intend to modify the reference

>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>> + *

>> + * @param pkt    Handle of the base packet for which a static reference is

>> + *               to be created.

>> + *

>> + * @return                    Handle of the static reference packet

>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>> + */

>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>> +

>> +/**

>> + * Create a reference to a packet

>> + *

>> + * Create a dynamic reference to a base packet starting at a specified byte

>> + * offset. This reference may be used as an argument to header manipulation

>> + * APIs to prefix a unique header onto the shared base. The storage associated

>> + * with the base packet is not freed until all references to it are freed,

>> + * which permits easy multicasting or retransmission processing to be

>> + * performed. Following a successful call, the base packet should be treated

>> + * as read-only. Results are unpredictable if this restriction is not

>> + * observed.

>> + *

>> + * This operation prepends a zero-length header onto the base packet beginning

>> + * at the specified offset. This header is always drawn from the same pool as

>> + * the base packet.

>> + *

>> + * Because references are unique packets, any bytes pushed onto the head of a

>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>> + * to this reference and are not seen by the base packet or by any other

>> + * reference to the same base packet.

>> + *

>> + * The base packet used as input to this routine may itself by a reference to

>> + * some other base packet. Implementations MAY restrict the ability to create

>> + * such compound references. Attempts to exceed any implementation limits in

>> + * this regard will result in this call failing and returning

>> + * ODP_PACKET_INVALID.

>> + *

>> + * If the caller does not indend to push a header onto the returned reference,

>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>> + * means of obtaining another reference to a base packet that will be treated

>> + * as read-only.

>> + *

>> + * @param pkt    Handle of the base packet for which a reference is to be

>> + *               created.

>> + *

>> + * @param offset Byte offset in the base packet at which the shared reference

>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>> + *

>> + * @return                    Handle of the reference packet

>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>> + *

>> +

>> + */

>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>> +

>> +/**

>> + * Create a reference to another packet from a header packet

>> + *

>> + * Create a dynamic reference to a base packet starting at a specified byte

>> + * offset by prepending a supplied header packet. The resulting reference

>> + * consists of the unshared header followed by the shared base packet starting

>> + * at the specified offset. This shared portion should be regarded as

>> + * read-only. The storage associated with the shared portion of the reference

>> + * is not freed until all references to it are freed, which permits easy

>> + * multicasting or retransmission processing to be performed.

>> + *

>> + * Either packet input to this routine may itself be a reference, however

>> + * individual implementations MAY impose limits on how deeply splices may be

>> + * nested and fail the call if those limits are exceeded.

>> + *

>> + * The packets used as input to this routine may reside in different pools,

>> + * however individual implementations MAY require that both reside in the same

>> + * pool and fail the call if this restriction is not observed.

>> + *

>> + * odp_packet_pool() on the returned reference returns the pool id of the

>> + * header packet.

>> + *

>> + * @param pkt    Handle of the base packet for which a reference is to be

>> + *               created.

>> + *

>> + * @param offset Byte offset in the base packet at which the shared reference

>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>> + *

>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>> + *               packet to create the reference. If this is specified as

>> + *               ODP_PACKET_INVALID, this call is equivalent to

>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>> + *               the base pkt and results are undefined if an attempt is made

>> + *               to create circular references.

>> + *

>> + * @return       Handle of the reference packet. This may or may not be the

>> + *               same as the input hdr packet. The caller should only use this

>> + *               handle since the original hdr packet no longer has any

>> + *               independent existence.

>> + *

>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>> + *                            and hdr are unchanged.

>> + */

>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>> +                             odp_packet_t hdr);

>> +

>> +/**

>> + * Test if a packet is a reference

>> + *

>> + * A packet is a reference if it was created by odp_packet_ref() or

>> + * odp_packet_ref_pkt().  Note that a reference created from another

>> + * reference is considered a compound reference. Calls on such packets will

>> + * result in return values greater than 1.

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @retval 0  Packet is not a reference

>> + * @retval >0 Packet is a reference consisting of N individual packets.

>> + */

>> +int odp_packet_is_ref(odp_packet_t pkt);

>> +

>> +/**

>> + * Test if a packet has a reference

>> + *

>> + * A packet has a reference if a reference was created on it by

>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>> + *

>> + * @param pkt Packet handle

>> + *

>> + * @retval 0  Packet does not have any references

>> + * @retval >0 Packet has N references based on it

>> + */

>> +int odp_packet_has_ref(odp_packet_t pkt);

>> +

>> +/*

>> + *

>>   * Copy

>>   * ********************************************************

>>   *

>>

>
Maxim Uvarov Dec. 19, 2016, 7:22 p.m. UTC | #5
On 12/19/16 19:41, Bill Fischofer wrote:
> On Mon, Dec 19, 2016 at 9:57 AM, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

>> how about group all packet reference apis to odp_packet_ref_ naming?

>>

>> In that case

>>

>> odp_packet_ref() -> odp_packet_ref_create() (same as odp_pool_create,

>> odp_queue, create).

> 

> I think the _create() suffix is unnecessary but have no objection to

> that if that's a consensus view. Note that we also have the other

> variants to deal with and finding a way to add _create() to them

> without it looking awkward might be a challenge.

> 

>>

>> both

>> odp_packet_is_ref() and odp_packet_has_ref()  -> odp_packet_ref()

>> (function which returns ref counter)

>>

>> It will be more easy if application could not see difference between

>> reference and original packet. At least if we do not have special use

>> case for that.

> 

> The reason applications see a difference (and why there are two calls

> rather than one) is that there is a difference, as previously

> discussed. I can modify a reference (indeed that's one of the main

> uses for it) by pushing a unique header onto it. I cannot modify a

> referenced packet as the entire packet must be treated as read-only,

> per our convention. If you want to modify both independently the way

> you do that is create two references to the same base packet, and then

> both of these can have their own unique prefixes while the base packet

> remains read-only.

> 


Maybe I understood previously something wrong. My understanding before
what that read-only for packet which has references can be
implementation specific. I.e. when you set up reference you defend
original memory with mprotect(READ) and set up flag in buffer header
that memory is protected. Then if somebody needs to modify that memory
first he needs to get access to packet with something like
*odp_packet_data(pkt) and here we have 2 options: 1) if uses asks for
data then alloc new packet and gave him writeable pointer 2) add call
*odp_packet_data_ro(pkt) which will keep packet read only and
*odp_packet_data(pkt) will assume writes. Other functions with packets
work with handles and can take care about rw internally. That looks more
easy to use from application side.

Maxim.


>>

>> Maxim.

>>

>>

>> On 11/15/16 17:44, Bill Fischofer wrote:

>>> Introduce three new APIs that support efficient sharing of portions of

>>> packets.

>>>

>>> odp_packet_ref_static() creates an alias for a base packet

>>>

>>> odp_packet_ref() creates a reference to a base packet

>>>

>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>> header packet

>>>

>>> In addition, three other APIs simplify working with references

>>>

>>> odp_packet_is_ref() says whether a packet is a reference

>>>

>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>

>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>> unique to this reference. These are the only bytes of the packet that may

>>> be modified safely.

>>>

>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>> ---

>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>  1 file changed, 168 insertions(+)

>>>

>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>> index faf62e2..137024f 100644

>>> --- a/include/odp/api/spec/packet.h

>>> +++ b/include/odp/api/spec/packet.h

>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>

>>>  /**

>>> + * Packet unshared data len

>>> + *

>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>> + * are the only bytes that should be modified by the caller. The rest of the

>>> + * packet should be treated as read-only.

>>> + *

>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>> + * odp_packet_is_ref() == 0.

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @return Packet unshared data length

>>> + */

>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>> +

>>> +/**

>>>   * Packet headroom length

>>>   *

>>>   * Returns the current packet level headroom length.

>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>

>>>  /*

>>>   *

>>> + * References

>>> + * ********************************************************

>>> + *

>>> + */

>>> +

>>> +/**

>>> + * Create a static reference to a packet

>>> + *

>>> + * A static reference is used to obtain an additional handle for referring to

>>> + * a packet so that the storage behind it is not freed until all references to

>>> + * the packet are freed. This can be used, for example, to support efficient

>>> + * retransmission processing.

>>> + *

>>> + * The intent of a static reference is that both the base packet and the

>>> + * returned reference will be treated as read-only after this call. Results

>>> + * are unpredictable if this restriction is not observed.

>>> + *

>>> + * Static references have restrictions but may have performance advantages on

>>> + * some platforms if the caller does not intend to modify the reference

>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a static reference is

>>> + *               to be created.

>>> + *

>>> + * @return                    Handle of the static reference packet

>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>> + */

>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>> +

>>> +/**

>>> + * Create a reference to a packet

>>> + *

>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>> + * offset. This reference may be used as an argument to header manipulation

>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>> + * with the base packet is not freed until all references to it are freed,

>>> + * which permits easy multicasting or retransmission processing to be

>>> + * performed. Following a successful call, the base packet should be treated

>>> + * as read-only. Results are unpredictable if this restriction is not

>>> + * observed.

>>> + *

>>> + * This operation prepends a zero-length header onto the base packet beginning

>>> + * at the specified offset. This header is always drawn from the same pool as

>>> + * the base packet.

>>> + *

>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>> + * to this reference and are not seen by the base packet or by any other

>>> + * reference to the same base packet.

>>> + *

>>> + * The base packet used as input to this routine may itself by a reference to

>>> + * some other base packet. Implementations MAY restrict the ability to create

>>> + * such compound references. Attempts to exceed any implementation limits in

>>> + * this regard will result in this call failing and returning

>>> + * ODP_PACKET_INVALID.

>>> + *

>>> + * If the caller does not indend to push a header onto the returned reference,

>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>> + * means of obtaining another reference to a base packet that will be treated

>>> + * as read-only.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>> + *               created.

>>> + *

>>> + * @param offset Byte offset in the base packet at which the shared reference

>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>> + *

>>> + * @return                    Handle of the reference packet

>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>> + *

>>> +

>>> + */

>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>> +

>>> +/**

>>> + * Create a reference to another packet from a header packet

>>> + *

>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>> + * offset by prepending a supplied header packet. The resulting reference

>>> + * consists of the unshared header followed by the shared base packet starting

>>> + * at the specified offset. This shared portion should be regarded as

>>> + * read-only. The storage associated with the shared portion of the reference

>>> + * is not freed until all references to it are freed, which permits easy

>>> + * multicasting or retransmission processing to be performed.

>>> + *

>>> + * Either packet input to this routine may itself be a reference, however

>>> + * individual implementations MAY impose limits on how deeply splices may be

>>> + * nested and fail the call if those limits are exceeded.

>>> + *

>>> + * The packets used as input to this routine may reside in different pools,

>>> + * however individual implementations MAY require that both reside in the same

>>> + * pool and fail the call if this restriction is not observed.

>>> + *

>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>> + * header packet.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>> + *               created.

>>> + *

>>> + * @param offset Byte offset in the base packet at which the shared reference

>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>> + *

>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>> + *               packet to create the reference. If this is specified as

>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>> + *               the base pkt and results are undefined if an attempt is made

>>> + *               to create circular references.

>>> + *

>>> + * @return       Handle of the reference packet. This may or may not be the

>>> + *               same as the input hdr packet. The caller should only use this

>>> + *               handle since the original hdr packet no longer has any

>>> + *               independent existence.

>>> + *

>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>> + *                            and hdr are unchanged.

>>> + */

>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>> +                             odp_packet_t hdr);

>>> +

>>> +/**

>>> + * Test if a packet is a reference

>>> + *

>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>> + * reference is considered a compound reference. Calls on such packets will

>>> + * result in return values greater than 1.

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @retval 0  Packet is not a reference

>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>> + */

>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>> +

>>> +/**

>>> + * Test if a packet has a reference

>>> + *

>>> + * A packet has a reference if a reference was created on it by

>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @retval 0  Packet does not have any references

>>> + * @retval >0 Packet has N references based on it

>>> + */

>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>> +

>>> +/*

>>> + *

>>>   * Copy

>>>   * ********************************************************

>>>   *

>>>

>>
Bill Fischofer Dec. 19, 2016, 8:06 p.m. UTC | #6
On Mon, Dec 19, 2016 at 1:22 PM, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:
> On 12/19/16 19:41, Bill Fischofer wrote:

>> On Mon, Dec 19, 2016 at 9:57 AM, Maxim Uvarov <maxim.uvarov@linaro.org> wrote:

>>> how about group all packet reference apis to odp_packet_ref_ naming?

>>>

>>> In that case

>>>

>>> odp_packet_ref() -> odp_packet_ref_create() (same as odp_pool_create,

>>> odp_queue, create).

>>

>> I think the _create() suffix is unnecessary but have no objection to

>> that if that's a consensus view. Note that we also have the other

>> variants to deal with and finding a way to add _create() to them

>> without it looking awkward might be a challenge.

>>

>>>

>>> both

>>> odp_packet_is_ref() and odp_packet_has_ref()  -> odp_packet_ref()

>>> (function which returns ref counter)

>>>

>>> It will be more easy if application could not see difference between

>>> reference and original packet. At least if we do not have special use

>>> case for that.

>>

>> The reason applications see a difference (and why there are two calls

>> rather than one) is that there is a difference, as previously

>> discussed. I can modify a reference (indeed that's one of the main

>> uses for it) by pushing a unique header onto it. I cannot modify a

>> referenced packet as the entire packet must be treated as read-only,

>> per our convention. If you want to modify both independently the way

>> you do that is create two references to the same base packet, and then

>> both of these can have their own unique prefixes while the base packet

>> remains read-only.

>>

>

> Maybe I understood previously something wrong. My understanding before

> what that read-only for packet which has references can be

> implementation specific. I.e. when you set up reference you defend

> original memory with mprotect(READ) and set up flag in buffer header

> that memory is protected. Then if somebody needs to modify that memory

> first he needs to get access to packet with something like

> *odp_packet_data(pkt) and here we have 2 options: 1) if uses asks for

> data then alloc new packet and gave him writeable pointer 2) add call

> *odp_packet_data_ro(pkt) which will keep packet read only and

> *odp_packet_data(pkt) will assume writes. Other functions with packets

> work with handles and can take care about rw internally. That looks more

> easy to use from application side.


Doing syscalls and inserting special headers is all very high overhead
whereas packet references are intended  to be a lightweight
abstraction, which is why Petri suggested that the simplest solution
to portability is to have the convention that applications treat any
packet that has a reference as read only. That avoids a lot of
overhead and is a lot simpler to explain and implement across varying
platforms. One of the reasons for adding the odp_packet_has_ref() and
odp_packet_unshared_len() APIs is to make it easy for applications (or
library routines) to know whether they should observe this restriction
when handed an unknown packet handle.

Since references are lightweight, it's a simple matter to create
multiple references if you want to manipulate two (or more) prefixes
to a common payload independently. The base packet is treated as read
only while the references do push() calls for their unique prefixes.
Since this latter case is expected to be fairly common, we have the
odp_packet_ref_pkt() API that does this in one call to allow
implementations to optimize this further.

The provision of the odp_packet_ref_static() API is another
optimization if the caller knows that the reference itself will also
be treated as read only (e.g., to hold onto a packet reference for
retransmission purposes). In that case the implementation may not have
to do anything more than increment a reference counter (which is how
odp-linux behaves for this API).

>

> Maxim.

>

>

>>>

>>> Maxim.

>>>

>>>

>>> On 11/15/16 17:44, Bill Fischofer wrote:

>>>> Introduce three new APIs that support efficient sharing of portions of

>>>> packets.

>>>>

>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>

>>>> odp_packet_ref() creates a reference to a base packet

>>>>

>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>> header packet

>>>>

>>>> In addition, three other APIs simplify working with references

>>>>

>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>

>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>

>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>> unique to this reference. These are the only bytes of the packet that may

>>>> be modified safely.

>>>>

>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>> ---

>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>  1 file changed, 168 insertions(+)

>>>>

>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>> index faf62e2..137024f 100644

>>>> --- a/include/odp/api/spec/packet.h

>>>> +++ b/include/odp/api/spec/packet.h

>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>

>>>>  /**

>>>> + * Packet unshared data len

>>>> + *

>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>> + * packet should be treated as read-only.

>>>> + *

>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>> + * odp_packet_is_ref() == 0.

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @return Packet unshared data length

>>>> + */

>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>> +

>>>> +/**

>>>>   * Packet headroom length

>>>>   *

>>>>   * Returns the current packet level headroom length.

>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>

>>>>  /*

>>>>   *

>>>> + * References

>>>> + * ********************************************************

>>>> + *

>>>> + */

>>>> +

>>>> +/**

>>>> + * Create a static reference to a packet

>>>> + *

>>>> + * A static reference is used to obtain an additional handle for referring to

>>>> + * a packet so that the storage behind it is not freed until all references to

>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>> + * retransmission processing.

>>>> + *

>>>> + * The intent of a static reference is that both the base packet and the

>>>> + * returned reference will be treated as read-only after this call. Results

>>>> + * are unpredictable if this restriction is not observed.

>>>> + *

>>>> + * Static references have restrictions but may have performance advantages on

>>>> + * some platforms if the caller does not intend to modify the reference

>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>> + *               to be created.

>>>> + *

>>>> + * @return                    Handle of the static reference packet

>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>> + */

>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>> +

>>>> +/**

>>>> + * Create a reference to a packet

>>>> + *

>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>> + * offset. This reference may be used as an argument to header manipulation

>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>> + * with the base packet is not freed until all references to it are freed,

>>>> + * which permits easy multicasting or retransmission processing to be

>>>> + * performed. Following a successful call, the base packet should be treated

>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>> + * observed.

>>>> + *

>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>> + * the base packet.

>>>> + *

>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>> + * to this reference and are not seen by the base packet or by any other

>>>> + * reference to the same base packet.

>>>> + *

>>>> + * The base packet used as input to this routine may itself by a reference to

>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>> + * this regard will result in this call failing and returning

>>>> + * ODP_PACKET_INVALID.

>>>> + *

>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>> + * means of obtaining another reference to a base packet that will be treated

>>>> + * as read-only.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>> + *               created.

>>>> + *

>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>> + *

>>>> + * @return                    Handle of the reference packet

>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>> + *

>>>> +

>>>> + */

>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>> +

>>>> +/**

>>>> + * Create a reference to another packet from a header packet

>>>> + *

>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>> + * consists of the unshared header followed by the shared base packet starting

>>>> + * at the specified offset. This shared portion should be regarded as

>>>> + * read-only. The storage associated with the shared portion of the reference

>>>> + * is not freed until all references to it are freed, which permits easy

>>>> + * multicasting or retransmission processing to be performed.

>>>> + *

>>>> + * Either packet input to this routine may itself be a reference, however

>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>> + * nested and fail the call if those limits are exceeded.

>>>> + *

>>>> + * The packets used as input to this routine may reside in different pools,

>>>> + * however individual implementations MAY require that both reside in the same

>>>> + * pool and fail the call if this restriction is not observed.

>>>> + *

>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>> + * header packet.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>> + *               created.

>>>> + *

>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>> + *

>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>> + *               packet to create the reference. If this is specified as

>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>> + *               the base pkt and results are undefined if an attempt is made

>>>> + *               to create circular references.

>>>> + *

>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>> + *               same as the input hdr packet. The caller should only use this

>>>> + *               handle since the original hdr packet no longer has any

>>>> + *               independent existence.

>>>> + *

>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>> + *                            and hdr are unchanged.

>>>> + */

>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>> +                             odp_packet_t hdr);

>>>> +

>>>> +/**

>>>> + * Test if a packet is a reference

>>>> + *

>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>> + * reference is considered a compound reference. Calls on such packets will

>>>> + * result in return values greater than 1.

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @retval 0  Packet is not a reference

>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>> + */

>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>> +

>>>> +/**

>>>> + * Test if a packet has a reference

>>>> + *

>>>> + * A packet has a reference if a reference was created on it by

>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @retval 0  Packet does not have any references

>>>> + * @retval >0 Packet has N references based on it

>>>> + */

>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>> +

>>>> +/*

>>>> + *

>>>>   * Copy

>>>>   * ********************************************************

>>>>   *

>>>>

>>>

>
Balasubramanian Manoharan Dec. 20, 2016, 11:59 a.m. UTC | #7
Regards,
Bala


On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

> <bala.manoharan@linaro.org> wrote:

>> Comments inline....

>>

>>

>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>> Introduce three new APIs that support efficient sharing of portions of

>>> packets.

>>>

>>> odp_packet_ref_static() creates an alias for a base packet

>>>

>>> odp_packet_ref() creates a reference to a base packet

>>>

>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>> header packet

>>>

>>> In addition, three other APIs simplify working with references

>>>

>>> odp_packet_is_ref() says whether a packet is a reference

>>>

>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>

>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>> unique to this reference. These are the only bytes of the packet that may

>>> be modified safely.

>>>

>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>> ---

>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>  1 file changed, 168 insertions(+)

>>>

>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>> index faf62e2..137024f 100644

>>> --- a/include/odp/api/spec/packet.h

>>> +++ b/include/odp/api/spec/packet.h

>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>

>>>  /**

>>> + * Packet unshared data len

>>> + *

>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>> + * are the only bytes that should be modified by the caller. The rest of the

>>> + * packet should be treated as read-only.

>>> + *

>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>> + * odp_packet_is_ref() == 0.

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @return Packet unshared data length

>>> + */

>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>> +

>>> +/**

>>>   * Packet headroom length

>>>   *

>>>   * Returns the current packet level headroom length.

>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>

>>>  /*

>>>   *

>>> + * References

>>> + * ********************************************************

>>> + *

>>> + */

>>> +

>>> +/**

>>> + * Create a static reference to a packet

>>> + *

>>> + * A static reference is used to obtain an additional handle for referring to

>>> + * a packet so that the storage behind it is not freed until all references to

>>> + * the packet are freed. This can be used, for example, to support efficient

>>> + * retransmission processing.

>>> + *

>>> + * The intent of a static reference is that both the base packet and the

>>> + * returned reference will be treated as read-only after this call. Results

>>> + * are unpredictable if this restriction is not observed.

>>> + *

>>> + * Static references have restrictions but may have performance advantages on

>>> + * some platforms if the caller does not intend to modify the reference

>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a static reference is

>>> + *               to be created.

>>> + *

>>> + * @return                    Handle of the static reference packet

>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>> + */

>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>

>> It is better to change the API signature to return the updated handle

>> of the base packet also to the application.

>> Similar to the API for odp_packet_extend_head().

>

> I think returning the packet ref directly rather than indirectly makes

> for easier coding on the part of applications. Failure is indicated by

> returning ODP_PACKET_INVALID. So in this sense we're like

> odp_packet_alloc(). The alternative:

>

> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>

> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>

>>

>>> +

>>> +/**

>>> + * Create a reference to a packet

>>> + *

>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>> + * offset. This reference may be used as an argument to header manipulation

>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>> + * with the base packet is not freed until all references to it are freed,

>>> + * which permits easy multicasting or retransmission processing to be

>>> + * performed. Following a successful call, the base packet should be treated

>>> + * as read-only. Results are unpredictable if this restriction is not

>>> + * observed.

>>> + *

>>> + * This operation prepends a zero-length header onto the base packet beginning

>>> + * at the specified offset. This header is always drawn from the same pool as

>>> + * the base packet.

>>> + *

>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>> + * to this reference and are not seen by the base packet or by any other

>>> + * reference to the same base packet.

>>> + *

>>> + * The base packet used as input to this routine may itself by a reference to

>>> + * some other base packet. Implementations MAY restrict the ability to create

>>> + * such compound references. Attempts to exceed any implementation limits in

>>> + * this regard will result in this call failing and returning

>>> + * ODP_PACKET_INVALID.

>>> + *

>>> + * If the caller does not indend to push a header onto the returned reference,

>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>> + * means of obtaining another reference to a base packet that will be treated

>>> + * as read-only.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>> + *               created.

>>> + *

>>> + * @param offset Byte offset in the base packet at which the shared reference

>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>> + *

>>> + * @return                    Handle of the reference packet

>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>> + *

>>> +

>>> + */

>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>> +

>>> +/**

>>> + * Create a reference to another packet from a header packet

>>> + *

>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>> + * offset by prepending a supplied header packet. The resulting reference

>>> + * consists of the unshared header followed by the shared base packet starting

>>> + * at the specified offset. This shared portion should be regarded as

>>> + * read-only. The storage associated with the shared portion of the reference

>>> + * is not freed until all references to it are freed, which permits easy

>>> + * multicasting or retransmission processing to be performed.

>>

>> My concerns with this API is what happens when application creates

>> multiple references from multiple offsets of the base packet. In that

>> scenario the read-only status of the shared portion could not be

>> maintained since if different references gives different offset there

>> will be more than one shared portion.

>>

>

> Why would this be a problem? We're relying on an "honor system" here

> since there is no defined enforcement mechanism. The rule is that you

> should only modify the unshared portion of any packet and results are

> undefined if this rule is ignored. That's why we have the

> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

> help applications know whether they should or should not attempt to

> modify a packet given it's handle.


Let us consider an example,

pkt_a = odp_packet_ref(pkt_base, 200);
-----
-----
-----
pkt_b = odp_packet_ref(pkt_base, 100);

In the above case, the read-only portion of pkt_base was from 200 to
end-of-packet during creation of first reference and it is moved to
100 to end-of-packet during creation of second reference so all the
segment handle of pkt_base previously owned by the application becomes
invalid.

>

>>> + *

>>> + * Either packet input to this routine may itself be a reference, however

>>> + * individual implementations MAY impose limits on how deeply splices may be

>>> + * nested and fail the call if those limits are exceeded.

>>> + *

>>> + * The packets used as input to this routine may reside in different pools,

>>> + * however individual implementations MAY require that both reside in the same

>>> + * pool and fail the call if this restriction is not observed.

>>

>> Not sure if there is a requirement to support reference with packets

>> from multiple pools.

>

> That's why this is a MAY. We could expose this as a capability or

> simply state that this is not supported, however some implementations

> may be able to support this (e.g., odp-linux has no need to make this

> restriction) and I could see how it could be useful to have header

> templates in their own pool for storage management purposes.


odp-linux is a special case since it does not use any HW pool manager,
most platforms using HW pool manager might not be able to create a
packet with segments from multiple pools since it will be difficult to
return the segments to respective pools during odp_packet_free().
packet pool is a limited resource in a system and it might not be
advisable to use a separate pool for header template.

>

>>

>>> + *

>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>> + * header packet.

>>> + *

>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>> + *               created.

>>> + *

>>> + * @param offset Byte offset in the base packet at which the shared reference

>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>> + *

>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>> + *               packet to create the reference. If this is specified as

>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>> + *               the base pkt and results are undefined if an attempt is made

>>> + *               to create circular references.

>>> + *

>>> + * @return       Handle of the reference packet. This may or may not be the

>>> + *               same as the input hdr packet. The caller should only use this

>>> + *               handle since the original hdr packet no longer has any

>>> + *               independent existence.

>>> + *

>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>> + *                            and hdr are unchanged.

>>> + */

>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>> +                               odp_packet_t hdr);

>>> +

>>> +/**

>>> + * Test if a packet is a reference

>>> + *

>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>> + * reference is considered a compound reference. Calls on such packets will

>>> + * result in return values greater than 1.

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @retval 0  Packet is not a reference

>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>> + */

>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>

>> It is better to keep the return value as 0 or 1. Is it expected to

>> return the number of references?

>> The total number of references created is not interesting and also it

>> is not so easy since references are created at segment level as per

>> this proposal.

>> Application will have to call odp_packet_free() for all the packet

>> handle it is holding.

>

> Since any implementation supporting references needs to have some

> internal notion of a reference count this is just attempting to expose

> that in an implementation-independent manner. This is also why there

> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.


Again this depends on the implementation, based on the proposal here
the reference count should be maintained per segment since the
reference is being created using an offset, different segments of the
packets could have different references.
My concern is this count is not useful since application already has
packet handles per reference and it has to free all the handles which
was populated using reference APIs.

>

>>

>>> +

>>> +/**

>>> + * Test if a packet has a reference

>>> + *

>>> + * A packet has a reference if a reference was created on it by

>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>> + *

>>> + * @param pkt Packet handle

>>> + *

>>> + * @retval 0  Packet does not have any references

>>> + * @retval >0 Packet has N references based on it

>>> + */

>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>

>> What is the difference between odp_packet_has_ref() and

>> odp_packet_is_ref()? IMO Once a reference is created there is no

>> difference between the base packet and reference.

>

> Not true. This is discussed (with diagrams) in the User Guide updates

> associated with this patch series.

>

> Consider the call:

>

> pkt_a = odp_packet_ref(pkt_b, 0);

>

> After this call the following conditions hold:

>

> odp_packet_is_ref(pkt_a) == 1

> odp_packet_is_ref(pkt_b) == 0

> odp_packet_has_ref(pkt_a) == 0

> odp_packet_has_ref(pkt_b) == 1

>

> Now consider the more complex case:

>

> pkt_a = odp_packet_ref(pkt_b, offset1);

> pkt_c = odp_packet_ref(pkt_b, offset2);

> pkt_d = odp_packet_ref(pkt_a, offset3);

>

> Now we have:

>

> odp_packet_is_ref(pkt_a) == 1

> odp_packet_is_ref(pkt_b) == 0

> odp_packet_is_ref(pkt_c) == 1

> odp_packet_is_ref(pkt_d) == 2

> odp_packet_has_ref(pkt_a) == 1

> odp_packet_has_ref(pkt_b) == 3

> odp_packet_has_ref(pkt_c) == 0

> odp_packet_has_ref(pkt_d) == 0

>

> Essentially, odp_packet_is_ref() answers the question "How many other

> packets are referenced via this handle"? While odp_packet_has_ref()

> answers the question "How many other handles can be used to reference

> this packet"?


This information requires lots of unnecessary computation at the
implementation level, references are created by linking multiple
segments together from application point of view there is no
difference between a base packet and a reference both have few shared
segments and have few unique segments. I understand the definition of
this API, my query is the use-case for returning this number of
has_ref and is_ref()?

Regards,
Bala
>

>>

>> Regards,

>> Bala

>>> +

>>> +/*

>>> + *

>>>   * Copy

>>>   * ********************************************************

>>>   *

>>> --

>>> 2.7.4

>>>
Bill Fischofer Dec. 20, 2016, 12:56 p.m. UTC | #8
On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan
<bala.manoharan@linaro.org> wrote:
> Regards,

> Bala

>

>

> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>> <bala.manoharan@linaro.org> wrote:

>>> Comments inline....

>>>

>>>

>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>> Introduce three new APIs that support efficient sharing of portions of

>>>> packets.

>>>>

>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>

>>>> odp_packet_ref() creates a reference to a base packet

>>>>

>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>> header packet

>>>>

>>>> In addition, three other APIs simplify working with references

>>>>

>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>

>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>

>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>> unique to this reference. These are the only bytes of the packet that may

>>>> be modified safely.

>>>>

>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>> ---

>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>  1 file changed, 168 insertions(+)

>>>>

>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>> index faf62e2..137024f 100644

>>>> --- a/include/odp/api/spec/packet.h

>>>> +++ b/include/odp/api/spec/packet.h

>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>

>>>>  /**

>>>> + * Packet unshared data len

>>>> + *

>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>> + * packet should be treated as read-only.

>>>> + *

>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>> + * odp_packet_is_ref() == 0.

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @return Packet unshared data length

>>>> + */

>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>> +

>>>> +/**

>>>>   * Packet headroom length

>>>>   *

>>>>   * Returns the current packet level headroom length.

>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>

>>>>  /*

>>>>   *

>>>> + * References

>>>> + * ********************************************************

>>>> + *

>>>> + */

>>>> +

>>>> +/**

>>>> + * Create a static reference to a packet

>>>> + *

>>>> + * A static reference is used to obtain an additional handle for referring to

>>>> + * a packet so that the storage behind it is not freed until all references to

>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>> + * retransmission processing.

>>>> + *

>>>> + * The intent of a static reference is that both the base packet and the

>>>> + * returned reference will be treated as read-only after this call. Results

>>>> + * are unpredictable if this restriction is not observed.

>>>> + *

>>>> + * Static references have restrictions but may have performance advantages on

>>>> + * some platforms if the caller does not intend to modify the reference

>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>> + *               to be created.

>>>> + *

>>>> + * @return                    Handle of the static reference packet

>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>> + */

>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>

>>> It is better to change the API signature to return the updated handle

>>> of the base packet also to the application.

>>> Similar to the API for odp_packet_extend_head().

>>

>> I think returning the packet ref directly rather than indirectly makes

>> for easier coding on the part of applications. Failure is indicated by

>> returning ODP_PACKET_INVALID. So in this sense we're like

>> odp_packet_alloc(). The alternative:

>>

>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>

>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>

>>>

>>>> +

>>>> +/**

>>>> + * Create a reference to a packet

>>>> + *

>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>> + * offset. This reference may be used as an argument to header manipulation

>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>> + * with the base packet is not freed until all references to it are freed,

>>>> + * which permits easy multicasting or retransmission processing to be

>>>> + * performed. Following a successful call, the base packet should be treated

>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>> + * observed.

>>>> + *

>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>> + * the base packet.

>>>> + *

>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>> + * to this reference and are not seen by the base packet or by any other

>>>> + * reference to the same base packet.

>>>> + *

>>>> + * The base packet used as input to this routine may itself by a reference to

>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>> + * this regard will result in this call failing and returning

>>>> + * ODP_PACKET_INVALID.

>>>> + *

>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>> + * means of obtaining another reference to a base packet that will be treated

>>>> + * as read-only.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>> + *               created.

>>>> + *

>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>> + *

>>>> + * @return                    Handle of the reference packet

>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>> + *

>>>> +

>>>> + */

>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>> +

>>>> +/**

>>>> + * Create a reference to another packet from a header packet

>>>> + *

>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>> + * consists of the unshared header followed by the shared base packet starting

>>>> + * at the specified offset. This shared portion should be regarded as

>>>> + * read-only. The storage associated with the shared portion of the reference

>>>> + * is not freed until all references to it are freed, which permits easy

>>>> + * multicasting or retransmission processing to be performed.

>>>

>>> My concerns with this API is what happens when application creates

>>> multiple references from multiple offsets of the base packet. In that

>>> scenario the read-only status of the shared portion could not be

>>> maintained since if different references gives different offset there

>>> will be more than one shared portion.

>>>

>>

>> Why would this be a problem? We're relying on an "honor system" here

>> since there is no defined enforcement mechanism. The rule is that you

>> should only modify the unshared portion of any packet and results are

>> undefined if this rule is ignored. That's why we have the

>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>> help applications know whether they should or should not attempt to

>> modify a packet given it's handle.

>

> Let us consider an example,

>

> pkt_a = odp_packet_ref(pkt_base, 200);

> -----

> -----

> -----

> pkt_b = odp_packet_ref(pkt_base, 100);

>

> In the above case, the read-only portion of pkt_base was from 200 to

> end-of-packet during creation of first reference and it is moved to

> 100 to end-of-packet during creation of second reference so all the

> segment handle of pkt_base previously owned by the application becomes

> invalid.

>


No, in the above examples the entirety of pkt_base should be treated
as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until
odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and
most portable rule. During this time odp_packet_unshared_len(pkt_base)
will return 0. I don't see a use case for any other interpretation,
and any other interpretation is going to encounter a lot of
portability issues. Again, this is an "honor system". It's up to the
application to observe this convention since we don't want to specify
that the implementation has to somehow make pkt_base read only. All
ODP says is results are undefined if applications do not observe this
rule.

Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while
odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be
modified. However since after the odp_packet_ref() call
odp_packet_unshared_len() is 0 for both, that means that the only way
they should be modified would be to call odp_packet_push_head() or
odp_packet_extend_head() to put a unique prefix on the shared payload.
So if we call odp_packet_push_head(pkt_a, 64);, then
odp_packet_unshare_len(pkt_a) == 64, etc.

>>

>>>> + *

>>>> + * Either packet input to this routine may itself be a reference, however

>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>> + * nested and fail the call if those limits are exceeded.

>>>> + *

>>>> + * The packets used as input to this routine may reside in different pools,

>>>> + * however individual implementations MAY require that both reside in the same

>>>> + * pool and fail the call if this restriction is not observed.

>>>

>>> Not sure if there is a requirement to support reference with packets

>>> from multiple pools.

>>

>> That's why this is a MAY. We could expose this as a capability or

>> simply state that this is not supported, however some implementations

>> may be able to support this (e.g., odp-linux has no need to make this

>> restriction) and I could see how it could be useful to have header

>> templates in their own pool for storage management purposes.

>

> odp-linux is a special case since it does not use any HW pool manager,

> most platforms using HW pool manager might not be able to create a

> packet with segments from multiple pools since it will be difficult to

> return the segments to respective pools during odp_packet_free().

> packet pool is a limited resource in a system and it might not be

> advisable to use a separate pool for header template.


That's why this is a MAY. Implementation that don't support this would
fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if
the header and packet to be referenced reside in different pools. This
is no different than an odp_packet_alloc() call failing because the
pool is out of memory, etc. Applications are expected to be able to
deal with these sort of situations.

Again, if we want we can simply say that both packets must come from
the same pool and leave it at that. I have no problem changing that if
people feel that a more uniform interpretation is desirable. Note
however that many of the other packet routines (odp_packet_copy(),
odp_packet_concat()) support multi-pool operation.

>

>>

>>>

>>>> + *

>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>> + * header packet.

>>>> + *

>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>> + *               created.

>>>> + *

>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>> + *

>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>> + *               packet to create the reference. If this is specified as

>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>> + *               the base pkt and results are undefined if an attempt is made

>>>> + *               to create circular references.

>>>> + *

>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>> + *               same as the input hdr packet. The caller should only use this

>>>> + *               handle since the original hdr packet no longer has any

>>>> + *               independent existence.

>>>> + *

>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>> + *                            and hdr are unchanged.

>>>> + */

>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>> +                               odp_packet_t hdr);

>>>> +

>>>> +/**

>>>> + * Test if a packet is a reference

>>>> + *

>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>> + * reference is considered a compound reference. Calls on such packets will

>>>> + * result in return values greater than 1.

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @retval 0  Packet is not a reference

>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>> + */

>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>

>>> It is better to keep the return value as 0 or 1. Is it expected to

>>> return the number of references?

>>> The total number of references created is not interesting and also it

>>> is not so easy since references are created at segment level as per

>>> this proposal.

>>> Application will have to call odp_packet_free() for all the packet

>>> handle it is holding.

>>

>> Since any implementation supporting references needs to have some

>> internal notion of a reference count this is just attempting to expose

>> that in an implementation-independent manner. This is also why there

>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>

> Again this depends on the implementation, based on the proposal here

> the reference count should be maintained per segment since the

> reference is being created using an offset, different segments of the

> packets could have different references.

> My concern is this count is not useful since application already has

> packet handles per reference and it has to free all the handles which

> was populated using reference APIs.


No, a reference count is logically on a per-packet basis because as
noted earlier the entire packet should be treated as read only if any
reference to it exists. If implementations want to use per-segment
reference counts for internal convenience that's fine, but the API is
referring to per-packet references. Remember that applications need to
be aware of the existence of segments but ODP APIs are designed to
manipulate packets, not segments, since it's assumed that the
existence of segments is for the implementation's convenience, not the
application's. Yes, we do have various odp_packet_seg_xxx() APIs but
they're a bit of a sidecar to the other packet APIs and it's unclear
how applications would use these in any generally portable manner.

>

>>

>>>

>>>> +

>>>> +/**

>>>> + * Test if a packet has a reference

>>>> + *

>>>> + * A packet has a reference if a reference was created on it by

>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>> + *

>>>> + * @param pkt Packet handle

>>>> + *

>>>> + * @retval 0  Packet does not have any references

>>>> + * @retval >0 Packet has N references based on it

>>>> + */

>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>

>>> What is the difference between odp_packet_has_ref() and

>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>> difference between the base packet and reference.

>>

>> Not true. This is discussed (with diagrams) in the User Guide updates

>> associated with this patch series.

>>

>> Consider the call:

>>

>> pkt_a = odp_packet_ref(pkt_b, 0);

>>

>> After this call the following conditions hold:

>>

>> odp_packet_is_ref(pkt_a) == 1

>> odp_packet_is_ref(pkt_b) == 0

>> odp_packet_has_ref(pkt_a) == 0

>> odp_packet_has_ref(pkt_b) == 1

>>

>> Now consider the more complex case:

>>

>> pkt_a = odp_packet_ref(pkt_b, offset1);

>> pkt_c = odp_packet_ref(pkt_b, offset2);

>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>

>> Now we have:

>>

>> odp_packet_is_ref(pkt_a) == 1

>> odp_packet_is_ref(pkt_b) == 0

>> odp_packet_is_ref(pkt_c) == 1

>> odp_packet_is_ref(pkt_d) == 2

>> odp_packet_has_ref(pkt_a) == 1

>> odp_packet_has_ref(pkt_b) == 3

>> odp_packet_has_ref(pkt_c) == 0

>> odp_packet_has_ref(pkt_d) == 0

>>

>> Essentially, odp_packet_is_ref() answers the question "How many other

>> packets are referenced via this handle"? While odp_packet_has_ref()

>> answers the question "How many other handles can be used to reference

>> this packet"?

>

> This information requires lots of unnecessary computation at the

> implementation level, references are created by linking multiple

> segments together from application point of view there is no

> difference between a base packet and a reference both have few shared

> segments and have few unique segments. I understand the definition of

> this API, my query is the use-case for returning this number of

> has_ref and is_ref()?


No that's not true since each packet (whether it's a reference or a
base packet) has its own metadata (including headroom, user area,
etc.). One of the reasons we introduced the odp_packet_ref_static()
API is to allow implementations to further optimize this and create
the type of references you refer to here when applications assert that
they're going to treat the reference as entirely read only (and hence
can share metadata with the base packet). You cannot just link
segments together and have a valid packet reference since that doesn't
cover the unique metadata like headroom, etc.

Given the requirement for unique metadata on a per-reference basis, I
don't see the implementation of these APIs as problematic. The
odp-linux implementation shows just how simple they really are.

>

> Regards,

> Bala


Thank you! This is a great dialog.

>>

>>>

>>> Regards,

>>> Bala

>>>> +

>>>> +/*

>>>> + *

>>>>   * Copy

>>>>   * ********************************************************

>>>>   *

>>>> --

>>>> 2.7.4

>>>>
Balasubramanian Manoharan Dec. 20, 2016, 2:26 p.m. UTC | #9
On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

> <bala.manoharan@linaro.org> wrote:

>> Regards,

>> Bala

>>

>>

>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>> <bala.manoharan@linaro.org> wrote:

>>>> Comments inline....

>>>>

>>>>

>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>> packets.

>>>>>

>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>

>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>

>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>> header packet

>>>>>

>>>>> In addition, three other APIs simplify working with references

>>>>>

>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>

>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>

>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>> be modified safely.

>>>>>

>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>> ---

>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>  1 file changed, 168 insertions(+)

>>>>>

>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>> index faf62e2..137024f 100644

>>>>> --- a/include/odp/api/spec/packet.h

>>>>> +++ b/include/odp/api/spec/packet.h

>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>

>>>>>  /**

>>>>> + * Packet unshared data len

>>>>> + *

>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>> + * packet should be treated as read-only.

>>>>> + *

>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>> + * odp_packet_is_ref() == 0.

>>>>> + *

>>>>> + * @param pkt Packet handle

>>>>> + *

>>>>> + * @return Packet unshared data length

>>>>> + */

>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>> +

>>>>> +/**

>>>>>   * Packet headroom length

>>>>>   *

>>>>>   * Returns the current packet level headroom length.

>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>

>>>>>  /*

>>>>>   *

>>>>> + * References

>>>>> + * ********************************************************

>>>>> + *

>>>>> + */

>>>>> +

>>>>> +/**

>>>>> + * Create a static reference to a packet

>>>>> + *

>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>> + * retransmission processing.

>>>>> + *

>>>>> + * The intent of a static reference is that both the base packet and the

>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>> + * are unpredictable if this restriction is not observed.

>>>>> + *

>>>>> + * Static references have restrictions but may have performance advantages on

>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>> + *

>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>> + *               to be created.

>>>>> + *

>>>>> + * @return                    Handle of the static reference packet

>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>> + */

>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>

>>>> It is better to change the API signature to return the updated handle

>>>> of the base packet also to the application.

>>>> Similar to the API for odp_packet_extend_head().

>>>

>>> I think returning the packet ref directly rather than indirectly makes

>>> for easier coding on the part of applications. Failure is indicated by

>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>> odp_packet_alloc(). The alternative:

>>>

>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>

>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>

>>>>

>>>>> +

>>>>> +/**

>>>>> + * Create a reference to a packet

>>>>> + *

>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>> + * observed.

>>>>> + *

>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>> + * the base packet.

>>>>> + *

>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>> + * reference to the same base packet.

>>>>> + *

>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>> + * this regard will result in this call failing and returning

>>>>> + * ODP_PACKET_INVALID.

>>>>> + *

>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>> + * as read-only.

>>>>> + *

>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>> + *               created.

>>>>> + *

>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>> + *

>>>>> + * @return                    Handle of the reference packet

>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>> + *

>>>>> +

>>>>> + */

>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>> +

>>>>> +/**

>>>>> + * Create a reference to another packet from a header packet

>>>>> + *

>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>> + * multicasting or retransmission processing to be performed.

>>>>

>>>> My concerns with this API is what happens when application creates

>>>> multiple references from multiple offsets of the base packet. In that

>>>> scenario the read-only status of the shared portion could not be

>>>> maintained since if different references gives different offset there

>>>> will be more than one shared portion.

>>>>

>>>

>>> Why would this be a problem? We're relying on an "honor system" here

>>> since there is no defined enforcement mechanism. The rule is that you

>>> should only modify the unshared portion of any packet and results are

>>> undefined if this rule is ignored. That's why we have the

>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>> help applications know whether they should or should not attempt to

>>> modify a packet given it's handle.

>>

>> Let us consider an example,

>>

>> pkt_a = odp_packet_ref(pkt_base, 200);

>> -----

>> -----

>> -----

>> pkt_b = odp_packet_ref(pkt_base, 100);

>>

>> In the above case, the read-only portion of pkt_base was from 200 to

>> end-of-packet during creation of first reference and it is moved to

>> 100 to end-of-packet during creation of second reference so all the

>> segment handle of pkt_base previously owned by the application becomes

>> invalid.

>>

>

> No, in the above examples the entirety of pkt_base should be treated

> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

> most portable rule. During this time odp_packet_unshared_len(pkt_base)

> will return 0. I don't see a use case for any other interpretation,

> and any other interpretation is going to encounter a lot of

> portability issues. Again, this is an "honor system". It's up to the

> application to observe this convention since we don't want to specify

> that the implementation has to somehow make pkt_base read only. All

> ODP says is results are undefined if applications do not observe this

> rule.

>

> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

> modified. However since after the odp_packet_ref() call

> odp_packet_unshared_len() is 0 for both, that means that the only way

> they should be modified would be to call odp_packet_push_head() or

> odp_packet_extend_head() to put a unique prefix on the shared payload.

> So if we call odp_packet_push_head(pkt_a, 64);, then

> odp_packet_unshare_len(pkt_a) == 64, etc.


I do not see the need why the entire packet should be marked as
read-only. If you mark the entire base packet as read-only then you
can not do any header modification for the base packet (ie. tcp or udp
checksum update) which will make the packet entirely use-less for
transmission. you already have an API for unshared length which could
be used to update the unique sections of the base packet.

Also what happens when you free the base packet after creating the
reference? I believe it should not cause any issue.
Contrary to this, we can also restrict the number of offset for the
base packet to 1, so the above conflict do not arise.

>

>>>

>>>>> + *

>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>> + * nested and fail the call if those limits are exceeded.

>>>>> + *

>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>> + * however individual implementations MAY require that both reside in the same

>>>>> + * pool and fail the call if this restriction is not observed.

>>>>

>>>> Not sure if there is a requirement to support reference with packets

>>>> from multiple pools.

>>>

>>> That's why this is a MAY. We could expose this as a capability or

>>> simply state that this is not supported, however some implementations

>>> may be able to support this (e.g., odp-linux has no need to make this

>>> restriction) and I could see how it could be useful to have header

>>> templates in their own pool for storage management purposes.

>>

>> odp-linux is a special case since it does not use any HW pool manager,

>> most platforms using HW pool manager might not be able to create a

>> packet with segments from multiple pools since it will be difficult to

>> return the segments to respective pools during odp_packet_free().

>> packet pool is a limited resource in a system and it might not be

>> advisable to use a separate pool for header template.

>

> That's why this is a MAY. Implementation that don't support this would

> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

> the header and packet to be referenced reside in different pools. This

> is no different than an odp_packet_alloc() call failing because the

> pool is out of memory, etc. Applications are expected to be able to

> deal with these sort of situations.

>

> Again, if we want we can simply say that both packets must come from

> the same pool and leave it at that. I have no problem changing that if

> people feel that a more uniform interpretation is desirable. Note

> however that many of the other packet routines (odp_packet_copy(),

> odp_packet_concat()) support multi-pool operation.


During packet_copy and packet_concat() there is a need to create two
completely independent packets after the call there is no shared data
between the packets and you need to copy the packet from "src" to
"dst" but that is not the case for reference creation this could run
in the fast path and should be done as efficient as possible.

>

>>

>>>

>>>>

>>>>> + *

>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>> + * header packet.

>>>>> + *

>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>> + *               created.

>>>>> + *

>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>> + *

>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>> + *               packet to create the reference. If this is specified as

>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>> + *               to create circular references.

>>>>> + *

>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>> + *               handle since the original hdr packet no longer has any

>>>>> + *               independent existence.

>>>>> + *

>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>> + *                            and hdr are unchanged.

>>>>> + */

>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>> +                               odp_packet_t hdr);

>>>>> +

>>>>> +/**

>>>>> + * Test if a packet is a reference

>>>>> + *

>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>> + * result in return values greater than 1.

>>>>> + *

>>>>> + * @param pkt Packet handle

>>>>> + *

>>>>> + * @retval 0  Packet is not a reference

>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>> + */

>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>

>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>> return the number of references?

>>>> The total number of references created is not interesting and also it

>>>> is not so easy since references are created at segment level as per

>>>> this proposal.

>>>> Application will have to call odp_packet_free() for all the packet

>>>> handle it is holding.

>>>

>>> Since any implementation supporting references needs to have some

>>> internal notion of a reference count this is just attempting to expose

>>> that in an implementation-independent manner. This is also why there

>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>

>> Again this depends on the implementation, based on the proposal here

>> the reference count should be maintained per segment since the

>> reference is being created using an offset, different segments of the

>> packets could have different references.

>> My concern is this count is not useful since application already has

>> packet handles per reference and it has to free all the handles which

>> was populated using reference APIs.

>

> No, a reference count is logically on a per-packet basis because as

> noted earlier the entire packet should be treated as read only if any

> reference to it exists. If implementations want to use per-segment

> reference counts for internal convenience that's fine, but the API is

> referring to per-packet references. Remember that applications need to

> be aware of the existence of segments but ODP APIs are designed to

> manipulate packets, not segments, since it's assumed that the

> existence of segments is for the implementation's convenience, not the

> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

> they're a bit of a sidecar to the other packet APIs and it's unclear

> how applications would use these in any generally portable manner.


Again this is not true. We seem to contradict each other heavily :)
If you want to have an efficient implementation this reference
creation should be done with zero-copy or as limited coping as
possible which would be achieved only by reusing common segments.
IMO there is no use in getting the number of reference which this
packet holds if you have any valid use-case then we can add this
feature.
The main concern for me is that this requires additional per-packet
meta-data storage.

>

>>

>>>

>>>>

>>>>> +

>>>>> +/**

>>>>> + * Test if a packet has a reference

>>>>> + *

>>>>> + * A packet has a reference if a reference was created on it by

>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>> + *

>>>>> + * @param pkt Packet handle

>>>>> + *

>>>>> + * @retval 0  Packet does not have any references

>>>>> + * @retval >0 Packet has N references based on it

>>>>> + */

>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>

>>>> What is the difference between odp_packet_has_ref() and

>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>> difference between the base packet and reference.

>>>

>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>> associated with this patch series.

>>>

>>> Consider the call:

>>>

>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>

>>> After this call the following conditions hold:

>>>

>>> odp_packet_is_ref(pkt_a) == 1

>>> odp_packet_is_ref(pkt_b) == 0

>>> odp_packet_has_ref(pkt_a) == 0

>>> odp_packet_has_ref(pkt_b) == 1

>>>

>>> Now consider the more complex case:

>>>

>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>

>>> Now we have:

>>>

>>> odp_packet_is_ref(pkt_a) == 1

>>> odp_packet_is_ref(pkt_b) == 0

>>> odp_packet_is_ref(pkt_c) == 1

>>> odp_packet_is_ref(pkt_d) == 2

>>> odp_packet_has_ref(pkt_a) == 1

>>> odp_packet_has_ref(pkt_b) == 3

>>> odp_packet_has_ref(pkt_c) == 0

>>> odp_packet_has_ref(pkt_d) == 0

>>>

>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>> answers the question "How many other handles can be used to reference

>>> this packet"?

>>

>> This information requires lots of unnecessary computation at the

>> implementation level, references are created by linking multiple

>> segments together from application point of view there is no

>> difference between a base packet and a reference both have few shared

>> segments and have few unique segments. I understand the definition of

>> this API, my query is the use-case for returning this number of

>> has_ref and is_ref()?

>

> No that's not true since each packet (whether it's a reference or a

> base packet) has its own metadata (including headroom, user area,

> etc.). One of the reasons we introduced the odp_packet_ref_static()

> API is to allow implementations to further optimize this and create

> the type of references you refer to here when applications assert that

> they're going to treat the reference as entirely read only (and hence

> can share metadata with the base packet). You cannot just link

> segments together and have a valid packet reference since that doesn't

> cover the unique metadata like headroom, etc.


For creating packet reference only header meta-data needs to be unique
but the segments could be shared even in odp_packet_ref() API.
It is better to share as many segments as possible since copying data
will not be as efficient as linking multiple segments together.

>

> Given the requirement for unique metadata on a per-reference basis, I

> don't see the implementation of these APIs as problematic. The

> odp-linux implementation shows just how simple they really are.


odp-linux does not have any constraint on the per-packet meta-data but
it is not true for all implementations, I can agree to add this value
in meta-data if there is any valid use-case for this number if not
then I would prefer not to waste packet meta-data unnecessarily.

Regards,
Bala
>

>>

>> Regards,

>> Bala

>

> Thank you! This is a great dialog.

>

>>>

>>>>

>>>> Regards,

>>>> Bala

>>>>> +

>>>>> +/*

>>>>> + *

>>>>>   * Copy

>>>>>   * ********************************************************

>>>>>   *

>>>>> --

>>>>> 2.7.4

>>>>>
Bill Fischofer Dec. 20, 2016, 5:05 p.m. UTC | #10
On Tue, Dec 20, 2016 at 8:26 AM, Bala Manoharan
<bala.manoharan@linaro.org> wrote:
> On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

>> <bala.manoharan@linaro.org> wrote:

>>> Regards,

>>> Bala

>>>

>>>

>>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>>> <bala.manoharan@linaro.org> wrote:

>>>>> Comments inline....

>>>>>

>>>>>

>>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>>> packets.

>>>>>>

>>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>>

>>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>>

>>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>>> header packet

>>>>>>

>>>>>> In addition, three other APIs simplify working with references

>>>>>>

>>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>>

>>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>>

>>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>>> be modified safely.

>>>>>>

>>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>>> ---

>>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>>  1 file changed, 168 insertions(+)

>>>>>>

>>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>>> index faf62e2..137024f 100644

>>>>>> --- a/include/odp/api/spec/packet.h

>>>>>> +++ b/include/odp/api/spec/packet.h

>>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>>

>>>>>>  /**

>>>>>> + * Packet unshared data len

>>>>>> + *

>>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>>> + * packet should be treated as read-only.

>>>>>> + *

>>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>>> + * odp_packet_is_ref() == 0.

>>>>>> + *

>>>>>> + * @param pkt Packet handle

>>>>>> + *

>>>>>> + * @return Packet unshared data length

>>>>>> + */

>>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>>> +

>>>>>> +/**

>>>>>>   * Packet headroom length

>>>>>>   *

>>>>>>   * Returns the current packet level headroom length.

>>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>>

>>>>>>  /*

>>>>>>   *

>>>>>> + * References

>>>>>> + * ********************************************************

>>>>>> + *

>>>>>> + */

>>>>>> +

>>>>>> +/**

>>>>>> + * Create a static reference to a packet

>>>>>> + *

>>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>>> + * retransmission processing.

>>>>>> + *

>>>>>> + * The intent of a static reference is that both the base packet and the

>>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>>> + * are unpredictable if this restriction is not observed.

>>>>>> + *

>>>>>> + * Static references have restrictions but may have performance advantages on

>>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>>> + *

>>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>>> + *               to be created.

>>>>>> + *

>>>>>> + * @return                    Handle of the static reference packet

>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>> + */

>>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>>

>>>>> It is better to change the API signature to return the updated handle

>>>>> of the base packet also to the application.

>>>>> Similar to the API for odp_packet_extend_head().

>>>>

>>>> I think returning the packet ref directly rather than indirectly makes

>>>> for easier coding on the part of applications. Failure is indicated by

>>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>>> odp_packet_alloc(). The alternative:

>>>>

>>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>>

>>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>>

>>>>>

>>>>>> +

>>>>>> +/**

>>>>>> + * Create a reference to a packet

>>>>>> + *

>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>>> + * observed.

>>>>>> + *

>>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>>> + * the base packet.

>>>>>> + *

>>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>>> + * reference to the same base packet.

>>>>>> + *

>>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>>> + * this regard will result in this call failing and returning

>>>>>> + * ODP_PACKET_INVALID.

>>>>>> + *

>>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>>> + * as read-only.

>>>>>> + *

>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>> + *               created.

>>>>>> + *

>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>> + *

>>>>>> + * @return                    Handle of the reference packet

>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>> + *

>>>>>> +

>>>>>> + */

>>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>>> +

>>>>>> +/**

>>>>>> + * Create a reference to another packet from a header packet

>>>>>> + *

>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>>> + * multicasting or retransmission processing to be performed.

>>>>>

>>>>> My concerns with this API is what happens when application creates

>>>>> multiple references from multiple offsets of the base packet. In that

>>>>> scenario the read-only status of the shared portion could not be

>>>>> maintained since if different references gives different offset there

>>>>> will be more than one shared portion.

>>>>>

>>>>

>>>> Why would this be a problem? We're relying on an "honor system" here

>>>> since there is no defined enforcement mechanism. The rule is that you

>>>> should only modify the unshared portion of any packet and results are

>>>> undefined if this rule is ignored. That's why we have the

>>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>>> help applications know whether they should or should not attempt to

>>>> modify a packet given it's handle.

>>>

>>> Let us consider an example,

>>>

>>> pkt_a = odp_packet_ref(pkt_base, 200);

>>> -----

>>> -----

>>> -----

>>> pkt_b = odp_packet_ref(pkt_base, 100);

>>>

>>> In the above case, the read-only portion of pkt_base was from 200 to

>>> end-of-packet during creation of first reference and it is moved to

>>> 100 to end-of-packet during creation of second reference so all the

>>> segment handle of pkt_base previously owned by the application becomes

>>> invalid.

>>>

>>

>> No, in the above examples the entirety of pkt_base should be treated

>> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

>> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

>> most portable rule. During this time odp_packet_unshared_len(pkt_base)

>> will return 0. I don't see a use case for any other interpretation,

>> and any other interpretation is going to encounter a lot of

>> portability issues. Again, this is an "honor system". It's up to the

>> application to observe this convention since we don't want to specify

>> that the implementation has to somehow make pkt_base read only. All

>> ODP says is results are undefined if applications do not observe this

>> rule.

>>

>> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

>> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

>> modified. However since after the odp_packet_ref() call

>> odp_packet_unshared_len() is 0 for both, that means that the only way

>> they should be modified would be to call odp_packet_push_head() or

>> odp_packet_extend_head() to put a unique prefix on the shared payload.

>> So if we call odp_packet_push_head(pkt_a, 64);, then

>> odp_packet_unshare_len(pkt_a) == 64, etc.

>

> I do not see the need why the entire packet should be marked as

> read-only. If you mark the entire base packet as read-only then you

> can not do any header modification for the base packet (ie. tcp or udp

> checksum update) which will make the packet entirely use-less for

> transmission. you already have an API for unshared length which could

> be used to update the unique sections of the base packet.


It's not "marked". This is a user convention. Marking implies that the
implementation is taking some sort of enforcement steps, which is what
we're trying to avoid. You wouldn't modify the base packet in any way.
Instead you'd create as many references to it as you need and modify
the (unique) headers for each reference. This was the point we
discussed earlier. Under the v2 patch if you need 30 references you
create 30 references and deal with them individually. I agreed to
rework the patch so that you only need to create 29 references and can
use the base packet as a reference in its own right, however this will
incur additional overhead (at least in SW implementations). The
arguments for wanting this seem to be mostly aesthetic since the idea
is that references are cheap so you can create as many as you need
with little overhead. Creating one less but incurring extra overhead
in the process for each one due to the need to maintain an invariant
offset in the presence of a possibly-modifiable base packet seems to
me to be a poor tradeoff.

>

> Also what happens when you free the base packet after creating the

> reference? I believe it should not cause any issue.


There are no issues with this. It's implementation-defined whether all
of the segments of the base packet are retained or if only the shared
segments are retained. Many implementations will find it's simpler and
easier to just retain them all until all packet references have been
freed and then free everything. Trying to optimize frees on a
per-segment basis is likely to incur more overhead for limited gain
since packet lifespans in the data plane are expected to be short
duration while the packet is transiting the application.

> Contrary to this, we can also restrict the number of offset for the

> base packet to 1, so the above conflict do not arise.


I'm not sure I understand this point. For odp_packet_ref() the API is
defined as valid offsets are 0..odp_packet_len(base_pkt)-1. If the
implementation cannot accommodate this full range then it is always
free to fail the call by returning ODP_PACKET_INVALID. Applications
must always check the returned odp_packet_t since there is no
guarantee that this API call will always succeed.

>

>>

>>>>

>>>>>> + *

>>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>>> + * nested and fail the call if those limits are exceeded.

>>>>>> + *

>>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>>> + * however individual implementations MAY require that both reside in the same

>>>>>> + * pool and fail the call if this restriction is not observed.

>>>>>

>>>>> Not sure if there is a requirement to support reference with packets

>>>>> from multiple pools.

>>>>

>>>> That's why this is a MAY. We could expose this as a capability or

>>>> simply state that this is not supported, however some implementations

>>>> may be able to support this (e.g., odp-linux has no need to make this

>>>> restriction) and I could see how it could be useful to have header

>>>> templates in their own pool for storage management purposes.

>>>

>>> odp-linux is a special case since it does not use any HW pool manager,

>>> most platforms using HW pool manager might not be able to create a

>>> packet with segments from multiple pools since it will be difficult to

>>> return the segments to respective pools during odp_packet_free().

>>> packet pool is a limited resource in a system and it might not be

>>> advisable to use a separate pool for header template.

>>

>> That's why this is a MAY. Implementation that don't support this would

>> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

>> the header and packet to be referenced reside in different pools. This

>> is no different than an odp_packet_alloc() call failing because the

>> pool is out of memory, etc. Applications are expected to be able to

>> deal with these sort of situations.

>>

>> Again, if we want we can simply say that both packets must come from

>> the same pool and leave it at that. I have no problem changing that if

>> people feel that a more uniform interpretation is desirable. Note

>> however that many of the other packet routines (odp_packet_copy(),

>> odp_packet_concat()) support multi-pool operation.

>

> During packet_copy and packet_concat() there is a need to create two

> completely independent packets after the call there is no shared data

> between the packets and you need to copy the packet from "src" to

> "dst" but that is not the case for reference creation this could run

> in the fast path and should be done as efficient as possible.


Agreed. The whole idea of references is to copy as little as possible
(ideally nothing). That's what odp-linux does. Having the convention
that the base packet is treated as read only was supposed to make this
simpler for implementations since it doesn't have to worry about the
referenced packet changing. For reasons I don't fully appreciate, this
convention makes life more difficult for your implementation?

>

>>

>>>

>>>>

>>>>>

>>>>>> + *

>>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>>> + * header packet.

>>>>>> + *

>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>> + *               created.

>>>>>> + *

>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>> + *

>>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>>> + *               packet to create the reference. If this is specified as

>>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>>> + *               to create circular references.

>>>>>> + *

>>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>>> + *               handle since the original hdr packet no longer has any

>>>>>> + *               independent existence.

>>>>>> + *

>>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>>> + *                            and hdr are unchanged.

>>>>>> + */

>>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>>> +                               odp_packet_t hdr);

>>>>>> +

>>>>>> +/**

>>>>>> + * Test if a packet is a reference

>>>>>> + *

>>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>>> + * result in return values greater than 1.

>>>>>> + *

>>>>>> + * @param pkt Packet handle

>>>>>> + *

>>>>>> + * @retval 0  Packet is not a reference

>>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>>> + */

>>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>>

>>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>>> return the number of references?

>>>>> The total number of references created is not interesting and also it

>>>>> is not so easy since references are created at segment level as per

>>>>> this proposal.

>>>>> Application will have to call odp_packet_free() for all the packet

>>>>> handle it is holding.

>>>>

>>>> Since any implementation supporting references needs to have some

>>>> internal notion of a reference count this is just attempting to expose

>>>> that in an implementation-independent manner. This is also why there

>>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>>

>>> Again this depends on the implementation, based on the proposal here

>>> the reference count should be maintained per segment since the

>>> reference is being created using an offset, different segments of the

>>> packets could have different references.

>>> My concern is this count is not useful since application already has

>>> packet handles per reference and it has to free all the handles which

>>> was populated using reference APIs.

>>

>> No, a reference count is logically on a per-packet basis because as

>> noted earlier the entire packet should be treated as read only if any

>> reference to it exists. If implementations want to use per-segment

>> reference counts for internal convenience that's fine, but the API is

>> referring to per-packet references. Remember that applications need to

>> be aware of the existence of segments but ODP APIs are designed to

>> manipulate packets, not segments, since it's assumed that the

>> existence of segments is for the implementation's convenience, not the

>> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

>> they're a bit of a sidecar to the other packet APIs and it's unclear

>> how applications would use these in any generally portable manner.

>

> Again this is not true. We seem to contradict each other heavily :)

> If you want to have an efficient implementation this reference

> creation should be done with zero-copy or as limited coping as

> possible which would be achieved only by reusing common segments.

> IMO there is no use in getting the number of reference which this

> packet holds if you have any valid use-case then we can add this

> feature.

> The main concern for me is that this requires additional per-packet

> meta-data storage.


Well, you can see my implementation (which is pretty simple and
efficient) but I can't see yours so I don't fully understand the
problem you're trying to solve. If you'd like to have a private
hangout to explore this further let me know.


>

>>

>>>

>>>>

>>>>>

>>>>>> +

>>>>>> +/**

>>>>>> + * Test if a packet has a reference

>>>>>> + *

>>>>>> + * A packet has a reference if a reference was created on it by

>>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>>> + *

>>>>>> + * @param pkt Packet handle

>>>>>> + *

>>>>>> + * @retval 0  Packet does not have any references

>>>>>> + * @retval >0 Packet has N references based on it

>>>>>> + */

>>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>>

>>>>> What is the difference between odp_packet_has_ref() and

>>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>>> difference between the base packet and reference.

>>>>

>>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>>> associated with this patch series.

>>>>

>>>> Consider the call:

>>>>

>>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>>

>>>> After this call the following conditions hold:

>>>>

>>>> odp_packet_is_ref(pkt_a) == 1

>>>> odp_packet_is_ref(pkt_b) == 0

>>>> odp_packet_has_ref(pkt_a) == 0

>>>> odp_packet_has_ref(pkt_b) == 1

>>>>

>>>> Now consider the more complex case:

>>>>

>>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>>

>>>> Now we have:

>>>>

>>>> odp_packet_is_ref(pkt_a) == 1

>>>> odp_packet_is_ref(pkt_b) == 0

>>>> odp_packet_is_ref(pkt_c) == 1

>>>> odp_packet_is_ref(pkt_d) == 2

>>>> odp_packet_has_ref(pkt_a) == 1

>>>> odp_packet_has_ref(pkt_b) == 3

>>>> odp_packet_has_ref(pkt_c) == 0

>>>> odp_packet_has_ref(pkt_d) == 0

>>>>

>>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>>> answers the question "How many other handles can be used to reference

>>>> this packet"?

>>>

>>> This information requires lots of unnecessary computation at the

>>> implementation level, references are created by linking multiple

>>> segments together from application point of view there is no

>>> difference between a base packet and a reference both have few shared

>>> segments and have few unique segments. I understand the definition of

>>> this API, my query is the use-case for returning this number of

>>> has_ref and is_ref()?

>>

>> No that's not true since each packet (whether it's a reference or a

>> base packet) has its own metadata (including headroom, user area,

>> etc.). One of the reasons we introduced the odp_packet_ref_static()

>> API is to allow implementations to further optimize this and create

>> the type of references you refer to here when applications assert that

>> they're going to treat the reference as entirely read only (and hence

>> can share metadata with the base packet). You cannot just link

>> segments together and have a valid packet reference since that doesn't

>> cover the unique metadata like headroom, etc.

>

> For creating packet reference only header meta-data needs to be unique

> but the segments could be shared even in odp_packet_ref() API.

> It is better to share as many segments as possible since copying data

> will not be as efficient as linking multiple segments together.

>

>>

>> Given the requirement for unique metadata on a per-reference basis, I

>> don't see the implementation of these APIs as problematic. The

>> odp-linux implementation shows just how simple they really are.

>

> odp-linux does not have any constraint on the per-packet meta-data but

> it is not true for all implementations, I can agree to add this value

> in meta-data if there is any valid use-case for this number if not

> then I would prefer not to waste packet meta-data unnecessarily.

>

> Regards,

> Bala

>>

>>>

>>> Regards,

>>> Bala

>>

>> Thank you! This is a great dialog.

>>

>>>>

>>>>>

>>>>> Regards,

>>>>> Bala

>>>>>> +

>>>>>> +/*

>>>>>> + *

>>>>>>   * Copy

>>>>>>   * ********************************************************

>>>>>>   *

>>>>>> --

>>>>>> 2.7.4

>>>>>>
Balasubramanian Manoharan Dec. 21, 2016, 12:42 p.m. UTC | #11
Regards,
Bala


On 20 December 2016 at 22:35, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> On Tue, Dec 20, 2016 at 8:26 AM, Bala Manoharan

> <bala.manoharan@linaro.org> wrote:

>> On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

>>> <bala.manoharan@linaro.org> wrote:

>>>> Regards,

>>>> Bala

>>>>

>>>>

>>>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>> Comments inline....

>>>>>>

>>>>>>

>>>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>>>> packets.

>>>>>>>

>>>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>>>

>>>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>>>

>>>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>>>> header packet

>>>>>>>

>>>>>>> In addition, three other APIs simplify working with references

>>>>>>>

>>>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>>>

>>>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>>>

>>>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>>>> be modified safely.

>>>>>>>

>>>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>>>> ---

>>>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>>>  1 file changed, 168 insertions(+)

>>>>>>>

>>>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>>>> index faf62e2..137024f 100644

>>>>>>> --- a/include/odp/api/spec/packet.h

>>>>>>> +++ b/include/odp/api/spec/packet.h

>>>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>>>

>>>>>>>  /**

>>>>>>> + * Packet unshared data len

>>>>>>> + *

>>>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>>>> + * packet should be treated as read-only.

>>>>>>> + *

>>>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>>>> + * odp_packet_is_ref() == 0.

>>>>>>> + *

>>>>>>> + * @param pkt Packet handle

>>>>>>> + *

>>>>>>> + * @return Packet unshared data length

>>>>>>> + */

>>>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>>>> +

>>>>>>> +/**

>>>>>>>   * Packet headroom length

>>>>>>>   *

>>>>>>>   * Returns the current packet level headroom length.

>>>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>>>

>>>>>>>  /*

>>>>>>>   *

>>>>>>> + * References

>>>>>>> + * ********************************************************

>>>>>>> + *

>>>>>>> + */

>>>>>>> +

>>>>>>> +/**

>>>>>>> + * Create a static reference to a packet

>>>>>>> + *

>>>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>>>> + * retransmission processing.

>>>>>>> + *

>>>>>>> + * The intent of a static reference is that both the base packet and the

>>>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>>>> + * are unpredictable if this restriction is not observed.

>>>>>>> + *

>>>>>>> + * Static references have restrictions but may have performance advantages on

>>>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>>>> + *

>>>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>>>> + *               to be created.

>>>>>>> + *

>>>>>>> + * @return                    Handle of the static reference packet

>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>> + */

>>>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>>>

>>>>>> It is better to change the API signature to return the updated handle

>>>>>> of the base packet also to the application.

>>>>>> Similar to the API for odp_packet_extend_head().

>>>>>

>>>>> I think returning the packet ref directly rather than indirectly makes

>>>>> for easier coding on the part of applications. Failure is indicated by

>>>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>>>> odp_packet_alloc(). The alternative:

>>>>>

>>>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>>>

>>>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>>>

>>>>>>

>>>>>>> +

>>>>>>> +/**

>>>>>>> + * Create a reference to a packet

>>>>>>> + *

>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>>>> + * observed.

>>>>>>> + *

>>>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>>>> + * the base packet.

>>>>>>> + *

>>>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>>>> + * reference to the same base packet.

>>>>>>> + *

>>>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>>>> + * this regard will result in this call failing and returning

>>>>>>> + * ODP_PACKET_INVALID.

>>>>>>> + *

>>>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>>>> + * as read-only.

>>>>>>> + *

>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>> + *               created.

>>>>>>> + *

>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>> + *

>>>>>>> + * @return                    Handle of the reference packet

>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>> + *

>>>>>>> +

>>>>>>> + */

>>>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>>>> +

>>>>>>> +/**

>>>>>>> + * Create a reference to another packet from a header packet

>>>>>>> + *

>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>>>> + * multicasting or retransmission processing to be performed.

>>>>>>

>>>>>> My concerns with this API is what happens when application creates

>>>>>> multiple references from multiple offsets of the base packet. In that

>>>>>> scenario the read-only status of the shared portion could not be

>>>>>> maintained since if different references gives different offset there

>>>>>> will be more than one shared portion.

>>>>>>

>>>>>

>>>>> Why would this be a problem? We're relying on an "honor system" here

>>>>> since there is no defined enforcement mechanism. The rule is that you

>>>>> should only modify the unshared portion of any packet and results are

>>>>> undefined if this rule is ignored. That's why we have the

>>>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>>>> help applications know whether they should or should not attempt to

>>>>> modify a packet given it's handle.

>>>>

>>>> Let us consider an example,

>>>>

>>>> pkt_a = odp_packet_ref(pkt_base, 200);

>>>> -----

>>>> -----

>>>> -----

>>>> pkt_b = odp_packet_ref(pkt_base, 100);

>>>>

>>>> In the above case, the read-only portion of pkt_base was from 200 to

>>>> end-of-packet during creation of first reference and it is moved to

>>>> 100 to end-of-packet during creation of second reference so all the

>>>> segment handle of pkt_base previously owned by the application becomes

>>>> invalid.

>>>>

>>>

>>> No, in the above examples the entirety of pkt_base should be treated

>>> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

>>> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

>>> most portable rule. During this time odp_packet_unshared_len(pkt_base)

>>> will return 0. I don't see a use case for any other interpretation,

>>> and any other interpretation is going to encounter a lot of

>>> portability issues. Again, this is an "honor system". It's up to the

>>> application to observe this convention since we don't want to specify

>>> that the implementation has to somehow make pkt_base read only. All

>>> ODP says is results are undefined if applications do not observe this

>>> rule.

>>>

>>> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

>>> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

>>> modified. However since after the odp_packet_ref() call

>>> odp_packet_unshared_len() is 0 for both, that means that the only way

>>> they should be modified would be to call odp_packet_push_head() or

>>> odp_packet_extend_head() to put a unique prefix on the shared payload.

>>> So if we call odp_packet_push_head(pkt_a, 64);, then

>>> odp_packet_unshare_len(pkt_a) == 64, etc.

>>

>> I do not see the need why the entire packet should be marked as

>> read-only. If you mark the entire base packet as read-only then you

>> can not do any header modification for the base packet (ie. tcp or udp

>> checksum update) which will make the packet entirely use-less for

>> transmission. you already have an API for unshared length which could

>> be used to update the unique sections of the base packet.

>

> It's not "marked". This is a user convention. Marking implies that the

> implementation is taking some sort of enforcement steps, which is what

> we're trying to avoid. You wouldn't modify the base packet in any way.

> Instead you'd create as many references to it as you need and modify

> the (unique) headers for each reference. This was the point we

> discussed earlier. Under the v2 patch if you need 30 references you

> create 30 references and deal with them individually. I agreed to

> rework the patch so that you only need to create 29 references and can

> use the base packet as a reference in its own right, however this will

> incur additional overhead (at least in SW implementations). The

> arguments for wanting this seem to be mostly aesthetic since the idea

> is that references are cheap so you can create as many as you need

> with little overhead. Creating one less but incurring extra overhead

> in the process for each one due to the need to maintain an invariant

> offset in the presence of a possibly-modifiable base packet seems to

> me to be a poor tradeoff.


Actually it is not. The idea of creating multiple reference without
reserving a read-only base packet is more efficient in HW
implementations since it removes the need to differentiate between
"base" and "reference" packet and all the packet have equal rights and
can be manipulated equally. If your implementation handles packets as
linked list of buffers then each buffer probably has a next pointer
stored at the start of the buffer and it is not possible to create a
segment by pointing at the middle of any segment without creating a
new segment since you need to have space for next pointer.
Also as discussed in the meeting keeping the shared portion of all
referenced packet as read-only has to be done by the application and
implementation does not have any control over it.

>

>>

>> Also what happens when you free the base packet after creating the

>> reference? I believe it should not cause any issue.

>

> There are no issues with this. It's implementation-defined whether all

> of the segments of the base packet are retained or if only the shared

> segments are retained. Many implementations will find it's simpler and

> easier to just retain them all until all packet references have been

> freed and then free everything. Trying to optimize frees on a

> per-segment basis is likely to incur more overhead for limited gain

> since packet lifespans in the data plane are expected to be short

> duration while the packet is transiting the application.


If we follow the proposal that there is no difference between a base
packet and a reference then the application is free to delete the
packets in any order as it prefers and there is no need for any kind
of book keeping required from the application. This segment level
manipulation is very efficient since it can achieve creation of
multiple references by copy of very minimal number of bytes.

Regards,
Bala

>

>> Contrary to this, we can also restrict the number of offset for the

>> base packet to 1, so the above conflict do not arise.

>

> I'm not sure I understand this point. For odp_packet_ref() the API is

> defined as valid offsets are 0..odp_packet_len(base_pkt)-1. If the

> implementation cannot accommodate this full range then it is always

> free to fail the call by returning ODP_PACKET_INVALID. Applications

> must always check the returned odp_packet_t since there is no

> guarantee that this API call will always succeed.

>

>>

>>>

>>>>>

>>>>>>> + *

>>>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>>>> + * nested and fail the call if those limits are exceeded.

>>>>>>> + *

>>>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>>>> + * however individual implementations MAY require that both reside in the same

>>>>>>> + * pool and fail the call if this restriction is not observed.

>>>>>>

>>>>>> Not sure if there is a requirement to support reference with packets

>>>>>> from multiple pools.

>>>>>

>>>>> That's why this is a MAY. We could expose this as a capability or

>>>>> simply state that this is not supported, however some implementations

>>>>> may be able to support this (e.g., odp-linux has no need to make this

>>>>> restriction) and I could see how it could be useful to have header

>>>>> templates in their own pool for storage management purposes.

>>>>

>>>> odp-linux is a special case since it does not use any HW pool manager,

>>>> most platforms using HW pool manager might not be able to create a

>>>> packet with segments from multiple pools since it will be difficult to

>>>> return the segments to respective pools during odp_packet_free().

>>>> packet pool is a limited resource in a system and it might not be

>>>> advisable to use a separate pool for header template.

>>>

>>> That's why this is a MAY. Implementation that don't support this would

>>> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

>>> the header and packet to be referenced reside in different pools. This

>>> is no different than an odp_packet_alloc() call failing because the

>>> pool is out of memory, etc. Applications are expected to be able to

>>> deal with these sort of situations.

>>>

>>> Again, if we want we can simply say that both packets must come from

>>> the same pool and leave it at that. I have no problem changing that if

>>> people feel that a more uniform interpretation is desirable. Note

>>> however that many of the other packet routines (odp_packet_copy(),

>>> odp_packet_concat()) support multi-pool operation.

>>

>> During packet_copy and packet_concat() there is a need to create two

>> completely independent packets after the call there is no shared data

>> between the packets and you need to copy the packet from "src" to

>> "dst" but that is not the case for reference creation this could run

>> in the fast path and should be done as efficient as possible.

>

> Agreed. The whole idea of references is to copy as little as possible

> (ideally nothing). That's what odp-linux does. Having the convention

> that the base packet is treated as read only was supposed to make this

> simpler for implementations since it doesn't have to worry about the

> referenced packet changing. For reasons I don't fully appreciate, this

> convention makes life more difficult for your implementation?

>

>>

>>>

>>>>

>>>>>

>>>>>>

>>>>>>> + *

>>>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>>>> + * header packet.

>>>>>>> + *

>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>> + *               created.

>>>>>>> + *

>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>> + *

>>>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>>>> + *               packet to create the reference. If this is specified as

>>>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>>>> + *               to create circular references.

>>>>>>> + *

>>>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>>>> + *               handle since the original hdr packet no longer has any

>>>>>>> + *               independent existence.

>>>>>>> + *

>>>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>>>> + *                            and hdr are unchanged.

>>>>>>> + */

>>>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>>>> +                               odp_packet_t hdr);

>>>>>>> +

>>>>>>> +/**

>>>>>>> + * Test if a packet is a reference

>>>>>>> + *

>>>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>>>> + * result in return values greater than 1.

>>>>>>> + *

>>>>>>> + * @param pkt Packet handle

>>>>>>> + *

>>>>>>> + * @retval 0  Packet is not a reference

>>>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>>>> + */

>>>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>>>

>>>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>>>> return the number of references?

>>>>>> The total number of references created is not interesting and also it

>>>>>> is not so easy since references are created at segment level as per

>>>>>> this proposal.

>>>>>> Application will have to call odp_packet_free() for all the packet

>>>>>> handle it is holding.

>>>>>

>>>>> Since any implementation supporting references needs to have some

>>>>> internal notion of a reference count this is just attempting to expose

>>>>> that in an implementation-independent manner. This is also why there

>>>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>>>

>>>> Again this depends on the implementation, based on the proposal here

>>>> the reference count should be maintained per segment since the

>>>> reference is being created using an offset, different segments of the

>>>> packets could have different references.

>>>> My concern is this count is not useful since application already has

>>>> packet handles per reference and it has to free all the handles which

>>>> was populated using reference APIs.

>>>

>>> No, a reference count is logically on a per-packet basis because as

>>> noted earlier the entire packet should be treated as read only if any

>>> reference to it exists. If implementations want to use per-segment

>>> reference counts for internal convenience that's fine, but the API is

>>> referring to per-packet references. Remember that applications need to

>>> be aware of the existence of segments but ODP APIs are designed to

>>> manipulate packets, not segments, since it's assumed that the

>>> existence of segments is for the implementation's convenience, not the

>>> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

>>> they're a bit of a sidecar to the other packet APIs and it's unclear

>>> how applications would use these in any generally portable manner.

>>

>> Again this is not true. We seem to contradict each other heavily :)

>> If you want to have an efficient implementation this reference

>> creation should be done with zero-copy or as limited coping as

>> possible which would be achieved only by reusing common segments.

>> IMO there is no use in getting the number of reference which this

>> packet holds if you have any valid use-case then we can add this

>> feature.

>> The main concern for me is that this requires additional per-packet

>> meta-data storage.

>

> Well, you can see my implementation (which is pretty simple and

> efficient) but I can't see yours so I don't fully understand the

> problem you're trying to solve. If you'd like to have a private

> hangout to explore this further let me know.

>

>

>>

>>>

>>>>

>>>>>

>>>>>>

>>>>>>> +

>>>>>>> +/**

>>>>>>> + * Test if a packet has a reference

>>>>>>> + *

>>>>>>> + * A packet has a reference if a reference was created on it by

>>>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>>>> + *

>>>>>>> + * @param pkt Packet handle

>>>>>>> + *

>>>>>>> + * @retval 0  Packet does not have any references

>>>>>>> + * @retval >0 Packet has N references based on it

>>>>>>> + */

>>>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>>>

>>>>>> What is the difference between odp_packet_has_ref() and

>>>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>>>> difference between the base packet and reference.

>>>>>

>>>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>>>> associated with this patch series.

>>>>>

>>>>> Consider the call:

>>>>>

>>>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>>>

>>>>> After this call the following conditions hold:

>>>>>

>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>> odp_packet_has_ref(pkt_a) == 0

>>>>> odp_packet_has_ref(pkt_b) == 1

>>>>>

>>>>> Now consider the more complex case:

>>>>>

>>>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>>>

>>>>> Now we have:

>>>>>

>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>> odp_packet_is_ref(pkt_c) == 1

>>>>> odp_packet_is_ref(pkt_d) == 2

>>>>> odp_packet_has_ref(pkt_a) == 1

>>>>> odp_packet_has_ref(pkt_b) == 3

>>>>> odp_packet_has_ref(pkt_c) == 0

>>>>> odp_packet_has_ref(pkt_d) == 0

>>>>>

>>>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>>>> answers the question "How many other handles can be used to reference

>>>>> this packet"?

>>>>

>>>> This information requires lots of unnecessary computation at the

>>>> implementation level, references are created by linking multiple

>>>> segments together from application point of view there is no

>>>> difference between a base packet and a reference both have few shared

>>>> segments and have few unique segments. I understand the definition of

>>>> this API, my query is the use-case for returning this number of

>>>> has_ref and is_ref()?

>>>

>>> No that's not true since each packet (whether it's a reference or a

>>> base packet) has its own metadata (including headroom, user area,

>>> etc.). One of the reasons we introduced the odp_packet_ref_static()

>>> API is to allow implementations to further optimize this and create

>>> the type of references you refer to here when applications assert that

>>> they're going to treat the reference as entirely read only (and hence

>>> can share metadata with the base packet). You cannot just link

>>> segments together and have a valid packet reference since that doesn't

>>> cover the unique metadata like headroom, etc.

>>

>> For creating packet reference only header meta-data needs to be unique

>> but the segments could be shared even in odp_packet_ref() API.

>> It is better to share as many segments as possible since copying data

>> will not be as efficient as linking multiple segments together.

>>

>>>

>>> Given the requirement for unique metadata on a per-reference basis, I

>>> don't see the implementation of these APIs as problematic. The

>>> odp-linux implementation shows just how simple they really are.

>>

>> odp-linux does not have any constraint on the per-packet meta-data but

>> it is not true for all implementations, I can agree to add this value

>> in meta-data if there is any valid use-case for this number if not

>> then I would prefer not to waste packet meta-data unnecessarily.

>>

>> Regards,

>> Bala

>>>

>>>>

>>>> Regards,

>>>> Bala

>>>

>>> Thank you! This is a great dialog.

>>>

>>>>>

>>>>>>

>>>>>> Regards,

>>>>>> Bala

>>>>>>> +

>>>>>>> +/*

>>>>>>> + *

>>>>>>>   * Copy

>>>>>>>   * ********************************************************

>>>>>>>   *

>>>>>>> --

>>>>>>> 2.7.4

>>>>>>>
Bill Fischofer Dec. 21, 2016, 2:15 p.m. UTC | #12
On Wed, Dec 21, 2016 at 6:42 AM, Bala Manoharan
<bala.manoharan@linaro.org> wrote:
> Regards,

> Bala

>

>

> On 20 December 2016 at 22:35, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>> On Tue, Dec 20, 2016 at 8:26 AM, Bala Manoharan

>> <bala.manoharan@linaro.org> wrote:

>>> On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

>>>> <bala.manoharan@linaro.org> wrote:

>>>>> Regards,

>>>>> Bala

>>>>>

>>>>>

>>>>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>>> Comments inline....

>>>>>>>

>>>>>>>

>>>>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>>>>> packets.

>>>>>>>>

>>>>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>>>>

>>>>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>>>>

>>>>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>>>>> header packet

>>>>>>>>

>>>>>>>> In addition, three other APIs simplify working with references

>>>>>>>>

>>>>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>>>>

>>>>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>>>>

>>>>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>>>>> be modified safely.

>>>>>>>>

>>>>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>>>>> ---

>>>>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>>>>  1 file changed, 168 insertions(+)

>>>>>>>>

>>>>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>>>>> index faf62e2..137024f 100644

>>>>>>>> --- a/include/odp/api/spec/packet.h

>>>>>>>> +++ b/include/odp/api/spec/packet.h

>>>>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>>>>

>>>>>>>>  /**

>>>>>>>> + * Packet unshared data len

>>>>>>>> + *

>>>>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>>>>> + * packet should be treated as read-only.

>>>>>>>> + *

>>>>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>>>>> + * odp_packet_is_ref() == 0.

>>>>>>>> + *

>>>>>>>> + * @param pkt Packet handle

>>>>>>>> + *

>>>>>>>> + * @return Packet unshared data length

>>>>>>>> + */

>>>>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>>   * Packet headroom length

>>>>>>>>   *

>>>>>>>>   * Returns the current packet level headroom length.

>>>>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>>>>

>>>>>>>>  /*

>>>>>>>>   *

>>>>>>>> + * References

>>>>>>>> + * ********************************************************

>>>>>>>> + *

>>>>>>>> + */

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>> + * Create a static reference to a packet

>>>>>>>> + *

>>>>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>>>>> + * retransmission processing.

>>>>>>>> + *

>>>>>>>> + * The intent of a static reference is that both the base packet and the

>>>>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>>>>> + * are unpredictable if this restriction is not observed.

>>>>>>>> + *

>>>>>>>> + * Static references have restrictions but may have performance advantages on

>>>>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>>>>> + *

>>>>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>>>>> + *               to be created.

>>>>>>>> + *

>>>>>>>> + * @return                    Handle of the static reference packet

>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>> + */

>>>>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>>>>

>>>>>>> It is better to change the API signature to return the updated handle

>>>>>>> of the base packet also to the application.

>>>>>>> Similar to the API for odp_packet_extend_head().

>>>>>>

>>>>>> I think returning the packet ref directly rather than indirectly makes

>>>>>> for easier coding on the part of applications. Failure is indicated by

>>>>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>>>>> odp_packet_alloc(). The alternative:

>>>>>>

>>>>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>>>>

>>>>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>>>>

>>>>>>>

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>> + * Create a reference to a packet

>>>>>>>> + *

>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>>>>> + * observed.

>>>>>>>> + *

>>>>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>>>>> + * the base packet.

>>>>>>>> + *

>>>>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>>>>> + * reference to the same base packet.

>>>>>>>> + *

>>>>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>>>>> + * this regard will result in this call failing and returning

>>>>>>>> + * ODP_PACKET_INVALID.

>>>>>>>> + *

>>>>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>>>>> + * as read-only.

>>>>>>>> + *

>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>> + *               created.

>>>>>>>> + *

>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>> + *

>>>>>>>> + * @return                    Handle of the reference packet

>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>> + *

>>>>>>>> +

>>>>>>>> + */

>>>>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>> + * Create a reference to another packet from a header packet

>>>>>>>> + *

>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>>>>> + * multicasting or retransmission processing to be performed.

>>>>>>>

>>>>>>> My concerns with this API is what happens when application creates

>>>>>>> multiple references from multiple offsets of the base packet. In that

>>>>>>> scenario the read-only status of the shared portion could not be

>>>>>>> maintained since if different references gives different offset there

>>>>>>> will be more than one shared portion.

>>>>>>>

>>>>>>

>>>>>> Why would this be a problem? We're relying on an "honor system" here

>>>>>> since there is no defined enforcement mechanism. The rule is that you

>>>>>> should only modify the unshared portion of any packet and results are

>>>>>> undefined if this rule is ignored. That's why we have the

>>>>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>>>>> help applications know whether they should or should not attempt to

>>>>>> modify a packet given it's handle.

>>>>>

>>>>> Let us consider an example,

>>>>>

>>>>> pkt_a = odp_packet_ref(pkt_base, 200);

>>>>> -----

>>>>> -----

>>>>> -----

>>>>> pkt_b = odp_packet_ref(pkt_base, 100);

>>>>>

>>>>> In the above case, the read-only portion of pkt_base was from 200 to

>>>>> end-of-packet during creation of first reference and it is moved to

>>>>> 100 to end-of-packet during creation of second reference so all the

>>>>> segment handle of pkt_base previously owned by the application becomes

>>>>> invalid.

>>>>>

>>>>

>>>> No, in the above examples the entirety of pkt_base should be treated

>>>> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

>>>> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

>>>> most portable rule. During this time odp_packet_unshared_len(pkt_base)

>>>> will return 0. I don't see a use case for any other interpretation,

>>>> and any other interpretation is going to encounter a lot of

>>>> portability issues. Again, this is an "honor system". It's up to the

>>>> application to observe this convention since we don't want to specify

>>>> that the implementation has to somehow make pkt_base read only. All

>>>> ODP says is results are undefined if applications do not observe this

>>>> rule.

>>>>

>>>> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

>>>> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

>>>> modified. However since after the odp_packet_ref() call

>>>> odp_packet_unshared_len() is 0 for both, that means that the only way

>>>> they should be modified would be to call odp_packet_push_head() or

>>>> odp_packet_extend_head() to put a unique prefix on the shared payload.

>>>> So if we call odp_packet_push_head(pkt_a, 64);, then

>>>> odp_packet_unshare_len(pkt_a) == 64, etc.

>>>

>>> I do not see the need why the entire packet should be marked as

>>> read-only. If you mark the entire base packet as read-only then you

>>> can not do any header modification for the base packet (ie. tcp or udp

>>> checksum update) which will make the packet entirely use-less for

>>> transmission. you already have an API for unshared length which could

>>> be used to update the unique sections of the base packet.

>>

>> It's not "marked". This is a user convention. Marking implies that the

>> implementation is taking some sort of enforcement steps, which is what

>> we're trying to avoid. You wouldn't modify the base packet in any way.

>> Instead you'd create as many references to it as you need and modify

>> the (unique) headers for each reference. This was the point we

>> discussed earlier. Under the v2 patch if you need 30 references you

>> create 30 references and deal with them individually. I agreed to

>> rework the patch so that you only need to create 29 references and can

>> use the base packet as a reference in its own right, however this will

>> incur additional overhead (at least in SW implementations). The

>> arguments for wanting this seem to be mostly aesthetic since the idea

>> is that references are cheap so you can create as many as you need

>> with little overhead. Creating one less but incurring extra overhead

>> in the process for each one due to the need to maintain an invariant

>> offset in the presence of a possibly-modifiable base packet seems to

>> me to be a poor tradeoff.

>

> Actually it is not. The idea of creating multiple reference without

> reserving a read-only base packet is more efficient in HW

> implementations since it removes the need to differentiate between

> "base" and "reference" packet and all the packet have equal rights and

> can be manipulated equally. If your implementation handles packets as

> linked list of buffers then each buffer probably has a next pointer

> stored at the start of the buffer and it is not possible to create a

> segment by pointing at the middle of any segment without creating a

> new segment since you need to have space for next pointer.

> Also as discussed in the meeting keeping the shared portion of all

> referenced packet as read-only has to be done by the application and

> implementation does not have any control over it.


We're not reserving a read-only base packet. There are no differences
between base packets and references at an implementation level. The
idea here is simply that if the application observes the *discipline*
that it treats base packets as entirely read-only that implementation
of references can be done more efficiently. That's certainly the case
in odp-linux and it may be true for other implementations as well
where perhaps this discipline would remove the need for segment copies
in many instances.

>

>>

>>>

>>> Also what happens when you free the base packet after creating the

>>> reference? I believe it should not cause any issue.

>>

>> There are no issues with this. It's implementation-defined whether all

>> of the segments of the base packet are retained or if only the shared

>> segments are retained. Many implementations will find it's simpler and

>> easier to just retain them all until all packet references have been

>> freed and then free everything. Trying to optimize frees on a

>> per-segment basis is likely to incur more overhead for limited gain

>> since packet lifespans in the data plane are expected to be short

>> duration while the packet is transiting the application.

>

> If we follow the proposal that there is no difference between a base

> packet and a reference then the application is free to delete the

> packets in any order as it prefers and there is no need for any kind


It's always the case that frees can be done in any order. If you have
an odp_packet_t you can always call odp_packet_free() for it whether
or not it is a reference. The implementation must employ reference
counters or similar mechanism to ensure that the actual packet data is
not freed as long as any references to it are still outstanding. The
application has no need to do bookkeeping in this regard as this is
entirely the responsibility of the implementation.

> of book keeping required from the application. This segment level

> manipulation is very efficient since it can achieve creation of

> multiple references by copy of very minimal number of bytes.


The problem is that what you really want is zero-copy, not "minimal
copy". If you assume that most packets are a single segment and/or
that the vast majority of references will be created with offsets in
the first segment, this means that every reference is going to involve
a segment copy of some sort, which means that the creation and use of
references is going to be relatively high overhead. As we discussed,
the goal is for references to be highly efficient since the
expectation is that they will be used frequently for things like
retransmission, multicast, etc. The updated User Guide discusses this
and shows code examples of these expected use cases and should be
considered as part of this discussion.

>

> Regards,

> Bala

>

>>

>>> Contrary to this, we can also restrict the number of offset for the

>>> base packet to 1, so the above conflict do not arise.

>>

>> I'm not sure I understand this point. For odp_packet_ref() the API is

>> defined as valid offsets are 0..odp_packet_len(base_pkt)-1. If the

>> implementation cannot accommodate this full range then it is always

>> free to fail the call by returning ODP_PACKET_INVALID. Applications

>> must always check the returned odp_packet_t since there is no

>> guarantee that this API call will always succeed.

>>

>>>

>>>>

>>>>>>

>>>>>>>> + *

>>>>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>>>>> + * nested and fail the call if those limits are exceeded.

>>>>>>>> + *

>>>>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>>>>> + * however individual implementations MAY require that both reside in the same

>>>>>>>> + * pool and fail the call if this restriction is not observed.

>>>>>>>

>>>>>>> Not sure if there is a requirement to support reference with packets

>>>>>>> from multiple pools.

>>>>>>

>>>>>> That's why this is a MAY. We could expose this as a capability or

>>>>>> simply state that this is not supported, however some implementations

>>>>>> may be able to support this (e.g., odp-linux has no need to make this

>>>>>> restriction) and I could see how it could be useful to have header

>>>>>> templates in their own pool for storage management purposes.

>>>>>

>>>>> odp-linux is a special case since it does not use any HW pool manager,

>>>>> most platforms using HW pool manager might not be able to create a

>>>>> packet with segments from multiple pools since it will be difficult to

>>>>> return the segments to respective pools during odp_packet_free().

>>>>> packet pool is a limited resource in a system and it might not be

>>>>> advisable to use a separate pool for header template.

>>>>

>>>> That's why this is a MAY. Implementation that don't support this would

>>>> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

>>>> the header and packet to be referenced reside in different pools. This

>>>> is no different than an odp_packet_alloc() call failing because the

>>>> pool is out of memory, etc. Applications are expected to be able to

>>>> deal with these sort of situations.

>>>>

>>>> Again, if we want we can simply say that both packets must come from

>>>> the same pool and leave it at that. I have no problem changing that if

>>>> people feel that a more uniform interpretation is desirable. Note

>>>> however that many of the other packet routines (odp_packet_copy(),

>>>> odp_packet_concat()) support multi-pool operation.

>>>

>>> During packet_copy and packet_concat() there is a need to create two

>>> completely independent packets after the call there is no shared data

>>> between the packets and you need to copy the packet from "src" to

>>> "dst" but that is not the case for reference creation this could run

>>> in the fast path and should be done as efficient as possible.

>>

>> Agreed. The whole idea of references is to copy as little as possible

>> (ideally nothing). That's what odp-linux does. Having the convention

>> that the base packet is treated as read only was supposed to make this

>> simpler for implementations since it doesn't have to worry about the

>> referenced packet changing. For reasons I don't fully appreciate, this

>> convention makes life more difficult for your implementation?

>>

>>>

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>> + *

>>>>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>>>>> + * header packet.

>>>>>>>> + *

>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>> + *               created.

>>>>>>>> + *

>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>> + *

>>>>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>>>>> + *               packet to create the reference. If this is specified as

>>>>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>>>>> + *               to create circular references.

>>>>>>>> + *

>>>>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>>>>> + *               handle since the original hdr packet no longer has any

>>>>>>>> + *               independent existence.

>>>>>>>> + *

>>>>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>>>>> + *                            and hdr are unchanged.

>>>>>>>> + */

>>>>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>>>>> +                               odp_packet_t hdr);

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>> + * Test if a packet is a reference

>>>>>>>> + *

>>>>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>>>>> + * result in return values greater than 1.

>>>>>>>> + *

>>>>>>>> + * @param pkt Packet handle

>>>>>>>> + *

>>>>>>>> + * @retval 0  Packet is not a reference

>>>>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>>>>> + */

>>>>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>>>>

>>>>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>>>>> return the number of references?

>>>>>>> The total number of references created is not interesting and also it

>>>>>>> is not so easy since references are created at segment level as per

>>>>>>> this proposal.

>>>>>>> Application will have to call odp_packet_free() for all the packet

>>>>>>> handle it is holding.

>>>>>>

>>>>>> Since any implementation supporting references needs to have some

>>>>>> internal notion of a reference count this is just attempting to expose

>>>>>> that in an implementation-independent manner. This is also why there

>>>>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>>>>

>>>>> Again this depends on the implementation, based on the proposal here

>>>>> the reference count should be maintained per segment since the

>>>>> reference is being created using an offset, different segments of the

>>>>> packets could have different references.

>>>>> My concern is this count is not useful since application already has

>>>>> packet handles per reference and it has to free all the handles which

>>>>> was populated using reference APIs.

>>>>

>>>> No, a reference count is logically on a per-packet basis because as

>>>> noted earlier the entire packet should be treated as read only if any

>>>> reference to it exists. If implementations want to use per-segment

>>>> reference counts for internal convenience that's fine, but the API is

>>>> referring to per-packet references. Remember that applications need to

>>>> be aware of the existence of segments but ODP APIs are designed to

>>>> manipulate packets, not segments, since it's assumed that the

>>>> existence of segments is for the implementation's convenience, not the

>>>> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

>>>> they're a bit of a sidecar to the other packet APIs and it's unclear

>>>> how applications would use these in any generally portable manner.

>>>

>>> Again this is not true. We seem to contradict each other heavily :)

>>> If you want to have an efficient implementation this reference

>>> creation should be done with zero-copy or as limited coping as

>>> possible which would be achieved only by reusing common segments.

>>> IMO there is no use in getting the number of reference which this

>>> packet holds if you have any valid use-case then we can add this

>>> feature.

>>> The main concern for me is that this requires additional per-packet

>>> meta-data storage.

>>

>> Well, you can see my implementation (which is pretty simple and

>> efficient) but I can't see yours so I don't fully understand the

>> problem you're trying to solve. If you'd like to have a private

>> hangout to explore this further let me know.

>>

>>

>>>

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>> +

>>>>>>>> +/**

>>>>>>>> + * Test if a packet has a reference

>>>>>>>> + *

>>>>>>>> + * A packet has a reference if a reference was created on it by

>>>>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>>>>> + *

>>>>>>>> + * @param pkt Packet handle

>>>>>>>> + *

>>>>>>>> + * @retval 0  Packet does not have any references

>>>>>>>> + * @retval >0 Packet has N references based on it

>>>>>>>> + */

>>>>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>>>>

>>>>>>> What is the difference between odp_packet_has_ref() and

>>>>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>>>>> difference between the base packet and reference.

>>>>>>

>>>>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>>>>> associated with this patch series.

>>>>>>

>>>>>> Consider the call:

>>>>>>

>>>>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>>>>

>>>>>> After this call the following conditions hold:

>>>>>>

>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>> odp_packet_has_ref(pkt_a) == 0

>>>>>> odp_packet_has_ref(pkt_b) == 1

>>>>>>

>>>>>> Now consider the more complex case:

>>>>>>

>>>>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>>>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>>>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>>>>

>>>>>> Now we have:

>>>>>>

>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>> odp_packet_is_ref(pkt_c) == 1

>>>>>> odp_packet_is_ref(pkt_d) == 2

>>>>>> odp_packet_has_ref(pkt_a) == 1

>>>>>> odp_packet_has_ref(pkt_b) == 3

>>>>>> odp_packet_has_ref(pkt_c) == 0

>>>>>> odp_packet_has_ref(pkt_d) == 0

>>>>>>

>>>>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>>>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>>>>> answers the question "How many other handles can be used to reference

>>>>>> this packet"?

>>>>>

>>>>> This information requires lots of unnecessary computation at the

>>>>> implementation level, references are created by linking multiple

>>>>> segments together from application point of view there is no

>>>>> difference between a base packet and a reference both have few shared

>>>>> segments and have few unique segments. I understand the definition of

>>>>> this API, my query is the use-case for returning this number of

>>>>> has_ref and is_ref()?

>>>>

>>>> No that's not true since each packet (whether it's a reference or a

>>>> base packet) has its own metadata (including headroom, user area,

>>>> etc.). One of the reasons we introduced the odp_packet_ref_static()

>>>> API is to allow implementations to further optimize this and create

>>>> the type of references you refer to here when applications assert that

>>>> they're going to treat the reference as entirely read only (and hence

>>>> can share metadata with the base packet). You cannot just link

>>>> segments together and have a valid packet reference since that doesn't

>>>> cover the unique metadata like headroom, etc.

>>>

>>> For creating packet reference only header meta-data needs to be unique

>>> but the segments could be shared even in odp_packet_ref() API.

>>> It is better to share as many segments as possible since copying data

>>> will not be as efficient as linking multiple segments together.

>>>

>>>>

>>>> Given the requirement for unique metadata on a per-reference basis, I

>>>> don't see the implementation of these APIs as problematic. The

>>>> odp-linux implementation shows just how simple they really are.

>>>

>>> odp-linux does not have any constraint on the per-packet meta-data but

>>> it is not true for all implementations, I can agree to add this value

>>> in meta-data if there is any valid use-case for this number if not

>>> then I would prefer not to waste packet meta-data unnecessarily.

>>>

>>> Regards,

>>> Bala

>>>>

>>>>>

>>>>> Regards,

>>>>> Bala

>>>>

>>>> Thank you! This is a great dialog.

>>>>

>>>>>>

>>>>>>>

>>>>>>> Regards,

>>>>>>> Bala

>>>>>>>> +

>>>>>>>> +/*

>>>>>>>> + *

>>>>>>>>   * Copy

>>>>>>>>   * ********************************************************

>>>>>>>>   *

>>>>>>>> --

>>>>>>>> 2.7.4

>>>>>>>>
Balasubramanian Manoharan Dec. 22, 2016, 3:07 p.m. UTC | #13
On 21 December 2016 at 19:45, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> On Wed, Dec 21, 2016 at 6:42 AM, Bala Manoharan

> <bala.manoharan@linaro.org> wrote:

>> Regards,

>> Bala

>>

>>

>> On 20 December 2016 at 22:35, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>> On Tue, Dec 20, 2016 at 8:26 AM, Bala Manoharan

>>> <bala.manoharan@linaro.org> wrote:

>>>> On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>> Regards,

>>>>>> Bala

>>>>>>

>>>>>>

>>>>>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>>>> Comments inline....

>>>>>>>>

>>>>>>>>

>>>>>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>>>>>> packets.

>>>>>>>>>

>>>>>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>>>>>

>>>>>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>>>>>

>>>>>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>>>>>> header packet

>>>>>>>>>

>>>>>>>>> In addition, three other APIs simplify working with references

>>>>>>>>>

>>>>>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>>>>>

>>>>>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>>>>>

>>>>>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>>>>>> be modified safely.

>>>>>>>>>

>>>>>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>>>>>> ---

>>>>>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>>>>>  1 file changed, 168 insertions(+)

>>>>>>>>>

>>>>>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>>>>>> index faf62e2..137024f 100644

>>>>>>>>> --- a/include/odp/api/spec/packet.h

>>>>>>>>> +++ b/include/odp/api/spec/packet.h

>>>>>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>>>>>

>>>>>>>>>  /**

>>>>>>>>> + * Packet unshared data len

>>>>>>>>> + *

>>>>>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>>>>>> + * packet should be treated as read-only.

>>>>>>>>> + *

>>>>>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>>>>>> + * odp_packet_is_ref() == 0.

>>>>>>>>> + *

>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>> + *

>>>>>>>>> + * @return Packet unshared data length

>>>>>>>>> + */

>>>>>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>>   * Packet headroom length

>>>>>>>>>   *

>>>>>>>>>   * Returns the current packet level headroom length.

>>>>>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>>>>>

>>>>>>>>>  /*

>>>>>>>>>   *

>>>>>>>>> + * References

>>>>>>>>> + * ********************************************************

>>>>>>>>> + *

>>>>>>>>> + */

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>> + * Create a static reference to a packet

>>>>>>>>> + *

>>>>>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>>>>>> + * retransmission processing.

>>>>>>>>> + *

>>>>>>>>> + * The intent of a static reference is that both the base packet and the

>>>>>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>>>>>> + * are unpredictable if this restriction is not observed.

>>>>>>>>> + *

>>>>>>>>> + * Static references have restrictions but may have performance advantages on

>>>>>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>>>>>> + *

>>>>>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>>>>>> + *               to be created.

>>>>>>>>> + *

>>>>>>>>> + * @return                    Handle of the static reference packet

>>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>>> + */

>>>>>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>>>>>

>>>>>>>> It is better to change the API signature to return the updated handle

>>>>>>>> of the base packet also to the application.

>>>>>>>> Similar to the API for odp_packet_extend_head().

>>>>>>>

>>>>>>> I think returning the packet ref directly rather than indirectly makes

>>>>>>> for easier coding on the part of applications. Failure is indicated by

>>>>>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>>>>>> odp_packet_alloc(). The alternative:

>>>>>>>

>>>>>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>>>>>

>>>>>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>>>>>

>>>>>>>>

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>> + * Create a reference to a packet

>>>>>>>>> + *

>>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>>>>>> + * observed.

>>>>>>>>> + *

>>>>>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>>>>>> + * the base packet.

>>>>>>>>> + *

>>>>>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>>>>>> + * reference to the same base packet.

>>>>>>>>> + *

>>>>>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>>>>>> + * this regard will result in this call failing and returning

>>>>>>>>> + * ODP_PACKET_INVALID.

>>>>>>>>> + *

>>>>>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>>>>>> + * as read-only.

>>>>>>>>> + *

>>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>>> + *               created.

>>>>>>>>> + *

>>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>>> + *

>>>>>>>>> + * @return                    Handle of the reference packet

>>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>>> + *

>>>>>>>>> +

>>>>>>>>> + */

>>>>>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>> + * Create a reference to another packet from a header packet

>>>>>>>>> + *

>>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>>>>>> + * multicasting or retransmission processing to be performed.

>>>>>>>>

>>>>>>>> My concerns with this API is what happens when application creates

>>>>>>>> multiple references from multiple offsets of the base packet. In that

>>>>>>>> scenario the read-only status of the shared portion could not be

>>>>>>>> maintained since if different references gives different offset there

>>>>>>>> will be more than one shared portion.

>>>>>>>>

>>>>>>>

>>>>>>> Why would this be a problem? We're relying on an "honor system" here

>>>>>>> since there is no defined enforcement mechanism. The rule is that you

>>>>>>> should only modify the unshared portion of any packet and results are

>>>>>>> undefined if this rule is ignored. That's why we have the

>>>>>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>>>>>> help applications know whether they should or should not attempt to

>>>>>>> modify a packet given it's handle.

>>>>>>

>>>>>> Let us consider an example,

>>>>>>

>>>>>> pkt_a = odp_packet_ref(pkt_base, 200);

>>>>>> -----

>>>>>> -----

>>>>>> -----

>>>>>> pkt_b = odp_packet_ref(pkt_base, 100);

>>>>>>

>>>>>> In the above case, the read-only portion of pkt_base was from 200 to

>>>>>> end-of-packet during creation of first reference and it is moved to

>>>>>> 100 to end-of-packet during creation of second reference so all the

>>>>>> segment handle of pkt_base previously owned by the application becomes

>>>>>> invalid.

>>>>>>

>>>>>

>>>>> No, in the above examples the entirety of pkt_base should be treated

>>>>> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

>>>>> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

>>>>> most portable rule. During this time odp_packet_unshared_len(pkt_base)

>>>>> will return 0. I don't see a use case for any other interpretation,

>>>>> and any other interpretation is going to encounter a lot of

>>>>> portability issues. Again, this is an "honor system". It's up to the

>>>>> application to observe this convention since we don't want to specify

>>>>> that the implementation has to somehow make pkt_base read only. All

>>>>> ODP says is results are undefined if applications do not observe this

>>>>> rule.

>>>>>

>>>>> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

>>>>> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

>>>>> modified. However since after the odp_packet_ref() call

>>>>> odp_packet_unshared_len() is 0 for both, that means that the only way

>>>>> they should be modified would be to call odp_packet_push_head() or

>>>>> odp_packet_extend_head() to put a unique prefix on the shared payload.

>>>>> So if we call odp_packet_push_head(pkt_a, 64);, then

>>>>> odp_packet_unshare_len(pkt_a) == 64, etc.

>>>>

>>>> I do not see the need why the entire packet should be marked as

>>>> read-only. If you mark the entire base packet as read-only then you

>>>> can not do any header modification for the base packet (ie. tcp or udp

>>>> checksum update) which will make the packet entirely use-less for

>>>> transmission. you already have an API for unshared length which could

>>>> be used to update the unique sections of the base packet.

>>>

>>> It's not "marked". This is a user convention. Marking implies that the

>>> implementation is taking some sort of enforcement steps, which is what

>>> we're trying to avoid. You wouldn't modify the base packet in any way.

>>> Instead you'd create as many references to it as you need and modify

>>> the (unique) headers for each reference. This was the point we

>>> discussed earlier. Under the v2 patch if you need 30 references you

>>> create 30 references and deal with them individually. I agreed to

>>> rework the patch so that you only need to create 29 references and can

>>> use the base packet as a reference in its own right, however this will

>>> incur additional overhead (at least in SW implementations). The

>>> arguments for wanting this seem to be mostly aesthetic since the idea

>>> is that references are cheap so you can create as many as you need

>>> with little overhead. Creating one less but incurring extra overhead

>>> in the process for each one due to the need to maintain an invariant

>>> offset in the presence of a possibly-modifiable base packet seems to

>>> me to be a poor tradeoff.

>>

>> Actually it is not. The idea of creating multiple reference without

>> reserving a read-only base packet is more efficient in HW

>> implementations since it removes the need to differentiate between

>> "base" and "reference" packet and all the packet have equal rights and

>> can be manipulated equally. If your implementation handles packets as

>> linked list of buffers then each buffer probably has a next pointer

>> stored at the start of the buffer and it is not possible to create a

>> segment by pointing at the middle of any segment without creating a

>> new segment since you need to have space for next pointer.

>> Also as discussed in the meeting keeping the shared portion of all

>> referenced packet as read-only has to be done by the application and

>> implementation does not have any control over it.

>

> We're not reserving a read-only base packet. There are no differences

> between base packets and references at an implementation level. The

> idea here is simply that if the application observes the *discipline*

> that it treats base packets as entirely read-only that implementation

> of references can be done more efficiently. That's certainly the case

> in odp-linux and it may be true for other implementations as well

> where perhaps this discipline would remove the need for segment copies

> in many instances.


I understand there is no concept of read-only it is just a
recommendation for the application which is what I would like to
avoid, it is okay to have read-only maintained for the shared section
but not for the unshared head portion. It is better to have same
behaviour for both base and reference packet which would be beneficial
in-terms of performance. I don't see this as very useful for
implementations using segmentation or otherwise. I hope on your next
proposal you would align with this logic.

>

>>

>>>

>>>>

>>>> Also what happens when you free the base packet after creating the

>>>> reference? I believe it should not cause any issue.

>>>

>>> There are no issues with this. It's implementation-defined whether all

>>> of the segments of the base packet are retained or if only the shared

>>> segments are retained. Many implementations will find it's simpler and

>>> easier to just retain them all until all packet references have been

>>> freed and then free everything. Trying to optimize frees on a

>>> per-segment basis is likely to incur more overhead for limited gain

>>> since packet lifespans in the data plane are expected to be short

>>> duration while the packet is transiting the application.

>>

>> If we follow the proposal that there is no difference between a base

>> packet and a reference then the application is free to delete the

>> packets in any order as it prefers and there is no need for any kind

>

> It's always the case that frees can be done in any order. If you have

> an odp_packet_t you can always call odp_packet_free() for it whether

> or not it is a reference. The implementation must employ reference

> counters or similar mechanism to ensure that the actual packet data is

> not freed as long as any references to it are still outstanding. The

> application has no need to do bookkeeping in this regard as this is

> entirely the responsibility of the implementation.

>

>> of book keeping required from the application. This segment level

>> manipulation is very efficient since it can achieve creation of

>> multiple references by copy of very minimal number of bytes.

>

> The problem is that what you really want is zero-copy, not "minimal

> copy". If you assume that most packets are a single segment and/or

> that the vast majority of references will be created with offsets in

> the first segment, this means that every reference is going to involve

> a segment copy of some sort, which means that the creation and use of

> references is going to be relatively high overhead. As we discussed,

> the goal is for references to be highly efficient since the

> expectation is that they will be used frequently for things like

> retransmission, multicast, etc. The updated User Guide discusses this

> and shows code examples of these expected use cases and should be

> considered as part of this discussion.


Yes. there will be a copy of few bytes of the segment but this will be
minimal and most efficient means to create a reference in
implementations which support segmentation. since most of the unshared
portion will be in the head of the packet new segment or copy of few
bytes of the segment will most likely happen only for the first
segment.
zero-copy logic cannot be implemented for implementations supporting
segmentation since if packet is stored as linked-list of segments then
every segment will have to have next pointer usually stored at the
start of the segment and you cannot have a packet in which segment
starts at middle of other segment since you need to store your next
pointer somewhere.

Regards,
Bala

>>

>> Regards,

>> Bala

>>

>>>

>>>> Contrary to this, we can also restrict the number of offset for the

>>>> base packet to 1, so the above conflict do not arise.

>>>

>>> I'm not sure I understand this point. For odp_packet_ref() the API is

>>> defined as valid offsets are 0..odp_packet_len(base_pkt)-1. If the

>>> implementation cannot accommodate this full range then it is always

>>> free to fail the call by returning ODP_PACKET_INVALID. Applications

>>> must always check the returned odp_packet_t since there is no

>>> guarantee that this API call will always succeed.

>>>

>>>>

>>>>>

>>>>>>>

>>>>>>>>> + *

>>>>>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>>>>>> + * nested and fail the call if those limits are exceeded.

>>>>>>>>> + *

>>>>>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>>>>>> + * however individual implementations MAY require that both reside in the same

>>>>>>>>> + * pool and fail the call if this restriction is not observed.

>>>>>>>>

>>>>>>>> Not sure if there is a requirement to support reference with packets

>>>>>>>> from multiple pools.

>>>>>>>

>>>>>>> That's why this is a MAY. We could expose this as a capability or

>>>>>>> simply state that this is not supported, however some implementations

>>>>>>> may be able to support this (e.g., odp-linux has no need to make this

>>>>>>> restriction) and I could see how it could be useful to have header

>>>>>>> templates in their own pool for storage management purposes.

>>>>>>

>>>>>> odp-linux is a special case since it does not use any HW pool manager,

>>>>>> most platforms using HW pool manager might not be able to create a

>>>>>> packet with segments from multiple pools since it will be difficult to

>>>>>> return the segments to respective pools during odp_packet_free().

>>>>>> packet pool is a limited resource in a system and it might not be

>>>>>> advisable to use a separate pool for header template.

>>>>>

>>>>> That's why this is a MAY. Implementation that don't support this would

>>>>> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

>>>>> the header and packet to be referenced reside in different pools. This

>>>>> is no different than an odp_packet_alloc() call failing because the

>>>>> pool is out of memory, etc. Applications are expected to be able to

>>>>> deal with these sort of situations.

>>>>>

>>>>> Again, if we want we can simply say that both packets must come from

>>>>> the same pool and leave it at that. I have no problem changing that if

>>>>> people feel that a more uniform interpretation is desirable. Note

>>>>> however that many of the other packet routines (odp_packet_copy(),

>>>>> odp_packet_concat()) support multi-pool operation.

>>>>

>>>> During packet_copy and packet_concat() there is a need to create two

>>>> completely independent packets after the call there is no shared data

>>>> between the packets and you need to copy the packet from "src" to

>>>> "dst" but that is not the case for reference creation this could run

>>>> in the fast path and should be done as efficient as possible.

>>>

>>> Agreed. The whole idea of references is to copy as little as possible

>>> (ideally nothing). That's what odp-linux does. Having the convention

>>> that the base packet is treated as read only was supposed to make this

>>> simpler for implementations since it doesn't have to worry about the

>>> referenced packet changing. For reasons I don't fully appreciate, this

>>> convention makes life more difficult for your implementation?

>>>

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>>

>>>>>>>>> + *

>>>>>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>>>>>> + * header packet.

>>>>>>>>> + *

>>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>>> + *               created.

>>>>>>>>> + *

>>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>>> + *

>>>>>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>>>>>> + *               packet to create the reference. If this is specified as

>>>>>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>>>>>> + *               to create circular references.

>>>>>>>>> + *

>>>>>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>>>>>> + *               handle since the original hdr packet no longer has any

>>>>>>>>> + *               independent existence.

>>>>>>>>> + *

>>>>>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>>>>>> + *                            and hdr are unchanged.

>>>>>>>>> + */

>>>>>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>>>>>> +                               odp_packet_t hdr);

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>> + * Test if a packet is a reference

>>>>>>>>> + *

>>>>>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>>>>>> + * result in return values greater than 1.

>>>>>>>>> + *

>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>> + *

>>>>>>>>> + * @retval 0  Packet is not a reference

>>>>>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>>>>>> + */

>>>>>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>>>>>

>>>>>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>>>>>> return the number of references?

>>>>>>>> The total number of references created is not interesting and also it

>>>>>>>> is not so easy since references are created at segment level as per

>>>>>>>> this proposal.

>>>>>>>> Application will have to call odp_packet_free() for all the packet

>>>>>>>> handle it is holding.

>>>>>>>

>>>>>>> Since any implementation supporting references needs to have some

>>>>>>> internal notion of a reference count this is just attempting to expose

>>>>>>> that in an implementation-independent manner. This is also why there

>>>>>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>>>>>

>>>>>> Again this depends on the implementation, based on the proposal here

>>>>>> the reference count should be maintained per segment since the

>>>>>> reference is being created using an offset, different segments of the

>>>>>> packets could have different references.

>>>>>> My concern is this count is not useful since application already has

>>>>>> packet handles per reference and it has to free all the handles which

>>>>>> was populated using reference APIs.

>>>>>

>>>>> No, a reference count is logically on a per-packet basis because as

>>>>> noted earlier the entire packet should be treated as read only if any

>>>>> reference to it exists. If implementations want to use per-segment

>>>>> reference counts for internal convenience that's fine, but the API is

>>>>> referring to per-packet references. Remember that applications need to

>>>>> be aware of the existence of segments but ODP APIs are designed to

>>>>> manipulate packets, not segments, since it's assumed that the

>>>>> existence of segments is for the implementation's convenience, not the

>>>>> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

>>>>> they're a bit of a sidecar to the other packet APIs and it's unclear

>>>>> how applications would use these in any generally portable manner.

>>>>

>>>> Again this is not true. We seem to contradict each other heavily :)

>>>> If you want to have an efficient implementation this reference

>>>> creation should be done with zero-copy or as limited coping as

>>>> possible which would be achieved only by reusing common segments.

>>>> IMO there is no use in getting the number of reference which this

>>>> packet holds if you have any valid use-case then we can add this

>>>> feature.

>>>> The main concern for me is that this requires additional per-packet

>>>> meta-data storage.

>>>

>>> Well, you can see my implementation (which is pretty simple and

>>> efficient) but I can't see yours so I don't fully understand the

>>> problem you're trying to solve. If you'd like to have a private

>>> hangout to explore this further let me know.

>>>

>>>

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>>

>>>>>>>>> +

>>>>>>>>> +/**

>>>>>>>>> + * Test if a packet has a reference

>>>>>>>>> + *

>>>>>>>>> + * A packet has a reference if a reference was created on it by

>>>>>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>>>>>> + *

>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>> + *

>>>>>>>>> + * @retval 0  Packet does not have any references

>>>>>>>>> + * @retval >0 Packet has N references based on it

>>>>>>>>> + */

>>>>>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>>>>>

>>>>>>>> What is the difference between odp_packet_has_ref() and

>>>>>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>>>>>> difference between the base packet and reference.

>>>>>>>

>>>>>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>>>>>> associated with this patch series.

>>>>>>>

>>>>>>> Consider the call:

>>>>>>>

>>>>>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>>>>>

>>>>>>> After this call the following conditions hold:

>>>>>>>

>>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>>> odp_packet_has_ref(pkt_a) == 0

>>>>>>> odp_packet_has_ref(pkt_b) == 1

>>>>>>>

>>>>>>> Now consider the more complex case:

>>>>>>>

>>>>>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>>>>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>>>>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>>>>>

>>>>>>> Now we have:

>>>>>>>

>>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>>> odp_packet_is_ref(pkt_c) == 1

>>>>>>> odp_packet_is_ref(pkt_d) == 2

>>>>>>> odp_packet_has_ref(pkt_a) == 1

>>>>>>> odp_packet_has_ref(pkt_b) == 3

>>>>>>> odp_packet_has_ref(pkt_c) == 0

>>>>>>> odp_packet_has_ref(pkt_d) == 0

>>>>>>>

>>>>>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>>>>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>>>>>> answers the question "How many other handles can be used to reference

>>>>>>> this packet"?

>>>>>>

>>>>>> This information requires lots of unnecessary computation at the

>>>>>> implementation level, references are created by linking multiple

>>>>>> segments together from application point of view there is no

>>>>>> difference between a base packet and a reference both have few shared

>>>>>> segments and have few unique segments. I understand the definition of

>>>>>> this API, my query is the use-case for returning this number of

>>>>>> has_ref and is_ref()?

>>>>>

>>>>> No that's not true since each packet (whether it's a reference or a

>>>>> base packet) has its own metadata (including headroom, user area,

>>>>> etc.). One of the reasons we introduced the odp_packet_ref_static()

>>>>> API is to allow implementations to further optimize this and create

>>>>> the type of references you refer to here when applications assert that

>>>>> they're going to treat the reference as entirely read only (and hence

>>>>> can share metadata with the base packet). You cannot just link

>>>>> segments together and have a valid packet reference since that doesn't

>>>>> cover the unique metadata like headroom, etc.

>>>>

>>>> For creating packet reference only header meta-data needs to be unique

>>>> but the segments could be shared even in odp_packet_ref() API.

>>>> It is better to share as many segments as possible since copying data

>>>> will not be as efficient as linking multiple segments together.

>>>>

>>>>>

>>>>> Given the requirement for unique metadata on a per-reference basis, I

>>>>> don't see the implementation of these APIs as problematic. The

>>>>> odp-linux implementation shows just how simple they really are.

>>>>

>>>> odp-linux does not have any constraint on the per-packet meta-data but

>>>> it is not true for all implementations, I can agree to add this value

>>>> in meta-data if there is any valid use-case for this number if not

>>>> then I would prefer not to waste packet meta-data unnecessarily.

>>>>

>>>> Regards,

>>>> Bala

>>>>>

>>>>>>

>>>>>> Regards,

>>>>>> Bala

>>>>>

>>>>> Thank you! This is a great dialog.

>>>>>

>>>>>>>

>>>>>>>>

>>>>>>>> Regards,

>>>>>>>> Bala

>>>>>>>>> +

>>>>>>>>> +/*

>>>>>>>>> + *

>>>>>>>>>   * Copy

>>>>>>>>>   * ********************************************************

>>>>>>>>>   *

>>>>>>>>> --

>>>>>>>>> 2.7.4

>>>>>>>>>
Bill Fischofer Dec. 23, 2016, 1:27 a.m. UTC | #14
On Thu, Dec 22, 2016 at 9:07 AM, Bala Manoharan
<bala.manoharan@linaro.org> wrote:
> On 21 December 2016 at 19:45, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>> On Wed, Dec 21, 2016 at 6:42 AM, Bala Manoharan

>> <bala.manoharan@linaro.org> wrote:

>>> Regards,

>>> Bala

>>>

>>>

>>> On 20 December 2016 at 22:35, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>> On Tue, Dec 20, 2016 at 8:26 AM, Bala Manoharan

>>>> <bala.manoharan@linaro.org> wrote:

>>>>> On 20 December 2016 at 18:26, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>> On Tue, Dec 20, 2016 at 5:59 AM, Bala Manoharan

>>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>>> Regards,

>>>>>>> Bala

>>>>>>>

>>>>>>>

>>>>>>> On 19 December 2016 at 20:23, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>>> On Mon, Dec 19, 2016 at 4:06 AM, Bala Manoharan

>>>>>>>> <bala.manoharan@linaro.org> wrote:

>>>>>>>>> Comments inline....

>>>>>>>>>

>>>>>>>>>

>>>>>>>>> On 15 November 2016 at 20:14, Bill Fischofer <bill.fischofer@linaro.org> wrote:

>>>>>>>>>> Introduce three new APIs that support efficient sharing of portions of

>>>>>>>>>> packets.

>>>>>>>>>>

>>>>>>>>>> odp_packet_ref_static() creates an alias for a base packet

>>>>>>>>>>

>>>>>>>>>> odp_packet_ref() creates a reference to a base packet

>>>>>>>>>>

>>>>>>>>>> odp_packet_ref_pkt() creates a reference to a base packet from a supplied

>>>>>>>>>> header packet

>>>>>>>>>>

>>>>>>>>>> In addition, three other APIs simplify working with references

>>>>>>>>>>

>>>>>>>>>> odp_packet_is_ref() says whether a packet is a reference

>>>>>>>>>>

>>>>>>>>>> odp_packet_has_ref() says whether a packet has had a reference made to it

>>>>>>>>>>

>>>>>>>>>> odp_packet_unshared_len() gives the length of the prefix bytes that are

>>>>>>>>>> unique to this reference. These are the only bytes of the packet that may

>>>>>>>>>> be modified safely.

>>>>>>>>>>

>>>>>>>>>> Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

>>>>>>>>>> ---

>>>>>>>>>>  include/odp/api/spec/packet.h | 168 ++++++++++++++++++++++++++++++++++++++++++

>>>>>>>>>>  1 file changed, 168 insertions(+)

>>>>>>>>>>

>>>>>>>>>> diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h

>>>>>>>>>> index faf62e2..137024f 100644

>>>>>>>>>> --- a/include/odp/api/spec/packet.h

>>>>>>>>>> +++ b/include/odp/api/spec/packet.h

>>>>>>>>>> @@ -256,6 +256,23 @@ uint32_t odp_packet_seg_len(odp_packet_t pkt);

>>>>>>>>>>  uint32_t odp_packet_len(odp_packet_t pkt);

>>>>>>>>>>

>>>>>>>>>>  /**

>>>>>>>>>> + * Packet unshared data len

>>>>>>>>>> + *

>>>>>>>>>> + * Returns the sum of data lengths over all unshared packet segments. These

>>>>>>>>>> + * are the only bytes that should be modified by the caller. The rest of the

>>>>>>>>>> + * packet should be treated as read-only.

>>>>>>>>>> + *

>>>>>>>>>> + * This routine will return 0 if odp_packet_has_ref() != 0 and will be the

>>>>>>>>>> + * same as odp_packet_len() if odp_packet_has_ref() == 0 and

>>>>>>>>>> + * odp_packet_is_ref() == 0.

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>>> + *

>>>>>>>>>> + * @return Packet unshared data length

>>>>>>>>>> + */

>>>>>>>>>> +uint32_t odp_packet_unshared_len(odp_packet_t pkt);

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>>   * Packet headroom length

>>>>>>>>>>   *

>>>>>>>>>>   * Returns the current packet level headroom length.

>>>>>>>>>> @@ -847,6 +864,157 @@ int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);

>>>>>>>>>>

>>>>>>>>>>  /*

>>>>>>>>>>   *

>>>>>>>>>> + * References

>>>>>>>>>> + * ********************************************************

>>>>>>>>>> + *

>>>>>>>>>> + */

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>> + * Create a static reference to a packet

>>>>>>>>>> + *

>>>>>>>>>> + * A static reference is used to obtain an additional handle for referring to

>>>>>>>>>> + * a packet so that the storage behind it is not freed until all references to

>>>>>>>>>> + * the packet are freed. This can be used, for example, to support efficient

>>>>>>>>>> + * retransmission processing.

>>>>>>>>>> + *

>>>>>>>>>> + * The intent of a static reference is that both the base packet and the

>>>>>>>>>> + * returned reference will be treated as read-only after this call. Results

>>>>>>>>>> + * are unpredictable if this restriction is not observed.

>>>>>>>>>> + *

>>>>>>>>>> + * Static references have restrictions but may have performance advantages on

>>>>>>>>>> + * some platforms if the caller does not intend to modify the reference

>>>>>>>>>> + * packet. If modification is needed (e.g., to prefix a unique header onto the

>>>>>>>>>> + * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt    Handle of the base packet for which a static reference is

>>>>>>>>>> + *               to be created.

>>>>>>>>>> + *

>>>>>>>>>> + * @return                    Handle of the static reference packet

>>>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>>>> + */

>>>>>>>>>> +odp_packet_t odp_packet_ref_static(odp_packet_t pkt);

>>>>>>>>>

>>>>>>>>> It is better to change the API signature to return the updated handle

>>>>>>>>> of the base packet also to the application.

>>>>>>>>> Similar to the API for odp_packet_extend_head().

>>>>>>>>

>>>>>>>> I think returning the packet ref directly rather than indirectly makes

>>>>>>>> for easier coding on the part of applications. Failure is indicated by

>>>>>>>> returning ODP_PACKET_INVALID. So in this sense we're like

>>>>>>>> odp_packet_alloc(). The alternative:

>>>>>>>>

>>>>>>>> int odp_packet_ref_static(odp_packet_t pkt, odp_packet_t *refpkt);

>>>>>>>>

>>>>>>>> seems cumbersome since if RC != 0 then the refpkt return would be meaningless.

>>>>>>>>

>>>>>>>>>

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>> + * Create a reference to a packet

>>>>>>>>>> + *

>>>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>>>> + * offset. This reference may be used as an argument to header manipulation

>>>>>>>>>> + * APIs to prefix a unique header onto the shared base. The storage associated

>>>>>>>>>> + * with the base packet is not freed until all references to it are freed,

>>>>>>>>>> + * which permits easy multicasting or retransmission processing to be

>>>>>>>>>> + * performed. Following a successful call, the base packet should be treated

>>>>>>>>>> + * as read-only. Results are unpredictable if this restriction is not

>>>>>>>>>> + * observed.

>>>>>>>>>> + *

>>>>>>>>>> + * This operation prepends a zero-length header onto the base packet beginning

>>>>>>>>>> + * at the specified offset. This header is always drawn from the same pool as

>>>>>>>>>> + * the base packet.

>>>>>>>>>> + *

>>>>>>>>>> + * Because references are unique packets, any bytes pushed onto the head of a

>>>>>>>>>> + * reference via odp_packet_push_head() or odp_packet_extend_head() are unique

>>>>>>>>>> + * to this reference and are not seen by the base packet or by any other

>>>>>>>>>> + * reference to the same base packet.

>>>>>>>>>> + *

>>>>>>>>>> + * The base packet used as input to this routine may itself by a reference to

>>>>>>>>>> + * some other base packet. Implementations MAY restrict the ability to create

>>>>>>>>>> + * such compound references. Attempts to exceed any implementation limits in

>>>>>>>>>> + * this regard will result in this call failing and returning

>>>>>>>>>> + * ODP_PACKET_INVALID.

>>>>>>>>>> + *

>>>>>>>>>> + * If the caller does not indend to push a header onto the returned reference,

>>>>>>>>>> + * the odp_packet_ref_static() API may be used. This may be a more efficient

>>>>>>>>>> + * means of obtaining another reference to a base packet that will be treated

>>>>>>>>>> + * as read-only.

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>>>> + *               created.

>>>>>>>>>> + *

>>>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>>>> + *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>>>> + *

>>>>>>>>>> + * @return                    Handle of the reference packet

>>>>>>>>>> + * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.

>>>>>>>>>> + *

>>>>>>>>>> +

>>>>>>>>>> + */

>>>>>>>>>> +odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>> + * Create a reference to another packet from a header packet

>>>>>>>>>> + *

>>>>>>>>>> + * Create a dynamic reference to a base packet starting at a specified byte

>>>>>>>>>> + * offset by prepending a supplied header packet. The resulting reference

>>>>>>>>>> + * consists of the unshared header followed by the shared base packet starting

>>>>>>>>>> + * at the specified offset. This shared portion should be regarded as

>>>>>>>>>> + * read-only. The storage associated with the shared portion of the reference

>>>>>>>>>> + * is not freed until all references to it are freed, which permits easy

>>>>>>>>>> + * multicasting or retransmission processing to be performed.

>>>>>>>>>

>>>>>>>>> My concerns with this API is what happens when application creates

>>>>>>>>> multiple references from multiple offsets of the base packet. In that

>>>>>>>>> scenario the read-only status of the shared portion could not be

>>>>>>>>> maintained since if different references gives different offset there

>>>>>>>>> will be more than one shared portion.

>>>>>>>>>

>>>>>>>>

>>>>>>>> Why would this be a problem? We're relying on an "honor system" here

>>>>>>>> since there is no defined enforcement mechanism. The rule is that you

>>>>>>>> should only modify the unshared portion of any packet and results are

>>>>>>>> undefined if this rule is ignored. That's why we have the

>>>>>>>> odp_packet_unshared_len() as well as the odp_packet_has_ref() APIs to

>>>>>>>> help applications know whether they should or should not attempt to

>>>>>>>> modify a packet given it's handle.

>>>>>>>

>>>>>>> Let us consider an example,

>>>>>>>

>>>>>>> pkt_a = odp_packet_ref(pkt_base, 200);

>>>>>>> -----

>>>>>>> -----

>>>>>>> -----

>>>>>>> pkt_b = odp_packet_ref(pkt_base, 100);

>>>>>>>

>>>>>>> In the above case, the read-only portion of pkt_base was from 200 to

>>>>>>> end-of-packet during creation of first reference and it is moved to

>>>>>>> 100 to end-of-packet during creation of second reference so all the

>>>>>>> segment handle of pkt_base previously owned by the application becomes

>>>>>>> invalid.

>>>>>>>

>>>>>>

>>>>>> No, in the above examples the entirety of pkt_base should be treated

>>>>>> as read-only from the moment odp_packet_has_ref(pkt_base) > 0 until

>>>>>> odp_packet_has_ref(pkt_base) == 0. That's the simplest, cleanest, and

>>>>>> most portable rule. During this time odp_packet_unshared_len(pkt_base)

>>>>>> will return 0. I don't see a use case for any other interpretation,

>>>>>> and any other interpretation is going to encounter a lot of

>>>>>> portability issues. Again, this is an "honor system". It's up to the

>>>>>> application to observe this convention since we don't want to specify

>>>>>> that the implementation has to somehow make pkt_base read only. All

>>>>>> ODP says is results are undefined if applications do not observe this

>>>>>> rule.

>>>>>>

>>>>>> Note that for both pkt_a and pkt_b odp_packet_has_ref() is 0 while

>>>>>> odp_packet_is_ref() will be 1. That means that pkt_a and pkt_b can be

>>>>>> modified. However since after the odp_packet_ref() call

>>>>>> odp_packet_unshared_len() is 0 for both, that means that the only way

>>>>>> they should be modified would be to call odp_packet_push_head() or

>>>>>> odp_packet_extend_head() to put a unique prefix on the shared payload.

>>>>>> So if we call odp_packet_push_head(pkt_a, 64);, then

>>>>>> odp_packet_unshare_len(pkt_a) == 64, etc.

>>>>>

>>>>> I do not see the need why the entire packet should be marked as

>>>>> read-only. If you mark the entire base packet as read-only then you

>>>>> can not do any header modification for the base packet (ie. tcp or udp

>>>>> checksum update) which will make the packet entirely use-less for

>>>>> transmission. you already have an API for unshared length which could

>>>>> be used to update the unique sections of the base packet.

>>>>

>>>> It's not "marked". This is a user convention. Marking implies that the

>>>> implementation is taking some sort of enforcement steps, which is what

>>>> we're trying to avoid. You wouldn't modify the base packet in any way.

>>>> Instead you'd create as many references to it as you need and modify

>>>> the (unique) headers for each reference. This was the point we

>>>> discussed earlier. Under the v2 patch if you need 30 references you

>>>> create 30 references and deal with them individually. I agreed to

>>>> rework the patch so that you only need to create 29 references and can

>>>> use the base packet as a reference in its own right, however this will

>>>> incur additional overhead (at least in SW implementations). The

>>>> arguments for wanting this seem to be mostly aesthetic since the idea

>>>> is that references are cheap so you can create as many as you need

>>>> with little overhead. Creating one less but incurring extra overhead

>>>> in the process for each one due to the need to maintain an invariant

>>>> offset in the presence of a possibly-modifiable base packet seems to

>>>> me to be a poor tradeoff.

>>>

>>> Actually it is not. The idea of creating multiple reference without

>>> reserving a read-only base packet is more efficient in HW

>>> implementations since it removes the need to differentiate between

>>> "base" and "reference" packet and all the packet have equal rights and

>>> can be manipulated equally. If your implementation handles packets as

>>> linked list of buffers then each buffer probably has a next pointer

>>> stored at the start of the buffer and it is not possible to create a

>>> segment by pointing at the middle of any segment without creating a

>>> new segment since you need to have space for next pointer.

>>> Also as discussed in the meeting keeping the shared portion of all

>>> referenced packet as read-only has to be done by the application and

>>> implementation does not have any control over it.

>>

>> We're not reserving a read-only base packet. There are no differences

>> between base packets and references at an implementation level. The

>> idea here is simply that if the application observes the *discipline*

>> that it treats base packets as entirely read-only that implementation

>> of references can be done more efficiently. That's certainly the case

>> in odp-linux and it may be true for other implementations as well

>> where perhaps this discipline would remove the need for segment copies

>> in many instances.

>

> I understand there is no concept of read-only it is just a

> recommendation for the application which is what I would like to

> avoid, it is okay to have read-only maintained for the shared section

> but not for the unshared head portion. It is better to have same

> behaviour for both base and reference packet which would be beneficial

> in-terms of performance. I don't see this as very useful for

> implementations using segmentation or otherwise. I hope on your next

> proposal you would align with this logic.


Yes, in the next revision I'll post the distinction between base and
reference packets disappears and you can continue to do push/pull
operations on the head of a referenced packet as long as you're
operating in the unshared portion of the packet. There is additional
overhead in odp-linux to support this generalization, but it shouldn't
be meaningful. I'm rebasing things on top of Petri's latest patch
series since the current patch conflicts with that and his fixes
should go in first before new functionality is introduced.

>

>>

>>>

>>>>

>>>>>

>>>>> Also what happens when you free the base packet after creating the

>>>>> reference? I believe it should not cause any issue.

>>>>

>>>> There are no issues with this. It's implementation-defined whether all

>>>> of the segments of the base packet are retained or if only the shared

>>>> segments are retained. Many implementations will find it's simpler and

>>>> easier to just retain them all until all packet references have been

>>>> freed and then free everything. Trying to optimize frees on a

>>>> per-segment basis is likely to incur more overhead for limited gain

>>>> since packet lifespans in the data plane are expected to be short

>>>> duration while the packet is transiting the application.

>>>

>>> If we follow the proposal that there is no difference between a base

>>> packet and a reference then the application is free to delete the

>>> packets in any order as it prefers and there is no need for any kind

>>

>> It's always the case that frees can be done in any order. If you have

>> an odp_packet_t you can always call odp_packet_free() for it whether

>> or not it is a reference. The implementation must employ reference

>> counters or similar mechanism to ensure that the actual packet data is

>> not freed as long as any references to it are still outstanding. The

>> application has no need to do bookkeeping in this regard as this is

>> entirely the responsibility of the implementation.

>>

>>> of book keeping required from the application. This segment level

>>> manipulation is very efficient since it can achieve creation of

>>> multiple references by copy of very minimal number of bytes.

>>

>> The problem is that what you really want is zero-copy, not "minimal

>> copy". If you assume that most packets are a single segment and/or

>> that the vast majority of references will be created with offsets in

>> the first segment, this means that every reference is going to involve

>> a segment copy of some sort, which means that the creation and use of

>> references is going to be relatively high overhead. As we discussed,

>> the goal is for references to be highly efficient since the

>> expectation is that they will be used frequently for things like

>> retransmission, multicast, etc. The updated User Guide discusses this

>> and shows code examples of these expected use cases and should be

>> considered as part of this discussion.

>

> Yes. there will be a copy of few bytes of the segment but this will be

> minimal and most efficient means to create a reference in

> implementations which support segmentation. since most of the unshared

> portion will be in the head of the packet new segment or copy of few

> bytes of the segment will most likely happen only for the first

> segment.

> zero-copy logic cannot be implemented for implementations supporting

> segmentation since if packet is stored as linked-list of segments then

> every segment will have to have next pointer usually stored at the

> start of the segment and you cannot have a packet in which segment

> starts at middle of other segment since you need to store your next

> pointer somewhere.


With the segmentation support that Petri has introduced recently,
odp-linux has the same considerations, however if you look at the
implementation of references they are done at a higher level, which
permits zero-copy references with arbitrary offsets into shared
segments.

The basic idea is not to try to overload the existing segment
structure used to support packets but rather create a way of linking
packets together to form compound packets that simply indirect from
one to the next at the specified offsets. I'm not sure if this
structure can be translated easily to HW-based implementations, but
something to look at as it may provide some ideas to increase
efficient sharing.

>

> Regards,

> Bala

>

>>>

>>> Regards,

>>> Bala

>>>

>>>>

>>>>> Contrary to this, we can also restrict the number of offset for the

>>>>> base packet to 1, so the above conflict do not arise.

>>>>

>>>> I'm not sure I understand this point. For odp_packet_ref() the API is

>>>> defined as valid offsets are 0..odp_packet_len(base_pkt)-1. If the

>>>> implementation cannot accommodate this full range then it is always

>>>> free to fail the call by returning ODP_PACKET_INVALID. Applications

>>>> must always check the returned odp_packet_t since there is no

>>>> guarantee that this API call will always succeed.

>>>>

>>>>>

>>>>>>

>>>>>>>>

>>>>>>>>>> + *

>>>>>>>>>> + * Either packet input to this routine may itself be a reference, however

>>>>>>>>>> + * individual implementations MAY impose limits on how deeply splices may be

>>>>>>>>>> + * nested and fail the call if those limits are exceeded.

>>>>>>>>>> + *

>>>>>>>>>> + * The packets used as input to this routine may reside in different pools,

>>>>>>>>>> + * however individual implementations MAY require that both reside in the same

>>>>>>>>>> + * pool and fail the call if this restriction is not observed.

>>>>>>>>>

>>>>>>>>> Not sure if there is a requirement to support reference with packets

>>>>>>>>> from multiple pools.

>>>>>>>>

>>>>>>>> That's why this is a MAY. We could expose this as a capability or

>>>>>>>> simply state that this is not supported, however some implementations

>>>>>>>> may be able to support this (e.g., odp-linux has no need to make this

>>>>>>>> restriction) and I could see how it could be useful to have header

>>>>>>>> templates in their own pool for storage management purposes.

>>>>>>>

>>>>>>> odp-linux is a special case since it does not use any HW pool manager,

>>>>>>> most platforms using HW pool manager might not be able to create a

>>>>>>> packet with segments from multiple pools since it will be difficult to

>>>>>>> return the segments to respective pools during odp_packet_free().

>>>>>>> packet pool is a limited resource in a system and it might not be

>>>>>>> advisable to use a separate pool for header template.

>>>>>>

>>>>>> That's why this is a MAY. Implementation that don't support this would

>>>>>> fail the odp_packet_ref_pkt() call by returning ODP_PACKET_INVALID if

>>>>>> the header and packet to be referenced reside in different pools. This

>>>>>> is no different than an odp_packet_alloc() call failing because the

>>>>>> pool is out of memory, etc. Applications are expected to be able to

>>>>>> deal with these sort of situations.

>>>>>>

>>>>>> Again, if we want we can simply say that both packets must come from

>>>>>> the same pool and leave it at that. I have no problem changing that if

>>>>>> people feel that a more uniform interpretation is desirable. Note

>>>>>> however that many of the other packet routines (odp_packet_copy(),

>>>>>> odp_packet_concat()) support multi-pool operation.

>>>>>

>>>>> During packet_copy and packet_concat() there is a need to create two

>>>>> completely independent packets after the call there is no shared data

>>>>> between the packets and you need to copy the packet from "src" to

>>>>> "dst" but that is not the case for reference creation this could run

>>>>> in the fast path and should be done as efficient as possible.

>>>>

>>>> Agreed. The whole idea of references is to copy as little as possible

>>>> (ideally nothing). That's what odp-linux does. Having the convention

>>>> that the base packet is treated as read only was supposed to make this

>>>> simpler for implementations since it doesn't have to worry about the

>>>> referenced packet changing. For reasons I don't fully appreciate, this

>>>> convention makes life more difficult for your implementation?

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>>

>>>>>>>>>

>>>>>>>>>> + *

>>>>>>>>>> + * odp_packet_pool() on the returned reference returns the pool id of the

>>>>>>>>>> + * header packet.

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt    Handle of the base packet for which a reference is to be

>>>>>>>>>> + *               created.

>>>>>>>>>> + *

>>>>>>>>>> + * @param offset Byte offset in the base packet at which the shared reference

>>>>>>>>>> + *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.

>>>>>>>>>> + *

>>>>>>>>>> + * @param hdr    Handle of the header packet to be prefixed onto the base

>>>>>>>>>> + *               packet to create the reference. If this is specified as

>>>>>>>>>> + *               ODP_PACKET_INVALID, this call is equivalent to

>>>>>>>>>> + *               odp_packet_ref(). The supplied hdr must be distinct from

>>>>>>>>>> + *               the base pkt and results are undefined if an attempt is made

>>>>>>>>>> + *               to create circular references.

>>>>>>>>>> + *

>>>>>>>>>> + * @return       Handle of the reference packet. This may or may not be the

>>>>>>>>>> + *               same as the input hdr packet. The caller should only use this

>>>>>>>>>> + *               handle since the original hdr packet no longer has any

>>>>>>>>>> + *               independent existence.

>>>>>>>>>> + *

>>>>>>>>>> + * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt

>>>>>>>>>> + *                            and hdr are unchanged.

>>>>>>>>>> + */

>>>>>>>>>> +odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,

>>>>>>>>>> +                               odp_packet_t hdr);

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>> + * Test if a packet is a reference

>>>>>>>>>> + *

>>>>>>>>>> + * A packet is a reference if it was created by odp_packet_ref() or

>>>>>>>>>> + * odp_packet_ref_pkt().  Note that a reference created from another

>>>>>>>>>> + * reference is considered a compound reference. Calls on such packets will

>>>>>>>>>> + * result in return values greater than 1.

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>>> + *

>>>>>>>>>> + * @retval 0  Packet is not a reference

>>>>>>>>>> + * @retval >0 Packet is a reference consisting of N individual packets.

>>>>>>>>>> + */

>>>>>>>>>> +int odp_packet_is_ref(odp_packet_t pkt);

>>>>>>>>>

>>>>>>>>> It is better to keep the return value as 0 or 1. Is it expected to

>>>>>>>>> return the number of references?

>>>>>>>>> The total number of references created is not interesting and also it

>>>>>>>>> is not so easy since references are created at segment level as per

>>>>>>>>> this proposal.

>>>>>>>>> Application will have to call odp_packet_free() for all the packet

>>>>>>>>> handle it is holding.

>>>>>>>>

>>>>>>>> Since any implementation supporting references needs to have some

>>>>>>>> internal notion of a reference count this is just attempting to expose

>>>>>>>> that in an implementation-independent manner. This is also why there

>>>>>>>> is both the odp_packet_is_ref() and odp_packet_has_ref() APIs defined.

>>>>>>>

>>>>>>> Again this depends on the implementation, based on the proposal here

>>>>>>> the reference count should be maintained per segment since the

>>>>>>> reference is being created using an offset, different segments of the

>>>>>>> packets could have different references.

>>>>>>> My concern is this count is not useful since application already has

>>>>>>> packet handles per reference and it has to free all the handles which

>>>>>>> was populated using reference APIs.

>>>>>>

>>>>>> No, a reference count is logically on a per-packet basis because as

>>>>>> noted earlier the entire packet should be treated as read only if any

>>>>>> reference to it exists. If implementations want to use per-segment

>>>>>> reference counts for internal convenience that's fine, but the API is

>>>>>> referring to per-packet references. Remember that applications need to

>>>>>> be aware of the existence of segments but ODP APIs are designed to

>>>>>> manipulate packets, not segments, since it's assumed that the

>>>>>> existence of segments is for the implementation's convenience, not the

>>>>>> application's. Yes, we do have various odp_packet_seg_xxx() APIs but

>>>>>> they're a bit of a sidecar to the other packet APIs and it's unclear

>>>>>> how applications would use these in any generally portable manner.

>>>>>

>>>>> Again this is not true. We seem to contradict each other heavily :)

>>>>> If you want to have an efficient implementation this reference

>>>>> creation should be done with zero-copy or as limited coping as

>>>>> possible which would be achieved only by reusing common segments.

>>>>> IMO there is no use in getting the number of reference which this

>>>>> packet holds if you have any valid use-case then we can add this

>>>>> feature.

>>>>> The main concern for me is that this requires additional per-packet

>>>>> meta-data storage.

>>>>

>>>> Well, you can see my implementation (which is pretty simple and

>>>> efficient) but I can't see yours so I don't fully understand the

>>>> problem you're trying to solve. If you'd like to have a private

>>>> hangout to explore this further let me know.

>>>>

>>>>

>>>>>

>>>>>>

>>>>>>>

>>>>>>>>

>>>>>>>>>

>>>>>>>>>> +

>>>>>>>>>> +/**

>>>>>>>>>> + * Test if a packet has a reference

>>>>>>>>>> + *

>>>>>>>>>> + * A packet has a reference if a reference was created on it by

>>>>>>>>>> + * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().

>>>>>>>>>> + *

>>>>>>>>>> + * @param pkt Packet handle

>>>>>>>>>> + *

>>>>>>>>>> + * @retval 0  Packet does not have any references

>>>>>>>>>> + * @retval >0 Packet has N references based on it

>>>>>>>>>> + */

>>>>>>>>>> +int odp_packet_has_ref(odp_packet_t pkt);

>>>>>>>>>

>>>>>>>>> What is the difference between odp_packet_has_ref() and

>>>>>>>>> odp_packet_is_ref()? IMO Once a reference is created there is no

>>>>>>>>> difference between the base packet and reference.

>>>>>>>>

>>>>>>>> Not true. This is discussed (with diagrams) in the User Guide updates

>>>>>>>> associated with this patch series.

>>>>>>>>

>>>>>>>> Consider the call:

>>>>>>>>

>>>>>>>> pkt_a = odp_packet_ref(pkt_b, 0);

>>>>>>>>

>>>>>>>> After this call the following conditions hold:

>>>>>>>>

>>>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>>>> odp_packet_has_ref(pkt_a) == 0

>>>>>>>> odp_packet_has_ref(pkt_b) == 1

>>>>>>>>

>>>>>>>> Now consider the more complex case:

>>>>>>>>

>>>>>>>> pkt_a = odp_packet_ref(pkt_b, offset1);

>>>>>>>> pkt_c = odp_packet_ref(pkt_b, offset2);

>>>>>>>> pkt_d = odp_packet_ref(pkt_a, offset3);

>>>>>>>>

>>>>>>>> Now we have:

>>>>>>>>

>>>>>>>> odp_packet_is_ref(pkt_a) == 1

>>>>>>>> odp_packet_is_ref(pkt_b) == 0

>>>>>>>> odp_packet_is_ref(pkt_c) == 1

>>>>>>>> odp_packet_is_ref(pkt_d) == 2

>>>>>>>> odp_packet_has_ref(pkt_a) == 1

>>>>>>>> odp_packet_has_ref(pkt_b) == 3

>>>>>>>> odp_packet_has_ref(pkt_c) == 0

>>>>>>>> odp_packet_has_ref(pkt_d) == 0

>>>>>>>>

>>>>>>>> Essentially, odp_packet_is_ref() answers the question "How many other

>>>>>>>> packets are referenced via this handle"? While odp_packet_has_ref()

>>>>>>>> answers the question "How many other handles can be used to reference

>>>>>>>> this packet"?

>>>>>>>

>>>>>>> This information requires lots of unnecessary computation at the

>>>>>>> implementation level, references are created by linking multiple

>>>>>>> segments together from application point of view there is no

>>>>>>> difference between a base packet and a reference both have few shared

>>>>>>> segments and have few unique segments. I understand the definition of

>>>>>>> this API, my query is the use-case for returning this number of

>>>>>>> has_ref and is_ref()?

>>>>>>

>>>>>> No that's not true since each packet (whether it's a reference or a

>>>>>> base packet) has its own metadata (including headroom, user area,

>>>>>> etc.). One of the reasons we introduced the odp_packet_ref_static()

>>>>>> API is to allow implementations to further optimize this and create

>>>>>> the type of references you refer to here when applications assert that

>>>>>> they're going to treat the reference as entirely read only (and hence

>>>>>> can share metadata with the base packet). You cannot just link

>>>>>> segments together and have a valid packet reference since that doesn't

>>>>>> cover the unique metadata like headroom, etc.

>>>>>

>>>>> For creating packet reference only header meta-data needs to be unique

>>>>> but the segments could be shared even in odp_packet_ref() API.

>>>>> It is better to share as many segments as possible since copying data

>>>>> will not be as efficient as linking multiple segments together.

>>>>>

>>>>>>

>>>>>> Given the requirement for unique metadata on a per-reference basis, I

>>>>>> don't see the implementation of these APIs as problematic. The

>>>>>> odp-linux implementation shows just how simple they really are.

>>>>>

>>>>> odp-linux does not have any constraint on the per-packet meta-data but

>>>>> it is not true for all implementations, I can agree to add this value

>>>>> in meta-data if there is any valid use-case for this number if not

>>>>> then I would prefer not to waste packet meta-data unnecessarily.

>>>>>

>>>>> Regards,

>>>>> Bala

>>>>>>

>>>>>>>

>>>>>>> Regards,

>>>>>>> Bala

>>>>>>

>>>>>> Thank you! This is a great dialog.

>>>>>>

>>>>>>>>

>>>>>>>>>

>>>>>>>>> Regards,

>>>>>>>>> Bala

>>>>>>>>>> +

>>>>>>>>>> +/*

>>>>>>>>>> + *

>>>>>>>>>>   * Copy

>>>>>>>>>>   * ********************************************************

>>>>>>>>>>   *

>>>>>>>>>> --

>>>>>>>>>> 2.7.4

>>>>>>>>>>
diff mbox

Patch

diff --git a/include/odp/api/spec/packet.h b/include/odp/api/spec/packet.h
index faf62e2..137024f 100644
--- a/include/odp/api/spec/packet.h
+++ b/include/odp/api/spec/packet.h
@@ -256,6 +256,23 @@  uint32_t odp_packet_seg_len(odp_packet_t pkt);
 uint32_t odp_packet_len(odp_packet_t pkt);
 
 /**
+ * Packet unshared data len
+ *
+ * Returns the sum of data lengths over all unshared packet segments. These
+ * are the only bytes that should be modified by the caller. The rest of the
+ * packet should be treated as read-only.
+ *
+ * This routine will return 0 if odp_packet_has_ref() != 0 and will be the
+ * same as odp_packet_len() if odp_packet_has_ref() == 0 and
+ * odp_packet_is_ref() == 0.
+ *
+ * @param pkt Packet handle
+ *
+ * @return Packet unshared data length
+ */
+uint32_t odp_packet_unshared_len(odp_packet_t pkt);
+
+/**
  * Packet headroom length
  *
  * Returns the current packet level headroom length.
@@ -847,6 +864,157 @@  int odp_packet_split(odp_packet_t *pkt, uint32_t len, odp_packet_t *tail);
 
 /*
  *
+ * References
+ * ********************************************************
+ *
+ */
+
+/**
+ * Create a static reference to a packet
+ *
+ * A static reference is used to obtain an additional handle for referring to
+ * a packet so that the storage behind it is not freed until all references to
+ * the packet are freed. This can be used, for example, to support efficient
+ * retransmission processing.
+ *
+ * The intent of a static reference is that both the base packet and the
+ * returned reference will be treated as read-only after this call. Results
+ * are unpredictable if this restriction is not observed.
+ *
+ * Static references have restrictions but may have performance advantages on
+ * some platforms if the caller does not intend to modify the reference
+ * packet. If modification is needed (e.g., to prefix a unique header onto the
+ * packet) then odp_packet_ref() or odp_packet_ref_pkt() should be used.
+ *
+ * @param pkt    Handle of the base packet for which a static reference is
+ *               to be created.
+ *
+ * @return                    Handle of the static reference packet
+ * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.
+ */
+odp_packet_t odp_packet_ref_static(odp_packet_t pkt);
+
+/**
+ * Create a reference to a packet
+ *
+ * Create a dynamic reference to a base packet starting at a specified byte
+ * offset. This reference may be used as an argument to header manipulation
+ * APIs to prefix a unique header onto the shared base. The storage associated
+ * with the base packet is not freed until all references to it are freed,
+ * which permits easy multicasting or retransmission processing to be
+ * performed. Following a successful call, the base packet should be treated
+ * as read-only. Results are unpredictable if this restriction is not
+ * observed.
+ *
+ * This operation prepends a zero-length header onto the base packet beginning
+ * at the specified offset. This header is always drawn from the same pool as
+ * the base packet.
+ *
+ * Because references are unique packets, any bytes pushed onto the head of a
+ * reference via odp_packet_push_head() or odp_packet_extend_head() are unique
+ * to this reference and are not seen by the base packet or by any other
+ * reference to the same base packet.
+ *
+ * The base packet used as input to this routine may itself by a reference to
+ * some other base packet. Implementations MAY restrict the ability to create
+ * such compound references. Attempts to exceed any implementation limits in
+ * this regard will result in this call failing and returning
+ * ODP_PACKET_INVALID.
+ *
+ * If the caller does not indend to push a header onto the returned reference,
+ * the odp_packet_ref_static() API may be used. This may be a more efficient
+ * means of obtaining another reference to a base packet that will be treated
+ * as read-only.
+ *
+ * @param pkt    Handle of the base packet for which a reference is to be
+ *               created.
+ *
+ * @param offset Byte offset in the base packet at which the shared reference
+ *               is to begin. Must be in the range 0..odp_packet_len(pkt)-1.
+ *
+ * @return                    Handle of the reference packet
+ * @retval ODP_PACKET_INVALID Operation failed. Base packet remains unchanged.
+ *
+
+ */
+odp_packet_t odp_packet_ref(odp_packet_t pkt, uint32_t offset);
+
+/**
+ * Create a reference to another packet from a header packet
+ *
+ * Create a dynamic reference to a base packet starting at a specified byte
+ * offset by prepending a supplied header packet. The resulting reference
+ * consists of the unshared header followed by the shared base packet starting
+ * at the specified offset. This shared portion should be regarded as
+ * read-only. The storage associated with the shared portion of the reference
+ * is not freed until all references to it are freed, which permits easy
+ * multicasting or retransmission processing to be performed.
+ *
+ * Either packet input to this routine may itself be a reference, however
+ * individual implementations MAY impose limits on how deeply splices may be
+ * nested and fail the call if those limits are exceeded.
+ *
+ * The packets used as input to this routine may reside in different pools,
+ * however individual implementations MAY require that both reside in the same
+ * pool and fail the call if this restriction is not observed.
+ *
+ * odp_packet_pool() on the returned reference returns the pool id of the
+ * header packet.
+ *
+ * @param pkt    Handle of the base packet for which a reference is to be
+ *               created.
+ *
+ * @param offset Byte offset in the base packet at which the shared reference
+ *               to begin. Must be in the range 0..odp_packet_len(pkt)-1.
+ *
+ * @param hdr    Handle of the header packet to be prefixed onto the base
+ *               packet to create the reference. If this is specified as
+ *               ODP_PACKET_INVALID, this call is equivalent to
+ *               odp_packet_ref(). The supplied hdr must be distinct from
+ *               the base pkt and results are undefined if an attempt is made
+ *               to create circular references.
+ *
+ * @return       Handle of the reference packet. This may or may not be the
+ *               same as the input hdr packet. The caller should only use this
+ *               handle since the original hdr packet no longer has any
+ *               independent existence.
+ *
+ * @retval ODP_PACKET_INVALID If the reference failed. In this case both pkt
+ *                            and hdr are unchanged.
+ */
+odp_packet_t odp_packet_ref_pkt(odp_packet_t pkt, uint32_t offset,
+				odp_packet_t hdr);
+
+/**
+ * Test if a packet is a reference
+ *
+ * A packet is a reference if it was created by odp_packet_ref() or
+ * odp_packet_ref_pkt().  Note that a reference created from another
+ * reference is considered a compound reference. Calls on such packets will
+ * result in return values greater than 1.
+ *
+ * @param pkt Packet handle
+ *
+ * @retval 0  Packet is not a reference
+ * @retval >0 Packet is a reference consisting of N individual packets.
+ */
+int odp_packet_is_ref(odp_packet_t pkt);
+
+/**
+ * Test if a packet has a reference
+ *
+ * A packet has a reference if a reference was created on it by
+ * odp_packet_ref_static(), odp_packet_ref(), or odp_packet_ref_pkt().
+ *
+ * @param pkt Packet handle
+ *
+ * @retval 0  Packet does not have any references
+ * @retval >0 Packet has N references based on it
+ */
+int odp_packet_has_ref(odp_packet_t pkt);
+
+/*
+ *
  * Copy
  * ********************************************************
  *