diff mbox

[v2,1/3] api: pool: Added packet pool parameters

Message ID 1422968368-14340-1-git-send-email-petri.savolainen@linaro.org
State New
Headers show

Commit Message

Petri Savolainen Feb. 3, 2015, 12:59 p.m. UTC
Completed odp_pool_param_t definition with packet pool parameters.
Parameter definition is close to what we are using already.

* seg_len: Defines minimum segment buffer length.
           With this parameter user can:
           * trade-off between pool memory usage and SW performance (linear memory access)
           * avoid segmentation in packet head (e.g. if legacy app cannot handle
             segmentation in the middle of the packet headers)
           * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to ODP_CONFIG_PACKET_SEG_LEN_MIN
           * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid

* seg_align: Defines minimum segment buffer alignment. With this parameter,
             user can force buffer alignment to match e.g. aligment requirements
             of data structures stored in or algorithms accessing the packet
             headroom. When user don't have specific alignment requirement 0
             should be used for default.

* seg_num: Number of segments. This is also the maximum number of packets.

Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
---
 include/odp/api/pool.h | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

Comments

Ola Liljedahl Feb. 3, 2015, 1:31 p.m. UTC | #1
On 3 February 2015 at 13:59, Petri Savolainen
<petri.savolainen@linaro.org> wrote:
> Completed odp_pool_param_t definition with packet pool parameters.
> Parameter definition is close to what we are using already.
>
> * seg_len: Defines minimum segment buffer length.
>            With this parameter user can:
>            * trade-off between pool memory usage and SW performance (linear memory access)
>            * avoid segmentation in packet head (e.g. if legacy app cannot handle
>              segmentation in the middle of the packet headers)
We already had defined a minimum segment size for conforming ODP
implementations. Isn't that enough?

I can see value in specifying the minimum size of the first segment of
a packet (which would contain all headers the application is going to
process). But this proposal goes much further than that.


>            * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to ODP_CONFIG_PACKET_SEG_LEN_MIN
>            * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
>
> * seg_align: Defines minimum segment buffer alignment. With this parameter,
>              user can force buffer alignment to match e.g. aligment requirements
>              of data structures stored in or algorithms accessing the packet
Can you give a practical example of when this configuration is useful?
To my knowledge, most data structures have quite small alignment
requirements, e.g. based on alignment requirements of individual
fields. But here I assume that we would specify alignment in multiples
of cache lines here (because the minimum segment alignment would be
the cache line size).

>              headroom. When user don't have specific alignment requirement 0
>              should be used for default.
>
> * seg_num: Number of segments. This is also the maximum number of packets.
I think these configurations could be hints but not strict
requirements. They do not change the *functionality* so an application
should not fail if these configurations can not be obeyed (except for
that legacy situation you describe above). The hints enable more
optimal utilization of e.g. packet memory and may decrease SW overhead
during packet processing but do not change the functionality.

To enable different hardware implementations, ODP apps should not
enforce unnecessary (non-functional) requirements on the ODP
implementations and limit the number of targets ODP can be implemented
on. ODP is not DPDK.

Applications should also not have to first check the limits of the
specific ODP implementation (as you suggested yesterday), adapts its
configuration to that and then send back those requirements to the ODP
implementation (which still has to check the parameters to verify that
they are valid). This is too complicated and will likely lead to code
that cheats and thus is not portable. Better for applications just to
specify its requested configuration to ODP and then get back the
results (i.e. actual values that will be used). The application can
then if necessary check that the configuration was honored. This
follows the normal programming flow.

>
> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
> ---
>  include/odp/api/pool.h | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
> index d09d92e..a1d7494 100644
> --- a/include/odp/api/pool.h
> +++ b/include/odp/api/pool.h
> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
>                                              of 8. */
>                         uint32_t num;   /**< Number of buffers in the pool */
>                 } buf;
> -/* Reserved for packet and timeout specific params
>                 struct {
> -                       uint32_t seg_size;
> -                       uint32_t seg_align;
> -                       uint32_t num;
> +                       uint32_t seg_len;   /**< Minimum packet segment buffer
> +                                                length in bytes. It includes
> +                                                possible head-/tailroom bytes.
> +                                                The maximum value is defined by
> +                                                ODP_CONFIG_PACKET_SEG_LEN_MAX.
> +                                                Use 0 for default length. */
> +                       uint32_t seg_align; /**< Minimum packet segment buffer
> +                                                alignment in bytes. Valid
> +                                                values are powers of two. The
> +                                                maximum value is defined by
> +                                                ODP_CONFIG_PACKET_SEG_ALIGN_MAX
> +                                                . Use 0 for default alignment.
> +                                                Default will always be a
> +                                                multiple of 8.
> +                                            */
> +                       uint32_t seg_num;   /**< Number of packet segments in
> +                                                the pool. This is also the
> +                                                maximum number of packets,
> +                                                since each packet consist of
> +                                                at least one segment.
What if both seg_num and a shared memory region is specified in the
odp_pool_create call? Which takes precedence?

> +                                            */
>                 } pkt;
> -*/
>                 struct {
>                         uint32_t __res1; /* Keep struct identical to buf, */
>                         uint32_t __res2; /* until pool implementation is fixed*/
> --
> 2.2.2
>
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/lng-odp
Ola Liljedahl Feb. 3, 2015, 9:04 p.m. UTC | #2
My alternative approach that should achieve that same goals as Petri
but give more freedom to implementations. You don't have to approve of
it, I just want to show that given a defined and understood problem,
many potential solutions exist and the first alternative might not be
the best. Let's work together.

* init_seg_len:
        On input: user's required (desired) minimal length of initial
segment (including headroom)
        On output: implementation's best effort to match user's request
        Purpose: ensure that those packet headers the application
normally will process are stored in a consecutive memory area.
        Applications do not have to check any configuration in order
to initialize a configuration which the implementation anyway has to
check if it can support.
        Applications should check output values to see if its desired
values were matched. The application decides whether a failure to
match is a fatal error or the application can handle the situation
anyway (e.g. with degraded performance because it has to do some
workarounds in SW).

* seg_len:
        On input: user's desired length of other segments
        On output: implementation's best effort to match user's request
        Purpose: a hint from the user how to partition to pool into
segments for best trade-off between memory utilization and SW
processing performance.
        Note: I know some HW can support multiple segment sizes so I
think it is useful for the API to allow for this. Targets which only
support one segment size (per packet pool) could use e.g.
max(int_seg_len, seg_len). Some targets may not allow user-defined
segment sizes at all, the ODP implementation will just return the
actual values and the application can check whether those are
acceptable.

* init_seg_num:
       On input: Number of initial segments.
       On output: Updated with actual number of segments if a shared
memory region was specified?
* seg_num:
        On input: Number of other segments.
       On output: Updated with actual number of segments if a shared
memory region was specified?

I dislike the defines because they will make a future portable ABI
(binary interface) definition impossible. We will need a portable ABI
to support e.g. shared library implementations. So all ODP_CONFIG's
should only be used internally (by the implementation in question) and
not be accessible as compile time constants to applications, i.e. they
should not be part of the ODP API. Are there any other instances where
the application is supposed to use these constants?

-- Ola

On 3 February 2015 at 14:31, Ola Liljedahl <ola.liljedahl@linaro.org> wrote:
> On 3 February 2015 at 13:59, Petri Savolainen
> <petri.savolainen@linaro.org> wrote:
>> Completed odp_pool_param_t definition with packet pool parameters.
>> Parameter definition is close to what we are using already.
>>
>> * seg_len: Defines minimum segment buffer length.
>>            With this parameter user can:
>>            * trade-off between pool memory usage and SW performance (linear memory access)
>>            * avoid segmentation in packet head (e.g. if legacy app cannot handle
>>              segmentation in the middle of the packet headers)
> We already had defined a minimum segment size for conforming ODP
> implementations. Isn't that enough?
>
> I can see value in specifying the minimum size of the first segment of
> a packet (which would contain all headers the application is going to
> process). But this proposal goes much further than that.
>
>
>>            * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to ODP_CONFIG_PACKET_SEG_LEN_MIN
>>            * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
>>
>> * seg_align: Defines minimum segment buffer alignment. With this parameter,
>>              user can force buffer alignment to match e.g. aligment requirements
>>              of data structures stored in or algorithms accessing the packet
> Can you give a practical example of when this configuration is useful?
> To my knowledge, most data structures have quite small alignment
> requirements, e.g. based on alignment requirements of individual
> fields. But here I assume that we would specify alignment in multiples
> of cache lines here (because the minimum segment alignment would be
> the cache line size).
>
>>              headroom. When user don't have specific alignment requirement 0
>>              should be used for default.
>>
>> * seg_num: Number of segments. This is also the maximum number of packets.
> I think these configurations could be hints but not strict
> requirements. They do not change the *functionality* so an application
> should not fail if these configurations can not be obeyed (except for
> that legacy situation you describe above). The hints enable more
> optimal utilization of e.g. packet memory and may decrease SW overhead
> during packet processing but do not change the functionality.
>
> To enable different hardware implementations, ODP apps should not
> enforce unnecessary (non-functional) requirements on the ODP
> implementations and limit the number of targets ODP can be implemented
> on. ODP is not DPDK.
>
> Applications should also not have to first check the limits of the
> specific ODP implementation (as you suggested yesterday), adapts its
> configuration to that and then send back those requirements to the ODP
> implementation (which still has to check the parameters to verify that
> they are valid). This is too complicated and will likely lead to code
> that cheats and thus is not portable. Better for applications just to
> specify its requested configuration to ODP and then get back the
> results (i.e. actual values that will be used). The application can
> then if necessary check that the configuration was honored. This
> follows the normal programming flow.
>
>>
>> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
>> ---
>>  include/odp/api/pool.h | 26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
>> index d09d92e..a1d7494 100644
>> --- a/include/odp/api/pool.h
>> +++ b/include/odp/api/pool.h
>> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
>>                                              of 8. */
>>                         uint32_t num;   /**< Number of buffers in the pool */
>>                 } buf;
>> -/* Reserved for packet and timeout specific params
>>                 struct {
>> -                       uint32_t seg_size;
>> -                       uint32_t seg_align;
>> -                       uint32_t num;
>> +                       uint32_t seg_len;   /**< Minimum packet segment buffer
>> +                                                length in bytes. It includes
>> +                                                possible head-/tailroom bytes.
>> +                                                The maximum value is defined by
>> +                                                ODP_CONFIG_PACKET_SEG_LEN_MAX.
>> +                                                Use 0 for default length. */
>> +                       uint32_t seg_align; /**< Minimum packet segment buffer
>> +                                                alignment in bytes. Valid
>> +                                                values are powers of two. The
>> +                                                maximum value is defined by
>> +                                                ODP_CONFIG_PACKET_SEG_ALIGN_MAX
>> +                                                . Use 0 for default alignment.
>> +                                                Default will always be a
>> +                                                multiple of 8.
>> +                                            */
>> +                       uint32_t seg_num;   /**< Number of packet segments in
>> +                                                the pool. This is also the
>> +                                                maximum number of packets,
>> +                                                since each packet consist of
>> +                                                at least one segment.
> What if both seg_num and a shared memory region is specified in the
> odp_pool_create call? Which takes precedence?
>
>> +                                            */
>>                 } pkt;
>> -*/
>>                 struct {
>>                         uint32_t __res1; /* Keep struct identical to buf, */
>>                         uint32_t __res2; /* until pool implementation is fixed*/
>> --
>> 2.2.2
>>
>>
>> _______________________________________________
>> lng-odp mailing list
>> lng-odp@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/lng-odp
Bill Fischofer Feb. 3, 2015, 11:48 p.m. UTC | #3
Passing in an application-supplied shm is guesswork since there is no
architected way for an application to know how much storage an
implementation needs to honor any given request.  The way this currently
works is the call succeeds if the supplied shm is large enough and fails
otherwise.  I see no reason to change that behavior.

We originally capped the requested size to ODP_CONFIG_PACKET_BUF_LEN_MAX
but that immediately broke a number of the crypto tests which need 32K
buffers.  That's why the code currently treats requests for large sizes as
a request for an unsegmented pool.

I don't understand the reluctance to recognize unsegmented pools as a
useful construct.  All of this attempted configurability is easily solved
in a fully portable manner by saying that if an application has some reason
for wanting things to be of a specific size it can simply request an
unsegmented pool and be assured that packets will never appear segmented to
it.  There was universal agreement last Summer that this was useful so I'm
not sure what changed in the interim.

Performance may suffer on some platforms when using unsegmented pools, but
presumably that's a trade-off the application has consciously chosen.
Otherwise the application should be content to use whatever segment sizes
the implementation chooses since the relevant ODP APIs always return a
seglen to enable it to navigate them as needed, and we've already
stipulated that the implementation-chosen segment size will never be less
than 256 bytes.

A bit of diversion into HW design may explain my obstinacy here.  The
scaling problems with network processing at higher speeds all have to do
with memory processing speeds.  The rule of thumb is that in the time that
network speeds grow by a factor of 10, corresponding memory speeds grow by
a factor of 3.  This means that for every generation in the progression
from 1Gb to 10Gb to 100Gb and beyond at each step memory bottlenecks become
more and more the gating factor on the ability to handle traffic. Between
10Gb and 100Gb you have to go to NUMA-type architectures because
byte-addressable SMP DRAM simply cannot keep pace.  This has nothing to do
with how clever your SW is or how fast your processors are. It's the memory
latencies and access times that kill you.  Playing around with cache
alignments and such buys you nothing.
So what HW designers do is create segregated DRAMs designed and optimized
for packet processing.  Multiple interleaved memories, in fact, since they
need to be multiplexed to overcome the I/O-memory performance disparity.
These memories have two properties that are relevant to ODP.  First, they
are not directly addressable by processing cores.  This is why there is
always some sort of implicit or explicit mapping function needed to make
the contents of these memories addressable to cores.  Second, they are
block-addressable, not byte-addressable.  The reason for this is that
address decode lines take silicon die area and power and have a real cost
associated with them.  By eliminating byte addressability you make the
memories both faster and cheaper since address decode logic is one of the
more expensive components of any memory subsystem. Cores need byte
addressability to run SW, but HW components use whatever addressability
makes sense for them.  This is where HW packet segmentation arises.  It's
not done for arbitrary reasons and it's "baked into" the design of the
memory system and is not subject to change by SW.

So this is why the implementation choses packet segment sizes.  It's
because it's matched to the block sizes used by the HW packet memory
subsystem.  It's also why SW should process packets on a per-segment basis,
because otherwise it's incurring additional latency until all of the
segments containing a packet can be made available to it.  HW will
typically prefetch the first segment and start the core processing it as
soon as it's available and let the SW request additional segments as needed
since by design it's expected that any information the SW may need (i.e.,
the packet headers) will reside in the first segment.

I hope that makes sense.  The point is that scalable designs can scale down
to lower speeds easily while those designed for concepts that only work at
lower speeds break completely as you try to scale them up to higher
levels.  So if we design ODP for shared-memory SMP systems we need to
understand that these APIs will need to be completely reworked to enable
applications to work with high-speed devices.  So best to get them right in
the first place.

Bill

On Tue, Feb 3, 2015 at 3:04 PM, Ola Liljedahl <ola.liljedahl@linaro.org>
wrote:

> My alternative approach that should achieve that same goals as Petri
> but give more freedom to implementations. You don't have to approve of
> it, I just want to show that given a defined and understood problem,
> many potential solutions exist and the first alternative might not be
> the best. Let's work together.
>
> * init_seg_len:
>         On input: user's required (desired) minimal length of initial
> segment (including headroom)
>         On output: implementation's best effort to match user's request
>         Purpose: ensure that those packet headers the application
> normally will process are stored in a consecutive memory area.
>         Applications do not have to check any configuration in order
> to initialize a configuration which the implementation anyway has to
> check if it can support.
>         Applications should check output values to see if its desired
> values were matched. The application decides whether a failure to
> match is a fatal error or the application can handle the situation
> anyway (e.g. with degraded performance because it has to do some
> workarounds in SW).
>
> * seg_len:
>         On input: user's desired length of other segments
>         On output: implementation's best effort to match user's request
>         Purpose: a hint from the user how to partition to pool into
> segments for best trade-off between memory utilization and SW
> processing performance.
>         Note: I know some HW can support multiple segment sizes so I
> think it is useful for the API to allow for this. Targets which only
> support one segment size (per packet pool) could use e.g.
> max(int_seg_len, seg_len). Some targets may not allow user-defined
> segment sizes at all, the ODP implementation will just return the
> actual values and the application can check whether those are
> acceptable.
>
> * init_seg_num:
>        On input: Number of initial segments.
>        On output: Updated with actual number of segments if a shared
> memory region was specified?
> * seg_num:
>         On input: Number of other segments.
>        On output: Updated with actual number of segments if a shared
> memory region was specified?
>
> I dislike the defines because they will make a future portable ABI
> (binary interface) definition impossible. We will need a portable ABI
> to support e.g. shared library implementations. So all ODP_CONFIG's
> should only be used internally (by the implementation in question) and
> not be accessible as compile time constants to applications, i.e. they
> should not be part of the ODP API. Are there any other instances where
> the application is supposed to use these constants?
>
> -- Ola
>
> On 3 February 2015 at 14:31, Ola Liljedahl <ola.liljedahl@linaro.org>
> wrote:
> > On 3 February 2015 at 13:59, Petri Savolainen
> > <petri.savolainen@linaro.org> wrote:
> >> Completed odp_pool_param_t definition with packet pool parameters.
> >> Parameter definition is close to what we are using already.
> >>
> >> * seg_len: Defines minimum segment buffer length.
> >>            With this parameter user can:
> >>            * trade-off between pool memory usage and SW performance
> (linear memory access)
> >>            * avoid segmentation in packet head (e.g. if legacy app
> cannot handle
> >>              segmentation in the middle of the packet headers)
> > We already had defined a minimum segment size for conforming ODP
> > implementations. Isn't that enough?
> >
> > I can see value in specifying the minimum size of the first segment of
> > a packet (which would contain all headers the application is going to
> > process). But this proposal goes much further than that.
> >
> >
> >>            * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to
> ODP_CONFIG_PACKET_SEG_LEN_MIN
> >>            * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
> >>
> >> * seg_align: Defines minimum segment buffer alignment. With this
> parameter,
> >>              user can force buffer alignment to match e.g. aligment
> requirements
> >>              of data structures stored in or algorithms accessing the
> packet
> > Can you give a practical example of when this configuration is useful?
> > To my knowledge, most data structures have quite small alignment
> > requirements, e.g. based on alignment requirements of individual
> > fields. But here I assume that we would specify alignment in multiples
> > of cache lines here (because the minimum segment alignment would be
> > the cache line size).
> >
> >>              headroom. When user don't have specific alignment
> requirement 0
> >>              should be used for default.
> >>
> >> * seg_num: Number of segments. This is also the maximum number of
> packets.
> > I think these configurations could be hints but not strict
> > requirements. They do not change the *functionality* so an application
> > should not fail if these configurations can not be obeyed (except for
> > that legacy situation you describe above). The hints enable more
> > optimal utilization of e.g. packet memory and may decrease SW overhead
> > during packet processing but do not change the functionality.
> >
> > To enable different hardware implementations, ODP apps should not
> > enforce unnecessary (non-functional) requirements on the ODP
> > implementations and limit the number of targets ODP can be implemented
> > on. ODP is not DPDK.
> >
> > Applications should also not have to first check the limits of the
> > specific ODP implementation (as you suggested yesterday), adapts its
> > configuration to that and then send back those requirements to the ODP
> > implementation (which still has to check the parameters to verify that
> > they are valid). This is too complicated and will likely lead to code
> > that cheats and thus is not portable. Better for applications just to
> > specify its requested configuration to ODP and then get back the
> > results (i.e. actual values that will be used). The application can
> > then if necessary check that the configuration was honored. This
> > follows the normal programming flow.
> >
> >>
> >> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
> >> ---
> >>  include/odp/api/pool.h | 26 +++++++++++++++++++++-----
> >>  1 file changed, 21 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
> >> index d09d92e..a1d7494 100644
> >> --- a/include/odp/api/pool.h
> >> +++ b/include/odp/api/pool.h
> >> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
> >>                                              of 8. */
> >>                         uint32_t num;   /**< Number of buffers in the
> pool */
> >>                 } buf;
> >> -/* Reserved for packet and timeout specific params
> >>                 struct {
> >> -                       uint32_t seg_size;
> >> -                       uint32_t seg_align;
> >> -                       uint32_t num;
> >> +                       uint32_t seg_len;   /**< Minimum packet segment
> buffer
> >> +                                                length in bytes. It
> includes
> >> +                                                possible
> head-/tailroom bytes.
> >> +                                                The maximum value is
> defined by
> >> +
> ODP_CONFIG_PACKET_SEG_LEN_MAX.
> >> +                                                Use 0 for default
> length. */
> >> +                       uint32_t seg_align; /**< Minimum packet segment
> buffer
> >> +                                                alignment in bytes.
> Valid
> >> +                                                values are powers of
> two. The
> >> +                                                maximum value is
> defined by
> >> +
> ODP_CONFIG_PACKET_SEG_ALIGN_MAX
> >> +                                                . Use 0 for default
> alignment.
> >> +                                                Default will always be
> a
> >> +                                                multiple of 8.
> >> +                                            */
> >> +                       uint32_t seg_num;   /**< Number of packet
> segments in
> >> +                                                the pool. This is also
> the
> >> +                                                maximum number of
> packets,
> >> +                                                since each packet
> consist of
> >> +                                                at least one segment.
> > What if both seg_num and a shared memory region is specified in the
> > odp_pool_create call? Which takes precedence?
> >
> >> +                                            */
> >>                 } pkt;
> >> -*/
> >>                 struct {
> >>                         uint32_t __res1; /* Keep struct identical to
> buf, */
> >>                         uint32_t __res2; /* until pool implementation
> is fixed*/
> >> --
> >> 2.2.2
> >>
> >>
> >> _______________________________________________
> >> lng-odp mailing list
> >> lng-odp@lists.linaro.org
> >> http://lists.linaro.org/mailman/listinfo/lng-odp
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/lng-odp
>
Ola Liljedahl Feb. 4, 2015, 11:24 a.m. UTC | #4
On 4 February 2015 at 00:48, Bill Fischofer <bill.fischofer@linaro.org> wrote:
> Passing in an application-supplied shm is guesswork since there is no
> architected way for an application to know how much storage an
> implementation needs to honor any given request.  The way this currently
> works is the call succeeds if the supplied shm is large enough and fails
> otherwise.  I see no reason to change that behavior.
Because the size of the shm region allocated for packet storage might
be determined by other means (e.g. because you are partitioning your
total memory map and allocating areas for different purposes) and you
will get as many packets as the implementation can fit into that
space. Basically if you specify a shm region (of some size), this will
override any requested number of packets (or segments). But it could
still be useful to understand how many packet will actually fit into
that space.

>
> We originally capped the requested size to ODP_CONFIG_PACKET_BUF_LEN_MAX but
> that immediately broke a number of the crypto tests which need 32K buffers.
> That's why the code currently treats requests for large sizes as a request
> for an unsegmented pool.
Isn't this a behavior that should be part of the API specification?
Per your description, this seems just to be part of the linux-generic
implementation and could be interpreted differently by other
implementations. Personally however, I think unsegmented pools should
be requested specifically and not a characteristic that is inferred
from other parameters. Default is to get whatever segmentation the HW
happens to use (at least 256 bytes per our ODP prerequisite). When the
application needs an exception, this request should be made explicit.

>
> I don't understand the reluctance to recognize unsegmented pools as a useful
> construct.  All of this attempted configurability is easily solved in a
> fully portable manner by saying that if an application has some reason for
> wanting things to be of a specific size it can simply request an unsegmented
> pool and be assured that packets will never appear segmented to it.  There
> was universal agreement last Summer that this was useful so I'm not sure
> what changed in the interim.
>
> Performance may suffer on some platforms when using unsegmented pools, but
> presumably that's a trade-off the application has consciously chosen.
> Otherwise the application should be content to use whatever segment sizes
> the implementation chooses since the relevant ODP APIs always return a
> seglen to enable it to navigate them as needed, and we've already stipulated
> that the implementation-chosen segment size will never be less than 256
> bytes.
>
> A bit of diversion into HW design may explain my obstinacy here.  The
> scaling problems with network processing at higher speeds all have to do
> with memory processing speeds.  The rule of thumb is that in the time that
> network speeds grow by a factor of 10, corresponding memory speeds grow by a
> factor of 3.  This means that for every generation in the progression from
> 1Gb to 10Gb to 100Gb and beyond at each step memory bottlenecks become more
> and more the gating factor on the ability to handle traffic. Between 10Gb
> and 100Gb you have to go to NUMA-type architectures because byte-addressable
> SMP DRAM simply cannot keep pace.  This has nothing to do with how clever
> your SW is or how fast your processors are. It's the memory latencies and
> access times that kill you.  Playing around with cache alignments and such
> buys you nothing.
> So what HW designers do is create segregated DRAMs designed and optimized
> for packet processing.  Multiple interleaved memories, in fact, since they
> need to be multiplexed to overcome the I/O-memory performance disparity.
> These memories have two properties that are relevant to ODP.  First, they
> are not directly addressable by processing cores.  This is why there is
> always some sort of implicit or explicit mapping function needed to make the
> contents of these memories addressable to cores.  Second, they are
> block-addressable, not byte-addressable.  The reason for this is that
> address decode lines take silicon die area and power and have a real cost
> associated with them.  By eliminating byte addressability you make the
> memories both faster and cheaper since address decode logic is one of the
> more expensive components of any memory subsystem. Cores need byte
> addressability to run SW, but HW components use whatever addressability
> makes sense for them.  This is where HW packet segmentation arises.  It's
> not done for arbitrary reasons and it's "baked into" the design of the
> memory system and is not subject to change by SW.
>
> So this is why the implementation choses packet segment sizes.  It's because
> it's matched to the block sizes used by the HW packet memory subsystem.
> It's also why SW should process packets on a per-segment basis, because
> otherwise it's incurring additional latency until all of the segments
> containing a packet can be made available to it.  HW will typically prefetch
> the first segment and start the core processing it as soon as it's available
> and let the SW request additional segments as needed since by design it's
> expected that any information the SW may need (i.e., the packet headers)
> will reside in the first segment.
>
> I hope that makes sense.  The point is that scalable designs can scale down
> to lower speeds easily while those designed for concepts that only work at
> lower speeds break completely as you try to scale them up to higher levels.
> So if we design ODP for shared-memory SMP systems we need to understand that
> these APIs will need to be completely reworked to enable applications to
> work with high-speed devices.  So best to get them right in the first place.
>
> Bill
>
> On Tue, Feb 3, 2015 at 3:04 PM, Ola Liljedahl <ola.liljedahl@linaro.org>
> wrote:
>>
>> My alternative approach that should achieve that same goals as Petri
>> but give more freedom to implementations. You don't have to approve of
>> it, I just want to show that given a defined and understood problem,
>> many potential solutions exist and the first alternative might not be
>> the best. Let's work together.
>>
>> * init_seg_len:
>>         On input: user's required (desired) minimal length of initial
>> segment (including headroom)
>>         On output: implementation's best effort to match user's request
>>         Purpose: ensure that those packet headers the application
>> normally will process are stored in a consecutive memory area.
>>         Applications do not have to check any configuration in order
>> to initialize a configuration which the implementation anyway has to
>> check if it can support.
>>         Applications should check output values to see if its desired
>> values were matched. The application decides whether a failure to
>> match is a fatal error or the application can handle the situation
>> anyway (e.g. with degraded performance because it has to do some
>> workarounds in SW).
>>
>> * seg_len:
>>         On input: user's desired length of other segments
>>         On output: implementation's best effort to match user's request
>>         Purpose: a hint from the user how to partition to pool into
>> segments for best trade-off between memory utilization and SW
>> processing performance.
>>         Note: I know some HW can support multiple segment sizes so I
>> think it is useful for the API to allow for this. Targets which only
>> support one segment size (per packet pool) could use e.g.
>> max(int_seg_len, seg_len). Some targets may not allow user-defined
>> segment sizes at all, the ODP implementation will just return the
>> actual values and the application can check whether those are
>> acceptable.
>>
>> * init_seg_num:
>>        On input: Number of initial segments.
>>        On output: Updated with actual number of segments if a shared
>> memory region was specified?
>> * seg_num:
>>         On input: Number of other segments.
>>        On output: Updated with actual number of segments if a shared
>> memory region was specified?
>>
>> I dislike the defines because they will make a future portable ABI
>> (binary interface) definition impossible. We will need a portable ABI
>> to support e.g. shared library implementations. So all ODP_CONFIG's
>> should only be used internally (by the implementation in question) and
>> not be accessible as compile time constants to applications, i.e. they
>> should not be part of the ODP API. Are there any other instances where
>> the application is supposed to use these constants?
>>
>> -- Ola
>>
>> On 3 February 2015 at 14:31, Ola Liljedahl <ola.liljedahl@linaro.org>
>> wrote:
>> > On 3 February 2015 at 13:59, Petri Savolainen
>> > <petri.savolainen@linaro.org> wrote:
>> >> Completed odp_pool_param_t definition with packet pool parameters.
>> >> Parameter definition is close to what we are using already.
>> >>
>> >> * seg_len: Defines minimum segment buffer length.
>> >>            With this parameter user can:
>> >>            * trade-off between pool memory usage and SW performance
>> >> (linear memory access)
>> >>            * avoid segmentation in packet head (e.g. if legacy app
>> >> cannot handle
>> >>              segmentation in the middle of the packet headers)
>> > We already had defined a minimum segment size for conforming ODP
>> > implementations. Isn't that enough?
>> >
>> > I can see value in specifying the minimum size of the first segment of
>> > a packet (which would contain all headers the application is going to
>> > process). But this proposal goes much further than that.
>> >
>> >
>> >>            * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to
>> >> ODP_CONFIG_PACKET_SEG_LEN_MIN
>> >>            * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
>> >>
>> >> * seg_align: Defines minimum segment buffer alignment. With this
>> >> parameter,
>> >>              user can force buffer alignment to match e.g. aligment
>> >> requirements
>> >>              of data structures stored in or algorithms accessing the
>> >> packet
>> > Can you give a practical example of when this configuration is useful?
>> > To my knowledge, most data structures have quite small alignment
>> > requirements, e.g. based on alignment requirements of individual
>> > fields. But here I assume that we would specify alignment in multiples
>> > of cache lines here (because the minimum segment alignment would be
>> > the cache line size).
>> >
>> >>              headroom. When user don't have specific alignment
>> >> requirement 0
>> >>              should be used for default.
>> >>
>> >> * seg_num: Number of segments. This is also the maximum number of
>> >> packets.
>> > I think these configurations could be hints but not strict
>> > requirements. They do not change the *functionality* so an application
>> > should not fail if these configurations can not be obeyed (except for
>> > that legacy situation you describe above). The hints enable more
>> > optimal utilization of e.g. packet memory and may decrease SW overhead
>> > during packet processing but do not change the functionality.
>> >
>> > To enable different hardware implementations, ODP apps should not
>> > enforce unnecessary (non-functional) requirements on the ODP
>> > implementations and limit the number of targets ODP can be implemented
>> > on. ODP is not DPDK.
>> >
>> > Applications should also not have to first check the limits of the
>> > specific ODP implementation (as you suggested yesterday), adapts its
>> > configuration to that and then send back those requirements to the ODP
>> > implementation (which still has to check the parameters to verify that
>> > they are valid). This is too complicated and will likely lead to code
>> > that cheats and thus is not portable. Better for applications just to
>> > specify its requested configuration to ODP and then get back the
>> > results (i.e. actual values that will be used). The application can
>> > then if necessary check that the configuration was honored. This
>> > follows the normal programming flow.
>> >
>> >>
>> >> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
>> >> ---
>> >>  include/odp/api/pool.h | 26 +++++++++++++++++++++-----
>> >>  1 file changed, 21 insertions(+), 5 deletions(-)
>> >>
>> >> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
>> >> index d09d92e..a1d7494 100644
>> >> --- a/include/odp/api/pool.h
>> >> +++ b/include/odp/api/pool.h
>> >> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
>> >>                                              of 8. */
>> >>                         uint32_t num;   /**< Number of buffers in the
>> >> pool */
>> >>                 } buf;
>> >> -/* Reserved for packet and timeout specific params
>> >>                 struct {
>> >> -                       uint32_t seg_size;
>> >> -                       uint32_t seg_align;
>> >> -                       uint32_t num;
>> >> +                       uint32_t seg_len;   /**< Minimum packet segment
>> >> buffer
>> >> +                                                length in bytes. It
>> >> includes
>> >> +                                                possible
>> >> head-/tailroom bytes.
>> >> +                                                The maximum value is
>> >> defined by
>> >> +
>> >> ODP_CONFIG_PACKET_SEG_LEN_MAX.
>> >> +                                                Use 0 for default
>> >> length. */
>> >> +                       uint32_t seg_align; /**< Minimum packet segment
>> >> buffer
>> >> +                                                alignment in bytes.
>> >> Valid
>> >> +                                                values are powers of
>> >> two. The
>> >> +                                                maximum value is
>> >> defined by
>> >> +
>> >> ODP_CONFIG_PACKET_SEG_ALIGN_MAX
>> >> +                                                . Use 0 for default
>> >> alignment.
>> >> +                                                Default will always be
>> >> a
>> >> +                                                multiple of 8.
>> >> +                                            */
>> >> +                       uint32_t seg_num;   /**< Number of packet
>> >> segments in
>> >> +                                                the pool. This is also
>> >> the
>> >> +                                                maximum number of
>> >> packets,
>> >> +                                                since each packet
>> >> consist of
>> >> +                                                at least one segment.
>> > What if both seg_num and a shared memory region is specified in the
>> > odp_pool_create call? Which takes precedence?
>> >
>> >> +                                            */
>> >>                 } pkt;
>> >> -*/
>> >>                 struct {
>> >>                         uint32_t __res1; /* Keep struct identical to
>> >> buf, */
>> >>                         uint32_t __res2; /* until pool implementation
>> >> is fixed*/
>> >> --
>> >> 2.2.2
>> >>
>> >>
>> >> _______________________________________________
>> >> lng-odp mailing list
>> >> lng-odp@lists.linaro.org
>> >> http://lists.linaro.org/mailman/listinfo/lng-odp
>>
>> _______________________________________________
>> lng-odp mailing list
>> lng-odp@lists.linaro.org
>> http://lists.linaro.org/mailman/listinfo/lng-odp
>
>
Maxim Uvarov Feb. 4, 2015, 4:07 p.m. UTC | #5
will here be updated version?

Maxim.

On 02/03/2015 03:59 PM, Petri Savolainen wrote:
> Completed odp_pool_param_t definition with packet pool parameters.
> Parameter definition is close to what we are using already.
>
> * seg_len: Defines minimum segment buffer length.
>             With this parameter user can:
>             * trade-off between pool memory usage and SW performance (linear memory access)
>             * avoid segmentation in packet head (e.g. if legacy app cannot handle
>               segmentation in the middle of the packet headers)
>             * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to ODP_CONFIG_PACKET_SEG_LEN_MIN
>             * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
>
> * seg_align: Defines minimum segment buffer alignment. With this parameter,
>               user can force buffer alignment to match e.g. aligment requirements
>               of data structures stored in or algorithms accessing the packet
>               headroom. When user don't have specific alignment requirement 0
>               should be used for default.
>
> * seg_num: Number of segments. This is also the maximum number of packets.
>
> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
> ---
>   include/odp/api/pool.h | 26 +++++++++++++++++++++-----
>   1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
> index d09d92e..a1d7494 100644
> --- a/include/odp/api/pool.h
> +++ b/include/odp/api/pool.h
> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
>   					     of 8. */
>   			uint32_t num;   /**< Number of buffers in the pool */
>   		} buf;
> -/* Reserved for packet and timeout specific params
>   		struct {
> -			uint32_t seg_size;
> -			uint32_t seg_align;
> -			uint32_t num;
> +			uint32_t seg_len;   /**< Minimum packet segment buffer
> +						 length in bytes. It includes
> +						 possible head-/tailroom bytes.
> +						 The maximum value is defined by
> +						 ODP_CONFIG_PACKET_SEG_LEN_MAX.
> +						 Use 0 for default length. */
> +			uint32_t seg_align; /**< Minimum packet segment buffer
> +						 alignment in bytes. Valid
> +						 values are powers of two. The
> +						 maximum value is defined by
> +						 ODP_CONFIG_PACKET_SEG_ALIGN_MAX
> +						 . Use 0 for default alignment.
> +						 Default will always be a
> +						 multiple of 8.
> +					     */
> +			uint32_t seg_num;   /**< Number of packet segments in
> +						 the pool. This is also the
> +						 maximum number of packets,
> +						 since each packet consist of
> +						 at least one segment.
> +					     */
>   		} pkt;
> -*/
>   		struct {
>   			uint32_t __res1; /* Keep struct identical to buf, */
>   			uint32_t __res2; /* until pool implementation is fixed*/
Taras Kondratiuk Feb. 5, 2015, 9:12 p.m. UTC | #6
On 02/03/2015 11:04 PM, Ola Liljedahl wrote:
> My alternative approach that should achieve that same goals as Petri
> but give more freedom to implementations. You don't have to approve of
> it, I just want to show that given a defined and understood problem,
> many potential solutions exist and the first alternative might not be
> the best. Let's work together.
> 
> * init_seg_len:
>         On input: user's required (desired) minimal length of initial
> segment (including headroom)
>         On output: implementation's best effort to match user's request
>         Purpose: ensure that those packet headers the application
> normally will process are stored in a consecutive memory area.
>         Applications do not have to check any configuration in order
> to initialize a configuration which the implementation anyway has to
> check if it can support.
>         Applications should check output values to see if its desired
> values were matched. The application decides whether a failure to
> match is a fatal error or the application can handle the situation
> anyway (e.g. with degraded performance because it has to do some
> workarounds in SW).
> 
> * seg_len:
>         On input: user's desired length of other segments
>         On output: implementation's best effort to match user's request
>         Purpose: a hint from the user how to partition to pool into
> segments for best trade-off between memory utilization and SW
> processing performance.
>         Note: I know some HW can support multiple segment sizes so I
> think it is useful for the API to allow for this. Targets which only
> support one segment size (per packet pool) could use e.g.
> max(int_seg_len, seg_len). Some targets may not allow user-defined
> segment sizes at all, the ODP implementation will just return the
> actual values and the application can check whether those are
> acceptable.
> 
> * init_seg_num:
>        On input: Number of initial segments.
>        On output: Updated with actual number of segments if a shared
> memory region was specified?
> * seg_num:
>         On input: Number of other segments.
>        On output: Updated with actual number of segments if a shared
> memory region was specified?

Wouldn't it be simpler to create two pools and assign both to Pktio?
Ola Liljedahl Feb. 5, 2015, 10:18 p.m. UTC | #7
On 5 February 2015 at 22:12, Taras Kondratiuk
<taras.kondratiuk@linaro.org> wrote:
> On 02/03/2015 11:04 PM, Ola Liljedahl wrote:
>> My alternative approach that should achieve that same goals as Petri
>> but give more freedom to implementations. You don't have to approve of
>> it, I just want to show that given a defined and understood problem,
>> many potential solutions exist and the first alternative might not be
>> the best. Let's work together.
>>
>> * init_seg_len:
>>         On input: user's required (desired) minimal length of initial
>> segment (including headroom)
>>         On output: implementation's best effort to match user's request
>>         Purpose: ensure that those packet headers the application
>> normally will process are stored in a consecutive memory area.
>>         Applications do not have to check any configuration in order
>> to initialize a configuration which the implementation anyway has to
>> check if it can support.
>>         Applications should check output values to see if its desired
>> values were matched. The application decides whether a failure to
>> match is a fatal error or the application can handle the situation
>> anyway (e.g. with degraded performance because it has to do some
>> workarounds in SW).
>>
>> * seg_len:
>>         On input: user's desired length of other segments
>>         On output: implementation's best effort to match user's request
>>         Purpose: a hint from the user how to partition to pool into
>> segments for best trade-off between memory utilization and SW
>> processing performance.
>>         Note: I know some HW can support multiple segment sizes so I
>> think it is useful for the API to allow for this. Targets which only
>> support one segment size (per packet pool) could use e.g.
>> max(int_seg_len, seg_len). Some targets may not allow user-defined
>> segment sizes at all, the ODP implementation will just return the
>> actual values and the application can check whether those are
>> acceptable.
>>
>> * init_seg_num:
>>        On input: Number of initial segments.
>>        On output: Updated with actual number of segments if a shared
>> memory region was specified?
>> * seg_num:
>>         On input: Number of other segments.
>>        On output: Updated with actual number of segments if a shared
>> memory region was specified?
>
> Wouldn't it be simpler to create two pools and assign both to Pktio?
Exactly the kind of constructive comments I want to enable by starting
with defining and agreeing on the problem!

If this is not an acceptable solution because of some other (currently
unknown) requirement, we need to understand and formalize that
requirement as well.

Is there any disadvantage of pre-segregating the buffers into
different pools (with different segment sizes)? Could this limit how
HW uses those segments for different purposes? Can a multi-segment
packet contain segments from different pools?
Bill Fischofer Feb. 5, 2015, 11:08 p.m. UTC | #8
HW is free to organize whatever underlying facilities it has at its
disposal to match ODP API semantics.  That's the whole point of having ODP
APIs specify only what's actually needed at the application level and
giving the implementation freedom in how to provide the needed semantics.
For example, we've already discussed how an implementation that supports
only a single physical pool might partition that into multiple logical
pools using usage counters or some other scheme.  So there's no assurance
that the ODP concept of pools map one-to-one to underlying HW concepts.
The actual structure of a pool is completely opaque.  The only observable
semantics are those that the ODP APIs define.  If we take the
API/Implementation separation model seriously we really need to take care
the ensure that ODP API definitions do not assume an underlying
implementation.

In the case of segmentation, if the application is allowed to specify pool
segment sizes, then what the API is actually specifying is what the
application wishes to observe--it has nothing to do with how the underlying
implementation may actually work on a given platform.  So we can easily get
into a situation with this or other APIs that try to specify implementation
behavior where the application believes it is helping to "optimize" things
but is only causing additional work and overhead at the implementation
level to present the requested "optimized" observable behavior back to the
application.

This is why I keep coming back to fundamentals.  Either the application is
or is not prepared to deal with segments.  If it is not, just say so and
let the implementation do whatever it needs to do to hide segments from the
application.  If it is, then just deal with whatever segment sizes are
native to that platform, trusting that the designers of that platform know
what they are doing.  Writing segment-size independent code is no different
than writing endian-neutral code or code that can run both 32-bit and
64-bit. Trying to control segment sizes at the application level is
effectively saying that the application really can't deal with segments but
somehow feels guilty about that and hence wants to "support" segment
processing by mandating sizes such that it won't ever actually have to deal
with them.

Taras rightly points out that we already have a perfectly usable model for
sorting packets into unsegmented pools of varying sizes based on the
lengths of incoming packets. This seems the perfect compromise between
application simplicity and memory efficiency, and I'm not sure why we
wouldn't want to promote this approach for application that don't want to
deal with segmentation.

On Thu, Feb 5, 2015 at 4:18 PM, Ola Liljedahl <ola.liljedahl@linaro.org>
wrote:

> On 5 February 2015 at 22:12, Taras Kondratiuk
> <taras.kondratiuk@linaro.org> wrote:
> > On 02/03/2015 11:04 PM, Ola Liljedahl wrote:
> >> My alternative approach that should achieve that same goals as Petri
> >> but give more freedom to implementations. You don't have to approve of
> >> it, I just want to show that given a defined and understood problem,
> >> many potential solutions exist and the first alternative might not be
> >> the best. Let's work together.
> >>
> >> * init_seg_len:
> >>         On input: user's required (desired) minimal length of initial
> >> segment (including headroom)
> >>         On output: implementation's best effort to match user's request
> >>         Purpose: ensure that those packet headers the application
> >> normally will process are stored in a consecutive memory area.
> >>         Applications do not have to check any configuration in order
> >> to initialize a configuration which the implementation anyway has to
> >> check if it can support.
> >>         Applications should check output values to see if its desired
> >> values were matched. The application decides whether a failure to
> >> match is a fatal error or the application can handle the situation
> >> anyway (e.g. with degraded performance because it has to do some
> >> workarounds in SW).
> >>
> >> * seg_len:
> >>         On input: user's desired length of other segments
> >>         On output: implementation's best effort to match user's request
> >>         Purpose: a hint from the user how to partition to pool into
> >> segments for best trade-off between memory utilization and SW
> >> processing performance.
> >>         Note: I know some HW can support multiple segment sizes so I
> >> think it is useful for the API to allow for this. Targets which only
> >> support one segment size (per packet pool) could use e.g.
> >> max(int_seg_len, seg_len). Some targets may not allow user-defined
> >> segment sizes at all, the ODP implementation will just return the
> >> actual values and the application can check whether those are
> >> acceptable.
> >>
> >> * init_seg_num:
> >>        On input: Number of initial segments.
> >>        On output: Updated with actual number of segments if a shared
> >> memory region was specified?
> >> * seg_num:
> >>         On input: Number of other segments.
> >>        On output: Updated with actual number of segments if a shared
> >> memory region was specified?
> >
> > Wouldn't it be simpler to create two pools and assign both to Pktio?
> Exactly the kind of constructive comments I want to enable by starting
> with defining and agreeing on the problem!
>
> If this is not an acceptable solution because of some other (currently
> unknown) requirement, we need to understand and formalize that
> requirement as well.
>
> Is there any disadvantage of pre-segregating the buffers into
> different pools (with different segment sizes)? Could this limit how
> HW uses those segments for different purposes? Can a multi-segment
> packet contain segments from different pools?
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/lng-odp
>
Bill Fischofer Feb. 6, 2015, 2:22 p.m. UTC | #9
On Fri, Feb 6, 2015 at 3:15 AM, Savolainen, Petri (NSN - FI/Espoo) <
petri.savolainen@nsn.com> wrote:

>
>
> > -----Original Message-----
> > From: ext Ola Liljedahl [mailto:ola.liljedahl@linaro.org]
> > Sent: Tuesday, February 03, 2015 11:05 PM
> > To: Petri Savolainen
> > Cc: LNG ODP Mailman List
> > Subject: Re: [lng-odp] [PATCH v2 1/3] api: pool: Added packet pool
> > parameters
> >
> > My alternative approach that should achieve that same goals as Petri
> > but give more freedom to implementations. You don't have to approve of
> > it, I just want to show that given a defined and understood problem,
> > many potential solutions exist and the first alternative might not be
> > the best. Let's work together.
>
> I think your proposal is not substantially different what I have suggested
> already (including conf calls). Main difference is that 1st segment vs
> other segments configuration would be introduced already in v1.0. I was
> thinking to nail v1.0 with minimum support (keep it close to what we have
> already done) and extend from there (with concept of 1st/other segments).
>
>
> >
> > * init_seg_len:
> >         On input: user's required (desired) minimal length of initial
> > segment (including headroom)
> >         On output: implementation's best effort to match user's request
> >         Purpose: ensure that those packet headers the application
> > normally will process are stored in a consecutive memory area.
> >         Applications do not have to check any configuration in order
> > to initialize a configuration which the implementation anyway has to
> > check if it can support.
> >         Applications should check output values to see if its desired
> > values were matched. The application decides whether a failure to
> > match is a fatal error or the application can handle the situation
> > anyway (e.g. with degraded performance because it has to do some
> > workarounds in SW).
>
> This would be .first_seg_len. Application is not interested on the actual
> len, either implementation can support the minimum requirement or not.
> That's why output param is not necessary, but min/max limits are needed.
> Limits can be provided as macros or function calls.
>
> I'd postpone splitting .seg_len from current proposal to .first_seg_len
> and .other_seg_len some time after v1.0 when we continue packet API
> segmentation API development anyway.
>
> >
> > * seg_len:
> >         On input: user's desired length of other segments
> >         On output: implementation's best effort to match user's request
> >         Purpose: a hint from the user how to partition to pool into
> > segments for best trade-off between memory utilization and SW
> > processing performance.
> >         Note: I know some HW can support multiple segment sizes so I
> > think it is useful for the API to allow for this. Targets which only
> > support one segment size (per packet pool) could use e.g.
> > max(int_seg_len, seg_len). Some targets may not allow user-defined
> > segment sizes at all, the ODP implementation will just return the
> > actual values and the application can check whether those are
> > acceptable.
>
> This would be .other_seg_len. This can be hint vs 1st seg len is strict
> requirement.
>
> >
> > * init_seg_num:
> >        On input: Number of initial segments.
> >        On output: Updated with actual number of segments if a shared
> > memory region was specified?
>
> This is also max number of packets (== seg_num in my proposal). Again I
> considered to postpone the split of first_seg_num/packet_num and
> other_seg_num after v1.0.
>
> Application may want (usually does) to specify max number of packet as
> hard limit, since that's the limit for maximum number of in-flight packet
> inside your app / SoC. The limit is  needed for QoS, per packet context
> memory allocation, etc.
>

I agree, which is why there should be a num_pkts parameter specified here
just as there is a num_bufs parameter for buffer pool creation.  The number
of packets is independent of the number of segments (one or more) they may
occupy.  Given num_pkts and the expected average length of those packets
the implementation can determine how much storage it needs to reserve for
the pool, which is what the current linux-generic code is doing.


>
> It depends on application if actual number is interesting or only
> success/failure.
>
>
> > * seg_num:
> >         On input: Number of other segments.
> >        On output: Updated with actual number of segments if a shared
> > memory region was specified?
> >
> > I dislike the defines because they will make a future portable ABI
> > (binary interface) definition impossible. We will need a portable ABI
> > to support e.g. shared library implementations. So all ODP_CONFIG's
> > should only be used internally (by the implementation in question) and
> > not be accessible as compile time constants to applications, i.e. they
> > should not be part of the ODP API. Are there any other instances where
> > the application is supposed to use these constants?
>
> This a new requirement, and is probably a valid one also. Obviously, any
> config interface development need to be postponed after v1.0.
>
> Since v1.0 has ODP_CONFIG_XXX (config.h) in the API, it's possible to use
> that mechanism also for these limits. Functions could be used also, but it
> would not solve the config issue as have used ODP_CONFIG_XXX already for
> other things like buffers. Minimizing changes to current practices to reach
> v1.0 sooner than later...
>
>
> -Petri
>
>
> >
> > -- Ola
> >
> > On 3 February 2015 at 14:31, Ola Liljedahl <ola.liljedahl@linaro.org>
> > wrote:
> > > On 3 February 2015 at 13:59, Petri Savolainen
> > > <petri.savolainen@linaro.org> wrote:
> > >> Completed odp_pool_param_t definition with packet pool parameters.
> > >> Parameter definition is close to what we are using already.
> > >>
> > >> * seg_len: Defines minimum segment buffer length.
> > >>            With this parameter user can:
> > >>            * trade-off between pool memory usage and SW performance
> > (linear memory access)
> > >>            * avoid segmentation in packet head (e.g. if legacy app
> > cannot handle
> > >>              segmentation in the middle of the packet headers)
> > > We already had defined a minimum segment size for conforming ODP
> > > implementations. Isn't that enough?
> > >
> > > I can see value in specifying the minimum size of the first segment of
> > > a packet (which would contain all headers the application is going to
> > > process). But this proposal goes much further than that.
> > >
> > >
> > >>            * seg_len < ODP_CONFIG_PACKET_SEG_LEN_MIN is rounded up to
> > ODP_CONFIG_PACKET_SEG_LEN_MIN
> > >>            * seg_len > ODP_CONFIG_PACKET_SEG_LEN_MAX is not valid
> > >>
> > >> * seg_align: Defines minimum segment buffer alignment. With this
> > parameter,
> > >>              user can force buffer alignment to match e.g. aligment
> > requirements
> > >>              of data structures stored in or algorithms accessing the
> > packet
> > > Can you give a practical example of when this configuration is useful?
> > > To my knowledge, most data structures have quite small alignment
> > > requirements, e.g. based on alignment requirements of individual
> > > fields. But here I assume that we would specify alignment in multiples
> > > of cache lines here (because the minimum segment alignment would be
> > > the cache line size).
> > >
> > >>              headroom. When user don't have specific alignment
> > requirement 0
> > >>              should be used for default.
> > >>
> > >> * seg_num: Number of segments. This is also the maximum number of
> > packets.
> > > I think these configurations could be hints but not strict
> > > requirements. They do not change the *functionality* so an application
> > > should not fail if these configurations can not be obeyed (except for
> > > that legacy situation you describe above). The hints enable more
> > > optimal utilization of e.g. packet memory and may decrease SW overhead
> > > during packet processing but do not change the functionality.
> > >
> > > To enable different hardware implementations, ODP apps should not
> > > enforce unnecessary (non-functional) requirements on the ODP
> > > implementations and limit the number of targets ODP can be implemented
> > > on. ODP is not DPDK.
> > >
> > > Applications should also not have to first check the limits of the
> > > specific ODP implementation (as you suggested yesterday), adapts its
> > > configuration to that and then send back those requirements to the ODP
> > > implementation (which still has to check the parameters to verify that
> > > they are valid). This is too complicated and will likely lead to code
> > > that cheats and thus is not portable. Better for applications just to
> > > specify its requested configuration to ODP and then get back the
> > > results (i.e. actual values that will be used). The application can
> > > then if necessary check that the configuration was honored. This
> > > follows the normal programming flow.
> > >
> > >>
> > >> Signed-off-by: Petri Savolainen <petri.savolainen@linaro.org>
> > >> ---
> > >>  include/odp/api/pool.h | 26 +++++++++++++++++++++-----
> > >>  1 file changed, 21 insertions(+), 5 deletions(-)
> > >>
> > >> diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
> > >> index d09d92e..a1d7494 100644
> > >> --- a/include/odp/api/pool.h
> > >> +++ b/include/odp/api/pool.h
> > >> @@ -61,13 +61,29 @@ typedef struct odp_pool_param_t {
> > >>                                              of 8. */
> > >>                         uint32_t num;   /**< Number of buffers in the
> > pool */
> > >>                 } buf;
> > >> -/* Reserved for packet and timeout specific params
> > >>                 struct {
> > >> -                       uint32_t seg_size;
> > >> -                       uint32_t seg_align;
> > >> -                       uint32_t num;
> > >> +                       uint32_t seg_len;   /**< Minimum packet
> segment
> > buffer
> > >> +                                                length in bytes. It
> > includes
> > >> +                                                possible head-
> > /tailroom bytes.
> > >> +                                                The maximum value is
> > defined by
> > >> +
> > ODP_CONFIG_PACKET_SEG_LEN_MAX.
> > >> +                                                Use 0 for default
> > length. */
> > >> +                       uint32_t seg_align; /**< Minimum packet
> segment
> > buffer
> > >> +                                                alignment in bytes.
> > Valid
> > >> +                                                values are powers of
> > two. The
> > >> +                                                maximum value is
> > defined by
> > >> +
> > ODP_CONFIG_PACKET_SEG_ALIGN_MAX
> > >> +                                                . Use 0 for default
> > alignment.
> > >> +                                                Default will always
> be
> > a
> > >> +                                                multiple of 8.
> > >> +                                            */
> > >> +                       uint32_t seg_num;   /**< Number of packet
> > segments in
> > >> +                                                the pool. This is
> also
> > the
> > >> +                                                maximum number of
> > packets,
> > >> +                                                since each packet
> > consist of
> > >> +                                                at least one segment.
> > > What if both seg_num and a shared memory region is specified in the
> > > odp_pool_create call? Which takes precedence?
> > >
> > >> +                                            */
> > >>                 } pkt;
> > >> -*/
> > >>                 struct {
> > >>                         uint32_t __res1; /* Keep struct identical to
> > buf, */
> > >>                         uint32_t __res2; /* until pool implementation
> > is fixed*/
> > >> --
> > >> 2.2.2
> > >>
> > >>
> > >> _______________________________________________
> > >> lng-odp mailing list
> > >> lng-odp@lists.linaro.org
> > >> http://lists.linaro.org/mailman/listinfo/lng-odp
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/lng-odp
>
diff mbox

Patch

diff --git a/include/odp/api/pool.h b/include/odp/api/pool.h
index d09d92e..a1d7494 100644
--- a/include/odp/api/pool.h
+++ b/include/odp/api/pool.h
@@ -61,13 +61,29 @@  typedef struct odp_pool_param_t {
 					     of 8. */
 			uint32_t num;   /**< Number of buffers in the pool */
 		} buf;
-/* Reserved for packet and timeout specific params
 		struct {
-			uint32_t seg_size;
-			uint32_t seg_align;
-			uint32_t num;
+			uint32_t seg_len;   /**< Minimum packet segment buffer
+						 length in bytes. It includes
+						 possible head-/tailroom bytes.
+						 The maximum value is defined by
+						 ODP_CONFIG_PACKET_SEG_LEN_MAX.
+						 Use 0 for default length. */
+			uint32_t seg_align; /**< Minimum packet segment buffer
+						 alignment in bytes. Valid
+						 values are powers of two. The
+						 maximum value is defined by
+						 ODP_CONFIG_PACKET_SEG_ALIGN_MAX
+						 . Use 0 for default alignment.
+						 Default will always be a
+						 multiple of 8.
+					     */
+			uint32_t seg_num;   /**< Number of packet segments in
+						 the pool. This is also the
+						 maximum number of packets,
+						 since each packet consist of
+						 at least one segment.
+					     */
 		} pkt;
-*/
 		struct {
 			uint32_t __res1; /* Keep struct identical to buf, */
 			uint32_t __res2; /* until pool implementation is fixed*/