diff mbox

[v2,1/2] performance: odp_pktio_perf: fix potential overflow for send_duration

Message ID 1438871431-27825-2-git-send-email-ivan.khoronzhuk@linaro.org
State New
Headers show

Commit Message

Ivan Khoronzhuk Aug. 6, 2015, 2:30 p.m. UTC
The direct comparing of "cur_cycles" and "end_cycles" is not valid,
as "end_cycles" can be overflowed and comparison will give wrong
result. So use odp_time_diff_cycles() for that, as it takes in
account cycles overflow.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 test/performance/odp_pktio_perf.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Stuart Haslam Aug. 6, 2015, 5:03 p.m. UTC | #1
On Thu, Aug 06, 2015 at 05:30:30PM +0300, Ivan Khoronzhuk wrote:
> The direct comparing of "cur_cycles" and "end_cycles" is not valid,
> as "end_cycles" can be overflowed and comparison will give wrong
> result. So use odp_time_diff_cycles() for that, as it takes in
> account cycles overflow.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  test/performance/odp_pktio_perf.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
> index fcbc4ec..360d832 100644
> --- a/test/performance/odp_pktio_perf.c
> +++ b/test/performance/odp_pktio_perf.c
> @@ -303,7 +303,7 @@ static void *run_thread_tx(void *arg)
>  	int thr_id;
>  	odp_queue_t outq;
>  	pkt_tx_stats_t *stats;
> -	uint64_t next_tx_cycles, end_cycles, cur_cycles;
> +	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
>  	uint64_t burst_gap_cycles;
>  	uint32_t batch_len;
>  	int unsent_pkts = 0;
> @@ -331,12 +331,11 @@ static void *run_thread_tx(void *arg)
>  
>  	odp_barrier_wait(&globals->tx_barrier);
>  
> +	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
>  	cur_cycles     = odp_time_cycles();
>  	next_tx_cycles = cur_cycles;
> -	end_cycles     = cur_cycles +
> -			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
> -
> -	while (cur_cycles < end_cycles) {
> +	end_cycles     = cur_cycles + send_duration;
> +	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {

I sent a comment on v1 before I saw this v2. This is still not right,
it *looks* like this will wait for 2*send_duration, but actually
cur_cycles will end up > end_cycles and a wrap will be incorrectly
assumed.
Ivan Khoronzhuk Aug. 6, 2015, 5:18 p.m. UTC | #2
On 06.08.15 20:03, Stuart Haslam wrote:
> On Thu, Aug 06, 2015 at 05:30:30PM +0300, Ivan Khoronzhuk wrote:
>> The direct comparing of "cur_cycles" and "end_cycles" is not valid,
>> as "end_cycles" can be overflowed and comparison will give wrong
>> result. So use odp_time_diff_cycles() for that, as it takes in
>> account cycles overflow.
>>
>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>> ---
>>   test/performance/odp_pktio_perf.c | 9 ++++-----
>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
>> index fcbc4ec..360d832 100644
>> --- a/test/performance/odp_pktio_perf.c
>> +++ b/test/performance/odp_pktio_perf.c
>> @@ -303,7 +303,7 @@ static void *run_thread_tx(void *arg)
>>   	int thr_id;
>>   	odp_queue_t outq;
>>   	pkt_tx_stats_t *stats;
>> -	uint64_t next_tx_cycles, end_cycles, cur_cycles;
>> +	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
>>   	uint64_t burst_gap_cycles;
>>   	uint32_t batch_len;
>>   	int unsent_pkts = 0;
>> @@ -331,12 +331,11 @@ static void *run_thread_tx(void *arg)
>>
>>   	odp_barrier_wait(&globals->tx_barrier);
>>
>> +	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
>>   	cur_cycles     = odp_time_cycles();
>>   	next_tx_cycles = cur_cycles;
>> -	end_cycles     = cur_cycles +
>> -			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
>> -
>> -	while (cur_cycles < end_cycles) {
>> +	end_cycles     = cur_cycles + send_duration;
>> +	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {
>
> I sent a comment on v1 before I saw this v2. This is still not right,
> it *looks* like this will wait for 2*send_duration,

One moment. How?
1: t1 = 3; t2 = 4; diff = 1;
2: t1 = 4; t2 = 4; diff = MAX;
3: t1 = 5; t2 = 4; diff = MAX -1;

The cycle will end at step 2.

1: t1 = MAX-1; t2 = 2; diff = 4;
2: t1 = MAX;   t2 = 2; diff = 3;
3: t1 = 0;     t2 = 2; diff = 2;
4: t1 = 1;     t2 = 2; diff = 1;
5: t1 = 2;     t2 = 2; diff = MAX;
6: t1 = 3;     t2 = 2; diff = MAX-1;

The cycle will end at step 5

It cannot be even diff will return 0 in case of equal times.


  but actually
> cur_cycles will end up > end_cycles and a wrap will be incorrectly
> assumed.
>

Sorry, could you explain?
Stuart Haslam Aug. 7, 2015, 9:33 a.m. UTC | #3
On Thu, Aug 06, 2015 at 08:18:36PM +0300, Ivan Khoronzhuk wrote:
> 
> 
> On 06.08.15 20:03, Stuart Haslam wrote:
> >On Thu, Aug 06, 2015 at 05:30:30PM +0300, Ivan Khoronzhuk wrote:
> >>The direct comparing of "cur_cycles" and "end_cycles" is not valid,
> >>as "end_cycles" can be overflowed and comparison will give wrong
> >>result. So use odp_time_diff_cycles() for that, as it takes in
> >>account cycles overflow.
> >>
> >>Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> >>---
> >>  test/performance/odp_pktio_perf.c | 9 ++++-----
> >>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>
> >>diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
> >>index fcbc4ec..360d832 100644
> >>--- a/test/performance/odp_pktio_perf.c
> >>+++ b/test/performance/odp_pktio_perf.c
> >>@@ -303,7 +303,7 @@ static void *run_thread_tx(void *arg)
> >>  	int thr_id;
> >>  	odp_queue_t outq;
> >>  	pkt_tx_stats_t *stats;
> >>-	uint64_t next_tx_cycles, end_cycles, cur_cycles;
> >>+	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
> >>  	uint64_t burst_gap_cycles;
> >>  	uint32_t batch_len;
> >>  	int unsent_pkts = 0;
> >>@@ -331,12 +331,11 @@ static void *run_thread_tx(void *arg)
> >>
> >>  	odp_barrier_wait(&globals->tx_barrier);
> >>
> >>+	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
> >>  	cur_cycles     = odp_time_cycles();
> >>  	next_tx_cycles = cur_cycles;
> >>-	end_cycles     = cur_cycles +
> >>-			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
> >>-
> >>-	while (cur_cycles < end_cycles) {
> >>+	end_cycles     = cur_cycles + send_duration;
> >>+	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {
> >
> >I sent a comment on v1 before I saw this v2. This is still not right,
> >it *looks* like this will wait for 2*send_duration,
> 
> One moment. How?
> 1: t1 = 3; t2 = 4; diff = 1;
> 2: t1 = 4; t2 = 4; diff = MAX;
> 3: t1 = 5; t2 = 4; diff = MAX -1;
> 
> The cycle will end at step 2.
> 
> 1: t1 = MAX-1; t2 = 2; diff = 4;
> 2: t1 = MAX;   t2 = 2; diff = 3;
> 3: t1 = 0;     t2 = 2; diff = 2;
> 4: t1 = 1;     t2 = 2; diff = 1;
> 5: t1 = 2;     t2 = 2; diff = MAX;
> 6: t1 = 3;     t2 = 2; diff = MAX-1;
> 
> The cycle will end at step 5

I understand that it works as you expected, my point was that it *looks*
wrong. odp_time_diff_cycles() allows the application to not worry about
wraps so it's odd to explicitly provoke one.

> 
> It cannot be even diff will return 0 in case of equal times.
> 
> 
>  but actually
> >cur_cycles will end up > end_cycles and a wrap will be incorrectly
> >assumed.
> >
> 
> Sorry, could you explain?
> 

As above, provoking a wrap by intentionally passing t1 > t2 feels like
misuse of the API.
Ivan Khoronzhuk Aug. 7, 2015, 10:26 a.m. UTC | #4
Hi, Stuart
I've combined here v1 and v2.

On 07.08.15 12:33, Stuart Haslam wrote:
> On Thu, Aug 06, 2015 at 08:18:36PM +0300, Ivan Khoronzhuk wrote:
>>
>>
>> On 06.08.15 20:03, Stuart Haslam wrote:
>>> On Thu, Aug 06, 2015 at 05:30:30PM +0300, Ivan Khoronzhuk wrote:
>>>> The direct comparing of "cur_cycles" and "end_cycles" is not valid,
>>>> as "end_cycles" can be overflowed and comparison will give wrong
>>>> result. So use odp_time_diff_cycles() for that, as it takes in
>>>> account cycles overflow.
>>>>
>>>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>> ---
>>>>   test/performance/odp_pktio_perf.c | 9 ++++-----
>>>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
>>>> index fcbc4ec..360d832 100644
>>>> --- a/test/performance/odp_pktio_perf.c
>>>> +++ b/test/performance/odp_pktio_perf.c
>>>> @@ -303,7 +303,7 @@ static void *run_thread_tx(void *arg)
>>>>   	int thr_id;
>>>>   	odp_queue_t outq;
>>>>   	pkt_tx_stats_t *stats;
>>>> -	uint64_t next_tx_cycles, end_cycles, cur_cycles;
>>>> +	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
>>>>   	uint64_t burst_gap_cycles;
>>>>   	uint32_t batch_len;
>>>>   	int unsent_pkts = 0;
>>>> @@ -331,12 +331,11 @@ static void *run_thread_tx(void *arg)
>>>>
>>>>   	odp_barrier_wait(&globals->tx_barrier);
>>>>
>>>> +	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
>>>>   	cur_cycles     = odp_time_cycles();
>>>>   	next_tx_cycles = cur_cycles;
>>>> -	end_cycles     = cur_cycles +
>>>> -			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
>>>> -
>>>> -	while (cur_cycles < end_cycles) {
>>>> +	end_cycles     = cur_cycles + send_duration;
>>>> +	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {
>>>
>>> I sent a comment on v1 before I saw this v2. This is still not right,
>>> it *looks* like this will wait for 2*send_duration,
>>
>> One moment. How?
>> 1: t1 = 3; t2 = 4; diff = 1;
>> 2: t1 = 4; t2 = 4; diff = MAX;
>> 3: t1 = 5; t2 = 4; diff = MAX -1;
>>
>> The cycle will end at step 2.
>>
>> 1: t1 = MAX-1; t2 = 2; diff = 4;
>> 2: t1 = MAX;   t2 = 2; diff = 3;
>> 3: t1 = 0;     t2 = 2; diff = 2;
>> 4: t1 = 1;     t2 = 2; diff = 1;
>> 5: t1 = 2;     t2 = 2; diff = MAX;
>> 6: t1 = 3;     t2 = 2; diff = MAX-1;
>>
>> The cycle will end at step 5
>
> I understand that it works as you expected, my point was that it *looks*
> wrong. odp_time_diff_cycles() allows the application to not worry about
> wraps so it's odd to explicitly provoke one.
>

No objections, I can send version with start address.
It requires a little more changes, but I like more simple changes.

>>
>> It cannot be even diff will return 0 in case of equal times.
>>
>>
>>   but actually
>>> cur_cycles will end up > end_cycles and a wrap will be incorrectly
>>> assumed.
>>>
>>
>> Sorry, could you explain?
>>
>
> As above, provoking a wrap by intentionally passing t1 > t2 feels like
> misuse of the API.
>
 > Duration is important, you can't measure a 60 second interval in this
 > way using a counter that wraps every 20 seconds. I asked because I
 > wasn't sure if you were actually hitting an issue with this code or
 > fixing a theoretical one, given that a 64-bit counter will wrap every
 > couple of centuries

I supposed only 64-bit version.
Yes, on current freqs it can take a very long time to overflow.
But the application shouldn't rely on it. Here is no guarantee how
counter was initialized, in which state it is when you start test.
and who knows how fast systems future ready for us.

 > or so, I thought you may be using a 32-bit counter.

No, I'm using 64-bit counter. See below.

 > It has to return a 64-bit value but it's not specified that it has to
 > be a 64-bit counter or what frequency it runs at.. we should probably
 > add some guidelines for acceptable ranges though.

Maybe it's not specified but it's obvious everywhere in sources that
it can be only 64-bit counter. We don't have odp_time_add_cycles().
We believe that simple operation of t1+t2 gives us correct overlap of
only 64-bit values.. if odp_time_cycles() returns us max 32-bit count,
then sources will have much more problems. So I supposed implementation,
if has 32-bit counter, has to emulate 64-bit counter. It has to 
correctly handle 64-bit wraps.

Another point of view.
Why is odp_time_diff_cycles() needed if counter cannot overflow?
(on current freqs it's ~ centuries). If we have such function, we
suppose that timer can overflow. If timer can overflow we need this
fix. In another way simple t2 - t1 can be used. Done this for future or 
as insurance against not ideal initialization never mind.
Stuart Haslam Aug. 7, 2015, 11:10 a.m. UTC | #5
On Fri, Aug 07, 2015 at 01:26:46PM +0300, Ivan Khoronzhuk wrote:
> Hi, Stuart
> I've combined here v1 and v2.
> 
> On 07.08.15 12:33, Stuart Haslam wrote:
> >On Thu, Aug 06, 2015 at 08:18:36PM +0300, Ivan Khoronzhuk wrote:
> >>
> >>
> >>On 06.08.15 20:03, Stuart Haslam wrote:
> >>>On Thu, Aug 06, 2015 at 05:30:30PM +0300, Ivan Khoronzhuk wrote:
> >>>>The direct comparing of "cur_cycles" and "end_cycles" is not valid,
> >>>>as "end_cycles" can be overflowed and comparison will give wrong
> >>>>result. So use odp_time_diff_cycles() for that, as it takes in
> >>>>account cycles overflow.
> >>>>
> >>>>Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> >>>>---
> >>>>  test/performance/odp_pktio_perf.c | 9 ++++-----
> >>>>  1 file changed, 4 insertions(+), 5 deletions(-)
> >>>>
> >>>>diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
> >>>>index fcbc4ec..360d832 100644
> >>>>--- a/test/performance/odp_pktio_perf.c
> >>>>+++ b/test/performance/odp_pktio_perf.c
> >>>>@@ -303,7 +303,7 @@ static void *run_thread_tx(void *arg)
> >>>>  	int thr_id;
> >>>>  	odp_queue_t outq;
> >>>>  	pkt_tx_stats_t *stats;
> >>>>-	uint64_t next_tx_cycles, end_cycles, cur_cycles;
> >>>>+	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
> >>>>  	uint64_t burst_gap_cycles;
> >>>>  	uint32_t batch_len;
> >>>>  	int unsent_pkts = 0;
> >>>>@@ -331,12 +331,11 @@ static void *run_thread_tx(void *arg)
> >>>>
> >>>>  	odp_barrier_wait(&globals->tx_barrier);
> >>>>
> >>>>+	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
> >>>>  	cur_cycles     = odp_time_cycles();
> >>>>  	next_tx_cycles = cur_cycles;
> >>>>-	end_cycles     = cur_cycles +
> >>>>-			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
> >>>>-
> >>>>-	while (cur_cycles < end_cycles) {
> >>>>+	end_cycles     = cur_cycles + send_duration;
> >>>>+	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {
> >>>
> >>>I sent a comment on v1 before I saw this v2. This is still not right,
> >>>it *looks* like this will wait for 2*send_duration,
> >>
> >>One moment. How?
> >>1: t1 = 3; t2 = 4; diff = 1;
> >>2: t1 = 4; t2 = 4; diff = MAX;
> >>3: t1 = 5; t2 = 4; diff = MAX -1;
> >>
> >>The cycle will end at step 2.
> >>
> >>1: t1 = MAX-1; t2 = 2; diff = 4;
> >>2: t1 = MAX;   t2 = 2; diff = 3;
> >>3: t1 = 0;     t2 = 2; diff = 2;
> >>4: t1 = 1;     t2 = 2; diff = 1;
> >>5: t1 = 2;     t2 = 2; diff = MAX;
> >>6: t1 = 3;     t2 = 2; diff = MAX-1;
> >>
> >>The cycle will end at step 5
> >
> >I understand that it works as you expected, my point was that it *looks*
> >wrong. odp_time_diff_cycles() allows the application to not worry about
> >wraps so it's odd to explicitly provoke one.
> >
> 
> No objections, I can send version with start address.

OK

> It requires a little more changes, but I like more simple changes.
> 
> >>
> >>It cannot be even diff will return 0 in case of equal times.
> >>
> >>
> >>  but actually
> >>>cur_cycles will end up > end_cycles and a wrap will be incorrectly
> >>>assumed.
> >>>
> >>
> >>Sorry, could you explain?
> >>
> >
> >As above, provoking a wrap by intentionally passing t1 > t2 feels like
> >misuse of the API.
> >
> > Duration is important, you can't measure a 60 second interval in this
> > way using a counter that wraps every 20 seconds. I asked because I
> > wasn't sure if you were actually hitting an issue with this code or
> > fixing a theoretical one, given that a 64-bit counter will wrap every
> > couple of centuries
> 
> I supposed only 64-bit version.
> Yes, on current freqs it can take a very long time to overflow.
> But the application shouldn't rely on it. Here is no guarantee how
> counter was initialized, in which state it is when you start test.
> and who knows how fast systems future ready for us.
> 
> > or so, I thought you may be using a 32-bit counter.
> 
> No, I'm using 64-bit counter. See below.
> 
> > It has to return a 64-bit value but it's not specified that it has to
> > be a 64-bit counter or what frequency it runs at.. we should probably
> > add some guidelines for acceptable ranges though.
> 
> Maybe it's not specified but it's obvious everywhere in sources that
> it can be only 64-bit counter. We don't have odp_time_add_cycles().
> We believe that simple operation of t1+t2 gives us correct overlap of
> only 64-bit values.. if odp_time_cycles() returns us max 32-bit count,
> then sources will have much more problems. So I supposed implementation,
> if has 32-bit counter, has to emulate 64-bit counter. It has to
> correctly handle 64-bit wraps.
> 
> Another point of view.
> Why is odp_time_diff_cycles() needed if counter cannot overflow?
> (on current freqs it's ~ centuries). If we have such function, we
> suppose that timer can overflow. If timer can overflow we need this
> fix. In another way simple t2 - t1 can be used. Done this for future
> or as insurance against not ideal initialization never mind.
> 

Sorry if it wasn't clear, but I wasn't suggesting that we didn't need to
cope with overflows. I agree these changes are needed.
diff mbox

Patch

diff --git a/test/performance/odp_pktio_perf.c b/test/performance/odp_pktio_perf.c
index fcbc4ec..360d832 100644
--- a/test/performance/odp_pktio_perf.c
+++ b/test/performance/odp_pktio_perf.c
@@ -303,7 +303,7 @@  static void *run_thread_tx(void *arg)
 	int thr_id;
 	odp_queue_t outq;
 	pkt_tx_stats_t *stats;
-	uint64_t next_tx_cycles, end_cycles, cur_cycles;
+	uint64_t next_tx_cycles, end_cycles, cur_cycles, send_duration;
 	uint64_t burst_gap_cycles;
 	uint32_t batch_len;
 	int unsent_pkts = 0;
@@ -331,12 +331,11 @@  static void *run_thread_tx(void *arg)
 
 	odp_barrier_wait(&globals->tx_barrier);
 
+	send_duration = odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
 	cur_cycles     = odp_time_cycles();
 	next_tx_cycles = cur_cycles;
-	end_cycles     = cur_cycles +
-			 odp_time_ns_to_cycles(targs->duration * ODP_TIME_SEC);
-
-	while (cur_cycles < end_cycles) {
+	end_cycles     = cur_cycles + send_duration;
+	while (odp_time_diff_cycles(cur_cycles, end_cycles) <= send_duration) {
 		unsigned alloc_cnt = 0, tx_cnt;
 
 		if (cur_cycles < next_tx_cycles) {