[API-NEXT] validation: timer: increase cyle timeout to prevent timer segfault

Message ID 1453208349-4330-1-git-send-email-maxim.uvarov@linaro.org
State New
Headers show

Commit Message

Maxim Uvarov Jan. 19, 2016, 12:59 p.m.
Increase after loop delay to:
10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
to make sure that timer hanles expired.
This patch can be considered as temporary fix for timer test to prevent
segfault. More detailed fix will be later.

Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
---
 test/validation/timer/timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ivan Khoronzhuk Jan. 19, 2016, 1:11 p.m. | #1
Maxim.

On 19.01.16 14:59, Maxim Uvarov wrote:
> Increase after loop delay to:
> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
> to make sure that timer hanles expired.
> This patch can be considered as temporary fix for timer test to prevent
It's not temporary fix and should be considered as fix of after-loop delay.

> segfault. More detailed fix will be later.
>
> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
> ---
>   test/validation/timer/timer.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
> index dda5e4c..f533f05 100644
> --- a/test/validation/timer/timer.c
> +++ b/test/validation/timer/timer.c
> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg TEST_UNUSED)
>   		}
>   		struct timespec ts;
>   		ts.tv_sec = 0;
> -		ts.tv_nsec = 1000000; /* 1ms */
> +		ts.tv_nsec = 650000000; /* 650ms */
It's not correct place for delay. You've increased polling period in the main loop.
I meant after-loop delay (line 408). And this delay is RANGE_MS dependent, it can be different when changed.
use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest of events */
And if you tested it, please add tested-by.

Also I propose to add preliminary patch to replace timespec on ODP time API.

>   		if (nanosleep(&ts, NULL) < 0)
>   			CU_FAIL_FATAL("nanosleep failed");
>   	}
>
Ivan Khoronzhuk Jan. 19, 2016, 1:47 p.m. | #2
On 19.01.16 15:11, Ivan Khoronzhuk wrote:
> Maxim.
>
> On 19.01.16 14:59, Maxim Uvarov wrote:
>> Increase after loop delay to:
>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
>> to make sure that timer hanles expired.
>> This patch can be considered as temporary fix for timer test to prevent
> It's not temporary fix and should be considered as fix of after-loop delay.
>
>> segfault. More detailed fix will be later.
>>
>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>> ---
>>   test/validation/timer/timer.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
>> index dda5e4c..f533f05 100644
>> --- a/test/validation/timer/timer.c
>> +++ b/test/validation/timer/timer.c
>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg TEST_UNUSED)
>>           }
>>           struct timespec ts;
>>           ts.tv_sec = 0;
>> -        ts.tv_nsec = 1000000; /* 1ms */
>> +        ts.tv_nsec = 650000000; /* 650ms */
> It's not correct place for delay. You've increased polling period in the main loop.
> I meant after-loop delay (line 408). And this delay is RANGE_MS dependent, it can be different when changed.
> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest of events */
> And if you tested it, please add tested-by.
ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;

>
> Also I propose to add preliminary patch to replace timespec on ODP time API.
>
>>           if (nanosleep(&ts, NULL) < 0)
>>               CU_FAIL_FATAL("nanosleep failed");
>>       }
>>
>
Maxim Uvarov Jan. 19, 2016, 2:16 p.m. | #3
On 01/19/2016 16:47, Ivan Khoronzhuk wrote:
>
>
> On 19.01.16 15:11, Ivan Khoronzhuk wrote:
>> Maxim.
>>
>> On 19.01.16 14:59, Maxim Uvarov wrote:
>>> Increase after loop delay to:
>>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
>>> to make sure that timer hanles expired.
>>> This patch can be considered as temporary fix for timer test to prevent
>> It's not temporary fix and should be considered as fix of after-loop 
>> delay.
>>
>>> segfault. More detailed fix will be later.
>>>
>>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>>> ---
>>>   test/validation/timer/timer.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/test/validation/timer/timer.c 
>>> b/test/validation/timer/timer.c
>>> index dda5e4c..f533f05 100644
>>> --- a/test/validation/timer/timer.c
>>> +++ b/test/validation/timer/timer.c
>>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg 
>>> TEST_UNUSED)
>>>           }
>>>           struct timespec ts;
>>>           ts.tv_sec = 0;
>>> -        ts.tv_nsec = 1000000; /* 1ms */
>>> +        ts.tv_nsec = 650000000; /* 650ms */
>> It's not correct place for delay. You've increased polling period in 
>> the main loop.
>> I meant after-loop delay (line 408). And this delay is RANGE_MS 
>> dependent, it can be different when changed.
>> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest 
>> of events */
>> And if you tested it, please add tested-by.
> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;

No, that does not work. First patch works.

Maxim.

>
>>
>> Also I propose to add preliminary patch to replace timespec on ODP 
>> time API.
>>
>>>           if (nanosleep(&ts, NULL) < 0)
>>>               CU_FAIL_FATAL("nanosleep failed");
>>>       }
>>>
>>
>
Ivan Khoronzhuk Jan. 19, 2016, 3:02 p.m. | #4
On 19.01.16 16:16, Maxim Uvarov wrote:
> On 01/19/2016 16:47, Ivan Khoronzhuk wrote:
>>
>>
>> On 19.01.16 15:11, Ivan Khoronzhuk wrote:
>>> Maxim.
>>>
>>> On 19.01.16 14:59, Maxim Uvarov wrote:
>>>> Increase after loop delay to:
>>>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
>>>> to make sure that timer hanles expired.
>>>> This patch can be considered as temporary fix for timer test to prevent
>>> It's not temporary fix and should be considered as fix of after-loop delay.
>>>
>>>> segfault. More detailed fix will be later.
>>>>
>>>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>>>> ---
>>>>   test/validation/timer/timer.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
>>>> index dda5e4c..f533f05 100644
>>>> --- a/test/validation/timer/timer.c
>>>> +++ b/test/validation/timer/timer.c
>>>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg TEST_UNUSED)
>>>>           }
>>>>           struct timespec ts;
>>>>           ts.tv_sec = 0;
>>>> -        ts.tv_nsec = 1000000; /* 1ms */
>>>> +        ts.tv_nsec = 650000000; /* 650ms */
>>> It's not correct place for delay. You've increased polling period in the main loop.
>>> I meant after-loop delay (line 408). And this delay is RANGE_MS dependent, it can be different when changed.
>>> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest of events */
>>> And if you tested it, please add tested-by.
>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;
>
> No, that does not work. First patch works.
>
> Maxim.
Unfortunately I cannot reproduce the issue on my PC at all.
Did you test it with my last comment:
ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS; <- multiplied on
?

If yes then probably you need to split 	/* Cancel and free all timers */ loop.
1. Cancel timers.
2. wait delay in question
3. delete timers.


>
>>
>>>
>>> Also I propose to add preliminary patch to replace timespec on ODP time API.
>>>
>>>>           if (nanosleep(&ts, NULL) < 0)
>>>>               CU_FAIL_FATAL("nanosleep failed");
>>>>       }
>>>>
>>>
>>
>
> _______________________________________________
> lng-odp mailing list
> lng-odp@lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/lng-odp
Ivan Khoronzhuk Jan. 19, 2016, 4:56 p.m. | #5
On 19.01.16 17:02, Ivan Khoronzhuk wrote:
>
>
> On 19.01.16 16:16, Maxim Uvarov wrote:
>> On 01/19/2016 16:47, Ivan Khoronzhuk wrote:
>>>
>>>
>>> On 19.01.16 15:11, Ivan Khoronzhuk wrote:
>>>> Maxim.
>>>>
>>>> On 19.01.16 14:59, Maxim Uvarov wrote:
>>>>> Increase after loop delay to:
>>>>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
>>>>> to make sure that timer hanles expired.
>>>>> This patch can be considered as temporary fix for timer test to prevent
>>>> It's not temporary fix and should be considered as fix of after-loop delay.
>>>>
>>>>> segfault. More detailed fix will be later.
>>>>>
>>>>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>>>>> ---
>>>>>   test/validation/timer/timer.c | 2 +-
>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
>>>>> index dda5e4c..f533f05 100644
>>>>> --- a/test/validation/timer/timer.c
>>>>> +++ b/test/validation/timer/timer.c
>>>>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg TEST_UNUSED)
>>>>>           }
>>>>>           struct timespec ts;
>>>>>           ts.tv_sec = 0;
>>>>> -        ts.tv_nsec = 1000000; /* 1ms */
>>>>> +        ts.tv_nsec = 650000000; /* 650ms */
>>>> It's not correct place for delay. You've increased polling period in the main loop.
>>>> I meant after-loop delay (line 408). And this delay is RANGE_MS dependent, it can be different when changed.
>>>> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest of events */
>>>> And if you tested it, please add tested-by.
>>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;
>>
>> No, that does not work. First patch works.
>>
>> Maxim.
> Unfortunately I cannot reproduce the issue on my PC at all.
> Did you test it with my last comment:
> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS; <- multiplied on
> ?
>
> If yes then probably you need to split     /* Cancel and free all timers */ loop.
> 1. Cancel timers.
> 2. wait delay in question
> 3. delete timers.
>
It's also interesting to know when the event is arrived.
before main test loop set in some global var:

start = odp_time_global();
printf("start time of testing: %", odp_time_to_ns(start));
...
loop
....

and in notify routine while crash get this value:
event_time = odp_time_global();
printf("start time of testing: %", odp_time_to_ns(event_time));

Then compare, the difference should be more than 2050ms.
If it less, then we cannot call timer_getoverrun() if timer is canceled.
And this is a problem then. It it's more then 2sec -> timings are set incorrectly.

But anyway, this patch and patch for replacing on time API (odp_time_wait_ns) have to be added.
I believe that intention of creator of this test was to wait on correct period, 1ms - it's hard typo.

>
>>
>>>
>>>>
>>>> Also I propose to add preliminary patch to replace timespec on ODP time API.
>>>>
>>>>>           if (nanosleep(&ts, NULL) < 0)
>>>>>               CU_FAIL_FATAL("nanosleep failed");
>>>>>       }
>>>>>
>>>>
>>>
>>
>> _______________________________________________
>> lng-odp mailing list
>> lng-odp@lists.linaro.org
>> https://lists.linaro.org/mailman/listinfo/lng-odp
>
Maxim Uvarov Jan. 19, 2016, 6:13 p.m. | #6
On 01/19/2016 19:56, Ivan Khoronzhuk wrote:
>
>
> On 19.01.16 17:02, Ivan Khoronzhuk wrote:
>>
>>
>> On 19.01.16 16:16, Maxim Uvarov wrote:
>>> On 01/19/2016 16:47, Ivan Khoronzhuk wrote:
>>>>
>>>>
>>>> On 19.01.16 15:11, Ivan Khoronzhuk wrote:
>>>>> Maxim.
>>>>>
>>>>> On 19.01.16 14:59, Maxim Uvarov wrote:
>>>>>> Increase after loop delay to:
>>>>>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 
>>>>>> 1ms)
>>>>>> to make sure that timer hanles expired.
>>>>>> This patch can be considered as temporary fix for timer test to 
>>>>>> prevent
>>>>> It's not temporary fix and should be considered as fix of 
>>>>> after-loop delay.
>>>>>
>>>>>> segfault. More detailed fix will be later.
>>>>>>
>>>>>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>>>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>>>>>> ---
>>>>>>   test/validation/timer/timer.c | 2 +-
>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/test/validation/timer/timer.c 
>>>>>> b/test/validation/timer/timer.c
>>>>>> index dda5e4c..f533f05 100644
>>>>>> --- a/test/validation/timer/timer.c
>>>>>> +++ b/test/validation/timer/timer.c
>>>>>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg 
>>>>>> TEST_UNUSED)
>>>>>>           }
>>>>>>           struct timespec ts;
>>>>>>           ts.tv_sec = 0;
>>>>>> -        ts.tv_nsec = 1000000; /* 1ms */
>>>>>> +        ts.tv_nsec = 650000000; /* 650ms */
>>>>> It's not correct place for delay. You've increased polling period 
>>>>> in the main loop.
>>>>> I meant after-loop delay (line 408). And this delay is RANGE_MS 
>>>>> dependent, it can be different when changed.
>>>>> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the 
>>>>> rest of events */
>>>>> And if you tested it, please add tested-by.
>>>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;
>>>
>>> No, that does not work. First patch works.
>>>
>>> Maxim.
>> Unfortunately I cannot reproduce the issue on my PC at all.
>> Did you test it with my last comment:
>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS; <- 
>> multiplied on
>> ?
>>
>> If yes then probably you need to split     /* Cancel and free all 
>> timers */ loop.
>> 1. Cancel timers.
>> 2. wait delay in question
>> 3. delete timers.
>>
> It's also interesting to know when the event is arrived.
> before main test loop set in some global var:
>
> start = odp_time_global();
> printf("start time of testing: %", odp_time_to_ns(start));
> ...
> loop
> ....
>
> and in notify routine while crash get this value:
> event_time = odp_time_global();
> printf("start time of testing: %", odp_time_to_ns(event_time));
>
> Then compare, the difference should be more than 2050ms.
> If it less, then we cannot call timer_getoverrun() if timer is canceled.
> And this is a problem then. It it's more then 2sec -> timings are set 
> incorrectly.
>
> But anyway, this patch and patch for replacing on time API 
> (odp_time_wait_ns) have to be added.
> I believe that intention of creator of this test was to wait on 
> correct period, 1ms - it's hard typo.
>

Ok, I will test that.

We also find that changing nanosleep to odp time sleep solves problem. 
But it's still unclear why. Let's
continue debug it tomorrow. I just think to remove random things from 
the test. That should increase
constantly fail / pass result repetition.

Maxim.

>>
>>>
>>>>
>>>>>
>>>>> Also I propose to add preliminary patch to replace timespec on ODP 
>>>>> time API.
>>>>>
>>>>>>           if (nanosleep(&ts, NULL) < 0)
>>>>>>               CU_FAIL_FATAL("nanosleep failed");
>>>>>>       }
>>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> lng-odp mailing list
>>> lng-odp@lists.linaro.org
>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>
>
Ivan Khoronzhuk Jan. 19, 2016, 7:46 p.m. | #7
On 19.01.16 20:13, Maxim Uvarov wrote:
> On 01/19/2016 19:56, Ivan Khoronzhuk wrote:
>>
>>
>> On 19.01.16 17:02, Ivan Khoronzhuk wrote:
>>>
>>>
>>> On 19.01.16 16:16, Maxim Uvarov wrote:
>>>> On 01/19/2016 16:47, Ivan Khoronzhuk wrote:
>>>>>
>>>>>
>>>>> On 19.01.16 15:11, Ivan Khoronzhuk wrote:
>>>>>> Maxim.
>>>>>>
>>>>>> On 19.01.16 14:59, Maxim Uvarov wrote:
>>>>>>> Increase after loop delay to:
>>>>>>> 10/10 - 7/10 = 3/10. 3/10 * 2000 + some 50ms = 650ms (instead of 1ms)
>>>>>>> to make sure that timer hanles expired.
>>>>>>> This patch can be considered as temporary fix for timer test to prevent
>>>>>> It's not temporary fix and should be considered as fix of after-loop delay.
>>>>>>
>>>>>>> segfault. More detailed fix will be later.
>>>>>>>
>>>>>>> Suggested-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>>>>>>> Signed-off-by: Maxim Uvarov <maxim.uvarov@linaro.org>
>>>>>>> ---
>>>>>>>   test/validation/timer/timer.c | 2 +-
>>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
>>>>>>> index dda5e4c..f533f05 100644
>>>>>>> --- a/test/validation/timer/timer.c
>>>>>>> +++ b/test/validation/timer/timer.c
>>>>>>> @@ -374,7 +374,7 @@ static void *worker_entrypoint(void *arg TEST_UNUSED)
>>>>>>>           }
>>>>>>>           struct timespec ts;
>>>>>>>           ts.tv_sec = 0;
>>>>>>> -        ts.tv_nsec = 1000000; /* 1ms */
>>>>>>> +        ts.tv_nsec = 650000000; /* 650ms */
>>>>>> It's not correct place for delay. You've increased polling period in the main loop.
>>>>>> I meant after-loop delay (line 408). And this delay is RANGE_MS dependent, it can be different when changed.
>>>>>> use: ts.tv_nsec = 3 * RANGE_MS / 10 + 50; /* time to receive the rest of events */
>>>>>> And if you tested it, please add tested-by.
>>>>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS;
>>>>
>>>> No, that does not work. First patch works.
>>>>
>>>> Maxim.
>>> Unfortunately I cannot reproduce the issue on my PC at all.
>>> Did you test it with my last comment:
>>> ts.tv_nsec = (3 * RANGE_MS / 10 + 50) * ODP_TIME_MSEC_IN_NS; <- multiplied on
>>> ?
>>>
>>> If yes then probably you need to split     /* Cancel and free all timers */ loop.
>>> 1. Cancel timers.
>>> 2. wait delay in question
>>> 3. delete timers.
>>>
>> It's also interesting to know when the event is arrived.
>> before main test loop set in some global var:
>>
>> start = odp_time_global();
>> printf("start time of testing: %", odp_time_to_ns(start));
>> ...
>> loop
>> ....
>>
>> and in notify routine while crash get this value:
>> event_time = odp_time_global();
>> printf("start time of testing: %", odp_time_to_ns(event_time));
>>
>> Then compare, the difference should be more than 2050ms.
>> If it less, then we cannot call timer_getoverrun() if timer is canceled.
>> And this is a problem then. It it's more then 2sec -> timings are set incorrectly.
>>
>> But anyway, this patch and patch for replacing on time API (odp_time_wait_ns) have to be added.
>> I believe that intention of creator of this test was to wait on correct period, 1ms - it's hard typo.
>>
>
> Ok, I will test that.
>
> We also find that changing nanosleep to odp time sleep solves problem. But it's still unclear why. Let's
> continue debug it tomorrow. I just think to remove random things from the test. That should increase
> constantly fail / pass result repetition.
randomness should be present after debugging, but replaced on odp_random_data.
nsleep should be present till the end of debugging.

>
> Maxim.
>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Also I propose to add preliminary patch to replace timespec on ODP time API.
>>>>>>
>>>>>>>           if (nanosleep(&ts, NULL) < 0)
>>>>>>>               CU_FAIL_FATAL("nanosleep failed");
>>>>>>>       }
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> lng-odp mailing list
>>>> lng-odp@lists.linaro.org
>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>
>>
>

Patch

diff --git a/test/validation/timer/timer.c b/test/validation/timer/timer.c
index dda5e4c..f533f05 100644
--- a/test/validation/timer/timer.c
+++ b/test/validation/timer/timer.c
@@ -374,7 +374,7 @@  static void *worker_entrypoint(void *arg TEST_UNUSED)
 		}
 		struct timespec ts;
 		ts.tv_sec = 0;
-		ts.tv_nsec = 1000000; /* 1ms */
+		ts.tv_nsec = 650000000; /* 650ms */
 		if (nanosleep(&ts, NULL) < 0)
 			CU_FAIL_FATAL("nanosleep failed");
 	}