diff mbox

[V2,Resend,4/4] timer: Migrate running timer

Message ID CAKohpokfKPQ3EhMf3WB8koXuH2rF6QoejwNOTw08X+XfnVhpEg@mail.gmail.com
State Accepted
Headers show

Commit Message

Viresh Kumar May 22, 2013, 8:34 a.m. UTC
Sorry for being late in replying to your queries.

On 13 May 2013 16:05, Thomas Gleixner <tglx@linutronix.de> wrote:
> Which mechanism is migrating the timer away?

It will be the same: get_nohz_timer_target() which will decide target
cpu for migration.

> I have no objections to the functionality per se, but the proposed
> solution is not going to fly.
>
> Aside of bloating the data structure you're changing the semantics of
> __mod_timer(). No __mod_timer() caller can deal with -EBUSY. So you'd
> break the world and some more.

Ahh.. That idea was dropped already.

> Here is a list of questions:
>
>       - Which mechanism migrates timers?
>
>       - How is that mechanism triggered?

The mechanism remains the same as is for non-rearmed timers.
i.e. get_nohz_timer_target()..

We are just trying to give a approach with which we can migrate
running timers. i.e. those which re-arm themselves from their
handlers.

>       - How does that deal with CPU bound timers?

We will still check 'Pinned' for this timer as is done for any other
normal timer. So, we don't migrate them.

So, this is the clean draft for the idea I had.. (Naming is poor for
now):

        timer->expires = expires;
@@ -990,7 +984,8 @@ int try_to_del_timer_sync(struct timer_list *timer)

        base = lock_timer_base(timer, &flags);

-       if (base->running_timer != timer) {
+       if ((base->running_timer != timer) &&
+                       !timer->wait_for_migration_to_complete) {
                timer_stats_timer_clear_start_info(timer);
                ret = detach_if_pending(timer, base, true);
        }
@@ -1183,6 +1178,8 @@ static inline void __run_timers(struct tvec_base *base)
                                call_timer_fn(timer, fn, data);
                                spin_lock_irq(&base->lock);
                        }
+                       if (timer->wait_for_migration_to_complete)
+                               timer->wait_for_migration_to_complete = 0;
                }
        }
        base->running_timer = NULL;


Please see if it a junk idea or has some light of hope :)

Comments

Peter Zijlstra May 22, 2013, 9:06 a.m. UTC | #1
On Wed, May 22, 2013 at 02:04:16PM +0530, Viresh Kumar wrote:

> So, this is the clean draft for the idea I had.. (Naming is poor for
> now):
> 
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 8c5a197..ad00ebe 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -20,6 +20,7 @@ struct timer_list {
> 
>         void (*function)(unsigned long);
>         unsigned long data;
> +       int wait_for_migration_to_complete;

If you play games with the alignment constraints of struct tvec_base you
can get extra low bits to play with for TIMER_FLAG_MASK. See struct
pool_workqueue for similar games.

That would avoid the struct bloat and I suppose make tglx happy :-)
Viresh Kumar May 22, 2013, 9:23 a.m. UTC | #2
On 22 May 2013 14:36, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, May 22, 2013 at 02:04:16PM +0530, Viresh Kumar wrote:
>
>> So, this is the clean draft for the idea I had.. (Naming is poor for
>> now):
>>
>> diff --git a/include/linux/timer.h b/include/linux/timer.h
>> index 8c5a197..ad00ebe 100644
>> --- a/include/linux/timer.h
>> +++ b/include/linux/timer.h
>> @@ -20,6 +20,7 @@ struct timer_list {
>>
>>         void (*function)(unsigned long);
>>         unsigned long data;
>> +       int wait_for_migration_to_complete;
>
> If you play games with the alignment constraints of struct tvec_base you
> can get extra low bits to play with for TIMER_FLAG_MASK. See struct
> pool_workqueue for similar games.
>
> That would avoid the struct bloat and I suppose make tglx happy :-)

Sure, once the initial draft is approved we can get it optimized to the
best of our capabilities. :)
Viresh Kumar May 31, 2013, 10:49 a.m. UTC | #3
On 22 May 2013 14:04, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Sorry for being late in replying to your queries.
>
> On 13 May 2013 16:05, Thomas Gleixner <tglx@linutronix.de> wrote:
>> Which mechanism is migrating the timer away?
>
> It will be the same: get_nohz_timer_target() which will decide target
> cpu for migration.
>
>> I have no objections to the functionality per se, but the proposed
>> solution is not going to fly.
>>
>> Aside of bloating the data structure you're changing the semantics of
>> __mod_timer(). No __mod_timer() caller can deal with -EBUSY. So you'd
>> break the world and some more.
>
> Ahh.. That idea was dropped already.
>
>> Here is a list of questions:
>>
>>       - Which mechanism migrates timers?
>>
>>       - How is that mechanism triggered?
>
> The mechanism remains the same as is for non-rearmed timers.
> i.e. get_nohz_timer_target()..
>
> We are just trying to give a approach with which we can migrate
> running timers. i.e. those which re-arm themselves from their
> handlers.
>
>>       - How does that deal with CPU bound timers?
>
> We will still check 'Pinned' for this timer as is done for any other
> normal timer. So, we don't migrate them.
>
> So, this is the clean draft for the idea I had.. (Naming is poor for
> now):
>
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 8c5a197..ad00ebe 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -20,6 +20,7 @@ struct timer_list {
>
>         void (*function)(unsigned long);
>         unsigned long data;
> +       int wait_for_migration_to_complete;
>
>         int slack;
>
> diff --git a/kernel/timer.c b/kernel/timer.c
> index a860bba..7791f28 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -746,21 +746,15 @@ __mod_timer(struct timer_list *timer, unsigned
> long expires,
>         new_base = per_cpu(tvec_bases, cpu);
>
>         if (base != new_base) {
> -               /*
> -                * We are trying to schedule the timer on the local CPU.
> -                * However we can't change timer's base while it is running,
> -                * otherwise del_timer_sync() can't detect that the timer's
> -                * handler yet has not finished. This also guarantees that
> -                * the timer is serialized wrt itself.
> -                */
> -               if (likely(base->running_timer != timer)) {
> -                       /* See the comment in lock_timer_base() */
> -                       timer_set_base(timer, NULL);
> -                       spin_unlock(&base->lock);
> -                       base = new_base;
> -                       spin_lock(&base->lock);
> -                       timer_set_base(timer, base);
> -               }
> +               if (base->running_timer == timer)
> +                       timer->wait_for_migration_to_complete = 1;
> +
> +               /* See the comment in lock_timer_base() */
> +               timer_set_base(timer, NULL);
> +               spin_unlock(&base->lock);
> +               base = new_base;
> +               spin_lock(&base->lock);
> +               timer_set_base(timer, base);
>         }
>
>         timer->expires = expires;
> @@ -990,7 +984,8 @@ int try_to_del_timer_sync(struct timer_list *timer)
>
>         base = lock_timer_base(timer, &flags);
>
> -       if (base->running_timer != timer) {
> +       if ((base->running_timer != timer) &&
> +                       !timer->wait_for_migration_to_complete) {
>                 timer_stats_timer_clear_start_info(timer);
>                 ret = detach_if_pending(timer, base, true);
>         }
> @@ -1183,6 +1178,8 @@ static inline void __run_timers(struct tvec_base *base)
>                                 call_timer_fn(timer, fn, data);
>                                 spin_lock_irq(&base->lock);
>                         }
> +                       if (timer->wait_for_migration_to_complete)
> +                               timer->wait_for_migration_to_complete = 0;
>                 }
>         }
>         base->running_timer = NULL;
>
>
> Please see if it a junk idea or has some light of hope :)

Ping!!
Viresh Kumar June 18, 2013, 4:51 a.m. UTC | #4
On 31 May 2013 16:19, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 22 May 2013 14:04, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> Sorry for being late in replying to your queries.
>>
>> On 13 May 2013 16:05, Thomas Gleixner <tglx@linutronix.de> wrote:
>>> Which mechanism is migrating the timer away?
>>
>> It will be the same: get_nohz_timer_target() which will decide target
>> cpu for migration.
>>
>>> I have no objections to the functionality per se, but the proposed
>>> solution is not going to fly.
>>>
>>> Aside of bloating the data structure you're changing the semantics of
>>> __mod_timer(). No __mod_timer() caller can deal with -EBUSY. So you'd
>>> break the world and some more.
>>
>> Ahh.. That idea was dropped already.
>>
>>> Here is a list of questions:
>>>
>>>       - Which mechanism migrates timers?
>>>
>>>       - How is that mechanism triggered?
>>
>> The mechanism remains the same as is for non-rearmed timers.
>> i.e. get_nohz_timer_target()..
>>
>> We are just trying to give a approach with which we can migrate
>> running timers. i.e. those which re-arm themselves from their
>> handlers.
>>
>>>       - How does that deal with CPU bound timers?
>>
>> We will still check 'Pinned' for this timer as is done for any other
>> normal timer. So, we don't migrate them.
>>
>> So, this is the clean draft for the idea I had.. (Naming is poor for
>> now):
>>
>> diff --git a/include/linux/timer.h b/include/linux/timer.h
>> index 8c5a197..ad00ebe 100644
>> --- a/include/linux/timer.h
>> +++ b/include/linux/timer.h
>> @@ -20,6 +20,7 @@ struct timer_list {
>>
>>         void (*function)(unsigned long);
>>         unsigned long data;
>> +       int wait_for_migration_to_complete;
>>
>>         int slack;
>>
>> diff --git a/kernel/timer.c b/kernel/timer.c
>> index a860bba..7791f28 100644
>> --- a/kernel/timer.c
>> +++ b/kernel/timer.c
>> @@ -746,21 +746,15 @@ __mod_timer(struct timer_list *timer, unsigned
>> long expires,
>>         new_base = per_cpu(tvec_bases, cpu);
>>
>>         if (base != new_base) {
>> -               /*
>> -                * We are trying to schedule the timer on the local CPU.
>> -                * However we can't change timer's base while it is running,
>> -                * otherwise del_timer_sync() can't detect that the timer's
>> -                * handler yet has not finished. This also guarantees that
>> -                * the timer is serialized wrt itself.
>> -                */
>> -               if (likely(base->running_timer != timer)) {
>> -                       /* See the comment in lock_timer_base() */
>> -                       timer_set_base(timer, NULL);
>> -                       spin_unlock(&base->lock);
>> -                       base = new_base;
>> -                       spin_lock(&base->lock);
>> -                       timer_set_base(timer, base);
>> -               }
>> +               if (base->running_timer == timer)
>> +                       timer->wait_for_migration_to_complete = 1;
>> +
>> +               /* See the comment in lock_timer_base() */
>> +               timer_set_base(timer, NULL);
>> +               spin_unlock(&base->lock);
>> +               base = new_base;
>> +               spin_lock(&base->lock);
>> +               timer_set_base(timer, base);
>>         }
>>
>>         timer->expires = expires;
>> @@ -990,7 +984,8 @@ int try_to_del_timer_sync(struct timer_list *timer)
>>
>>         base = lock_timer_base(timer, &flags);
>>
>> -       if (base->running_timer != timer) {
>> +       if ((base->running_timer != timer) &&
>> +                       !timer->wait_for_migration_to_complete) {
>>                 timer_stats_timer_clear_start_info(timer);
>>                 ret = detach_if_pending(timer, base, true);
>>         }
>> @@ -1183,6 +1178,8 @@ static inline void __run_timers(struct tvec_base *base)
>>                                 call_timer_fn(timer, fn, data);
>>                                 spin_lock_irq(&base->lock);
>>                         }
>> +                       if (timer->wait_for_migration_to_complete)
>> +                               timer->wait_for_migration_to_complete = 0;
>>                 }
>>         }
>>         base->running_timer = NULL;
>>
>>
>> Please see if it a junk idea or has some light of hope :)
>
> Ping!!

Ping!!
Viresh Kumar July 24, 2013, 9:17 a.m. UTC | #5
On 18 June 2013 10:21, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 31 May 2013 16:19, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 22 May 2013 14:04, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>> Sorry for being late in replying to your queries.
>>>
>>> On 13 May 2013 16:05, Thomas Gleixner <tglx@linutronix.de> wrote:
>>>> Which mechanism is migrating the timer away?
>>>
>>> It will be the same: get_nohz_timer_target() which will decide target
>>> cpu for migration.
>>>
>>>> I have no objections to the functionality per se, but the proposed
>>>> solution is not going to fly.
>>>>
>>>> Aside of bloating the data structure you're changing the semantics of
>>>> __mod_timer(). No __mod_timer() caller can deal with -EBUSY. So you'd
>>>> break the world and some more.
>>>
>>> Ahh.. That idea was dropped already.
>>>
>>>> Here is a list of questions:
>>>>
>>>>       - Which mechanism migrates timers?
>>>>
>>>>       - How is that mechanism triggered?
>>>
>>> The mechanism remains the same as is for non-rearmed timers.
>>> i.e. get_nohz_timer_target()..
>>>
>>> We are just trying to give a approach with which we can migrate
>>> running timers. i.e. those which re-arm themselves from their
>>> handlers.
>>>
>>>>       - How does that deal with CPU bound timers?
>>>
>>> We will still check 'Pinned' for this timer as is done for any other
>>> normal timer. So, we don't migrate them.
>>>
>>> So, this is the clean draft for the idea I had.. (Naming is poor for
>>> now):
>>>
>>> diff --git a/include/linux/timer.h b/include/linux/timer.h
>>> index 8c5a197..ad00ebe 100644
>>> --- a/include/linux/timer.h
>>> +++ b/include/linux/timer.h
>>> @@ -20,6 +20,7 @@ struct timer_list {
>>>
>>>         void (*function)(unsigned long);
>>>         unsigned long data;
>>> +       int wait_for_migration_to_complete;
>>>
>>>         int slack;
>>>
>>> diff --git a/kernel/timer.c b/kernel/timer.c
>>> index a860bba..7791f28 100644
>>> --- a/kernel/timer.c
>>> +++ b/kernel/timer.c
>>> @@ -746,21 +746,15 @@ __mod_timer(struct timer_list *timer, unsigned
>>> long expires,
>>>         new_base = per_cpu(tvec_bases, cpu);
>>>
>>>         if (base != new_base) {
>>> -               /*
>>> -                * We are trying to schedule the timer on the local CPU.
>>> -                * However we can't change timer's base while it is running,
>>> -                * otherwise del_timer_sync() can't detect that the timer's
>>> -                * handler yet has not finished. This also guarantees that
>>> -                * the timer is serialized wrt itself.
>>> -                */
>>> -               if (likely(base->running_timer != timer)) {
>>> -                       /* See the comment in lock_timer_base() */
>>> -                       timer_set_base(timer, NULL);
>>> -                       spin_unlock(&base->lock);
>>> -                       base = new_base;
>>> -                       spin_lock(&base->lock);
>>> -                       timer_set_base(timer, base);
>>> -               }
>>> +               if (base->running_timer == timer)
>>> +                       timer->wait_for_migration_to_complete = 1;
>>> +
>>> +               /* See the comment in lock_timer_base() */
>>> +               timer_set_base(timer, NULL);
>>> +               spin_unlock(&base->lock);
>>> +               base = new_base;
>>> +               spin_lock(&base->lock);
>>> +               timer_set_base(timer, base);
>>>         }
>>>
>>>         timer->expires = expires;
>>> @@ -990,7 +984,8 @@ int try_to_del_timer_sync(struct timer_list *timer)
>>>
>>>         base = lock_timer_base(timer, &flags);
>>>
>>> -       if (base->running_timer != timer) {
>>> +       if ((base->running_timer != timer) &&
>>> +                       !timer->wait_for_migration_to_complete) {
>>>                 timer_stats_timer_clear_start_info(timer);
>>>                 ret = detach_if_pending(timer, base, true);
>>>         }
>>> @@ -1183,6 +1178,8 @@ static inline void __run_timers(struct tvec_base *base)
>>>                                 call_timer_fn(timer, fn, data);
>>>                                 spin_lock_irq(&base->lock);
>>>                         }
>>> +                       if (timer->wait_for_migration_to_complete)
>>> +                               timer->wait_for_migration_to_complete = 0;
>>>                 }
>>>         }
>>>         base->running_timer = NULL;
>>>
>>>
>>> Please see if it a junk idea or has some light of hope :)
>>
>> Ping!!
>
> Ping!!

Hi Thomas,

Do you still see some light of hope in this patch :) ?

--
viresh
Viresh Kumar Aug. 7, 2013, 9:55 a.m. UTC | #6
On 24 July 2013 14:47, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 18 June 2013 10:21, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> Hi Thomas,
>
> Do you still see some light of hope in this patch :) ?

Ping!!
Sebastian Andrzej Siewior Oct. 4, 2013, 12:39 p.m. UTC | #7
* Viresh Kumar | 2013-08-07 15:25:55 [+0530]:

>On 24 July 2013 14:47, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> On 18 June 2013 10:21, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
>> Hi Thomas,
>>
>> Do you still see some light of hope in this patch :) ?
>
>Ping!!

tglx, could you please take a look at this patch / thread?

Sebastian
Viresh Kumar Oct. 23, 2013, 5:55 a.m. UTC | #8
On 4 October 2013 18:09, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> tglx, could you please take a look at this patch / thread?

Thomas, Ping!!
diff mbox

Patch

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 8c5a197..ad00ebe 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -20,6 +20,7 @@  struct timer_list {

        void (*function)(unsigned long);
        unsigned long data;
+       int wait_for_migration_to_complete;

        int slack;

diff --git a/kernel/timer.c b/kernel/timer.c
index a860bba..7791f28 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -746,21 +746,15 @@  __mod_timer(struct timer_list *timer, unsigned
long expires,
        new_base = per_cpu(tvec_bases, cpu);

        if (base != new_base) {
-               /*
-                * We are trying to schedule the timer on the local CPU.
-                * However we can't change timer's base while it is running,
-                * otherwise del_timer_sync() can't detect that the timer's
-                * handler yet has not finished. This also guarantees that
-                * the timer is serialized wrt itself.
-                */
-               if (likely(base->running_timer != timer)) {
-                       /* See the comment in lock_timer_base() */
-                       timer_set_base(timer, NULL);
-                       spin_unlock(&base->lock);
-                       base = new_base;
-                       spin_lock(&base->lock);
-                       timer_set_base(timer, base);
-               }
+               if (base->running_timer == timer)
+                       timer->wait_for_migration_to_complete = 1;
+
+               /* See the comment in lock_timer_base() */
+               timer_set_base(timer, NULL);
+               spin_unlock(&base->lock);
+               base = new_base;
+               spin_lock(&base->lock);
+               timer_set_base(timer, base);
        }