diff mbox series

pm_runtime: move autosuspend on hrtimer

Message ID 1543393048-30426-1-git-send-email-vincent.guittot@linaro.org
State Superseded
Headers show
Series pm_runtime: move autosuspend on hrtimer | expand

Commit Message

Vincent Guittot Nov. 28, 2018, 8:17 a.m. UTC
pm runtime uses the timer infrastructure for autosuspend. This implies that
the minimum time before autosuspending a device is in the range
[1 tick - 2 ticks [
-On arm64 this means [4ms - 8ms[ with default jiffies configuration
-And on arm, it is [10ms - 20ms[

These values are quite high for embedded system which sometimes wants
duration in the range of 1 ms.

We can move autosuspend on hrtimer to get finer granularity for short
duration and takes advantage of slack to keep some margins and gathers
long timeout in minimum wakeups.

On an arm64 platform that uses 1ms of its GPU, the power consumption
improves by 10% for idle use case with hrtimer.

The performance impact on arm64 hikey octo coresis :
- mark_last_busy: from 1.11 us to 1.25 us
- rpm_suspend: from 15.54 us to 15.38 us
  Only the code path of rpm_suspend that starts hrtimer has been measured

arm64 image (arm64 default defconfig) decreases by around 3KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
12034646	6869268	 386840	19290754	1265a82	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
12030550	6870164	 387032	19287746	1264ec2	vmlinux

The performance impact on arm 32bits snowball dual cores is :
- mark_last_busy: from 0.31 us usec to 0.77 us
- rpm_suspend: from 6.83 us to 6.67 usec

The increase of the image for snowball platform that I used for testing
performance impact, is neglictable (244B)

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
7157961	2119580	 264120	9541661	 91981d	build-ux500/vmlinux

size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
7157773	2119884	 264248	9541905	 919911	vmlinux-hrtimer

And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB
with following details:

$ size vmlinux-timer
   text	   data	    bss	    dec	    hex	filename
13304443	6803420	 402768	20510631	138f7a7	vmlinux

$ size vmlinux-hrtimer
   text	   data	    bss	    dec	    hex	filename
13304299	6805276	 402768	20512343	138fe57	vmlinux

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

---
 drivers/base/power/runtime.c | 63 ++++++++++++++++++++++++--------------------
 include/linux/pm.h           |  5 ++--
 include/linux/pm_runtime.h   |  6 ++---
 3 files changed, 40 insertions(+), 34 deletions(-)

-- 
2.7.4

Comments

Ulf Hansson Dec. 11, 2018, 1:14 p.m. UTC | #1
On Wed, 28 Nov 2018 at 09:17, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>

> pm runtime uses the timer infrastructure for autosuspend. This implies that

> the minimum time before autosuspending a device is in the range

> [1 tick - 2 ticks [

> -On arm64 this means [4ms - 8ms[ with default jiffies configuration

> -And on arm, it is [10ms - 20ms[


Nitpick: The brackets defining the ranges looks a bit odd to me.

>

> These values are quite high for embedded system which sometimes wants

> duration in the range of 1 ms.

>

> We can move autosuspend on hrtimer to get finer granularity for short

> duration and takes advantage of slack to keep some margins and gathers

> long timeout in minimum wakeups.

>

> On an arm64 platform that uses 1ms of its GPU, the power consumption


Maybe point to the driver in question and possibly also mention that
the 1 ms you refer to, is the "autosuspend timeout".

> improves by 10% for idle use case with hrtimer.

>

> The performance impact on arm64 hikey octo coresis :


I suggest to use the terminology "latency", rather than "performance",
as I think it makes it more clear.

> - mark_last_busy: from 1.11 us to 1.25 us

> - rpm_suspend: from 15.54 us to 15.38 us

>   Only the code path of rpm_suspend that starts hrtimer has been measured

>

> arm64 image (arm64 default defconfig) decreases by around 3KB

> with following details:

>

> $ size vmlinux-timer

>    text    data     bss     dec     hex filename

> 12034646        6869268  386840 19290754        1265a82 vmlinux

>

> $ size vmlinux-hrtimer

>    text    data     bss     dec     hex filename

> 12030550        6870164  387032 19287746        1264ec2 vmlinux

>

> The performance impact on arm 32bits snowball dual cores is :

> - mark_last_busy: from 0.31 us usec to 0.77 us

> - rpm_suspend: from 6.83 us to 6.67 usec

>

> The increase of the image for snowball platform that I used for testing

> performance impact, is neglictable (244B)

>

> $ size vmlinux-timer

>    text    data     bss     dec     hex filename

> 7157961 2119580  264120 9541661  91981d build-ux500/vmlinux

>

> size vmlinux-hrtimer

>    text    data     bss     dec     hex filename

> 7157773 2119884  264248 9541905  919911 vmlinux-hrtimer

>

> And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB

> with following details:

>

> $ size vmlinux-timer

>    text    data     bss     dec     hex filename

> 13304443        6803420  402768 20510631        138f7a7 vmlinux

>

> $ size vmlinux-hrtimer

>    text    data     bss     dec     hex filename

> 13304299        6805276  402768 20512343        138fe57 vmlinux


I am impressed that you cared to collect and provide all this data,
you make me feel lazy. :-)

>

> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---

>  drivers/base/power/runtime.c | 63 ++++++++++++++++++++++++--------------------

>  include/linux/pm.h           |  5 ++--

>  include/linux/pm_runtime.h   |  6 ++---

>  3 files changed, 40 insertions(+), 34 deletions(-)

>

> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c

> index beb85c3..7062469 100644

> --- a/drivers/base/power/runtime.c

> +++ b/drivers/base/power/runtime.c

> @@ -8,6 +8,8 @@

>   */

>

>  #include <linux/sched/mm.h>

> +#include <linux/ktime.h>

> +#include <linux/hrtimer.h>

>  #include <linux/export.h>

>  #include <linux/pm_runtime.h>

>  #include <linux/pm_wakeirq.h>

> @@ -93,7 +95,7 @@ static void __update_runtime_status(struct device *dev, enum rpm_status status)

>  static void pm_runtime_deactivate_timer(struct device *dev)

>  {

>         if (dev->power.timer_expires > 0) {

> -               del_timer(&dev->power.suspend_timer);

> +               hrtimer_cancel(&dev->power.suspend_timer);

>                 dev->power.timer_expires = 0;

>         }

>  }

> @@ -124,12 +126,11 @@ static void pm_runtime_cancel_pending(struct device *dev)

>   * This function may be called either with or without dev->power.lock held.

>   * Either way it can be racy, since power.last_busy may be updated at any time.

>   */

> -unsigned long pm_runtime_autosuspend_expiration(struct device *dev)

> +u64 pm_runtime_autosuspend_expiration(struct device *dev)

>  {

>         int autosuspend_delay;

> -       long elapsed;

> -       unsigned long last_busy;

> -       unsigned long expires = 0;

> +       u64 last_busy, expires = 0;

> +       u64 now = ktime_to_ns(ktime_get());

>

>         if (!dev->power.use_autosuspend)

>                 goto out;

> @@ -139,19 +140,9 @@ unsigned long pm_runtime_autosuspend_expiration(struct device *dev)

>                 goto out;

>

>         last_busy = READ_ONCE(dev->power.last_busy);

> -       elapsed = jiffies - last_busy;

> -       if (elapsed < 0)

> -               goto out;       /* jiffies has wrapped around. */

>

> -       /*

> -        * If the autosuspend_delay is >= 1 second, align the timer by rounding

> -        * up to the nearest second.

> -        */

> -       expires = last_busy + msecs_to_jiffies(autosuspend_delay);

> -       if (autosuspend_delay >= 1000)

> -               expires = round_jiffies(expires);

> -       expires += !expires;

> -       if (elapsed >= expires - last_busy)

> +       expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;


It may be a good idea to convert the last_busy member in struct
dev_pm_info, into a ktime_t instead. In this way, we don't need to
convert to nanoseconds, every time pm_runtime_mark_last_busy() is
called - and more importantly, it should also help to address the
"overflow" scenario. Or maybe you already deal with that?

Moreover, you may want to look into more of the ktime helpers, while
doing the different computations. Like ktime_get_ns() and ktime_sub(),
for example.

> +       if (expires <= now)

>                 expires = 0;    /* Already expired. */

>

>   out:

> @@ -515,7 +506,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)

>         /* If the autosuspend_delay time hasn't expired yet, reschedule. */

>         if ((rpmflags & RPM_AUTO)

>             && dev->power.runtime_status != RPM_SUSPENDING) {

> -               unsigned long expires = pm_runtime_autosuspend_expiration(dev);

> +               u64 expires = pm_runtime_autosuspend_expiration(dev);

>

>                 if (expires != 0) {

>                         /* Pending requests need to be canceled. */

> @@ -528,10 +519,20 @@ static int rpm_suspend(struct device *dev, int rpmflags)

>                          * expire; pm_suspend_timer_fn() will take care of the

>                          * rest.

>                          */

> -                       if (!(dev->power.timer_expires && time_before_eq(

> -                           dev->power.timer_expires, expires))) {

> +                       if (!(dev->power.timer_expires &&

> +                                       dev->power.timer_expires <= expires)) {

> +                               /*

> +                                * We add a slack of 25% to gather wakeups

> +                                * without sacrificing the granularity.

> +                                */

> +                               u64 slack = READ_ONCE(dev->power.autosuspend_delay) *

> +                                                   (NSEC_PER_MSEC >> 2);

> +

>                                 dev->power.timer_expires = expires;

> -                               mod_timer(&dev->power.suspend_timer, expires);

> +                               hrtimer_start_range_ns(&dev->power.suspend_timer,

> +                                               ns_to_ktime(expires),

> +                                               slack,

> +                                               HRTIMER_MODE_ABS);

>                         }

>                         dev->power.timer_autosuspends = 1;

>                         goto out;

> @@ -895,23 +896,25 @@ static void pm_runtime_work(struct work_struct *work)

>   *

>   * Check if the time is right and queue a suspend request.

>   */

> -static void pm_suspend_timer_fn(struct timer_list *t)

> +static enum hrtimer_restart  pm_suspend_timer_fn(struct hrtimer *timer)

>  {

> -       struct device *dev = from_timer(dev, t, power.suspend_timer);

> +       struct device *dev = container_of(timer, struct device, power.suspend_timer);

>         unsigned long flags;

> -       unsigned long expires;

> +       u64 expires;

>

>         spin_lock_irqsave(&dev->power.lock, flags);

>

>         expires = dev->power.timer_expires;

>         /* If 'expire' is after 'jiffies' we've been called too early. */

> -       if (expires > 0 && !time_after(expires, jiffies)) {

> +       if (expires > 0 && expires < ktime_to_ns(ktime_get())) {

>                 dev->power.timer_expires = 0;

>                 rpm_suspend(dev, dev->power.timer_autosuspends ?

>                     (RPM_ASYNC | RPM_AUTO) : RPM_ASYNC);

>         }

>

>         spin_unlock_irqrestore(&dev->power.lock, flags);

> +

> +       return HRTIMER_NORESTART;

>  }

>

>  /**

> @@ -922,6 +925,7 @@ static void pm_suspend_timer_fn(struct timer_list *t)

>  int pm_schedule_suspend(struct device *dev, unsigned int delay)

>  {

>         unsigned long flags;

> +       ktime_t expires;

>         int retval;

>

>         spin_lock_irqsave(&dev->power.lock, flags);

> @@ -938,10 +942,10 @@ int pm_schedule_suspend(struct device *dev, unsigned int delay)

>         /* Other scheduled or pending requests need to be canceled. */

>         pm_runtime_cancel_pending(dev);

>

> -       dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);

> -       dev->power.timer_expires += !dev->power.timer_expires;

> +       expires = ktime_add(ktime_get(), ms_to_ktime(delay));

> +       dev->power.timer_expires = ktime_to_ns(expires);

>         dev->power.timer_autosuspends = 0;

> -       mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);

> +       hrtimer_start(&dev->power.suspend_timer, expires, HRTIMER_MODE_ABS);

>

>   out:

>         spin_unlock_irqrestore(&dev->power.lock, flags);

> @@ -1491,7 +1495,8 @@ void pm_runtime_init(struct device *dev)

>         INIT_WORK(&dev->power.work, pm_runtime_work);

>

>         dev->power.timer_expires = 0;

> -       timer_setup(&dev->power.suspend_timer, pm_suspend_timer_fn, 0);

> +       hrtimer_init(&dev->power.suspend_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);

> +       dev->power.suspend_timer.function = pm_suspend_timer_fn;

>

>         init_waitqueue_head(&dev->power.wait_queue);

>  }

> diff --git a/include/linux/pm.h b/include/linux/pm.h

> index e723b78..0bd9de1 100644

> --- a/include/linux/pm.h

> +++ b/include/linux/pm.h

> @@ -26,6 +26,7 @@

>  #include <linux/spinlock.h>

>  #include <linux/wait.h>

>  #include <linux/timer.h>

> +#include <linux/hrtimer.h>

>  #include <linux/completion.h>

>

>  /*

> @@ -608,7 +609,7 @@ struct dev_pm_info {

>         unsigned int            should_wakeup:1;

>  #endif

>  #ifdef CONFIG_PM

> -       struct timer_list       suspend_timer;

> +       struct hrtimer          suspend_timer;

>         unsigned long           timer_expires;

>         struct work_struct      work;

>         wait_queue_head_t       wait_queue;

> @@ -631,7 +632,7 @@ struct dev_pm_info {

>         enum rpm_status         runtime_status;

>         int                     runtime_error;

>         int                     autosuspend_delay;

> -       unsigned long           last_busy;

> +       u64                     last_busy;

>         unsigned long           active_jiffies;

>         unsigned long           suspended_jiffies;

>         unsigned long           accounting_timestamp;

> diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h

> index f0fc470..54af4ee 100644

> --- a/include/linux/pm_runtime.h

> +++ b/include/linux/pm_runtime.h

> @@ -51,7 +51,7 @@ extern void pm_runtime_no_callbacks(struct device *dev);

>  extern void pm_runtime_irq_safe(struct device *dev);

>  extern void __pm_runtime_use_autosuspend(struct device *dev, bool use);

>  extern void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);

> -extern unsigned long pm_runtime_autosuspend_expiration(struct device *dev);

> +extern u64 pm_runtime_autosuspend_expiration(struct device *dev);

>  extern void pm_runtime_update_max_time_suspended(struct device *dev,

>                                                  s64 delta_ns);

>  extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);

> @@ -105,7 +105,7 @@ static inline bool pm_runtime_callbacks_present(struct device *dev)

>

>  static inline void pm_runtime_mark_last_busy(struct device *dev)

>  {

> -       WRITE_ONCE(dev->power.last_busy, jiffies);

> +       WRITE_ONCE(dev->power.last_busy, ktime_to_ns(ktime_get()));

>  }

>

>  static inline bool pm_runtime_is_irq_safe(struct device *dev)

> @@ -168,7 +168,7 @@ static inline void __pm_runtime_use_autosuspend(struct device *dev,

>                                                 bool use) {}

>  static inline void pm_runtime_set_autosuspend_delay(struct device *dev,

>                                                 int delay) {}

> -static inline unsigned long pm_runtime_autosuspend_expiration(

> +static inline u64 pm_runtime_autosuspend_expiration(

>                                 struct device *dev) { return 0; }

>  static inline void pm_runtime_set_memalloc_noio(struct device *dev,

>                                                 bool enable){}

> --

> 2.7.4

>


An overall comment.

Assuming that we agree that $subject patch seems like a reasonable
change to do, then I think we should also convert the "accounting"
data, which tracks the time a device have been
runtime_suspended|resumed, into using ktime rather than the jiffies.
Genpd already use ktime for this matter.

Would you mind, already at this step, adding a patch converting the
"accounting" data into ktime? I think it would provide a more complete
solution, hence the request.

Kind regards
Uffe
Vincent Guittot Dec. 11, 2018, 2 p.m. UTC | #2
On Tue, 11 Dec 2018 at 14:15, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>

> On Wed, 28 Nov 2018 at 09:17, Vincent Guittot

> <vincent.guittot@linaro.org> wrote:

> >

> > pm runtime uses the timer infrastructure for autosuspend. This implies that

> > the minimum time before autosuspending a device is in the range

> > [1 tick - 2 ticks [

> > -On arm64 this means [4ms - 8ms[ with default jiffies configuration

> > -And on arm, it is [10ms - 20ms[

>

> Nitpick: The brackets defining the ranges looks a bit odd to me.


The range is between 10 ms included and 20ms excluded
What do you suggest ?

>

> >

> > These values are quite high for embedded system which sometimes wants

> > duration in the range of 1 ms.

> >

> > We can move autosuspend on hrtimer to get finer granularity for short

> > duration and takes advantage of slack to keep some margins and gathers

> > long timeout in minimum wakeups.

> >

> > On an arm64 platform that uses 1ms of its GPU, the power consumption

>

> Maybe point to the driver in question and possibly also mention that

> the 1 ms you refer to, is the "autosuspend timeout".


This has been done on an android phone and I'm afraid that the driver
is upstreamed mainline yet

>

> > improves by 10% for idle use case with hrtimer.

> >

> > The performance impact on arm64 hikey octo coresis :

>

> I suggest to use the terminology "latency", rather than "performance",

> as I think it makes it more clear.


ok

>

> > - mark_last_busy: from 1.11 us to 1.25 us

> > - rpm_suspend: from 15.54 us to 15.38 us

> >   Only the code path of rpm_suspend that starts hrtimer has been measured

> >

> > arm64 image (arm64 default defconfig) decreases by around 3KB

> > with following details:

> >

> > $ size vmlinux-timer

> >    text    data     bss     dec     hex filename

> > 12034646        6869268  386840 19290754        1265a82 vmlinux

> >

> > $ size vmlinux-hrtimer

> >    text    data     bss     dec     hex filename

> > 12030550        6870164  387032 19287746        1264ec2 vmlinux

> >

> > The performance impact on arm 32bits snowball dual cores is :

> > - mark_last_busy: from 0.31 us usec to 0.77 us

> > - rpm_suspend: from 6.83 us to 6.67 usec

> >

> > The increase of the image for snowball platform that I used for testing

> > performance impact, is neglictable (244B)

> >

> > $ size vmlinux-timer

> >    text    data     bss     dec     hex filename

> > 7157961 2119580  264120 9541661  91981d build-ux500/vmlinux

> >

> > size vmlinux-hrtimer

> >    text    data     bss     dec     hex filename

> > 7157773 2119884  264248 9541905  919911 vmlinux-hrtimer

> >

> > And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB

> > with following details:

> >

> > $ size vmlinux-timer

> >    text    data     bss     dec     hex filename

> > 13304443        6803420  402768 20510631        138f7a7 vmlinux

> >

> > $ size vmlinux-hrtimer

> >    text    data     bss     dec     hex filename

> > 13304299        6805276  402768 20512343        138fe57 vmlinux

>

> I am impressed that you cared to collect and provide all this data,

> you make me feel lazy. :-)


;-)

>

> >

> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

> > ---

> >  drivers/base/power/runtime.c | 63 ++++++++++++++++++++++++--------------------

> >  include/linux/pm.h           |  5 ++--

> >  include/linux/pm_runtime.h   |  6 ++---

> >  3 files changed, 40 insertions(+), 34 deletions(-)

> >

> > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c

> > index beb85c3..7062469 100644

> > --- a/drivers/base/power/runtime.c

> > +++ b/drivers/base/power/runtime.c

> > @@ -8,6 +8,8 @@

> >   */

> >

> >  #include <linux/sched/mm.h>

> > +#include <linux/ktime.h>

> > +#include <linux/hrtimer.h>

> >  #include <linux/export.h>

> >  #include <linux/pm_runtime.h>

> >  #include <linux/pm_wakeirq.h>

> > @@ -93,7 +95,7 @@ static void __update_runtime_status(struct device *dev, enum rpm_status status)

> >  static void pm_runtime_deactivate_timer(struct device *dev)

> >  {

> >         if (dev->power.timer_expires > 0) {

> > -               del_timer(&dev->power.suspend_timer);

> > +               hrtimer_cancel(&dev->power.suspend_timer);

> >                 dev->power.timer_expires = 0;

> >         }

> >  }

> > @@ -124,12 +126,11 @@ static void pm_runtime_cancel_pending(struct device *dev)

> >   * This function may be called either with or without dev->power.lock held.

> >   * Either way it can be racy, since power.last_busy may be updated at any time.

> >   */

> > -unsigned long pm_runtime_autosuspend_expiration(struct device *dev)

> > +u64 pm_runtime_autosuspend_expiration(struct device *dev)

> >  {

> >         int autosuspend_delay;

> > -       long elapsed;

> > -       unsigned long last_busy;

> > -       unsigned long expires = 0;

> > +       u64 last_busy, expires = 0;

> > +       u64 now = ktime_to_ns(ktime_get());

> >

> >         if (!dev->power.use_autosuspend)

> >                 goto out;

> > @@ -139,19 +140,9 @@ unsigned long pm_runtime_autosuspend_expiration(struct device *dev)

> >                 goto out;

> >

> >         last_busy = READ_ONCE(dev->power.last_busy);

> > -       elapsed = jiffies - last_busy;

> > -       if (elapsed < 0)

> > -               goto out;       /* jiffies has wrapped around. */

> >

> > -       /*

> > -        * If the autosuspend_delay is >= 1 second, align the timer by rounding

> > -        * up to the nearest second.

> > -        */

> > -       expires = last_busy + msecs_to_jiffies(autosuspend_delay);

> > -       if (autosuspend_delay >= 1000)

> > -               expires = round_jiffies(expires);

> > -       expires += !expires;

> > -       if (elapsed >= expires - last_busy)

> > +       expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

>

> It may be a good idea to convert the last_busy member in struct

> dev_pm_info, into a ktime_t instead. In this way, we don't need to

> convert to nanoseconds, every time pm_runtime_mark_last_busy() is


I'm using the raw ns value so it doesn't depend on a struct and we can
do direct comparison
The code looks easier to read IMO

expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

compare to

expires = ms_to_ktime(autosuspend_delay);
expires = ktime_add(expires, last_busy);

In addition, the current ktime_get_ns() is a nop and doesn't add any overhead

> called - and more importantly, it should also help to address the

> "overflow" scenario. Or maybe you already deal with that?


AFAICT, we don't consider the overflow case with 64bits

>

> Moreover, you may want to look into more of the ktime helpers, while

> doing the different computations. Like ktime_get_ns() and ktime_sub(),

> for example.

>

> > +       if (expires <= now)

> >                 expires = 0;    /* Already expired. */

> >

> >   out:

> > @@ -515,7 +506,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)

> >         /* If the autosuspend_delay time hasn't expired yet, reschedule. */

> >         if ((rpmflags & RPM_AUTO)

> >             && dev->power.runtime_status != RPM_SUSPENDING) {

> > -               unsigned long expires = pm_runtime_autosuspend_expiration(dev);

> > +               u64 expires = pm_runtime_autosuspend_expiration(dev);

> >

> >                 if (expires != 0) {

> >                         /* Pending requests need to be canceled. */

> > @@ -528,10 +519,20 @@ static int rpm_suspend(struct device *dev, int rpmflags)

> >                          * expire; pm_suspend_timer_fn() will take care of the

> >                          * rest.

> >                          */

> > -                       if (!(dev->power.timer_expires && time_before_eq(

> > -                           dev->power.timer_expires, expires))) {

> > +                       if (!(dev->power.timer_expires &&

> > +                                       dev->power.timer_expires <= expires)) {

> > +                               /*

> > +                                * We add a slack of 25% to gather wakeups

> > +                                * without sacrificing the granularity.

> > +                                */

> > +                               u64 slack = READ_ONCE(dev->power.autosuspend_delay) *

> > +                                                   (NSEC_PER_MSEC >> 2);

> > +

> >                                 dev->power.timer_expires = expires;

> > -                               mod_timer(&dev->power.suspend_timer, expires);

> > +                               hrtimer_start_range_ns(&dev->power.suspend_timer,

> > +                                               ns_to_ktime(expires),

> > +                                               slack,

> > +                                               HRTIMER_MODE_ABS);

> >                         }

> >                         dev->power.timer_autosuspends = 1;

> >                         goto out;

> > @@ -895,23 +896,25 @@ static void pm_runtime_work(struct work_struct *work)

> >   *

> >   * Check if the time is right and queue a suspend request.

> >   */

> > -static void pm_suspend_timer_fn(struct timer_list *t)

> > +static enum hrtimer_restart  pm_suspend_timer_fn(struct hrtimer *timer)

> >  {

> > -       struct device *dev = from_timer(dev, t, power.suspend_timer);

> > +       struct device *dev = container_of(timer, struct device, power.suspend_timer);

> >         unsigned long flags;

> > -       unsigned long expires;

> > +       u64 expires;

> >

> >         spin_lock_irqsave(&dev->power.lock, flags);

> >

> >         expires = dev->power.timer_expires;

> >         /* If 'expire' is after 'jiffies' we've been called too early. */

> > -       if (expires > 0 && !time_after(expires, jiffies)) {

> > +       if (expires > 0 && expires < ktime_to_ns(ktime_get())) {

> >                 dev->power.timer_expires = 0;

> >                 rpm_suspend(dev, dev->power.timer_autosuspends ?

> >                     (RPM_ASYNC | RPM_AUTO) : RPM_ASYNC);

> >         }

> >

> >         spin_unlock_irqrestore(&dev->power.lock, flags);

> > +

> > +       return HRTIMER_NORESTART;

> >  }

> >

> >  /**

> > @@ -922,6 +925,7 @@ static void pm_suspend_timer_fn(struct timer_list *t)

> >  int pm_schedule_suspend(struct device *dev, unsigned int delay)

> >  {

> >         unsigned long flags;

> > +       ktime_t expires;

> >         int retval;

> >

> >         spin_lock_irqsave(&dev->power.lock, flags);

> > @@ -938,10 +942,10 @@ int pm_schedule_suspend(struct device *dev, unsigned int delay)

> >         /* Other scheduled or pending requests need to be canceled. */

> >         pm_runtime_cancel_pending(dev);

> >

> > -       dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);

> > -       dev->power.timer_expires += !dev->power.timer_expires;

> > +       expires = ktime_add(ktime_get(), ms_to_ktime(delay));

> > +       dev->power.timer_expires = ktime_to_ns(expires);

> >         dev->power.timer_autosuspends = 0;

> > -       mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);

> > +       hrtimer_start(&dev->power.suspend_timer, expires, HRTIMER_MODE_ABS);

> >

> >   out:

> >         spin_unlock_irqrestore(&dev->power.lock, flags);

> > @@ -1491,7 +1495,8 @@ void pm_runtime_init(struct device *dev)

> >         INIT_WORK(&dev->power.work, pm_runtime_work);

> >

> >         dev->power.timer_expires = 0;

> > -       timer_setup(&dev->power.suspend_timer, pm_suspend_timer_fn, 0);

> > +       hrtimer_init(&dev->power.suspend_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);

> > +       dev->power.suspend_timer.function = pm_suspend_timer_fn;

> >

> >         init_waitqueue_head(&dev->power.wait_queue);

> >  }

> > diff --git a/include/linux/pm.h b/include/linux/pm.h

> > index e723b78..0bd9de1 100644

> > --- a/include/linux/pm.h

> > +++ b/include/linux/pm.h

> > @@ -26,6 +26,7 @@

> >  #include <linux/spinlock.h>

> >  #include <linux/wait.h>

> >  #include <linux/timer.h>

> > +#include <linux/hrtimer.h>

> >  #include <linux/completion.h>

> >

> >  /*

> > @@ -608,7 +609,7 @@ struct dev_pm_info {

> >         unsigned int            should_wakeup:1;

> >  #endif

> >  #ifdef CONFIG_PM

> > -       struct timer_list       suspend_timer;

> > +       struct hrtimer          suspend_timer;

> >         unsigned long           timer_expires;

> >         struct work_struct      work;

> >         wait_queue_head_t       wait_queue;

> > @@ -631,7 +632,7 @@ struct dev_pm_info {

> >         enum rpm_status         runtime_status;

> >         int                     runtime_error;

> >         int                     autosuspend_delay;

> > -       unsigned long           last_busy;

> > +       u64                     last_busy;

> >         unsigned long           active_jiffies;

> >         unsigned long           suspended_jiffies;

> >         unsigned long           accounting_timestamp;

> > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h

> > index f0fc470..54af4ee 100644

> > --- a/include/linux/pm_runtime.h

> > +++ b/include/linux/pm_runtime.h

> > @@ -51,7 +51,7 @@ extern void pm_runtime_no_callbacks(struct device *dev);

> >  extern void pm_runtime_irq_safe(struct device *dev);

> >  extern void __pm_runtime_use_autosuspend(struct device *dev, bool use);

> >  extern void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);

> > -extern unsigned long pm_runtime_autosuspend_expiration(struct device *dev);

> > +extern u64 pm_runtime_autosuspend_expiration(struct device *dev);

> >  extern void pm_runtime_update_max_time_suspended(struct device *dev,

> >                                                  s64 delta_ns);

> >  extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);

> > @@ -105,7 +105,7 @@ static inline bool pm_runtime_callbacks_present(struct device *dev)

> >

> >  static inline void pm_runtime_mark_last_busy(struct device *dev)

> >  {

> > -       WRITE_ONCE(dev->power.last_busy, jiffies);

> > +       WRITE_ONCE(dev->power.last_busy, ktime_to_ns(ktime_get()));

> >  }

> >

> >  static inline bool pm_runtime_is_irq_safe(struct device *dev)

> > @@ -168,7 +168,7 @@ static inline void __pm_runtime_use_autosuspend(struct device *dev,

> >                                                 bool use) {}

> >  static inline void pm_runtime_set_autosuspend_delay(struct device *dev,

> >                                                 int delay) {}

> > -static inline unsigned long pm_runtime_autosuspend_expiration(

> > +static inline u64 pm_runtime_autosuspend_expiration(

> >                                 struct device *dev) { return 0; }

> >  static inline void pm_runtime_set_memalloc_noio(struct device *dev,

> >                                                 bool enable){}

> > --

> > 2.7.4

> >

>

> An overall comment.

>

> Assuming that we agree that $subject patch seems like a reasonable

> change to do, then I think we should also convert the "accounting"

> data, which tracks the time a device have been

> runtime_suspended|resumed, into using ktime rather than the jiffies.


I wanted to focus on the timer vs hrtimer in this 1st version and the
patch for converting accounting was not ready yet
The conversion of the accounting will be added in a next version

Regards,
Vincent

> Genpd already use ktime for this matter.

>

> Would you mind, already at this step, adding a patch converting the

> "accounting" data into ktime? I think it would provide a more complete

> solution, hence the request.

>

> Kind regards

> Uffe
Ulf Hansson Dec. 11, 2018, 2:19 p.m. UTC | #3
On Tue, 11 Dec 2018 at 15:00, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>

> On Tue, 11 Dec 2018 at 14:15, Ulf Hansson <ulf.hansson@linaro.org> wrote:

> >

> > On Wed, 28 Nov 2018 at 09:17, Vincent Guittot

> > <vincent.guittot@linaro.org> wrote:

> > >

> > > pm runtime uses the timer infrastructure for autosuspend. This implies that

> > > the minimum time before autosuspending a device is in the range

> > > [1 tick - 2 ticks [

> > > -On arm64 this means [4ms - 8ms[ with default jiffies configuration

> > > -And on arm, it is [10ms - 20ms[

> >

> > Nitpick: The brackets defining the ranges looks a bit odd to me.

>

> The range is between 10 ms included and 20ms excluded

> What do you suggest ?


Just write it in text, it becomes more obvious, at least to me.

>

> >

> > >

> > > These values are quite high for embedded system which sometimes wants

> > > duration in the range of 1 ms.

> > >

> > > We can move autosuspend on hrtimer to get finer granularity for short

> > > duration and takes advantage of slack to keep some margins and gathers

> > > long timeout in minimum wakeups.

> > >

> > > On an arm64 platform that uses 1ms of its GPU, the power consumption

> >

> > Maybe point to the driver in question and possibly also mention that

> > the 1 ms you refer to, is the "autosuspend timeout".

>

> This has been done on an android phone and I'm afraid that the driver

> is upstreamed mainline yet


Okay, no worries, just clarify that it's the "autosuspend timeout" you
are talking about.

[...]

 > > @@ -139,19 +140,9 @@ unsigned long

pm_runtime_autosuspend_expiration(struct device *dev)
> > >                 goto out;

> > >

> > >         last_busy = READ_ONCE(dev->power.last_busy);

> > > -       elapsed = jiffies - last_busy;

> > > -       if (elapsed < 0)

> > > -               goto out;       /* jiffies has wrapped around. */

> > >

> > > -       /*

> > > -        * If the autosuspend_delay is >= 1 second, align the timer by rounding

> > > -        * up to the nearest second.

> > > -        */

> > > -       expires = last_busy + msecs_to_jiffies(autosuspend_delay);

> > > -       if (autosuspend_delay >= 1000)

> > > -               expires = round_jiffies(expires);

> > > -       expires += !expires;

> > > -       if (elapsed >= expires - last_busy)

> > > +       expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

> >

> > It may be a good idea to convert the last_busy member in struct

> > dev_pm_info, into a ktime_t instead. In this way, we don't need to

> > convert to nanoseconds, every time pm_runtime_mark_last_busy() is

>

> I'm using the raw ns value so it doesn't depend on a struct and we can

> do direct comparison

> The code looks easier to read IMO

>

> expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

>

> compare to

>

> expires = ms_to_ktime(autosuspend_delay);

> expires = ktime_add(expires, last_busy);

>

> In addition, the current ktime_get_ns() is a nop and doesn't add any overhead


You have a point!

Still, there is also another reason to why changing to ktime_t could
make sense, which is to be consistent with the way genpd supports the
idle statistics. Genpd stores a ktime_t in the genpd struct. However,
it's not a big deal to me, pick whatever option you prefer.

[...]

> >

> > An overall comment.

> >

> > Assuming that we agree that $subject patch seems like a reasonable

> > change to do, then I think we should also convert the "accounting"

> > data, which tracks the time a device have been

> > runtime_suspended|resumed, into using ktime rather than the jiffies.

>

> I wanted to focus on the timer vs hrtimer in this 1st version and the

> patch for converting accounting was not ready yet

> The conversion of the accounting will be added in a next version


Well, to be clear, please keep it a separate patch. On top of $subject patch.

[...]

Kind regards
Uffe
Vincent Guittot Dec. 11, 2018, 2:24 p.m. UTC | #4
On Tue, 11 Dec 2018 at 15:20, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>

> On Tue, 11 Dec 2018 at 15:00, Vincent Guittot

> <vincent.guittot@linaro.org> wrote:

> >

> > On Tue, 11 Dec 2018 at 14:15, Ulf Hansson <ulf.hansson@linaro.org> wrote:

> > >

> > > On Wed, 28 Nov 2018 at 09:17, Vincent Guittot

> > > <vincent.guittot@linaro.org> wrote:

> > > >

> > > > pm runtime uses the timer infrastructure for autosuspend. This implies that

> > > > the minimum time before autosuspending a device is in the range

> > > > [1 tick - 2 ticks [

> > > > -On arm64 this means [4ms - 8ms[ with default jiffies configuration

> > > > -And on arm, it is [10ms - 20ms[

> > >

> > > Nitpick: The brackets defining the ranges looks a bit odd to me.

> >

> > The range is between 10 ms included and 20ms excluded

> > What do you suggest ?

>

> Just write it in text, it becomes more obvious, at least to me.


Ok

>

> >

> > >

> > > >

> > > > These values are quite high for embedded system which sometimes wants

> > > > duration in the range of 1 ms.

> > > >

> > > > We can move autosuspend on hrtimer to get finer granularity for short

> > > > duration and takes advantage of slack to keep some margins and gathers

> > > > long timeout in minimum wakeups.

> > > >

> > > > On an arm64 platform that uses 1ms of its GPU, the power consumption

> > >

> > > Maybe point to the driver in question and possibly also mention that

> > > the 1 ms you refer to, is the "autosuspend timeout".

> >

> > This has been done on an android phone and I'm afraid that the driver

> > is upstreamed mainline yet

>

> Okay, no worries, just clarify that it's the "autosuspend timeout" you

> are talking about.

>

> [...]

>

>  > > @@ -139,19 +140,9 @@ unsigned long

> pm_runtime_autosuspend_expiration(struct device *dev)

> > > >                 goto out;

> > > >

> > > >         last_busy = READ_ONCE(dev->power.last_busy);

> > > > -       elapsed = jiffies - last_busy;

> > > > -       if (elapsed < 0)

> > > > -               goto out;       /* jiffies has wrapped around. */

> > > >

> > > > -       /*

> > > > -        * If the autosuspend_delay is >= 1 second, align the timer by rounding

> > > > -        * up to the nearest second.

> > > > -        */

> > > > -       expires = last_busy + msecs_to_jiffies(autosuspend_delay);

> > > > -       if (autosuspend_delay >= 1000)

> > > > -               expires = round_jiffies(expires);

> > > > -       expires += !expires;

> > > > -       if (elapsed >= expires - last_busy)

> > > > +       expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

> > >

> > > It may be a good idea to convert the last_busy member in struct

> > > dev_pm_info, into a ktime_t instead. In this way, we don't need to

> > > convert to nanoseconds, every time pm_runtime_mark_last_busy() is

> >

> > I'm using the raw ns value so it doesn't depend on a struct and we can

> > do direct comparison

> > The code looks easier to read IMO

> >

> > expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

> >

> > compare to

> >

> > expires = ms_to_ktime(autosuspend_delay);

> > expires = ktime_add(expires, last_busy);

> >

> > In addition, the current ktime_get_ns() is a nop and doesn't add any overhead

>

> You have a point!

>

> Still, there is also another reason to why changing to ktime_t could

> make sense, which is to be consistent with the way genpd supports the

> idle statistics. Genpd stores a ktime_t in the genpd struct. However,

> it's not a big deal to me, pick whatever option you prefer.


let see what other are saying but i would prefer keep raw ns value

>

> [...]

>

> > >

> > > An overall comment.

> > >

> > > Assuming that we agree that $subject patch seems like a reasonable

> > > change to do, then I think we should also convert the "accounting"

> > > data, which tracks the time a device have been

> > > runtime_suspended|resumed, into using ktime rather than the jiffies.

> >

> > I wanted to focus on the timer vs hrtimer in this 1st version and the

> > patch for converting accounting was not ready yet

> > The conversion of the accounting will be added in a next version

>

> Well, to be clear, please keep it a separate patch. On top of $subject patch.


Yes that was my intent

Thanks

>

> [...]

>

> Kind regards

> Uffe
Rafael J. Wysocki Dec. 12, 2018, 12:59 p.m. UTC | #5
On Tue, Dec 11, 2018 at 3:24 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>

> On Tue, 11 Dec 2018 at 15:20, Ulf Hansson <ulf.hansson@linaro.org> wrote:

> >

> > On Tue, 11 Dec 2018 at 15:00, Vincent Guittot

> > <vincent.guittot@linaro.org> wrote:

> > >

> > > On Tue, 11 Dec 2018 at 14:15, Ulf Hansson <ulf.hansson@linaro.org> wrote:

> > > >

> > > > On Wed, 28 Nov 2018 at 09:17, Vincent Guittot

> > > > <vincent.guittot@linaro.org> wrote:

> > > > >

> > > > > pm runtime uses the timer infrastructure for autosuspend. This implies that

> > > > > the minimum time before autosuspending a device is in the range

> > > > > [1 tick - 2 ticks [

> > > > > -On arm64 this means [4ms - 8ms[ with default jiffies configuration

> > > > > -And on arm, it is [10ms - 20ms[

> > > >

> > > > Nitpick: The brackets defining the ranges looks a bit odd to me.

> > >

> > > The range is between 10 ms included and 20ms excluded

> > > What do you suggest ?

> >

> > Just write it in text, it becomes more obvious, at least to me.

>

> Ok

>

> >

> > >

> > > >

> > > > >

> > > > > These values are quite high for embedded system which sometimes wants

> > > > > duration in the range of 1 ms.

> > > > >

> > > > > We can move autosuspend on hrtimer to get finer granularity for short

> > > > > duration and takes advantage of slack to keep some margins and gathers

> > > > > long timeout in minimum wakeups.

> > > > >

> > > > > On an arm64 platform that uses 1ms of its GPU, the power consumption

> > > >

> > > > Maybe point to the driver in question and possibly also mention that

> > > > the 1 ms you refer to, is the "autosuspend timeout".

> > >

> > > This has been done on an android phone and I'm afraid that the driver

> > > is upstreamed mainline yet

> >

> > Okay, no worries, just clarify that it's the "autosuspend timeout" you

> > are talking about.

> >

> > [...]

> >

> >  > > @@ -139,19 +140,9 @@ unsigned long

> > pm_runtime_autosuspend_expiration(struct device *dev)

> > > > >                 goto out;

> > > > >

> > > > >         last_busy = READ_ONCE(dev->power.last_busy);

> > > > > -       elapsed = jiffies - last_busy;

> > > > > -       if (elapsed < 0)

> > > > > -               goto out;       /* jiffies has wrapped around. */

> > > > >

> > > > > -       /*

> > > > > -        * If the autosuspend_delay is >= 1 second, align the timer by rounding

> > > > > -        * up to the nearest second.

> > > > > -        */

> > > > > -       expires = last_busy + msecs_to_jiffies(autosuspend_delay);

> > > > > -       if (autosuspend_delay >= 1000)

> > > > > -               expires = round_jiffies(expires);

> > > > > -       expires += !expires;

> > > > > -       if (elapsed >= expires - last_busy)

> > > > > +       expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

> > > >

> > > > It may be a good idea to convert the last_busy member in struct

> > > > dev_pm_info, into a ktime_t instead. In this way, we don't need to

> > > > convert to nanoseconds, every time pm_runtime_mark_last_busy() is

> > >

> > > I'm using the raw ns value so it doesn't depend on a struct and we can

> > > do direct comparison

> > > The code looks easier to read IMO

> > >

> > > expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;

> > >

> > > compare to

> > >

> > > expires = ms_to_ktime(autosuspend_delay);

> > > expires = ktime_add(expires, last_busy);

> > >

> > > In addition, the current ktime_get_ns() is a nop and doesn't add any overhead

> >

> > You have a point!

> >

> > Still, there is also another reason to why changing to ktime_t could

> > make sense, which is to be consistent with the way genpd supports the

> > idle statistics. Genpd stores a ktime_t in the genpd struct. However,

> > it's not a big deal to me, pick whatever option you prefer.

>

> let see what other are saying but i would prefer keep raw ns value


Yes, a raw ns value is fine (ktime_t is just more overhead as you have
pointed out).

I will be expecting an update of this one.

Cheers,
Rafael
diff mbox series

Patch

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index beb85c3..7062469 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -8,6 +8,8 @@ 
  */
 
 #include <linux/sched/mm.h>
+#include <linux/ktime.h>
+#include <linux/hrtimer.h>
 #include <linux/export.h>
 #include <linux/pm_runtime.h>
 #include <linux/pm_wakeirq.h>
@@ -93,7 +95,7 @@  static void __update_runtime_status(struct device *dev, enum rpm_status status)
 static void pm_runtime_deactivate_timer(struct device *dev)
 {
 	if (dev->power.timer_expires > 0) {
-		del_timer(&dev->power.suspend_timer);
+		hrtimer_cancel(&dev->power.suspend_timer);
 		dev->power.timer_expires = 0;
 	}
 }
@@ -124,12 +126,11 @@  static void pm_runtime_cancel_pending(struct device *dev)
  * This function may be called either with or without dev->power.lock held.
  * Either way it can be racy, since power.last_busy may be updated at any time.
  */
-unsigned long pm_runtime_autosuspend_expiration(struct device *dev)
+u64 pm_runtime_autosuspend_expiration(struct device *dev)
 {
 	int autosuspend_delay;
-	long elapsed;
-	unsigned long last_busy;
-	unsigned long expires = 0;
+	u64 last_busy, expires = 0;
+	u64 now = ktime_to_ns(ktime_get());
 
 	if (!dev->power.use_autosuspend)
 		goto out;
@@ -139,19 +140,9 @@  unsigned long pm_runtime_autosuspend_expiration(struct device *dev)
 		goto out;
 
 	last_busy = READ_ONCE(dev->power.last_busy);
-	elapsed = jiffies - last_busy;
-	if (elapsed < 0)
-		goto out;	/* jiffies has wrapped around. */
 
-	/*
-	 * If the autosuspend_delay is >= 1 second, align the timer by rounding
-	 * up to the nearest second.
-	 */
-	expires = last_busy + msecs_to_jiffies(autosuspend_delay);
-	if (autosuspend_delay >= 1000)
-		expires = round_jiffies(expires);
-	expires += !expires;
-	if (elapsed >= expires - last_busy)
+	expires = last_busy + autosuspend_delay * NSEC_PER_MSEC;
+	if (expires <= now)
 		expires = 0;	/* Already expired. */
 
  out:
@@ -515,7 +506,7 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 	/* If the autosuspend_delay time hasn't expired yet, reschedule. */
 	if ((rpmflags & RPM_AUTO)
 	    && dev->power.runtime_status != RPM_SUSPENDING) {
-		unsigned long expires = pm_runtime_autosuspend_expiration(dev);
+		u64 expires = pm_runtime_autosuspend_expiration(dev);
 
 		if (expires != 0) {
 			/* Pending requests need to be canceled. */
@@ -528,10 +519,20 @@  static int rpm_suspend(struct device *dev, int rpmflags)
 			 * expire; pm_suspend_timer_fn() will take care of the
 			 * rest.
 			 */
-			if (!(dev->power.timer_expires && time_before_eq(
-			    dev->power.timer_expires, expires))) {
+			if (!(dev->power.timer_expires &&
+					dev->power.timer_expires <= expires)) {
+				/*
+				 * We add a slack of 25% to gather wakeups
+				 * without sacrificing the granularity.
+				 */
+				u64 slack = READ_ONCE(dev->power.autosuspend_delay) *
+						    (NSEC_PER_MSEC >> 2);
+
 				dev->power.timer_expires = expires;
-				mod_timer(&dev->power.suspend_timer, expires);
+				hrtimer_start_range_ns(&dev->power.suspend_timer,
+						ns_to_ktime(expires),
+						slack,
+						HRTIMER_MODE_ABS);
 			}
 			dev->power.timer_autosuspends = 1;
 			goto out;
@@ -895,23 +896,25 @@  static void pm_runtime_work(struct work_struct *work)
  *
  * Check if the time is right and queue a suspend request.
  */
-static void pm_suspend_timer_fn(struct timer_list *t)
+static enum hrtimer_restart  pm_suspend_timer_fn(struct hrtimer *timer)
 {
-	struct device *dev = from_timer(dev, t, power.suspend_timer);
+	struct device *dev = container_of(timer, struct device, power.suspend_timer);
 	unsigned long flags;
-	unsigned long expires;
+	u64 expires;
 
 	spin_lock_irqsave(&dev->power.lock, flags);
 
 	expires = dev->power.timer_expires;
 	/* If 'expire' is after 'jiffies' we've been called too early. */
-	if (expires > 0 && !time_after(expires, jiffies)) {
+	if (expires > 0 && expires < ktime_to_ns(ktime_get())) {
 		dev->power.timer_expires = 0;
 		rpm_suspend(dev, dev->power.timer_autosuspends ?
 		    (RPM_ASYNC | RPM_AUTO) : RPM_ASYNC);
 	}
 
 	spin_unlock_irqrestore(&dev->power.lock, flags);
+
+	return HRTIMER_NORESTART;
 }
 
 /**
@@ -922,6 +925,7 @@  static void pm_suspend_timer_fn(struct timer_list *t)
 int pm_schedule_suspend(struct device *dev, unsigned int delay)
 {
 	unsigned long flags;
+	ktime_t expires;
 	int retval;
 
 	spin_lock_irqsave(&dev->power.lock, flags);
@@ -938,10 +942,10 @@  int pm_schedule_suspend(struct device *dev, unsigned int delay)
 	/* Other scheduled or pending requests need to be canceled. */
 	pm_runtime_cancel_pending(dev);
 
-	dev->power.timer_expires = jiffies + msecs_to_jiffies(delay);
-	dev->power.timer_expires += !dev->power.timer_expires;
+	expires = ktime_add(ktime_get(), ms_to_ktime(delay));
+	dev->power.timer_expires = ktime_to_ns(expires);
 	dev->power.timer_autosuspends = 0;
-	mod_timer(&dev->power.suspend_timer, dev->power.timer_expires);
+	hrtimer_start(&dev->power.suspend_timer, expires, HRTIMER_MODE_ABS);
 
  out:
 	spin_unlock_irqrestore(&dev->power.lock, flags);
@@ -1491,7 +1495,8 @@  void pm_runtime_init(struct device *dev)
 	INIT_WORK(&dev->power.work, pm_runtime_work);
 
 	dev->power.timer_expires = 0;
-	timer_setup(&dev->power.suspend_timer, pm_suspend_timer_fn, 0);
+	hrtimer_init(&dev->power.suspend_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+	dev->power.suspend_timer.function = pm_suspend_timer_fn;
 
 	init_waitqueue_head(&dev->power.wait_queue);
 }
diff --git a/include/linux/pm.h b/include/linux/pm.h
index e723b78..0bd9de1 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -26,6 +26,7 @@ 
 #include <linux/spinlock.h>
 #include <linux/wait.h>
 #include <linux/timer.h>
+#include <linux/hrtimer.h>
 #include <linux/completion.h>
 
 /*
@@ -608,7 +609,7 @@  struct dev_pm_info {
 	unsigned int		should_wakeup:1;
 #endif
 #ifdef CONFIG_PM
-	struct timer_list	suspend_timer;
+	struct hrtimer		suspend_timer;
 	unsigned long		timer_expires;
 	struct work_struct	work;
 	wait_queue_head_t	wait_queue;
@@ -631,7 +632,7 @@  struct dev_pm_info {
 	enum rpm_status		runtime_status;
 	int			runtime_error;
 	int			autosuspend_delay;
-	unsigned long		last_busy;
+	u64			last_busy;
 	unsigned long		active_jiffies;
 	unsigned long		suspended_jiffies;
 	unsigned long		accounting_timestamp;
diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index f0fc470..54af4ee 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -51,7 +51,7 @@  extern void pm_runtime_no_callbacks(struct device *dev);
 extern void pm_runtime_irq_safe(struct device *dev);
 extern void __pm_runtime_use_autosuspend(struct device *dev, bool use);
 extern void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);
-extern unsigned long pm_runtime_autosuspend_expiration(struct device *dev);
+extern u64 pm_runtime_autosuspend_expiration(struct device *dev);
 extern void pm_runtime_update_max_time_suspended(struct device *dev,
 						 s64 delta_ns);
 extern void pm_runtime_set_memalloc_noio(struct device *dev, bool enable);
@@ -105,7 +105,7 @@  static inline bool pm_runtime_callbacks_present(struct device *dev)
 
 static inline void pm_runtime_mark_last_busy(struct device *dev)
 {
-	WRITE_ONCE(dev->power.last_busy, jiffies);
+	WRITE_ONCE(dev->power.last_busy, ktime_to_ns(ktime_get()));
 }
 
 static inline bool pm_runtime_is_irq_safe(struct device *dev)
@@ -168,7 +168,7 @@  static inline void __pm_runtime_use_autosuspend(struct device *dev,
 						bool use) {}
 static inline void pm_runtime_set_autosuspend_delay(struct device *dev,
 						int delay) {}
-static inline unsigned long pm_runtime_autosuspend_expiration(
+static inline u64 pm_runtime_autosuspend_expiration(
 				struct device *dev) { return 0; }
 static inline void pm_runtime_set_memalloc_noio(struct device *dev,
 						bool enable){}