diff mbox series

[v3,1/2] PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter

Message ID 20201110092933.3342784-2-zhangqilong3@huawei.com
State Accepted
Commit dd8088d5a8969dc2b42f71d7bc01c25c61a78066
Headers show
Series Fix usage counter leak by adding a general sync ops | expand

Commit Message

Zhang Qilong Nov. 10, 2020, 9:29 a.m. UTC
In many case, we need to check return value of pm_runtime_get_sync, but
it brings a trouble to the usage counter processing. Many callers forget
to decrease the usage counter when it failed, which could resulted in
reference leak. It has been discussed a lot[0][1]. So we add a function
to deal with the usage counter for better coding.

[0]https://lkml.org/lkml/2020/6/14/88
[1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
---
 include/linux/pm_runtime.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Rafael J. Wysocki Nov. 16, 2020, 12:49 p.m. UTC | #1
On Tue, Nov 10, 2020 at 10:25 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:
>
> In many case, we need to check return value of pm_runtime_get_sync, but
> it brings a trouble to the usage counter processing. Many callers forget
> to decrease the usage counter when it failed, which could resulted in
> reference leak. It has been discussed a lot[0][1]. So we add a function
> to deal with the usage counter for better coding.
>
> [0]https://lkml.org/lkml/2020/6/14/88
> [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139
> Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

Acked-by: Rafael J. Wysocki  <rafael.j.wysocki@intel.com>

> ---
>  include/linux/pm_runtime.h | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
> index 4b708f4e8eed..b492ae00cc90 100644
> --- a/include/linux/pm_runtime.h
> +++ b/include/linux/pm_runtime.h
> @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)
>         return __pm_runtime_resume(dev, RPM_GET_PUT);
>  }
>
> +/**
> + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
> + * @dev: Target device.
> + *
> + * Resume @dev synchronously and if that is successful, increment its runtime
> + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been
> + * incremented or a negative error code otherwise.
> + */
> +static inline int pm_runtime_resume_and_get(struct device *dev)
> +{
> +       int ret;
> +
> +       ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> +       if (ret < 0) {
> +               pm_runtime_put_noidle(dev);
> +               return ret;
> +       }
> +
> +       return 0;
> +}
> +
>  /**
>   * pm_runtime_put - Drop device usage counter and queue up "idle check" if 0.
>   * @dev: Target device.
> --
> 2.25.4
>
Rafael J. Wysocki Nov. 30, 2020, 4:37 p.m. UTC | #2
On Fri, Nov 27, 2020 at 11:16 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>

> Hi Zhang,

>

> On Tue, Nov 10, 2020 at 10:29 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:

> > In many case, we need to check return value of pm_runtime_get_sync, but

> > it brings a trouble to the usage counter processing. Many callers forget

> > to decrease the usage counter when it failed, which could resulted in

> > reference leak. It has been discussed a lot[0][1]. So we add a function

> > to deal with the usage counter for better coding.

> >

> > [0]https://lkml.org/lkml/2020/6/14/88

> > [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139

> > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

>

> Thanks for your patch, which is now commit dd8088d5a8969dc2 ("PM:

> runtime: Add pm_runtime_resume_and_get to deal with usage counter") in

> v5.10-rc5.

>

> > --- a/include/linux/pm_runtime.h

> > +++ b/include/linux/pm_runtime.h

> > @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)

> >         return __pm_runtime_resume(dev, RPM_GET_PUT);

> >  }

> >

> > +/**

> > + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.

> > + * @dev: Target device.

> > + *

> > + * Resume @dev synchronously and if that is successful, increment its runtime

> > + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been

> > + * incremented or a negative error code otherwise.

> > + */

> > +static inline int pm_runtime_resume_and_get(struct device *dev)

>

> Perhaps this function should be called pm_runtime_resume_and_get_sync(),


No, really.

I might consider calling it pm_runtime_acquire(), and adding a
matching _release() as a pm_runtime_get() synonym for that matter, but
not the above.

> to make it clear it does a synchronous get?

>

> I had to look into the implementation to verify that a change like


I'm not sure why, because the kerneldoc is unambiguous AFAICS.

>

> -       ret = pm_runtime_get_sync(&pdev->dev);

> +       ret = pm_runtime_resume_and_get(&pdev->dev);

>

> in the follow-up patches is actually a valid change, maintaining

> synchronous operation. Oh, pm_runtime_resume() is synchronous, too...


Yes, it is.
Laurent Pinchart Nov. 30, 2020, 5:35 p.m. UTC | #3
On Mon, Nov 30, 2020 at 05:37:52PM +0100, Rafael J. Wysocki wrote:
> On Fri, Nov 27, 2020 at 11:16 AM Geert Uytterhoeven wrote:

> > On Tue, Nov 10, 2020 at 10:29 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:

> > > In many case, we need to check return value of pm_runtime_get_sync, but

> > > it brings a trouble to the usage counter processing. Many callers forget

> > > to decrease the usage counter when it failed, which could resulted in

> > > reference leak. It has been discussed a lot[0][1]. So we add a function

> > > to deal with the usage counter for better coding.

> > >

> > > [0]https://lkml.org/lkml/2020/6/14/88

> > > [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139

> > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

> >

> > Thanks for your patch, which is now commit dd8088d5a8969dc2 ("PM:

> > runtime: Add pm_runtime_resume_and_get to deal with usage counter") in

> > v5.10-rc5.

> >

> > > --- a/include/linux/pm_runtime.h

> > > +++ b/include/linux/pm_runtime.h

> > > @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)

> > >         return __pm_runtime_resume(dev, RPM_GET_PUT);

> > >  }

> > >

> > > +/**

> > > + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.

> > > + * @dev: Target device.

> > > + *

> > > + * Resume @dev synchronously and if that is successful, increment its runtime

> > > + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been

> > > + * incremented or a negative error code otherwise.

> > > + */

> > > +static inline int pm_runtime_resume_and_get(struct device *dev)

> >

> > Perhaps this function should be called pm_runtime_resume_and_get_sync(),

> 

> No, really.

> 

> I might consider calling it pm_runtime_acquire(), and adding a

> matching _release() as a pm_runtime_get() synonym for that matter, but

> not the above.


pm_runtime_acquire() seems better to me too. Would pm_runtime_release()
would be an alias for pm_runtime_put() ?

We would also likely need a pm_runtime_release_autosuspend() too then.
But on that topic, I was wondering, is there a reason we can't select
autosuspend behaviour automatically when autosuspend is enabled ?

> > to make it clear it does a synchronous get?

> >

> > I had to look into the implementation to verify that a change like

> 

> I'm not sure why, because the kerneldoc is unambiguous AFAICS.

> 

> >

> > -       ret = pm_runtime_get_sync(&pdev->dev);

> > +       ret = pm_runtime_resume_and_get(&pdev->dev);

> >

> > in the follow-up patches is actually a valid change, maintaining

> > synchronous operation. Oh, pm_runtime_resume() is synchronous, too...

> 

> Yes, it is.


-- 
Regards,

Laurent Pinchart
Rafael J. Wysocki Nov. 30, 2020, 5:55 p.m. UTC | #4
On Mon, Nov 30, 2020 at 6:35 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>

> On Mon, Nov 30, 2020 at 05:37:52PM +0100, Rafael J. Wysocki wrote:

> > On Fri, Nov 27, 2020 at 11:16 AM Geert Uytterhoeven wrote:

> > > On Tue, Nov 10, 2020 at 10:29 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:

> > > > In many case, we need to check return value of pm_runtime_get_sync, but

> > > > it brings a trouble to the usage counter processing. Many callers forget

> > > > to decrease the usage counter when it failed, which could resulted in

> > > > reference leak. It has been discussed a lot[0][1]. So we add a function

> > > > to deal with the usage counter for better coding.

> > > >

> > > > [0]https://lkml.org/lkml/2020/6/14/88

> > > > [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139

> > > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

> > >

> > > Thanks for your patch, which is now commit dd8088d5a8969dc2 ("PM:

> > > runtime: Add pm_runtime_resume_and_get to deal with usage counter") in

> > > v5.10-rc5.

> > >

> > > > --- a/include/linux/pm_runtime.h

> > > > +++ b/include/linux/pm_runtime.h

> > > > @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)

> > > >         return __pm_runtime_resume(dev, RPM_GET_PUT);

> > > >  }

> > > >

> > > > +/**

> > > > + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.

> > > > + * @dev: Target device.

> > > > + *

> > > > + * Resume @dev synchronously and if that is successful, increment its runtime

> > > > + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been

> > > > + * incremented or a negative error code otherwise.

> > > > + */

> > > > +static inline int pm_runtime_resume_and_get(struct device *dev)

> > >

> > > Perhaps this function should be called pm_runtime_resume_and_get_sync(),

> >

> > No, really.

> >

> > I might consider calling it pm_runtime_acquire(), and adding a

> > matching _release() as a pm_runtime_get() synonym for that matter, but

> > not the above.

>

> pm_runtime_acquire() seems better to me too. Would pm_runtime_release()

> would be an alias for pm_runtime_put() ?


Yes.  This covers all of the use cases relevant for drivers AFAICS.

> We would also likely need a pm_runtime_release_autosuspend() too then.


Why would we?

> But on that topic, I was wondering, is there a reason we can't select

> autosuspend behaviour automatically when autosuspend is enabled ?


That is the case already.

pm_runtime_put() will autosuspend if enabled and the usage counter is
0, as long as ->runtime_idle() returns 0 (or is absent).

pm_runtime_put_autosuspend() is an optimization allowing
->runtime_idle() to be skipped entirely, but I'm wondering how many
users really need that.
Laurent Pinchart Nov. 30, 2020, 6:50 p.m. UTC | #5
Hi Rafael,

On Mon, Nov 30, 2020 at 06:55:57PM +0100, Rafael J. Wysocki wrote:
> On Mon, Nov 30, 2020 at 6:35 PM Laurent Pinchart wrote:
> > On Mon, Nov 30, 2020 at 05:37:52PM +0100, Rafael J. Wysocki wrote:
> > > On Fri, Nov 27, 2020 at 11:16 AM Geert Uytterhoeven wrote:
> > > > On Tue, Nov 10, 2020 at 10:29 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:
> > > > > In many case, we need to check return value of pm_runtime_get_sync, but
> > > > > it brings a trouble to the usage counter processing. Many callers forget
> > > > > to decrease the usage counter when it failed, which could resulted in
> > > > > reference leak. It has been discussed a lot[0][1]. So we add a function
> > > > > to deal with the usage counter for better coding.
> > > > >
> > > > > [0]https://lkml.org/lkml/2020/6/14/88
> > > > > [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139
> > > > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
> > > >
> > > > Thanks for your patch, which is now commit dd8088d5a8969dc2 ("PM:
> > > > runtime: Add pm_runtime_resume_and_get to deal with usage counter") in
> > > > v5.10-rc5.
> > > >
> > > > > --- a/include/linux/pm_runtime.h
> > > > > +++ b/include/linux/pm_runtime.h
> > > > > @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)
> > > > >         return __pm_runtime_resume(dev, RPM_GET_PUT);
> > > > >  }
> > > > >
> > > > > +/**
> > > > > + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
> > > > > + * @dev: Target device.
> > > > > + *
> > > > > + * Resume @dev synchronously and if that is successful, increment its runtime
> > > > > + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been
> > > > > + * incremented or a negative error code otherwise.
> > > > > + */
> > > > > +static inline int pm_runtime_resume_and_get(struct device *dev)
> > > >
> > > > Perhaps this function should be called pm_runtime_resume_and_get_sync(),
> > >
> > > No, really.
> > >
> > > I might consider calling it pm_runtime_acquire(), and adding a
> > > matching _release() as a pm_runtime_get() synonym for that matter, but
> > > not the above.
> >
> > pm_runtime_acquire() seems better to me too. Would pm_runtime_release()
> > would be an alias for pm_runtime_put() ?
> 
> Yes.  This covers all of the use cases relevant for drivers AFAICS.
> 
> > We would also likely need a pm_runtime_release_autosuspend() too then.
> 
> Why would we?
> 
> > But on that topic, I was wondering, is there a reason we can't select
> > autosuspend behaviour automatically when autosuspend is enabled ?
> 
> That is the case already.
> 
> pm_runtime_put() will autosuspend if enabled and the usage counter is
> 0, as long as ->runtime_idle() returns 0 (or is absent).
> 
> pm_runtime_put_autosuspend() is an optimization allowing
> ->runtime_idle() to be skipped entirely, but I'm wondering how many
> users really need that.

Ah, I didn't know that, that's good to know. We then don't need
pm_runtime_release_autosuspend() (unless the optimization really makes a
big difference).

Should I write new drievr code with pm_runtime_put() instead of
pm_runtime_put_autosuspend() ? I haven't found clear guidelines on this
in the documentation.
Rafael J. Wysocki Nov. 30, 2020, 7:12 p.m. UTC | #6
On Mon, Nov 30, 2020 at 7:50 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>

> Hi Rafael,

>

> On Mon, Nov 30, 2020 at 06:55:57PM +0100, Rafael J. Wysocki wrote:

> > On Mon, Nov 30, 2020 at 6:35 PM Laurent Pinchart wrote:

> > > On Mon, Nov 30, 2020 at 05:37:52PM +0100, Rafael J. Wysocki wrote:

> > > > On Fri, Nov 27, 2020 at 11:16 AM Geert Uytterhoeven wrote:

> > > > > On Tue, Nov 10, 2020 at 10:29 AM Zhang Qilong <zhangqilong3@huawei.com> wrote:

> > > > > > In many case, we need to check return value of pm_runtime_get_sync, but

> > > > > > it brings a trouble to the usage counter processing. Many callers forget

> > > > > > to decrease the usage counter when it failed, which could resulted in

> > > > > > reference leak. It has been discussed a lot[0][1]. So we add a function

> > > > > > to deal with the usage counter for better coding.

> > > > > >

> > > > > > [0]https://lkml.org/lkml/2020/6/14/88

> > > > > > [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139

> > > > > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

> > > > >

> > > > > Thanks for your patch, which is now commit dd8088d5a8969dc2 ("PM:

> > > > > runtime: Add pm_runtime_resume_and_get to deal with usage counter") in

> > > > > v5.10-rc5.

> > > > >

> > > > > > --- a/include/linux/pm_runtime.h

> > > > > > +++ b/include/linux/pm_runtime.h

> > > > > > @@ -386,6 +386,27 @@ static inline int pm_runtime_get_sync(struct device *dev)

> > > > > >         return __pm_runtime_resume(dev, RPM_GET_PUT);

> > > > > >  }

> > > > > >

> > > > > > +/**

> > > > > > + * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.

> > > > > > + * @dev: Target device.

> > > > > > + *

> > > > > > + * Resume @dev synchronously and if that is successful, increment its runtime

> > > > > > + * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been

> > > > > > + * incremented or a negative error code otherwise.

> > > > > > + */

> > > > > > +static inline int pm_runtime_resume_and_get(struct device *dev)

> > > > >

> > > > > Perhaps this function should be called pm_runtime_resume_and_get_sync(),

> > > >

> > > > No, really.

> > > >

> > > > I might consider calling it pm_runtime_acquire(), and adding a

> > > > matching _release() as a pm_runtime_get() synonym for that matter, but

> > > > not the above.

> > >

> > > pm_runtime_acquire() seems better to me too. Would pm_runtime_release()

> > > would be an alias for pm_runtime_put() ?

> >

> > Yes.  This covers all of the use cases relevant for drivers AFAICS.

> >

> > > We would also likely need a pm_runtime_release_autosuspend() too then.

> >

> > Why would we?

> >

> > > But on that topic, I was wondering, is there a reason we can't select

> > > autosuspend behaviour automatically when autosuspend is enabled ?

> >

> > That is the case already.

> >

> > pm_runtime_put() will autosuspend if enabled and the usage counter is

> > 0, as long as ->runtime_idle() returns 0 (or is absent).

> >

> > pm_runtime_put_autosuspend() is an optimization allowing

> > ->runtime_idle() to be skipped entirely, but I'm wondering how many

> > users really need that.

>

> Ah, I didn't know that, that's good to know. We then don't need

> pm_runtime_release_autosuspend() (unless the optimization really makes a

> big difference).

>

> Should I write new drievr code with pm_runtime_put() instead of

> pm_runtime_put_autosuspend() ?


If you don't have ->runtime_idle() in the driver (and in the bus type
generally speaking, but none of them provide it IIRC),
pm_runtime_put() is basically equivalent to
pm_runtime_put_autosuspend() AFAICS, except for some extra checks done
by the former.

Otherwise it all depends on what the ->runtime_idle() callback does,
but it is hard to imagine a practical use case when the difference
would be really meaningful.

> I haven't found clear guidelines on this in the documentation.


Yes, that's one of the items I need to take care of.
diff mbox series

Patch

diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
index 4b708f4e8eed..b492ae00cc90 100644
--- a/include/linux/pm_runtime.h
+++ b/include/linux/pm_runtime.h
@@ -386,6 +386,27 @@  static inline int pm_runtime_get_sync(struct device *dev)
 	return __pm_runtime_resume(dev, RPM_GET_PUT);
 }
 
+/**
+ * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
+ * @dev: Target device.
+ *
+ * Resume @dev synchronously and if that is successful, increment its runtime
+ * PM usage counter. Return 0 if the runtime PM usage counter of @dev has been
+ * incremented or a negative error code otherwise.
+ */
+static inline int pm_runtime_resume_and_get(struct device *dev)
+{
+	int ret;
+
+	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
+	if (ret < 0) {
+		pm_runtime_put_noidle(dev);
+		return ret;
+	}
+
+	return 0;
+}
+
 /**
  * pm_runtime_put - Drop device usage counter and queue up "idle check" if 0.
  * @dev: Target device.