mbox series

[v2,0/2] Fix usage counter leak by adding a general sync ops

Message ID 20201109150416.1877878-1-zhangqilong3@huawei.com
Headers show
Series Fix usage counter leak by adding a general sync ops | expand

Message

Zhang Qilong Nov. 9, 2020, 3:04 p.m. UTC
In many case, we need to check return value of pm_runtime_get_sync,
but it brings a trouble to the usage counter processing. Many callers
forget to decrease the usage counter when it failed. It has been
discussed a lot[0][1]. So we add a function to deal with the usage
counter for better coding and view. Then, we replace pm_runtime_get_sync
with it in fec_main.c

Zhang Qilong (2):
  PM: runtime: Add a general runtime get sync operation to deal with
    usage counter
  net: fec: Fix reference count leak in fec series ops

 drivers/net/ethernet/freescale/fec_main.c | 12 ++++-----
 include/linux/pm_runtime.h                | 30 +++++++++++++++++++++++
 2 files changed, 35 insertions(+), 7 deletions(-)

Comments

Zhang Qilong Nov. 9, 2020, 3:50 p.m. UTC | #1
> operation to deal with usage counter

> 

> On Mon, Nov 9, 2020 at 4:00 PM Zhang Qilong <zhangqilong3@huawei.com>

> wrote:

> >

> > In many case, we need to check return value of pm_runtime_get_sync,

> > but it brings a trouble to the usage counter processing. Many callers

> > forget to decrease the usage counter when it failed. It has been

> > discussed a lot[0][1]. So we add a function to deal with the usage

> > counter for better coding.

> >

> > [0]https://lkml.org/lkml/2020/6/14/88

> > [1]https://patchwork.ozlabs.org/project/linux-tegra/patch/202005200951

> > 48.10995-1-dinghao.liu@zju.edu.cn/

> > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>

> > ---

> >  include/linux/pm_runtime.h | 30 ++++++++++++++++++++++++++++++

> >  1 file changed, 30 insertions(+)

> >

> > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h

> > index 4b708f4e8eed..6549ce764400 100644

> > --- a/include/linux/pm_runtime.h

> > +++ b/include/linux/pm_runtime.h

> > @@ -386,6 +386,36 @@ static inline int pm_runtime_get_sync(struct device

> *dev)

> >         return __pm_runtime_resume(dev, RPM_GET_PUT);  }

> >

> > +/**

> > + * pm_runtime_general_get - Bump up usage counter of a device and

> resume it.

> > + * @dev: Target device.

> > + *

> > + * Increase runtime PM usage counter of @dev first, and carry out

> > +runtime-resume

> > + * of it synchronously. If __pm_runtime_resume return negative

> > +value(device is in

> > + * error state), we to need decrease the usage counter before it

> > +return. If

> > + * __pm_runtime_resume return positive value, it means the runtime of

> > +device has

> > + * already been in active state, and we let the new wrapper return zero

> instead.

> > + *

> > + * The possible return values of this function is zero or negative value.

> > + * zero:

> > + *    - it means resume succeeed or runtime of device has already been

> active, the

> > + *      runtime PM usage counter of @dev remains incremented.

> > + * negative:

> > + *    - it means failure and the runtime PM usage counter of @dev has

> been balanced.

> 

> The kerneldoc above is kind of noisy and it is hard to figure out what the helper

> really does from it.

> 

> You could basically say something like "Resume @dev synchronously and if that

> is successful, increment its runtime PM usage counter.  Return

> 0 if the runtime PM usage counter of @dev has been incremented or a negative

> error code otherwise."

> 


How about the following description.
/**
390  * pm_runtime_general_get - Bump up usage counter of a device and resume it.
391  * @dev: Target device.
392  *
393  * Increase runtime PM usage counter of @dev first, and carry out runtime-resume
394  * of it synchronously. If __pm_runtime_resume return negative value(device is in
395  * error state), we to need decrease the usage counter before it return. If
396  * __pm_runtime_resume return positive value, it means the runtime of device has
397  * already been in active state, and we let the new wrapper return zero instead.
398  *
399  * Resume @dev synchronously and if that is successful, and increment its runtime
400  * PM usage counter if it turn out to equal to 0. The runtime PM usage counter of
401  * @dev has been incremented or a negative error code otherwise.
402  */

Thanks,
Zhang

> > + */

> > +static inline int pm_runtime_general_get(struct device *dev)

> 

> What about pm_runtime_resume_and_get()?

> 


I think it's OK.

> > +{

> > +       int ret = 0;

> 

> This extra initialization is not necessary.

> 

> You can initialize ret to the __pm_runtime_resume() return value right away.

> 


OK, good idea.

> > +

> > +       ret = __pm_runtime_resume(dev, RPM_GET_PUT);

> > +       if (ret < 0) {

> > +               pm_runtime_put_noidle(dev);

> > +               return ret;

> > +       }

> > +

> > +       return 0;

> > +}

> > +

> >  /**

> >   * pm_runtime_put - Drop device usage counter and queue up "idle check"

> if 0.

> >   * @dev: Target device.

> > --
Rafael J. Wysocki Nov. 9, 2020, 4 p.m. UTC | #2
On Mon, Nov 9, 2020 at 4:50 PM zhangqilong <zhangqilong3@huawei.com> wrote:
>
> > operation to deal with usage counter
> >
> > On Mon, Nov 9, 2020 at 4:00 PM Zhang Qilong <zhangqilong3@huawei.com>
> > wrote:
> > >
> > > In many case, we need to check return value of pm_runtime_get_sync,
> > > but it brings a trouble to the usage counter processing. Many callers
> > > forget to decrease the usage counter when it failed. It has been
> > > discussed a lot[0][1]. So we add a function to deal with the usage
> > > counter for better coding.
> > >
> > > [0]https://lkml.org/lkml/2020/6/14/88
> > > [1]https://patchwork.ozlabs.org/project/linux-tegra/patch/202005200951
> > > 48.10995-1-dinghao.liu@zju.edu.cn/
> > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
> > > ---
> > >  include/linux/pm_runtime.h | 30 ++++++++++++++++++++++++++++++
> > >  1 file changed, 30 insertions(+)
> > >
> > > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h
> > > index 4b708f4e8eed..6549ce764400 100644
> > > --- a/include/linux/pm_runtime.h
> > > +++ b/include/linux/pm_runtime.h
> > > @@ -386,6 +386,36 @@ static inline int pm_runtime_get_sync(struct device
> > *dev)
> > >         return __pm_runtime_resume(dev, RPM_GET_PUT);  }
> > >
> > > +/**
> > > + * pm_runtime_general_get - Bump up usage counter of a device and
> > resume it.
> > > + * @dev: Target device.
> > > + *
> > > + * Increase runtime PM usage counter of @dev first, and carry out
> > > +runtime-resume
> > > + * of it synchronously. If __pm_runtime_resume return negative
> > > +value(device is in
> > > + * error state), we to need decrease the usage counter before it
> > > +return. If
> > > + * __pm_runtime_resume return positive value, it means the runtime of
> > > +device has
> > > + * already been in active state, and we let the new wrapper return zero
> > instead.
> > > + *
> > > + * The possible return values of this function is zero or negative value.
> > > + * zero:
> > > + *    - it means resume succeeed or runtime of device has already been
> > active, the
> > > + *      runtime PM usage counter of @dev remains incremented.
> > > + * negative:
> > > + *    - it means failure and the runtime PM usage counter of @dev has
> > been balanced.
> >
> > The kerneldoc above is kind of noisy and it is hard to figure out what the helper
> > really does from it.
> >
> > You could basically say something like "Resume @dev synchronously and if that
> > is successful, increment its runtime PM usage counter.  Return
> > 0 if the runtime PM usage counter of @dev has been incremented or a negative
> > error code otherwise."
> >
>
> How about the following description.
> /**
> 390  * pm_runtime_general_get - Bump up usage counter of a device and resume it.
> 391  * @dev: Target device.
> 392  *
> 393  * Increase runtime PM usage counter of @dev first, and carry out runtime-resume
> 394  * of it synchronously. If __pm_runtime_resume return negative value(device is in
> 395  * error state), we to need decrease the usage counter before it return. If
> 396  * __pm_runtime_resume return positive value, it means the runtime of device has
> 397  * already been in active state, and we let the new wrapper return zero instead.
> 398  *

If you add the paragraph below, the one above becomes redundant IMV.

> 399  * Resume @dev synchronously and if that is successful, and increment its runtime

"Resume @dev synchronously and if that is successful, increment its runtime"

(drop the extra "and").

> 400  * PM usage counter if it turn out to equal to 0. The runtime PM usage counter of

The "if it turn out to equal to 0" phrase is redundant (and the
grammar in it is incorrect).

> 401  * @dev has been incremented or a negative error code otherwise.
> 402  */

Why don't you use what I said verbatim?
Rafael J. Wysocki Nov. 9, 2020, 4:40 p.m. UTC | #3
On Mon, Nov 9, 2020 at 5:15 PM zhangqilong <zhangqilong3@huawei.com> wrote:
>
> Hi
>
> >
> > On Mon, Nov 9, 2020 at 4:50 PM zhangqilong <zhangqilong3@huawei.com>
> > wrote:
> > >
> > > > operation to deal with usage counter
> > > >
> > > > On Mon, Nov 9, 2020 at 4:00 PM Zhang Qilong
> > > > <zhangqilong3@huawei.com>
> > > > wrote:
> > > > >
> > > > > In many case, we need to check return value of
> > > > > pm_runtime_get_sync, but it brings a trouble to the usage counter
> > > > > processing. Many callers forget to decrease the usage counter when
> > > > > it failed. It has been discussed a lot[0][1]. So we add a function
> > > > > to deal with the usage counter for better coding.
> > > > >
> > > > > [0]https://lkml.org/lkml/2020/6/14/88
> > > > > [1]https://patchwork.ozlabs.org/project/linux-tegra/patch/20200520
> > > > > 0951 48.10995-1-dinghao.liu@zju.edu.cn/
> > > > > Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
> > > > > ---
> > > > >  include/linux/pm_runtime.h | 30 ++++++++++++++++++++++++++++++
> > > > >  1 file changed, 30 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/pm_runtime.h
> > > > > b/include/linux/pm_runtime.h index 4b708f4e8eed..6549ce764400
> > > > > 100644
> > > > > --- a/include/linux/pm_runtime.h
> > > > > +++ b/include/linux/pm_runtime.h
> > > > > @@ -386,6 +386,36 @@ static inline int pm_runtime_get_sync(struct
> > > > > device
> > > > *dev)
> > > > >         return __pm_runtime_resume(dev, RPM_GET_PUT);  }
> > > > >
> > > > > +/**
> > > > > + * pm_runtime_general_get - Bump up usage counter of a device and
> > > > resume it.
> > > > > + * @dev: Target device.
> > > > > + *
> > > > > + * Increase runtime PM usage counter of @dev first, and carry out
> > > > > +runtime-resume
> > > > > + * of it synchronously. If __pm_runtime_resume return negative
> > > > > +value(device is in
> > > > > + * error state), we to need decrease the usage counter before it
> > > > > +return. If
> > > > > + * __pm_runtime_resume return positive value, it means the
> > > > > +runtime of device has
> > > > > + * already been in active state, and we let the new wrapper
> > > > > +return zero
> > > > instead.
> > > > > + *
> > > > > + * The possible return values of this function is zero or negative value.
> > > > > + * zero:
> > > > > + *    - it means resume succeeed or runtime of device has already been
> > > > active, the
> > > > > + *      runtime PM usage counter of @dev remains incremented.
> > > > > + * negative:
> > > > > + *    - it means failure and the runtime PM usage counter of @dev has
> > > > been balanced.
> > > >
> > > > The kerneldoc above is kind of noisy and it is hard to figure out
> > > > what the helper really does from it.
> > > >
> > > > You could basically say something like "Resume @dev synchronously
> > > > and if that is successful, increment its runtime PM usage counter.
> > > > Return
> > > > 0 if the runtime PM usage counter of @dev has been incremented or a
> > > > negative error code otherwise."
> > > >
> > >
> > > How about the following description.
> > > /**
> > > 390  * pm_runtime_general_get - Bump up usage counter of a device and
> > resume it.
> > > 391  * @dev: Target device.
> > > 392  *
> > > 393  * Increase runtime PM usage counter of @dev first, and carry out
> > > runtime-resume
> > > 394  * of it synchronously. If __pm_runtime_resume return negative
> > > value(device is in
> > > 395  * error state), we to need decrease the usage counter before it
> > > return. If
> > > 396  * __pm_runtime_resume return positive value, it means the runtime
> > > of device has
> > > 397  * already been in active state, and we let the new wrapper return zero
> > instead.
> > > 398  *
> >
> > If you add the paragraph below, the one above becomes redundant IMV.
> >
> > > 399  * Resume @dev synchronously and if that is successful, and
> > > increment its runtime
> >
> > "Resume @dev synchronously and if that is successful, increment its runtime"
> >
> > (drop the extra "and").
> >
> > > 400  * PM usage counter if it turn out to equal to 0. The runtime PM
> > > usage counter of
> >
> > The "if it turn out to equal to 0" phrase is redundant (and the grammar in it is
> > incorrect).
> >
> > > 401  * @dev has been incremented or a negative error code otherwise.
> > > 402  */
> >
> > Why don't you use what I said verbatim?
>
> I had misunderstand just now, sorry for that. The description is as follows:
> 389 /**
> 390  * pm_runtime_resume_and_get - Bump up usage counter of a device and resume it.
> 391  * @dev: Target device.
> 392  *
> 393  * Resume @dev synchronously if that is successful, increment its runtime PM

"Resume @dev synchronously and if that is successful, increment its runtime PM"

(missing "and").

> 394  * usage counter.  Return 0 if the runtime PM usage counter of @dev has been
> 395  * incremented or a negative error code otherwise.
> 396  */
>
> Do you think it's OK?

Apart from the above typo, yes it is.

Thanks!