diff mbox series

cpufreq: Make cpufreq_online() call driver->offline() on errors

Message ID 11788436.O9o76ZdvQC@kreacher
State New
Headers show
Series cpufreq: Make cpufreq_online() call driver->offline() on errors | expand

Commit Message

Rafael J. Wysocki June 21, 2021, 5:26 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In the CPU removal path the ->offline() callback provided by the
driver is always invoked before ->exit(), but in the cpufreq_online()
error path it is not, so ->exit() is somehow expected to know the
context in which it has been called and act accordingly.

That is less than straightforward, so make cpufreq_online() invoke
the driver's ->offline() callback before ->exit() too.

This only potentially affects intel_pstate at this point.

Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/cpufreq.c |    3 +++
 1 file changed, 3 insertions(+)

Comments

Viresh Kumar June 22, 2021, 6:20 a.m. UTC | #1
On 21-06-21, 19:26, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> 

> In the CPU removal path the ->offline() callback provided by the

> driver is always invoked before ->exit(),


Note that exit() doesn't get called if offline() is present in the CPU
removal path. We call exit() _only_ when the cpufreq driver gets
unregistered.

> but in the cpufreq_online()

> error path it is not, so ->exit() is somehow expected to know the

> context in which it has been called and act accordingly.


I agree, it isn't very clear.

> That is less than straightforward, so make cpufreq_online() invoke

> the driver's ->offline() callback before ->exit() too.

> 

> This only potentially affects intel_pstate at this point.

> 

> Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs")

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---

>  drivers/cpufreq/cpufreq.c |    3 +++

>  1 file changed, 3 insertions(+)

> 

> Index: linux-pm/drivers/cpufreq/cpufreq.c

> ===================================================================

> --- linux-pm.orig/drivers/cpufreq/cpufreq.c

> +++ linux-pm/drivers/cpufreq/cpufreq.c

> @@ -1516,6 +1516,9 @@ out_destroy_policy:

>  	up_write(&policy->rwsem);

>  

>  out_exit_policy:

> +	if (cpufreq_driver->offline)

> +		cpufreq_driver->offline(policy);

> +

>  	if (cpufreq_driver->exit)

>  		cpufreq_driver->exit(policy);


If we want to go down this path, then we better do more and make it
very explicit that ->offline() follows a previous invocation of
->online().

Otherwise, with above we will end up calling ->offline() for two separate
states, ->online() failed (i.e. two calls to offline() one after the other
here) and other generic failures after ->init() passed.

So, better make it clear that online/offline are paired like
init/exit.

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1d1b563cea4b..0ce48dcb2e8a 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1347,14 +1347,11 @@ static int cpufreq_online(unsigned int cpu)
        }
 
        if (!new_policy && cpufreq_driver->online) {
-               ret = cpufreq_driver->online(policy);
-               if (ret) {
-                       pr_debug("%s: %d: initialization failed\n", __func__,
-                                __LINE__);
-                       goto out_exit_policy;
-               }
-
-               /* Recover policy->cpus using related_cpus */
+               /*
+                * We did light-weight tear down earlier, don't need to do heavy
+                * initialization here. Just recover policy->cpus using
+                * related_cpus.
+                */
                cpumask_copy(policy->cpus, policy->related_cpus);
        } else {
                cpumask_copy(policy->cpus, cpumask_of(cpu));
@@ -1378,6 +1375,13 @@ static int cpufreq_online(unsigned int cpu)
                cpumask_copy(policy->related_cpus, policy->cpus);
        }
 
+       ret = cpufreq_driver->online(policy);
+       if (ret) {
+               pr_debug("%s: %d: %d: ->online() failed\n", __func__, __LINE__,
+                        ret);
+               goto out_exit_policy;
+       }
+
        down_write(&policy->rwsem);
        /*
         * affected cpus must always be the one, which are online. We aren't
@@ -1518,6 +1522,9 @@ static int cpufreq_online(unsigned int cpu)
 
        up_write(&policy->rwsem);
 
+       if (cpufreq_driver->offline)
+               cpufreq_driver->offline(policy);
+
 out_exit_policy:
        if (cpufreq_driver->exit)
                cpufreq_driver->exit(policy);


-- 
viresh
Rafael J. Wysocki June 22, 2021, 12:18 p.m. UTC | #2
On Tue, Jun 22, 2021 at 8:20 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>

> On 21-06-21, 19:26, Rafael J. Wysocki wrote:

> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> >

> > In the CPU removal path the ->offline() callback provided by the

> > driver is always invoked before ->exit(),

>

> Note that exit() doesn't get called if offline() is present in the CPU

> removal path. We call exit() _only_ when the cpufreq driver gets

> unregistered.


True, but that doesn't contradict what I said.

> > but in the cpufreq_online()

> > error path it is not, so ->exit() is somehow expected to know the

> > context in which it has been called and act accordingly.

>

> I agree, it isn't very clear.

>

> > That is less than straightforward, so make cpufreq_online() invoke

> > the driver's ->offline() callback before ->exit() too.

> >

> > This only potentially affects intel_pstate at this point.

> >

> > Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs")

> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> > ---

> >  drivers/cpufreq/cpufreq.c |    3 +++

> >  1 file changed, 3 insertions(+)

> >

> > Index: linux-pm/drivers/cpufreq/cpufreq.c

> > ===================================================================

> > --- linux-pm.orig/drivers/cpufreq/cpufreq.c

> > +++ linux-pm/drivers/cpufreq/cpufreq.c

> > @@ -1516,6 +1516,9 @@ out_destroy_policy:

> >       up_write(&policy->rwsem);

> >

> >  out_exit_policy:

> > +     if (cpufreq_driver->offline)

> > +             cpufreq_driver->offline(policy);

> > +

> >       if (cpufreq_driver->exit)

> >               cpufreq_driver->exit(policy);

>

> If we want to go down this path, then we better do more and make it

> very explicit that ->offline() follows a previous invocation of

> ->online().

>

> Otherwise, with above we will end up calling ->offline() for two separate

> states, ->online() failed (i.e. two calls to offline() one after the other

> here) and other generic failures after ->init() passed.


Good point.

> So, better make it clear that online/offline are paired like

> init/exit.

>

> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

> index 1d1b563cea4b..0ce48dcb2e8a 100644

> --- a/drivers/cpufreq/cpufreq.c

> +++ b/drivers/cpufreq/cpufreq.c

> @@ -1347,14 +1347,11 @@ static int cpufreq_online(unsigned int cpu)

>         }

>

>         if (!new_policy && cpufreq_driver->online) {

> -               ret = cpufreq_driver->online(policy);

> -               if (ret) {

> -                       pr_debug("%s: %d: initialization failed\n", __func__,

> -                                __LINE__);

> -                       goto out_exit_policy;

> -               }

> -

> -               /* Recover policy->cpus using related_cpus */

> +               /*

> +                * We did light-weight tear down earlier, don't need to do heavy

> +                * initialization here. Just recover policy->cpus using

> +                * related_cpus.

> +                */

>                 cpumask_copy(policy->cpus, policy->related_cpus);

>         } else {

>                 cpumask_copy(policy->cpus, cpumask_of(cpu));

> @@ -1378,6 +1375,13 @@ static int cpufreq_online(unsigned int cpu)

>                 cpumask_copy(policy->related_cpus, policy->cpus);

>         }

>

> +       ret = cpufreq_driver->online(policy);


But I wouldn't make this change, because that would require reworking
->init() in the driver too.

Let's continue assuming that ->init() will do ->online() if successful
and so ->offline() is needed after a successful ->init() as well as
after a successful ->online().

> +       if (ret) {

> +               pr_debug("%s: %d: %d: ->online() failed\n", __func__, __LINE__,

> +                        ret);

> +               goto out_exit_policy;

> +       }

> +

>         down_write(&policy->rwsem);

>         /*

>          * affected cpus must always be the one, which are online. We aren't

> @@ -1518,6 +1522,9 @@ static int cpufreq_online(unsigned int cpu)

>

>         up_write(&policy->rwsem);

>


So I think that a new label is needed here to avoid running
->offline() after a failing ->online().

Let me update the patch accordingly.

> +       if (cpufreq_driver->offline)

> +               cpufreq_driver->offline(policy);

> +

>  out_exit_policy:

>         if (cpufreq_driver->exit)

>                 cpufreq_driver->exit(policy);
Viresh Kumar June 23, 2021, 3:10 a.m. UTC | #3
On 22-06-21, 21:11, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> Subject: [PATCH] cpufreq: Make cpufreq_online() call driver->offline() on errors

> 

> In the CPU removal path the ->offline() callback provided by the

> driver is always invoked before ->exit(), but in the cpufreq_online()

> error path it is not, so ->exit() is expected to somehow know the

> context in which it has been called and act accordingly.

> 

> That is less than straightforward, so make cpufreq_online() invoke

> the driver's ->offline() callback, if present, on errors before

> ->exit() too.

> 

> This only potentially affects intel_pstate.

> 

> Fixes: 91a12e91dc39 ("cpufreq: Allow light-weight tear down and bring up of CPUs")

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---

> 

> -> v2:

>    * Avoid calling ->offline() after a failing ->online().

>    * Add a comment regarding the expected state after calling ->init().

>    * Edit the changelog a bit.


Acked-by: Viresh Kumar <viresh.kumar@linaro.org>


-- 
viresh
diff mbox series

Patch

Index: linux-pm/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/cpufreq.c
+++ linux-pm/drivers/cpufreq/cpufreq.c
@@ -1516,6 +1516,9 @@  out_destroy_policy:
 	up_write(&policy->rwsem);
 
 out_exit_policy:
+	if (cpufreq_driver->offline)
+		cpufreq_driver->offline(policy);
+
 	if (cpufreq_driver->exit)
 		cpufreq_driver->exit(policy);