[v7,3/5] clk: Supply the critical clock {init, enable, disable} framework

Message ID 1437570255-21049-4-git-send-email-lee.jones@linaro.org
State New
Headers show

Commit Message

Lee Jones July 22, 2015, 1:04 p.m.
These new API calls will firstly provide a mechanisms to tag a clock as
critical and secondly allow any knowledgeable driver to (un)gate clocks,
even if they are marked as critical.

Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/clk-provider.h |  2 ++
 include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+)

Comments

Lee Jones July 27, 2015, 8:53 a.m. | #1
On Mon, 27 Jul 2015, Maxime Ripard wrote:

> On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > These new API calls will firstly provide a mechanisms to tag a clock as
> > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > even if they are marked as critical.
> > 
> > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > ---
> >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/clk-provider.h |  2 ++
> >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> >  3 files changed, 77 insertions(+)
> > 
> > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > index 61c3fc5..486b1da 100644
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> >  
> >  /***    private data structures    ***/
> >  
> > +/**
> > + * struct critical -	Provides 'play' over critical clocks.  A clock can be
> > + *			marked as critical, meaning that it should not be
> > + *			disabled.  However, if a driver which is aware of the
> > + *			critical behaviour wants to control it, it can do so
> > + *			using clk_enable_critical() and clk_disable_critical().
> > + *
> > + * @enabled	Is clock critical?  Once set, doesn't change
> > + * @leave_on	Self explanatory.  Can be disabled by knowledgeable drivers
> > + */
> > +struct critical {
> > +	bool enabled;
> > +	bool leave_on;
> > +};
> > +
> >  struct clk_core {
> >  	const char		*name;
> >  	const struct clk_ops	*ops;
> > @@ -75,6 +90,7 @@ struct clk_core {
> >  	struct dentry		*dentry;
> >  #endif
> >  	struct kref		ref;
> > +	struct critical		critical;
> >  };
> >  
> >  struct clk {
> > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> >  	if (WARN_ON(clk->enable_count == 0))
> >  		return;
> >  
> > +	/* Refuse to turn off a critical clock */
> > +	if (clk->enable_count == 1 && clk->critical.leave_on)
> > +		return;
> > +
> 
> I think it should be handled by a separate counting. Otherwise, if you
> have two users that marked the clock as critical, and then one of them
> disable it...
> 
> >  	if (--clk->enable_count > 0)
> >  		return;
> >  
> > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> >  }
> >  EXPORT_SYMBOL_GPL(clk_disable);
> >  
> > +void clk_disable_critical(struct clk *clk)
> > +{
> > +	clk->core->critical.leave_on = false;
> 
> .. you just lost the fact that it was critical in the first place.

I thought about both of these points, which is why I came up with this
strategy.

Any device which uses the *_critical() API should a) have knowledge of
what happens when a particular critical clock is gated and b) have
thought about the consequences.  I don't think we can use reference
counting, because we'd need as many critical clock owners as there are
critical clocks.  Cast your mind back to the reasons for this critical
clock API.  One of the most important intentions of this API is the
requirement mitigation for each of the critical clocks to have an owner
(driver).

With regards to your second point, that's what 'critical.enabled'
is for.  Take a look at clk_enable_critical().
Lee Jones July 28, 2015, 1 p.m. | #2
On Tue, 28 Jul 2015, Maxime Ripard wrote:

> On Mon, Jul 27, 2015 at 09:53:38AM +0100, Lee Jones wrote:
> > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > 
> > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > even if they are marked as critical.
> > > > 
> > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/linux/clk-provider.h |  2 ++
> > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > >  3 files changed, 77 insertions(+)
> > > > 
> > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > index 61c3fc5..486b1da 100644
> > > > --- a/drivers/clk/clk.c
> > > > +++ b/drivers/clk/clk.c
> > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > >  
> > > >  /***    private data structures    ***/
> > > >  
> > > > +/**
> > > > + * struct critical -	Provides 'play' over critical clocks.  A clock can be
> > > > + *			marked as critical, meaning that it should not be
> > > > + *			disabled.  However, if a driver which is aware of the
> > > > + *			critical behaviour wants to control it, it can do so
> > > > + *			using clk_enable_critical() and clk_disable_critical().
> > > > + *
> > > > + * @enabled	Is clock critical?  Once set, doesn't change
> > > > + * @leave_on	Self explanatory.  Can be disabled by knowledgeable drivers
> > > > + */
> > > > +struct critical {
> > > > +	bool enabled;
> > > > +	bool leave_on;
> > > > +};
> > > > +
> > > >  struct clk_core {
> > > >  	const char		*name;
> > > >  	const struct clk_ops	*ops;
> > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > >  	struct dentry		*dentry;
> > > >  #endif
> > > >  	struct kref		ref;
> > > > +	struct critical		critical;
> > > >  };
> > > >  
> > > >  struct clk {
> > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > >  	if (WARN_ON(clk->enable_count == 0))
> > > >  		return;
> > > >  
> > > > +	/* Refuse to turn off a critical clock */
> > > > +	if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > +		return;
> > > > +
> > > 
> > > I think it should be handled by a separate counting. Otherwise, if you
> > > have two users that marked the clock as critical, and then one of them
> > > disable it...
> > > 
> > > >  	if (--clk->enable_count > 0)
> > > >  		return;
> > > >  
> > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(clk_disable);
> > > >  
> > > > +void clk_disable_critical(struct clk *clk)
> > > > +{
> > > > +	clk->core->critical.leave_on = false;
> > > 
> > > .. you just lost the fact that it was critical in the first place.
> > 
> > I thought about both of these points, which is why I came up with this
> > strategy.
> > 
> > Any device which uses the *_critical() API should a) have knowledge of
> > what happens when a particular critical clock is gated and b) have
> > thought about the consequences.
> 
> Indeed.
> 
> > I don't think we can use reference counting, because we'd need as
> > many critical clock owners as there are critical clocks.
> 
> Which we can have if we replace the call to clk_prepare_enable you add
> in your fourth patch in __set_critical_clocks.

What should it be replaced with?

> > Cast your mind back to the reasons for this critical clock API.  One
> > of the most important intentions of this API is the requirement
> > mitigation for each of the critical clocks to have an owner
> > (driver).
> > 
> > With regards to your second point, that's what 'critical.enabled'
> > is for.  Take a look at clk_enable_critical().
> 
> I don't think this addresses the issue, if you just throw more
> customers at it, the issue remain with your implementation.
> 
> If you have three customers that used the critical API, and if on of
> these calls clk_disable_critical, you're losing leave_on.

That's the idea.  See my point above, the one you replied "Indeed"
to.  So when a driver uses clk_disable_critical() it's saying, "I know
why this clock is a critical clock, and I know that nothing terrible
will happen if I disable it, as I have that covered".  So then if it's
not the last user to call clk_disable(), the last one out the door
will be allowed to finally gate the clock, regardless whether it's
critical aware or not.

Then, when we come to enable the clock again, the critical aware user
then re-marks the clock as leave_on, so not critical un-aware user can
take the final reference and disable the clock.

> Which means that if there's one of the two users left that calls
> clk_disable on it, the clock will actually be disabled, which is
> clearly not what we want to do, as we have still a user that want the
> clock to be enabled.

That's not what happens (at least it shouldn't if I've coded it up
right).  The API _still_ requires all of the users to give-up their
reference.

> It would be much more robust to have another count for the critical
> stuff, initialised to one by the __set_critical_clocks function.

If I understand you correctly, we already have a count.  We use the
original reference count.  No need for one of our own.

Using your RAM Clock (Clock 4) as an example
--------------------------------------------

Early start-up:
  Clock 4 is marked as critical and a reference is taken (ref == 1)

Driver probe:
  SPI enables Clock 4 (ref == 2)
  I2C enables Clock 4 (ref == 3)

Suspend (without RAM driver's permission):
  SPI disables Clock 4 (ref == 2)
  I2C disables Clock 4 (ref == 1)
  /*
   * Clock won't be gated because:
   *   .leave_on is True - can't dec final reference
   */

Suspend (with RAM driver's permission):
  /* Order is unimportant */
  SPI disables Clock 4 (ref == 2)
  RAM disables Clock 4 (ref == 1) /* Won't turn off here (ref > 0)
  I2C disables Clock 4 (ref == 0) /* (.leave_on == False) last ref can be taken */
  /*
   * Clock will be gated because:
   *   .leave_on is False, so (ref == 0)
   */

Resume:
  /* Order is unimportant */
  SPI enables Clock 4 (ref == 1)
  RAM enables Clock 4 and re-enables .leave_on (ref == 2)
  I2C enables Clock 4 (ref == 3)

Hopefully that clears things up.

Please tell me if the code doesn't reflect this strategy, or if you
can see anything wrong with how it operates.
Mike Turquette July 30, 2015, 1:02 a.m. | #3
Hi Lee,

+ linux-clk ml

Quoting Lee Jones (2015-07-22 06:04:13)
> These new API calls will firstly provide a mechanisms to tag a clock as
> critical and secondly allow any knowledgeable driver to (un)gate clocks,
> even if they are marked as critical.
> 
> Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/clk-provider.h |  2 ++
>  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
>  3 files changed, 77 insertions(+)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 61c3fc5..486b1da 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
>  
>  /***    private data structures    ***/
>  
> +/**
> + * struct critical -   Provides 'play' over critical clocks.  A clock can be
> + *                     marked as critical, meaning that it should not be
> + *                     disabled.  However, if a driver which is aware of the
> + *                     critical behaviour wants to control it, it can do so
> + *                     using clk_enable_critical() and clk_disable_critical().
> + *
> + * @enabled    Is clock critical?  Once set, doesn't change
> + * @leave_on   Self explanatory.  Can be disabled by knowledgeable drivers

Not self explanatory. I need this explained to me. What does leave_on
do? Better yet, what would happen if leave_on did not exist?

> + */
> +struct critical {
> +       bool enabled;
> +       bool leave_on;
> +};
> +
>  struct clk_core {
>         const char              *name;
>         const struct clk_ops    *ops;
> @@ -75,6 +90,7 @@ struct clk_core {
>         struct dentry           *dentry;
>  #endif
>         struct kref             ref;
> +       struct critical         critical;
>  };
>  
>  struct clk {
> @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
>         if (WARN_ON(clk->enable_count == 0))
>                 return;
>  
> +       /* Refuse to turn off a critical clock */
> +       if (clk->enable_count == 1 && clk->critical.leave_on)
> +               return;

How do we get to this point? clk_enable_critical actually calls
clk_enable, thus incrementing the enable_count. The only time that we
could hit the above case is if,

a) there is an imbalance in clk_enable and clk_disable calls. If this is
the case then the drivers need to be fixed. Or better yet some
infrastructure to catch that, now that we have per-user struct clk
cookies.

b) a driver knowingly calls clk_enable_critical(foo) and then regular,
old clk_disable(foo). But why would a driver do that?

It might be that I am missing the point here, so please feel free to
clue me in.

Regards,
Mike

> +
>         if (--clk->enable_count > 0)
>                 return;
>  
> @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
>  }
>  EXPORT_SYMBOL_GPL(clk_disable);
>  
> +void clk_disable_critical(struct clk *clk)
> +{
> +       clk->core->critical.leave_on = false;
> +       clk_disable(clk);
> +}
> +EXPORT_SYMBOL_GPL(clk_disable_critical);
> +
>  static int clk_core_enable(struct clk_core *clk)
>  {
>         int ret = 0;
> @@ -1100,6 +1127,15 @@ int clk_enable(struct clk *clk)
>  }
>  EXPORT_SYMBOL_GPL(clk_enable);
>  
> +int clk_enable_critical(struct clk *clk)
> +{
> +       if (clk->core->critical.enabled)
> +               clk->core->critical.leave_on = true;
> +
> +       return clk_enable(clk);
> +}
> +EXPORT_SYMBOL_GPL(clk_enable_critical);
> +
>  static unsigned long clk_core_round_rate_nolock(struct clk_core *clk,
>                                                 unsigned long rate,
>                                                 unsigned long min_rate,
> @@ -2482,6 +2518,15 @@ fail_out:
>  }
>  EXPORT_SYMBOL_GPL(clk_register);
>  
> +void clk_init_critical(struct clk *clk)
> +{
> +       struct critical *critical = &clk->core->critical;
> +
> +       critical->enabled = true;
> +       critical->leave_on = true;
> +}
> +EXPORT_SYMBOL_GPL(clk_init_critical);
> +
>  /*
>   * Free memory allocated for a clock.
>   * Caller must hold prepare_lock.
> diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
> index 5591ea7..15ef8c9 100644
> --- a/include/linux/clk-provider.h
> +++ b/include/linux/clk-provider.h
> @@ -563,6 +563,8 @@ struct clk *devm_clk_register(struct device *dev, struct clk_hw *hw);
>  void clk_unregister(struct clk *clk);
>  void devm_clk_unregister(struct device *dev, struct clk *clk);
>  
> +void clk_init_critical(struct clk *clk);
> +
>  /* helper functions */
>  const char *__clk_get_name(struct clk *clk);
>  struct clk_hw *__clk_get_hw(struct clk *clk);
> diff --git a/include/linux/clk.h b/include/linux/clk.h
> index 8381bbf..9807f3b 100644
> --- a/include/linux/clk.h
> +++ b/include/linux/clk.h
> @@ -231,6 +231,19 @@ struct clk *devm_clk_get(struct device *dev, const char *id);
>  int clk_enable(struct clk *clk);
>  
>  /**
> + * clk_enable_critical - inform the system when the clock source should be
> + *                      running, even if clock is critical.
> + * @clk: clock source
> + *
> + * If the clock can not be enabled/disabled, this should return success.
> + *
> + * May be called from atomic contexts.
> + *
> + * Returns success (0) or negative errno.
> + */
> +int clk_enable_critical(struct clk *clk);
> +
> +/**
>   * clk_disable - inform the system when the clock source is no longer required.
>   * @clk: clock source
>   *
> @@ -247,6 +260,23 @@ int clk_enable(struct clk *clk);
>  void clk_disable(struct clk *clk);
>  
>  /**
> + * clk_disable_critical - inform the system when the clock source is no
> + *                       longer required, even if clock is critical.
> + * @clk: clock source
> + *
> + * Inform the system that a clock source is no longer required by
> + * a driver and may be shut down.
> + *
> + * May be called from atomic contexts.
> + *
> + * Implementation detail: if the clock source is shared between
> + * multiple drivers, clk_enable_critical() calls must be balanced
> + * by the same number of clk_disable_critical() calls for the clock
> + * source to be disabled.
> + */
> +void clk_disable_critical(struct clk *clk);
> +
> +/**
>   * clk_get_rate - obtain the current clock rate (in Hz) for a clock source.
>   *               This is only valid once the clock source has been enabled.
>   * @clk: clock source
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Turquette July 30, 2015, 1:19 a.m. | #4
Quoting Lee Jones (2015-07-28 06:00:55)
> On Tue, 28 Jul 2015, Maxime Ripard wrote:
> 
> > On Mon, Jul 27, 2015 at 09:53:38AM +0100, Lee Jones wrote:
> > > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > > 
> > > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > > even if they are marked as critical.
> > > > > 
> > > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > ---
> > > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > > >  include/linux/clk-provider.h |  2 ++
> > > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > > >  3 files changed, 77 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > index 61c3fc5..486b1da 100644
> > > > > --- a/drivers/clk/clk.c
> > > > > +++ b/drivers/clk/clk.c
> > > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > > >  
> > > > >  /***    private data structures    ***/
> > > > >  
> > > > > +/**
> > > > > + * struct critical -   Provides 'play' over critical clocks.  A clock can be
> > > > > + *                     marked as critical, meaning that it should not be
> > > > > + *                     disabled.  However, if a driver which is aware of the
> > > > > + *                     critical behaviour wants to control it, it can do so
> > > > > + *                     using clk_enable_critical() and clk_disable_critical().
> > > > > + *
> > > > > + * @enabled    Is clock critical?  Once set, doesn't change
> > > > > + * @leave_on   Self explanatory.  Can be disabled by knowledgeable drivers
> > > > > + */
> > > > > +struct critical {
> > > > > +       bool enabled;
> > > > > +       bool leave_on;
> > > > > +};
> > > > > +
> > > > >  struct clk_core {
> > > > >         const char              *name;
> > > > >         const struct clk_ops    *ops;
> > > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > > >         struct dentry           *dentry;
> > > > >  #endif
> > > > >         struct kref             ref;
> > > > > +       struct critical         critical;
> > > > >  };
> > > > >  
> > > > >  struct clk {
> > > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > > >         if (WARN_ON(clk->enable_count == 0))
> > > > >                 return;
> > > > >  
> > > > > +       /* Refuse to turn off a critical clock */
> > > > > +       if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > > +               return;
> > > > > +
> > > > 
> > > > I think it should be handled by a separate counting. Otherwise, if you
> > > > have two users that marked the clock as critical, and then one of them
> > > > disable it...
> > > > 
> > > > >         if (--clk->enable_count > 0)
> > > > >                 return;
> > > > >  
> > > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(clk_disable);
> > > > >  
> > > > > +void clk_disable_critical(struct clk *clk)
> > > > > +{
> > > > > +       clk->core->critical.leave_on = false;
> > > > 
> > > > .. you just lost the fact that it was critical in the first place.
> > > 
> > > I thought about both of these points, which is why I came up with this
> > > strategy.
> > > 
> > > Any device which uses the *_critical() API should a) have knowledge of
> > > what happens when a particular critical clock is gated and b) have
> > > thought about the consequences.
> > 
> > Indeed.
> > 
> > > I don't think we can use reference counting, because we'd need as
> > > many critical clock owners as there are critical clocks.
> > 
> > Which we can have if we replace the call to clk_prepare_enable you add
> > in your fourth patch in __set_critical_clocks.
> 
> What should it be replaced with?
> 
> > > Cast your mind back to the reasons for this critical clock API.  One
> > > of the most important intentions of this API is the requirement
> > > mitigation for each of the critical clocks to have an owner
> > > (driver).
> > > 
> > > With regards to your second point, that's what 'critical.enabled'
> > > is for.  Take a look at clk_enable_critical().
> > 
> > I don't think this addresses the issue, if you just throw more
> > customers at it, the issue remain with your implementation.
> > 
> > If you have three customers that used the critical API, and if on of
> > these calls clk_disable_critical, you're losing leave_on.
> 
> That's the idea.  See my point above, the one you replied "Indeed"
> to.  So when a driver uses clk_disable_critical() it's saying, "I know
> why this clock is a critical clock, and I know that nothing terrible
> will happen if I disable it, as I have that covered".  So then if it's
> not the last user to call clk_disable(), the last one out the door
> will be allowed to finally gate the clock, regardless whether it's
> critical aware or not.
> 
> Then, when we come to enable the clock again, the critical aware user
> then re-marks the clock as leave_on, so not critical un-aware user can
> take the final reference and disable the clock.
> 
> > Which means that if there's one of the two users left that calls
> > clk_disable on it, the clock will actually be disabled, which is
> > clearly not what we want to do, as we have still a user that want the
> > clock to be enabled.
> 
> That's not what happens (at least it shouldn't if I've coded it up
> right).  The API _still_ requires all of the users to give-up their
> reference.
> 
> > It would be much more robust to have another count for the critical
> > stuff, initialised to one by the __set_critical_clocks function.
> 
> If I understand you correctly, we already have a count.  We use the
> original reference count.  No need for one of our own.
> 
> Using your RAM Clock (Clock 4) as an example
> --------------------------------------------
> 
> Early start-up:
>   Clock 4 is marked as critical and a reference is taken (ref == 1)
> 
> Driver probe:
>   SPI enables Clock 4 (ref == 2)
>   I2C enables Clock 4 (ref == 3)
> 
> Suspend (without RAM driver's permission):
>   SPI disables Clock 4 (ref == 2)
>   I2C disables Clock 4 (ref == 1)
>   /*
>    * Clock won't be gated because:
>    *   .leave_on is True - can't dec final reference

I am clearly missing the point. The clock won't be gated because the
enable_count is still 1! What does .leave_on do here?

>    */
> 
> Suspend (with RAM driver's permission):
>   /* Order is unimportant */
>   SPI disables Clock 4 (ref == 2)
>   RAM disables Clock 4 (ref == 1) /* Won't turn off here (ref > 0)
>   I2C disables Clock 4 (ref == 0) /* (.leave_on == False) last ref can be taken */
>   /*
>    * Clock will be gated because:
>    *   .leave_on is False, so (ref == 0)

Again, .leave_on does nothing new here. We gate the clock because the
reference count is 0.

>    */
> 
> Resume:
>   /* Order is unimportant */
>   SPI enables Clock 4 (ref == 1)
>   RAM enables Clock 4 and re-enables .leave_on (ref == 2)
>   I2C enables Clock 4 (ref == 3)

Same again. As soon as RAM calls clk_enable_critical the ref count goes
up. .leave_on does nothing as far as I can tell. The all works because
of the reference counting, which already exists before this patch
series.

Regards,
Mike

> 
> Hopefully that clears things up.
> 
> Please tell me if the code doesn't reflect this strategy, or if you
> can see anything wrong with how it operates.
> 
> -- 
> Lee Jones
> Linaro STMicroelectronics Landing Team Lead
> Linaro.org │ Open source software for ARM SoCs
> Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Mike Turquette July 30, 2015, 1:21 a.m. | #5
Quoting Lee Jones (2015-07-27 01:53:38)
> On Mon, 27 Jul 2015, Maxime Ripard wrote:
> 
> > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > even if they are marked as critical.
> > > 
> > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > ---
> > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > >  include/linux/clk-provider.h |  2 ++
> > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > >  3 files changed, 77 insertions(+)
> > > 
> > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > index 61c3fc5..486b1da 100644
> > > --- a/drivers/clk/clk.c
> > > +++ b/drivers/clk/clk.c
> > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > >  
> > >  /***    private data structures    ***/
> > >  
> > > +/**
> > > + * struct critical -       Provides 'play' over critical clocks.  A clock can be
> > > + *                 marked as critical, meaning that it should not be
> > > + *                 disabled.  However, if a driver which is aware of the
> > > + *                 critical behaviour wants to control it, it can do so
> > > + *                 using clk_enable_critical() and clk_disable_critical().
> > > + *
> > > + * @enabled        Is clock critical?  Once set, doesn't change
> > > + * @leave_on       Self explanatory.  Can be disabled by knowledgeable drivers
> > > + */
> > > +struct critical {
> > > +   bool enabled;
> > > +   bool leave_on;
> > > +};
> > > +
> > >  struct clk_core {
> > >     const char              *name;
> > >     const struct clk_ops    *ops;
> > > @@ -75,6 +90,7 @@ struct clk_core {
> > >     struct dentry           *dentry;
> > >  #endif
> > >     struct kref             ref;
> > > +   struct critical         critical;
> > >  };
> > >  
> > >  struct clk {
> > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > >     if (WARN_ON(clk->enable_count == 0))
> > >             return;
> > >  
> > > +   /* Refuse to turn off a critical clock */
> > > +   if (clk->enable_count == 1 && clk->critical.leave_on)
> > > +           return;
> > > +
> > 
> > I think it should be handled by a separate counting. Otherwise, if you
> > have two users that marked the clock as critical, and then one of them
> > disable it...
> > 
> > >     if (--clk->enable_count > 0)
> > >             return;
> > >  
> > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > >  }
> > >  EXPORT_SYMBOL_GPL(clk_disable);
> > >  
> > > +void clk_disable_critical(struct clk *clk)
> > > +{
> > > +   clk->core->critical.leave_on = false;
> > 
> > .. you just lost the fact that it was critical in the first place.
> 
> I thought about both of these points, which is why I came up with this
> strategy.
> 
> Any device which uses the *_critical() API should a) have knowledge of
> what happens when a particular critical clock is gated and b) have
> thought about the consequences.

If this statement above is true then I fail to see the need for a new
api. A driver which has a really great idea of when it is safe or unsafe
to gate a clock should call clk_prepare_enable at probe and then only
call clk_disable_unprepare once it is safe to do so.

The existing bookkeeping in the clock framework will do the rest.

Regards,
Mike

> I don't think we can use reference
> counting, because we'd need as many critical clock owners as there are
> critical clocks.  Cast your mind back to the reasons for this critical
> clock API.  One of the most important intentions of this API is the
> requirement mitigation for each of the critical clocks to have an owner
> (driver).
> 
> With regards to your second point, that's what 'critical.enabled'
> is for.  Take a look at clk_enable_critical().
> 
> -- 
> Lee Jones
> Linaro STMicroelectronics Landing Team Lead
> Linaro.org │ Open source software for ARM SoCs
> Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lee Jones July 30, 2015, 9:21 a.m. | #6
On Wed, 29 Jul 2015, Michael Turquette wrote:
> Quoting Lee Jones (2015-07-27 01:53:38)
> > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > 
> > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > even if they are marked as critical.
> > > > 
> > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/linux/clk-provider.h |  2 ++
> > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > >  3 files changed, 77 insertions(+)
> > > > 
> > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > index 61c3fc5..486b1da 100644
> > > > --- a/drivers/clk/clk.c
> > > > +++ b/drivers/clk/clk.c
> > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > >  
> > > >  /***    private data structures    ***/
> > > >  
> > > > +/**
> > > > + * struct critical -       Provides 'play' over critical clocks.  A clock can be
> > > > + *                 marked as critical, meaning that it should not be
> > > > + *                 disabled.  However, if a driver which is aware of the
> > > > + *                 critical behaviour wants to control it, it can do so
> > > > + *                 using clk_enable_critical() and clk_disable_critical().
> > > > + *
> > > > + * @enabled        Is clock critical?  Once set, doesn't change
> > > > + * @leave_on       Self explanatory.  Can be disabled by knowledgeable drivers
> > > > + */
> > > > +struct critical {
> > > > +   bool enabled;
> > > > +   bool leave_on;
> > > > +};
> > > > +
> > > >  struct clk_core {
> > > >     const char              *name;
> > > >     const struct clk_ops    *ops;
> > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > >     struct dentry           *dentry;
> > > >  #endif
> > > >     struct kref             ref;
> > > > +   struct critical         critical;
> > > >  };
> > > >  
> > > >  struct clk {
> > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > >     if (WARN_ON(clk->enable_count == 0))
> > > >             return;
> > > >  
> > > > +   /* Refuse to turn off a critical clock */
> > > > +   if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > +           return;
> > > > +
> > > 
> > > I think it should be handled by a separate counting. Otherwise, if you
> > > have two users that marked the clock as critical, and then one of them
> > > disable it...
> > > 
> > > >     if (--clk->enable_count > 0)
> > > >             return;
> > > >  
> > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(clk_disable);
> > > >  
> > > > +void clk_disable_critical(struct clk *clk)
> > > > +{
> > > > +   clk->core->critical.leave_on = false;
> > > 
> > > .. you just lost the fact that it was critical in the first place.
> > 
> > I thought about both of these points, which is why I came up with this
> > strategy.
> > 
> > Any device which uses the *_critical() API should a) have knowledge of
> > what happens when a particular critical clock is gated and b) have
> > thought about the consequences.
> 
> If this statement above is true then I fail to see the need for a new
> api. A driver which has a really great idea of when it is safe or unsafe
> to gate a clock should call clk_prepare_enable at probe and then only
> call clk_disable_unprepare once it is safe to do so.
> 
> The existing bookkeeping in the clock framework will do the rest.

I think you are viewing this particular API back-to-front.  The idea
is to mark all of the critical clocks at start-up by taking a
reference.  Then, if there are no knowledgable drivers who wish to
turn the clock off, the CCF will leave the clock ungated becuase the
reference count will always be >0.

The clk_{disable,enable}_critical() calls are to be used by
knowledgable drivers to say "[disable] I know what I'm doing and it's
okay for this clock to be turned off" and "[enable] right I'm done
with this clock now, let's turn it back on and mark it back as
critical, so no one else can turn it off".

To put things simply, the knowledgable driver will _not_ be enabling
the clock in the first place.  The first interaction it has with it is
to gate it.  Then, once it's done, it will enable it again and mark it
back up as critical.

Still confused ... let's go on another Q in one of your other emails
for another way of putting it.

> > I don't think we can use reference
> > counting, because we'd need as many critical clock owners as there are
> > critical clocks.  Cast your mind back to the reasons for this critical
> > clock API.  One of the most important intentions of this API is the
> > requirement mitigation for each of the critical clocks to have an owner
> > (driver).
> > 
> > With regards to your second point, that's what 'critical.enabled'
> > is for.  Take a look at clk_enable_critical().
> >
Lee Jones July 30, 2015, 9:50 a.m. | #7
On Wed, 29 Jul 2015, Michael Turquette wrote:
> Quoting Lee Jones (2015-07-28 06:00:55)
> > On Tue, 28 Jul 2015, Maxime Ripard wrote:
> > 
> > > On Mon, Jul 27, 2015 at 09:53:38AM +0100, Lee Jones wrote:
> > > > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > > > 
> > > > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > > > even if they are marked as critical.
> > > > > > 
> > > > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > ---
> > > > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > > > >  include/linux/clk-provider.h |  2 ++
> > > > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > > > >  3 files changed, 77 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > > index 61c3fc5..486b1da 100644
> > > > > > --- a/drivers/clk/clk.c
> > > > > > +++ b/drivers/clk/clk.c
> > > > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > > > >  
> > > > > >  /***    private data structures    ***/
> > > > > >  
> > > > > > +/**
> > > > > > + * struct critical -   Provides 'play' over critical clocks.  A clock can be
> > > > > > + *                     marked as critical, meaning that it should not be
> > > > > > + *                     disabled.  However, if a driver which is aware of the
> > > > > > + *                     critical behaviour wants to control it, it can do so
> > > > > > + *                     using clk_enable_critical() and clk_disable_critical().
> > > > > > + *
> > > > > > + * @enabled    Is clock critical?  Once set, doesn't change
> > > > > > + * @leave_on   Self explanatory.  Can be disabled by knowledgeable drivers
> > > > > > + */
> > > > > > +struct critical {
> > > > > > +       bool enabled;
> > > > > > +       bool leave_on;
> > > > > > +};
> > > > > > +
> > > > > >  struct clk_core {
> > > > > >         const char              *name;
> > > > > >         const struct clk_ops    *ops;
> > > > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > > > >         struct dentry           *dentry;
> > > > > >  #endif
> > > > > >         struct kref             ref;
> > > > > > +       struct critical         critical;
> > > > > >  };
> > > > > >  
> > > > > >  struct clk {
> > > > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > > > >         if (WARN_ON(clk->enable_count == 0))
> > > > > >                 return;
> > > > > >  
> > > > > > +       /* Refuse to turn off a critical clock */
> > > > > > +       if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > > > +               return;
> > > > > > +
> > > > > 
> > > > > I think it should be handled by a separate counting. Otherwise, if you
> > > > > have two users that marked the clock as critical, and then one of them
> > > > > disable it...
> > > > > 
> > > > > >         if (--clk->enable_count > 0)
> > > > > >                 return;
> > > > > >  
> > > > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(clk_disable);
> > > > > >  
> > > > > > +void clk_disable_critical(struct clk *clk)
> > > > > > +{
> > > > > > +       clk->core->critical.leave_on = false;
> > > > > 
> > > > > .. you just lost the fact that it was critical in the first place.
> > > > 
> > > > I thought about both of these points, which is why I came up with this
> > > > strategy.
> > > > 
> > > > Any device which uses the *_critical() API should a) have knowledge of
> > > > what happens when a particular critical clock is gated and b) have
> > > > thought about the consequences.
> > > 
> > > Indeed.
> > > 
> > > > I don't think we can use reference counting, because we'd need as
> > > > many critical clock owners as there are critical clocks.
> > > 
> > > Which we can have if we replace the call to clk_prepare_enable you add
> > > in your fourth patch in __set_critical_clocks.
> > 
> > What should it be replaced with?
> > 
> > > > Cast your mind back to the reasons for this critical clock API.  One
> > > > of the most important intentions of this API is the requirement
> > > > mitigation for each of the critical clocks to have an owner
> > > > (driver).
> > > > 
> > > > With regards to your second point, that's what 'critical.enabled'
> > > > is for.  Take a look at clk_enable_critical().
> > > 
> > > I don't think this addresses the issue, if you just throw more
> > > customers at it, the issue remain with your implementation.
> > > 
> > > If you have three customers that used the critical API, and if on of
> > > these calls clk_disable_critical, you're losing leave_on.
> > 
> > That's the idea.  See my point above, the one you replied "Indeed"
> > to.  So when a driver uses clk_disable_critical() it's saying, "I know
> > why this clock is a critical clock, and I know that nothing terrible
> > will happen if I disable it, as I have that covered".  So then if it's
> > not the last user to call clk_disable(), the last one out the door
> > will be allowed to finally gate the clock, regardless whether it's
> > critical aware or not.
> > 
> > Then, when we come to enable the clock again, the critical aware user
> > then re-marks the clock as leave_on, so not critical un-aware user can
> > take the final reference and disable the clock.
> > 
> > > Which means that if there's one of the two users left that calls
> > > clk_disable on it, the clock will actually be disabled, which is
> > > clearly not what we want to do, as we have still a user that want the
> > > clock to be enabled.
> > 
> > That's not what happens (at least it shouldn't if I've coded it up
> > right).  The API _still_ requires all of the users to give-up their
> > reference.
> > 
> > > It would be much more robust to have another count for the critical
> > > stuff, initialised to one by the __set_critical_clocks function.
> > 
> > If I understand you correctly, we already have a count.  We use the
> > original reference count.  No need for one of our own.
> > 
> > Using your RAM Clock (Clock 4) as an example
> > --------------------------------------------
> > 
> > Early start-up:
> >   Clock 4 is marked as critical and a reference is taken (ref == 1)
> > 
> > Driver probe:
> >   SPI enables Clock 4 (ref == 2)
> >   I2C enables Clock 4 (ref == 3)
> > 
> > Suspend (without RAM driver's permission):
> >   SPI disables Clock 4 (ref == 2)
> >   I2C disables Clock 4 (ref == 1)
> >   /*
> >    * Clock won't be gated because:
> >    *   .leave_on is True - can't dec final reference
> 
> I am clearly missing the point. The clock won't be gated because the
> enable_count is still 1! What does .leave_on do here?

The point of _this_ (the extended) part of the API is so that the
clock _can_ be turned off.  Without the possibility to disable
.leave_on and the logic with accompanies it (i.e.
clk_disable_critical()) the clock will _never_ be gated.

> >    */
> > 
> > Suspend (with RAM driver's permission):
> >   /* Order is unimportant */
> >   SPI disables Clock 4 (ref == 2)
> >   RAM disables Clock 4 (ref == 1) /* Won't turn off here (ref > 0)
> >   I2C disables Clock 4 (ref == 0) /* (.leave_on == False) last ref can be taken */
> >   /*
> >    * Clock will be gated because:
> >    *   .leave_on is False, so (ref == 0)
> 
> Again, .leave_on does nothing new here. We gate the clock because the
> reference count is 0.

It's the fact that .leave_on has been disabled in
clk_disable_critical() that allows the final reference to be taken.

> >    */
> > 
> > Resume:
> >   /* Order is unimportant */
> >   SPI enables Clock 4 (ref == 1)
> >   RAM enables Clock 4 and re-enables .leave_on (ref == 2)
> >   I2C enables Clock 4 (ref == 3)
> 
> Same again. As soon as RAM calls clk_enable_critical the ref count goes
> up. .leave_on does nothing as far as I can tell. The all works because
> of the reference counting, which already exists before this patch
> series.

So fundamentally you're right in what you say.  All you really need to
disable a critical clock is write a knowledgeable driver, which is
intentionally unbalanced i.e. just calls clk_disable().  All this
extended API really does is makes the process more official and
ensures that an unintentionally unbalanced driver doesn't bugger up
the running platform.  We could also add a new WARN() to say that said
driver is unbalanced, as it just tried to turn off a critical clock.

What do you think is best?
Lee Jones July 30, 2015, 11:17 a.m. | #8
On Wed, 29 Jul 2015, Michael Turquette wrote:

> Hi Lee,
> 
> + linux-clk ml
> 
> Quoting Lee Jones (2015-07-22 06:04:13)
> > These new API calls will firstly provide a mechanisms to tag a clock as
> > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > even if they are marked as critical.
> > 
> > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > ---
> >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/clk-provider.h |  2 ++
> >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> >  3 files changed, 77 insertions(+)
> > 
> > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > index 61c3fc5..486b1da 100644
> > --- a/drivers/clk/clk.c
> > +++ b/drivers/clk/clk.c
> > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> >  
> >  /***    private data structures    ***/
> >  
> > +/**
> > + * struct critical -   Provides 'play' over critical clocks.  A clock can be
> > + *                     marked as critical, meaning that it should not be
> > + *                     disabled.  However, if a driver which is aware of the
> > + *                     critical behaviour wants to control it, it can do so
> > + *                     using clk_enable_critical() and clk_disable_critical().
> > + *
> > + * @enabled    Is clock critical?  Once set, doesn't change
> > + * @leave_on   Self explanatory.  Can be disabled by knowledgeable drivers
> 
> Not self explanatory. I need this explained to me. What does leave_on
> do? Better yet, what would happen if leave_on did not exist?
> 
> > + */
> > +struct critical {
> > +       bool enabled;
> > +       bool leave_on;
> > +};
> > +
> >  struct clk_core {
> >         const char              *name;
> >         const struct clk_ops    *ops;
> > @@ -75,6 +90,7 @@ struct clk_core {
> >         struct dentry           *dentry;
> >  #endif
> >         struct kref             ref;
> > +       struct critical         critical;
> >  };
> >  
> >  struct clk {
> > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> >         if (WARN_ON(clk->enable_count == 0))
> >                 return;
> >  
> > +       /* Refuse to turn off a critical clock */
> > +       if (clk->enable_count == 1 && clk->critical.leave_on)
> > +               return;
> 
> How do we get to this point? clk_enable_critical actually calls
> clk_enable, thus incrementing the enable_count. The only time that we
> could hit the above case is if,
> 
> a) there is an imbalance in clk_enable and clk_disable calls. If this is
> the case then the drivers need to be fixed. Or better yet some
> infrastructure to catch that, now that we have per-user struct clk
> cookies.
> 
> b) a driver knowingly calls clk_enable_critical(foo) and then regular,
> old clk_disable(foo). But why would a driver do that?
> 
> It might be that I am missing the point here, so please feel free to
> clue me in.

This check behaves in a very similar to the WARN() above.  It's more
of a fail-safe.  If all drivers are behaving properly, then it
shouldn't ever be true.  If they're not, it prevents an incorrectly
written driver from irrecoverably crippling the system.

As I said in the other mail.  We can do without these 3 new wrappers.
We _could_ just write a driver which only calls clk_enable() _after_
it calls clk_disable(), a kind of intentional unbalance and it would
do that same thing.  However, what we're trying to do here is provide
a proper API, so we can see at first glance what the 'knowledgeable'
driver is trying to do and not have someone attempt to submit a 'fix'
which calls clk_enable() or something.

> > +
> >         if (--clk->enable_count > 0)
> >                 return;
> >  
> > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> >  }
> >  EXPORT_SYMBOL_GPL(clk_disable);
> >  
> > +void clk_disable_critical(struct clk *clk)
> > +{
> > +       clk->core->critical.leave_on = false;
> > +       clk_disable(clk);
> > +}
> > +EXPORT_SYMBOL_GPL(clk_disable_critical);
> > +
> >  static int clk_core_enable(struct clk_core *clk)
> >  {
> >         int ret = 0;
> > @@ -1100,6 +1127,15 @@ int clk_enable(struct clk *clk)
> >  }
> >  EXPORT_SYMBOL_GPL(clk_enable);
> >  
> > +int clk_enable_critical(struct clk *clk)
> > +{
> > +       if (clk->core->critical.enabled)
> > +               clk->core->critical.leave_on = true;
> > +
> > +       return clk_enable(clk);
> > +}
> > +EXPORT_SYMBOL_GPL(clk_enable_critical);
> > +
> >  static unsigned long clk_core_round_rate_nolock(struct clk_core *clk,
> >                                                 unsigned long rate,
> >                                                 unsigned long min_rate,
> > @@ -2482,6 +2518,15 @@ fail_out:
> >  }
> >  EXPORT_SYMBOL_GPL(clk_register);
> >  
> > +void clk_init_critical(struct clk *clk)
> > +{
> > +       struct critical *critical = &clk->core->critical;
> > +
> > +       critical->enabled = true;
> > +       critical->leave_on = true;
> > +}
> > +EXPORT_SYMBOL_GPL(clk_init_critical);
> > +
> >  /*
> >   * Free memory allocated for a clock.
> >   * Caller must hold prepare_lock.
> > diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
> > index 5591ea7..15ef8c9 100644
> > --- a/include/linux/clk-provider.h
> > +++ b/include/linux/clk-provider.h
> > @@ -563,6 +563,8 @@ struct clk *devm_clk_register(struct device *dev, struct clk_hw *hw);
> >  void clk_unregister(struct clk *clk);
> >  void devm_clk_unregister(struct device *dev, struct clk *clk);
> >  
> > +void clk_init_critical(struct clk *clk);
> > +
> >  /* helper functions */
> >  const char *__clk_get_name(struct clk *clk);
> >  struct clk_hw *__clk_get_hw(struct clk *clk);
> > diff --git a/include/linux/clk.h b/include/linux/clk.h
> > index 8381bbf..9807f3b 100644
> > --- a/include/linux/clk.h
> > +++ b/include/linux/clk.h
> > @@ -231,6 +231,19 @@ struct clk *devm_clk_get(struct device *dev, const char *id);
> >  int clk_enable(struct clk *clk);
> >  
> >  /**
> > + * clk_enable_critical - inform the system when the clock source should be
> > + *                      running, even if clock is critical.
> > + * @clk: clock source
> > + *
> > + * If the clock can not be enabled/disabled, this should return success.
> > + *
> > + * May be called from atomic contexts.
> > + *
> > + * Returns success (0) or negative errno.
> > + */
> > +int clk_enable_critical(struct clk *clk);
> > +
> > +/**
> >   * clk_disable - inform the system when the clock source is no longer required.
> >   * @clk: clock source
> >   *
> > @@ -247,6 +260,23 @@ int clk_enable(struct clk *clk);
> >  void clk_disable(struct clk *clk);
> >  
> >  /**
> > + * clk_disable_critical - inform the system when the clock source is no
> > + *                       longer required, even if clock is critical.
> > + * @clk: clock source
> > + *
> > + * Inform the system that a clock source is no longer required by
> > + * a driver and may be shut down.
> > + *
> > + * May be called from atomic contexts.
> > + *
> > + * Implementation detail: if the clock source is shared between
> > + * multiple drivers, clk_enable_critical() calls must be balanced
> > + * by the same number of clk_disable_critical() calls for the clock
> > + * source to be disabled.
> > + */
> > +void clk_disable_critical(struct clk *clk);
> > +
> > +/**
> >   * clk_get_rate - obtain the current clock rate (in Hz) for a clock source.
> >   *               This is only valid once the clock source has been enabled.
> >   * @clk: clock source
Lee Jones July 31, 2015, 8:32 a.m. | #9
On Fri, 31 Jul 2015, Maxime Ripard wrote:

> On Thu, Jul 30, 2015 at 03:47:20PM -0700, Michael Turquette wrote:
> > > > >    */
> > > > > 
> > > > > Resume:
> > > > >   /* Order is unimportant */
> > > > >   SPI enables Clock 4 (ref == 1)
> > > > >   RAM enables Clock 4 and re-enables .leave_on (ref == 2)
> > > > >   I2C enables Clock 4 (ref == 3)
> > > > 
> > > > Same again. As soon as RAM calls clk_enable_critical the ref count goes
> > > > up. .leave_on does nothing as far as I can tell. The all works because
> > > > of the reference counting, which already exists before this patch
> > > > series.
> > > 
> > > So fundamentally you're right in what you say.  All you really need to
> > > disable a critical clock is write a knowledgeable driver, which is
> > > intentionally unbalanced i.e. just calls clk_disable().  All this
> > 
> > OK, the line above is helpful. What you really want is a formalized
> > hand-off mechanism, whereby the clock is enabled at registration-time
> > and it cannot be turned off until the right driver claims it and decides
> > turning it off is OK (with a priori knowledge that the clock is already
> > enabled).
> 
> There's two things that should be covered, and are related, yet can be
> done in two steps:
> 
>   - Have a way to, no matter what (which configuration we have, if we
>     have multiple users or not that might reparent or disable their
>     clocks, etc.), make sure that a clock will always be running by
>     default. This is done through the call in clk-conf, and we
>     identify such clocks using critical-clocks in the DT.
> 
>   - Now, even though that information is true, some driver who are
>     aware of that fact might want to disable those critical
>     clocks. This is what the clk_disable_critical and
>     clk_enable_critical functions are here for.

+1

> > Note that I don't think this implementation can really work in the near
> > future. Today we do not catch unbalanced calls to clk_enable and
> > clk_disable, but I have a patch that catches this and WARNs loudly in my
> > private tree. More on that in the next stanza.
> > 
> > What I don't understand is if there is ever a case for a clock consumer
> > driver to ever call clk_enable_critical... I do not think there is. What
> > you're trying to protect against is having the clock disabled BEFORE
> > that "knowledgeable driver" has a chance to enable it.

In my mind and in this implementation  clk_disable_critical() will be
used _first_ by the knowledgeable driver, then when the knowledgeable
driver has finished whatever it was doing (shutting down banks of RAM
etc...), it should then call clk_enable_critical() to reset the clock
state back to the way it found it i.e. enabled and marked as critical.

> It's really about what we want the API to look like in the second
> case.
> 
> Do we want such drivers to still call clk_prepare_enable? Some other
> function? Should they assume that the clock has already been enabled,
> or do we want a handover, or a force disable, or whatever.
> 
> I guess we should discuss those questions, before starting to think
> about how to implement it.
> 
> IMHO, I think that the existing way of doing should be used, or at
> least with the same mindset to avoid confusion, errors, and
> misinformed reviews.
> 
> So I'd expect the drivers to do something like:
> 
> probe:
>   clk_get
>   clk_critical_enable

Well becuase the clock has been marked as critical, a reference has
already been taken, so even if there are 0 users the clock now has 2
references attributed to it.

> remove / probe error path:
>   clk_critical_disable
>   clk_put

I think we should assume that the clock is already running in the
knowledgeable driver:

start-up:
  __set_critical_clocks()

probe:
  clk_get()

suspend (or whatever reason the driver wishes to disable the clk):
  clk_disable_critical()

resume (or whatever ...):
  clk_enable_critical()

remove:
  clk_put() /* Or just rely on devm_*() */ 

> and use the clk_critical_enable and clk_critical_disable whenever
> needed, and thing would just work as intended.
Lee Jones July 31, 2015, 8:48 a.m. | #10
On Fri, 31 Jul 2015, Maxime Ripard wrote:

> On Tue, Jul 28, 2015 at 02:00:55PM +0100, Lee Jones wrote:
> > > > I don't think we can use reference counting, because we'd need as
> > > > many critical clock owners as there are critical clocks.
> > > 
> > > Which we can have if we replace the call to clk_prepare_enable you add
> > > in your fourth patch in __set_critical_clocks.
> > 
> > What should it be replaced with?
> 
> clk->critical_count++
> clk_prepare_enable
> 
> ?

Ah, so not replace it then.  Just add a reference counter.

I'm with you, that's fine.

> > > > Cast your mind back to the reasons for this critical clock API.  One
> > > > of the most important intentions of this API is the requirement
> > > > mitigation for each of the critical clocks to have an owner
> > > > (driver).
> > > > 
> > > > With regards to your second point, that's what 'critical.enabled'
> > > > is for.  Take a look at clk_enable_critical().
> > > 
> > > I don't think this addresses the issue, if you just throw more
> > > customers at it, the issue remain with your implementation.
> > > 
> > > If you have three customers that used the critical API, and if on of
> > > these calls clk_disable_critical, you're losing leave_on.
> > 
> > That's the idea.  See my point above, the one you replied "Indeed"
> > to.  So when a driver uses clk_disable_critical() it's saying, "I know
> > why this clock is a critical clock, and I know that nothing terrible
> > will happen if I disable it, as I have that covered".
> 
> We do agree on the semantic of clk_disable_critical :)
> 
> > So then if it's not the last user to call clk_disable(), the last
> > one out the door will be allowed to finally gate the clock,
> > regardless whether it's critical aware or not.
> 
> That's right, but what I mean would be a case where you have two users
> that are aware that it is a critical clock (A and B), and one which is
> not (C).
> 
> If I understood correctly your code, if A calls clk_disable_critical,
> leave_on is set to false. That means that now, if C calls clk_disable
> on that clock, it will actually be shut down, while B still considers
> it a critical clock.

Hmm... I'd have to think about this.

How would you mark a clock as critical twice?

> > Then, when we come to enable the clock again, the critical aware user
> > then re-marks the clock as leave_on, so not critical un-aware user can
> > take the final reference and disable the clock.
> > 
> > > Which means that if there's one of the two users left that calls
> > > clk_disable on it, the clock will actually be disabled, which is
> > > clearly not what we want to do, as we have still a user that want the
> > > clock to be enabled.
> > 
> > That's not what happens (at least it shouldn't if I've coded it up
> > right).  The API _still_ requires all of the users to give-up their
> > reference.
> 
> Ah, right. So I guess it all boils down to the discussion you're
> having with Mike regarding whether critical users should expect to
> already have a reference taken or calling clk_prepare / clk_enable
> themselves.

Then we'd need a clk_prepare_enable_critical() :)

... which would be aware of the original reference (taken by
__set_critical_clocks).  However, if a second knowledgeable driver
were to call it, then how would it know that whether the original
reference was still present or not?  I guess that's where your
critical clock reference comes in.  If it's the first critical user,
it would decrement the original reference, it it's a subsequent user,
then it won't

> > > It would be much more robust to have another count for the critical
> > > stuff, initialised to one by the __set_critical_clocks function.
> > 
> > If I understand you correctly, we already have a count.  We use the
> > original reference count.  No need for one of our own.
> > 
> > Using your RAM Clock (Clock 4) as an example
> > --------------------------------------------
> > 
> > Early start-up:
> >   Clock 4 is marked as critical and a reference is taken (ref == 1)
> > 
> > Driver probe:
> >   SPI enables Clock 4 (ref == 2)
> >   I2C enables Clock 4 (ref == 3)
> > 
> > Suspend (without RAM driver's permission):
> >   SPI disables Clock 4 (ref == 2)
> >   I2C disables Clock 4 (ref == 1)
> >   /*
> >    * Clock won't be gated because:
> >    *   .leave_on is True - can't dec final reference
> >    */
> > 
> > Suspend (with RAM driver's permission):
> >   /* Order is unimportant */
> >   SPI disables Clock 4 (ref == 2)
> >   RAM disables Clock 4 (ref == 1) /* Won't turn off here (ref > 0)
> >   I2C disables Clock 4 (ref == 0) /* (.leave_on == False) last ref can be taken */
> >   /*
> >    * Clock will be gated because:
> >    *   .leave_on is False, so (ref == 0)
> >    */
> > 
> > Resume:
> >   /* Order is unimportant */
> >   SPI enables Clock 4 (ref == 1)
> >   RAM enables Clock 4 and re-enables .leave_on (ref == 2)
> >   I2C enables Clock 4 (ref == 3)
> > 
> > Hopefully that clears things up.
> 
> It does indeed. I simply forgot to take into account the fact that it
> would still need the reference to be set to 0. My bad.
> 
> Still, If we take the same kind of scenario:
> 
> Early start-up:
>   Clock 4 is marked as critical and a reference is taken (ref == 1, leave_on = true)
> 
> Driver probe:
>   SPI enables Clock 4 (ref == 2)
>   I2C enables Clock 4 (ref == 3)
>   RAM enables Clock 4 (ref == 4, leave_on = true )
>   CPUIdle enables Clock 4 (ref == 5, leave_on = true )
> 
> Suspend (with CPUIdle and RAM permissions):
>   /* Order is unimportant */
>   SPI disables Clock 4 (ref == 4)
>   CPUIdle disables Clock 4 (ref == 3, leave_on = false ) 
>   RAM disables Clock 4 (ref == 2, leave_on = false )
>   I2C disables Clock 4 (ref == 1)
> 
> And even though the clock will still be running when CPUIdle calls
> clk_disable_critical because of the enable_count, the status of
> leave_on is off, as the RAM driver still considers it to be left on
> (ie, hasn't called clk_disable_critical yet).
> 
> Or at least, that's what I understood of it.

Right, I understood this problem when you suggested that two critical
clock users could be using the same clock.  Other than having them
call clock_prepare_enable[_critical](), I'm not sure if that's
possible.

As I mentioned above, we can handle this with reference counting and
I'm happy to code that up.
Lee Jones July 31, 2015, 8:56 a.m. | #11
On Thu, 30 Jul 2015, Michael Turquette wrote:

> Quoting Lee Jones (2015-07-30 02:21:39)
> > On Wed, 29 Jul 2015, Michael Turquette wrote:
> > > Quoting Lee Jones (2015-07-27 01:53:38)
> > > > On Mon, 27 Jul 2015, Maxime Ripard wrote:
> > > > 
> > > > > On Wed, Jul 22, 2015 at 02:04:13PM +0100, Lee Jones wrote:
> > > > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > > > even if they are marked as critical.
> > > > > > 
> > > > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > ---
> > > > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > > > >  include/linux/clk-provider.h |  2 ++
> > > > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > > > >  3 files changed, 77 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > > index 61c3fc5..486b1da 100644
> > > > > > --- a/drivers/clk/clk.c
> > > > > > +++ b/drivers/clk/clk.c
> > > > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > > > >  
> > > > > >  /***    private data structures    ***/
> > > > > >  
> > > > > > +/**
> > > > > > + * struct critical -       Provides 'play' over critical clocks.  A clock can be
> > > > > > + *                 marked as critical, meaning that it should not be
> > > > > > + *                 disabled.  However, if a driver which is aware of the
> > > > > > + *                 critical behaviour wants to control it, it can do so
> > > > > > + *                 using clk_enable_critical() and clk_disable_critical().
> > > > > > + *
> > > > > > + * @enabled        Is clock critical?  Once set, doesn't change
> > > > > > + * @leave_on       Self explanatory.  Can be disabled by knowledgeable drivers
> > > > > > + */
> > > > > > +struct critical {
> > > > > > +   bool enabled;
> > > > > > +   bool leave_on;
> > > > > > +};
> > > > > > +
> > > > > >  struct clk_core {
> > > > > >     const char              *name;
> > > > > >     const struct clk_ops    *ops;
> > > > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > > > >     struct dentry           *dentry;
> > > > > >  #endif
> > > > > >     struct kref             ref;
> > > > > > +   struct critical         critical;
> > > > > >  };
> > > > > >  
> > > > > >  struct clk {
> > > > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > > > >     if (WARN_ON(clk->enable_count == 0))
> > > > > >             return;
> > > > > >  
> > > > > > +   /* Refuse to turn off a critical clock */
> > > > > > +   if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > > > +           return;
> > > > > > +
> > > > > 
> > > > > I think it should be handled by a separate counting. Otherwise, if you
> > > > > have two users that marked the clock as critical, and then one of them
> > > > > disable it...
> > > > > 
> > > > > >     if (--clk->enable_count > 0)
> > > > > >             return;
> > > > > >  
> > > > > > @@ -1037,6 +1057,13 @@ void clk_disable(struct clk *clk)
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(clk_disable);
> > > > > >  
> > > > > > +void clk_disable_critical(struct clk *clk)
> > > > > > +{
> > > > > > +   clk->core->critical.leave_on = false;
> > > > > 
> > > > > .. you just lost the fact that it was critical in the first place.
> > > > 
> > > > I thought about both of these points, which is why I came up with this
> > > > strategy.
> > > > 
> > > > Any device which uses the *_critical() API should a) have knowledge of
> > > > what happens when a particular critical clock is gated and b) have
> > > > thought about the consequences.
> > > 
> > > If this statement above is true then I fail to see the need for a new
> > > api. A driver which has a really great idea of when it is safe or unsafe
> > > to gate a clock should call clk_prepare_enable at probe and then only
> > > call clk_disable_unprepare once it is safe to do so.
> > > 
> > > The existing bookkeeping in the clock framework will do the rest.
> > 
> > I think you are viewing this particular API back-to-front.  The idea
> > is to mark all of the critical clocks at start-up by taking a
> > reference.  Then, if there are no knowledgable drivers who wish to
> > turn the clock off, the CCF will leave the clock ungated becuase the
> > reference count will always be >0.
> 
> Right. So I'll ask the same question here that I asked in the other
> patch: is there ever a case where a clock consumer driver would want to
> call clk_enable_critical? It seems to me that in you usage of it, that
> call would only ever be made by the core framework code (e.g. clk-conf.c
> or perhaps somewhere in drivers/clk/clk.c).

Yes, _after_ it has called clk_disable_critical(), when it has
finished fiddling with it.  clk_enable_critical() simply resets the
clock back to an enabled/critical state (how the knowledgeable driver
found it).

> > The clk_{disable,enable}_critical() calls are to be used by
> > knowledgable drivers to say "[disable] I know what I'm doing and it's
> > okay for this clock to be turned off" and "[enable] right I'm done
> > with this clock now, let's turn it back on and mark it back as
> > critical, so no one else can turn it off".
> 
> OK, so this almost answers my question above. You have a driver that may
> finish using a clock for a while (ie, rmmod knowledgeable_driver), and
> you want it (critically) enabled again. Is this a real use case? Who
> would come along and disable this clock later on? If the driver is to be
> re-loaded then I would suggest never unloading it in the first place.

This has nothing to do with modules.  I believe the knowledgeable
consumer should only gate the clock (steal a reference) when the clock
is actually gated.  The rest of the time the framework will have it
marked as critical "do not turn me off".

> (Oh and bear in mind when answering my question above that imbalanced
> clk_enable/clk_disable calls will not succeed thanks to the vaporware
> patch that I have in my local tree)

They won't be imbalanced, because clk_enable() would have been called
first during start-up (__set_critical_clocks()).

> If you have a second knowledgeable_driver_2 that shares that clock and
> can be trusted to manage it (critically) then I would need to see that
> example, as it doesn't feel like a real use to me.

Nor me, that's why this impementation doesn't handle that use-case,
however Maxime thinks it is one, so we can solve that with reference
counting.

> > To put things simply, the knowledgable driver will _not_ be enabling
> > the clock in the first place.  The first interaction it has with it is
> > to gate it.  Then, once it's done, it will enable it again and mark it
> > back up as critical.a
> 
> I like the first part. Makes sense and fills a real need. I am fully
> on-board with a provider-handoff-to-consumer solution. It is all the
> weird stuff that comes after it that I'm unsure of.

I don't think there should be hand-off.  I think the
{enable,disable}_critical() should "momentarily" take the last
reference, then put everything back as it found it when it's finished
disabling the clock.

> > Still confused ... let's go on another Q in one of your other emails
> > for another way of putting it.
> > 
> > > > I don't think we can use reference
> > > > counting, because we'd need as many critical clock owners as there are
> > > > critical clocks.  Cast your mind back to the reasons for this critical
> > > > clock API.  One of the most important intentions of this API is the
> > > > requirement mitigation for each of the critical clocks to have an owner
> > > > (driver).
> > > > 
> > > > With regards to your second point, that's what 'critical.enabled'
> > > > is for.  Take a look at clk_enable_critical().
> > > > 
> >
Lee Jones July 31, 2015, 9:02 a.m. | #12
On Thu, 30 Jul 2015, Michael Turquette wrote:

> Quoting Lee Jones (2015-07-30 04:17:47)
> > On Wed, 29 Jul 2015, Michael Turquette wrote:
> > 
> > > Hi Lee,
> > > 
> > > + linux-clk ml
> > > 
> > > Quoting Lee Jones (2015-07-22 06:04:13)
> > > > These new API calls will firstly provide a mechanisms to tag a clock as
> > > > critical and secondly allow any knowledgeable driver to (un)gate clocks,
> > > > even if they are marked as critical.
> > > > 
> > > > Suggested-by: Maxime Ripard <maxime.ripard@free-electrons.com>
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >  drivers/clk/clk.c            | 45 ++++++++++++++++++++++++++++++++++++++++++++
> > > >  include/linux/clk-provider.h |  2 ++
> > > >  include/linux/clk.h          | 30 +++++++++++++++++++++++++++++
> > > >  3 files changed, 77 insertions(+)
> > > > 
> > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > index 61c3fc5..486b1da 100644
> > > > --- a/drivers/clk/clk.c
> > > > +++ b/drivers/clk/clk.c
> > > > @@ -46,6 +46,21 @@ static struct clk_core *clk_core_lookup(const char *name);
> > > >  
> > > >  /***    private data structures    ***/
> > > >  
> > > > +/**
> > > > + * struct critical -   Provides 'play' over critical clocks.  A clock can be
> > > > + *                     marked as critical, meaning that it should not be
> > > > + *                     disabled.  However, if a driver which is aware of the
> > > > + *                     critical behaviour wants to control it, it can do so
> > > > + *                     using clk_enable_critical() and clk_disable_critical().
> > > > + *
> > > > + * @enabled    Is clock critical?  Once set, doesn't change
> > > > + * @leave_on   Self explanatory.  Can be disabled by knowledgeable drivers
> > > 
> > > Not self explanatory. I need this explained to me. What does leave_on
> > > do? Better yet, what would happen if leave_on did not exist?
> > > 
> > > > + */
> > > > +struct critical {
> > > > +       bool enabled;
> > > > +       bool leave_on;
> > > > +};
> > > > +
> > > >  struct clk_core {
> > > >         const char              *name;
> > > >         const struct clk_ops    *ops;
> > > > @@ -75,6 +90,7 @@ struct clk_core {
> > > >         struct dentry           *dentry;
> > > >  #endif
> > > >         struct kref             ref;
> > > > +       struct critical         critical;
> > > >  };
> > > >  
> > > >  struct clk {
> > > > @@ -995,6 +1011,10 @@ static void clk_core_disable(struct clk_core *clk)
> > > >         if (WARN_ON(clk->enable_count == 0))
> > > >                 return;
> > > >  
> > > > +       /* Refuse to turn off a critical clock */
> > > > +       if (clk->enable_count == 1 && clk->critical.leave_on)
> > > > +               return;
> > > 
> > > How do we get to this point? clk_enable_critical actually calls
> > > clk_enable, thus incrementing the enable_count. The only time that we
> > > could hit the above case is if,
> > > 
> > > a) there is an imbalance in clk_enable and clk_disable calls. If this is
> > > the case then the drivers need to be fixed. Or better yet some
> > > infrastructure to catch that, now that we have per-user struct clk
> > > cookies.
> > > 
> > > b) a driver knowingly calls clk_enable_critical(foo) and then regular,
> > > old clk_disable(foo). But why would a driver do that?
> > > 
> > > It might be that I am missing the point here, so please feel free to
> > > clue me in.
> > 
> > This check behaves in a very similar to the WARN() above.  It's more
> > of a fail-safe.  If all drivers are behaving properly, then it
> > shouldn't ever be true.  If they're not, it prevents an incorrectly
> > written driver from irrecoverably crippling the system.
> 
> Then this check should be replaced with a generic approach that refuses
> to honor imbalances anyways. Below are two patches that probably resolve
> the issue of badly behaving drivers that cause enable imbalances.

Your patch should make the requirement for this check moot, so it can
probably be removed.

> > As I said in the other mail.  We can do without these 3 new wrappers.
> > We _could_ just write a driver which only calls clk_enable() _after_
> > it calls clk_disable(), a kind of intentional unbalance and it would
> > do that same thing.
> 
> This naive approach will not work with per-user imbalance tracking.

Steady on.  I said we "_could_", that that I think it's a good idea.

I think it's a bad idea, which is why I wrote this set. ;)

> > However, what we're trying to do here is provide
> > a proper API, so we can see at first glance what the 'knowledgeable'
> > driver is trying to do and not have someone attempt to submit a 'fix'
> > which calls clk_enable() or something.
> 
> We'll need some type of api for sure for the handoff.

This set will not trigger your new checks.  The clocks will be in
perfect ballance becuase a reference will be taken at start-up.

Again:

start-up:
  clk_prepare_enable()

knowlegable_driver_probe:
  clk_get()

knowlegable_driver_gate_clk:
  clk_disable_critical()

knowlegable_driver_ungate_clk:
  clk_enable_critical()

knowlegable_driver_remove:
  clk_put()

> From 3599ed206da9ce770bfafcfd95cbb9a03ac44473 Mon Sep 17 00:00:00 2001
> From: Michael Turquette <mturquette@baylibre.com>
> Date: Wed, 29 Jul 2015 18:22:45 -0700
> Subject: [PATCH 1/2] clk: per-user clk prepare & enable ref counts
> 
> This patch adds prepare and enable reference counts for the per-user
> handles that clock consumers have for a clock node. This patch warns if
> an imbalance occurs while trying to disable or unprepare a clock and
> aborts, leaving the hardware unaffected.
> 
> Signed-off-by: Michael Turquette <mturquette@baylibre.com>
> ---
>  drivers/clk/clk.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 898052e..72feee9 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -84,6 +84,8 @@ struct clk {
>  	unsigned long min_rate;
>  	unsigned long max_rate;
>  	struct hlist_node clks_node;
> +	unsigned int enable_count;
> +	unsigned int prepare_count;
>  };
>  
>  /***           locking             ***/
> @@ -600,6 +602,9 @@ void clk_unprepare(struct clk *clk)
>  		return;
>  
>  	clk_prepare_lock();
> +	if (WARN_ON(clk->prepare_count == 0))
> +		return;
> +	clk->prepare_count--;
>  	clk_core_unprepare(clk->core);
>  	clk_prepare_unlock();
>  }
> @@ -657,6 +662,7 @@ int clk_prepare(struct clk *clk)
>  		return 0;
>  
>  	clk_prepare_lock();
> +	clk->prepare_count++;
>  	ret = clk_core_prepare(clk->core);
>  	clk_prepare_unlock();
>  
> @@ -707,6 +713,9 @@ void clk_disable(struct clk *clk)
>  		return;
>  
>  	flags = clk_enable_lock();
> +	if (WARN_ON(clk->enable_count == 0))
> +		return;
> +	clk->enable_count--;
>  	clk_core_disable(clk->core);
>  	clk_enable_unlock(flags);
>  }
> @@ -769,6 +778,7 @@ int clk_enable(struct clk *clk)
>  		return 0;
>  
>  	flags = clk_enable_lock();
> +	clk->enable_count++;
>  	ret = clk_core_enable(clk->core);
>  	clk_enable_unlock(flags);
>

Patch

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 61c3fc5..486b1da 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -46,6 +46,21 @@  static struct clk_core *clk_core_lookup(const char *name);
 
 /***    private data structures    ***/
 
+/**
+ * struct critical -	Provides 'play' over critical clocks.  A clock can be
+ *			marked as critical, meaning that it should not be
+ *			disabled.  However, if a driver which is aware of the
+ *			critical behaviour wants to control it, it can do so
+ *			using clk_enable_critical() and clk_disable_critical().
+ *
+ * @enabled	Is clock critical?  Once set, doesn't change
+ * @leave_on	Self explanatory.  Can be disabled by knowledgeable drivers
+ */
+struct critical {
+	bool enabled;
+	bool leave_on;
+};
+
 struct clk_core {
 	const char		*name;
 	const struct clk_ops	*ops;
@@ -75,6 +90,7 @@  struct clk_core {
 	struct dentry		*dentry;
 #endif
 	struct kref		ref;
+	struct critical		critical;
 };
 
 struct clk {
@@ -995,6 +1011,10 @@  static void clk_core_disable(struct clk_core *clk)
 	if (WARN_ON(clk->enable_count == 0))
 		return;
 
+	/* Refuse to turn off a critical clock */
+	if (clk->enable_count == 1 && clk->critical.leave_on)
+		return;
+
 	if (--clk->enable_count > 0)
 		return;
 
@@ -1037,6 +1057,13 @@  void clk_disable(struct clk *clk)
 }
 EXPORT_SYMBOL_GPL(clk_disable);
 
+void clk_disable_critical(struct clk *clk)
+{
+	clk->core->critical.leave_on = false;
+	clk_disable(clk);
+}
+EXPORT_SYMBOL_GPL(clk_disable_critical);
+
 static int clk_core_enable(struct clk_core *clk)
 {
 	int ret = 0;
@@ -1100,6 +1127,15 @@  int clk_enable(struct clk *clk)
 }
 EXPORT_SYMBOL_GPL(clk_enable);
 
+int clk_enable_critical(struct clk *clk)
+{
+	if (clk->core->critical.enabled)
+		clk->core->critical.leave_on = true;
+
+	return clk_enable(clk);
+}
+EXPORT_SYMBOL_GPL(clk_enable_critical);
+
 static unsigned long clk_core_round_rate_nolock(struct clk_core *clk,
 						unsigned long rate,
 						unsigned long min_rate,
@@ -2482,6 +2518,15 @@  fail_out:
 }
 EXPORT_SYMBOL_GPL(clk_register);
 
+void clk_init_critical(struct clk *clk)
+{
+	struct critical *critical = &clk->core->critical;
+
+	critical->enabled = true;
+	critical->leave_on = true;
+}
+EXPORT_SYMBOL_GPL(clk_init_critical);
+
 /*
  * Free memory allocated for a clock.
  * Caller must hold prepare_lock.
diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
index 5591ea7..15ef8c9 100644
--- a/include/linux/clk-provider.h
+++ b/include/linux/clk-provider.h
@@ -563,6 +563,8 @@  struct clk *devm_clk_register(struct device *dev, struct clk_hw *hw);
 void clk_unregister(struct clk *clk);
 void devm_clk_unregister(struct device *dev, struct clk *clk);
 
+void clk_init_critical(struct clk *clk);
+
 /* helper functions */
 const char *__clk_get_name(struct clk *clk);
 struct clk_hw *__clk_get_hw(struct clk *clk);
diff --git a/include/linux/clk.h b/include/linux/clk.h
index 8381bbf..9807f3b 100644
--- a/include/linux/clk.h
+++ b/include/linux/clk.h
@@ -231,6 +231,19 @@  struct clk *devm_clk_get(struct device *dev, const char *id);
 int clk_enable(struct clk *clk);
 
 /**
+ * clk_enable_critical - inform the system when the clock source should be
+ * 			 running, even if clock is critical.
+ * @clk: clock source
+ *
+ * If the clock can not be enabled/disabled, this should return success.
+ *
+ * May be called from atomic contexts.
+ *
+ * Returns success (0) or negative errno.
+ */
+int clk_enable_critical(struct clk *clk);
+
+/**
  * clk_disable - inform the system when the clock source is no longer required.
  * @clk: clock source
  *
@@ -247,6 +260,23 @@  int clk_enable(struct clk *clk);
 void clk_disable(struct clk *clk);
 
 /**
+ * clk_disable_critical - inform the system when the clock source is no
+ * 			  longer required, even if clock is critical.
+ * @clk: clock source
+ *
+ * Inform the system that a clock source is no longer required by
+ * a driver and may be shut down.
+ *
+ * May be called from atomic contexts.
+ *
+ * Implementation detail: if the clock source is shared between
+ * multiple drivers, clk_enable_critical() calls must be balanced
+ * by the same number of clk_disable_critical() calls for the clock
+ * source to be disabled.
+ */
+void clk_disable_critical(struct clk *clk);
+
+/**
  * clk_get_rate - obtain the current clock rate (in Hz) for a clock source.
  *		  This is only valid once the clock source has been enabled.
  * @clk: clock source