diff mbox

[2/2] tick: SHUTDOWN event-dev if no events are required for KTIME_MAX

Message ID 0fe81b08a8aa32912ce7b4a015fdc09b6e3830e3.1399623699.git.viresh.kumar@linaro.org
State New
Headers show

Commit Message

Viresh Kumar May 9, 2014, 8:40 a.m. UTC
When expires is set to KTIME_MAX in tick_program_event(), we are sure that there
are no events enqueued for a very long time and so there is no point keeping
event device running. We will get interrupted without any work to do many a
times, for example when timer's counter overflows.

So, its better to SHUTDOWN the event device then and restart it ones we get a
request for next event. For implementing this a new field 'last_mode' is added
to 'struct clock_event_device' to keep track of last mode used.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 include/linux/clockchips.h |  1 +
 kernel/time/tick-oneshot.c | 14 +++++++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Thomas Gleixner May 9, 2014, 8:09 p.m. UTC | #1
On Fri, 9 May 2014, Viresh Kumar wrote:
> When expires is set to KTIME_MAX in tick_program_event(), we are sure that there
> are no events enqueued for a very long time and so there is no point keeping
> event device running. We will get interrupted without any work to do many a
> times, for example when timer's counter overflows.
> 
> So, its better to SHUTDOWN the event device then and restart it ones we get a
> request for next event. For implementing this a new field 'last_mode' is added
> to 'struct clock_event_device' to keep track of last mode used.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  include/linux/clockchips.h |  1 +
>  kernel/time/tick-oneshot.c | 14 +++++++++++++-
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
> index 2e4cb67..36a4ca6 100644
> --- a/include/linux/clockchips.h
> +++ b/include/linux/clockchips.h
> @@ -105,6 +105,7 @@ struct clock_event_device {
>  	u32			mult;
>  	u32			shift;
>  	enum clock_event_mode	mode;
> +	enum clock_event_mode	last_mode;
>  	unsigned int		features;
>  	unsigned long		retries;
>  
> diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c
> index 8241090..9543815 100644
> --- a/kernel/time/tick-oneshot.c
> +++ b/kernel/time/tick-oneshot.c
> @@ -27,8 +27,20 @@
>  int tick_program_event(ktime_t expires, int force)
>  {
>  	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
> +	int ret = 0;
>  
> -	return clockevents_program_event(dev, expires, force);
> +	/* Shut down event device if it is not required for long */
> +	if (unlikely(expires.tv64 == KTIME_MAX)) {
> +		dev->last_mode = dev->mode;
> +		clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);

No, we are not doing a state change behind the scene and a magic
restore. And I know at least one way to make this fall flat on its
nose, because you are blindly doing dev->last_mode = dev->mode on
every invocation. So if that gets called twice without a restore in
between, the device is going to be in shutdown mode forever.

It's moronic anyway as the clock event device has the state
CLOCK_EVT_MODE_ONESHOT if its active, otherwise we would not be in
that code path.

But what's even worse: you just define that it's the best way for all
implementations of clockevents to handle this.

It's definitley NOT. Some startup/shutdown implementations are rather
complex, so that would burden them with rather big latencies and some
of them will even outright break.

There is a world outside of YOUR favourite subarch.

We do not hijack stuff just because we can and it works on some
machines. We think about it proper.

I prevented that the GPIO folks hijacked irq_startup in a disgusting
way for solving the GPIO issues for the very same reason. So in the
end we added a new OPT-In interface which solved the problem without
breaking any existing code. And it made the code simpler and cleaner
in the very end.

If we hijack some existing facility then we audit ALL implementation
sites and document that we did so and why we are sure that it won't
break stuff. It still might break some oddball case, but that's not a
big issue.

In the clockevents case we do not even need a new interface, but this
must be made OPT-in and not a flagday change for all users.

And no we are not going to abuse a feature flag for this. It's not a
feature.

I'd rather have a new state for this, simply because it is NOT
shutdown. It is in ONESHOT_STOPPED state. Whether a specific
implementation will use the SHUTDOWN code for it or not does not
matter.

That requires a full tree update of all implementations because most
of them have a switch case for the mode. And adding a state will cause
all of them which do not have a default clause to omit warnings
because the mode is an enum for this very reason.

And even if all of them would have a default clause, you'd need a way
to OPT-In, because some of the defaults have a BUG() in there. Again,
no feature flag exclusion. See above.

So the right thing to do this is:

1A) Change the prototype of the set_mode callback to return int and
    fixup all users. Either add the missing default clause or remove
    the existing BUG()/ pr_err()/whatever handling in the existing
    default clause and return a UNIQUE error code.

    I know I should have done that from the very beginning, but in
    hindsight one could have done everything better.

    coccinelle is your friend (if you need help ask me or Julia
    Lawall). But it's going to be quite some manual work on top.

1B) Audit the changes and look at the implementations. If the patch is
    just adding the default clause or replacing some BUG/printk error
    handling goto #1C

    If it looks like it needs some preparatory care or if you find
    bugs in a particular implementation, roll back the changes and do
    the bug fixes and preparatory changes first as separate patches.

    Go back to #1A until the coccinelle patches are just squeaky
    clean.

1C) Add proper error handling for the various modes to the set_mode
    callback call sites, only two AFAIK.

2A) Add a new mode ONESHOT_STOPPED. That's safe now as all error
    handling will be done in the core code.

2B) Implement the ONESHOT_STOPPED logic and make sure all of the core
    code is aware of it.

And don't tell me it can't be done. I've done it I don't know how many
times with interrupts, timers, locking and some more. It's hard work,
but it's valuable and way better than the brainless "make it work for
me" hackery.

You asked me yesterday about your other hrtimer patches. You know why
I do not come around to review them? Because I have found way too much
half baken stuff in your patches I reviewed so far. That forces me to
go through all of them with a fine comb and I simply do not scale.

Alone reviewing this patch took me several couple of hours, because I
had to think through the implications and stare into the code. And you
know why? Because, first of all I do not trust your patches and
secondly your changelogs (especially the one of the 1/2 patch) told me
clearly, that this is "works for me" hackery.

So YOU forced me to spend time on looking at the consequences all over
the place instead of YOU had looked in the first place and figured it
out yourself.

Did you look at ALL implementations of clock events when you made that
change?  Definitely NOT.

I did. And found quite some of them which are going to be hurt. I also
found some of them which are buggy.

Just get it. This is CORE code and it affects ALL of its users. You
can play that "hack it into submission game" with a random driver, i.e
at the end of the callchain, but core code is very very differrent.

There is always the risk to break something when you work on core code
and nobody will rip your head off, if you break something because you
did not notice the random oddity of some use site.

But breaking stuff wholesale by just not thinking about it carefully
won't earn you any brownie points.

Vs. your other pending patches, I have no idea whether I have the time
and the stomach to go through them before I vanish to Japan next
weekend.

If there are urgent bugfixes, which are obvious or proper thought
through and explained, please resend them.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Thomas Gleixner May 10, 2014, 11:01 a.m. UTC | #2
On Fri, 9 May 2014, Thomas Gleixner wrote:
> On Fri, 9 May 2014, Viresh Kumar wrote:
> So the right thing to do this is:
> 
> 1A) Change the prototype of the set_mode callback to return int and
>     fixup all users. Either add the missing default clause or remove
>     the existing BUG()/ pr_err()/whatever handling in the existing
>     default clause and return a UNIQUE error code.
> 
>     I know I should have done that from the very beginning, but in
>     hindsight one could have done everything better.
> 
>     coccinelle is your friend (if you need help ask me or Julia
>     Lawall). But it's going to be quite some manual work on top.

There is even a better way to do that:

1) Create a new callback set_state() which has an
   int return value.

2) Make the callsites do

   if (dev->set_state) {
      ret = dev->set_state();
      handle_return_value();
   } else
      dev->set_mode();

3) Convert implementations one by one to use the new callback

4) Remove the set_mode callback

5) Implement new features.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Viresh Kumar May 12, 2014, 5:35 a.m. UTC | #3
Thanks for blasting me off, it might be very helpful going forward :)

On 10 May 2014 01:39, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Fri, 9 May 2014, Viresh Kumar wrote:

>> diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c

>>  int tick_program_event(ktime_t expires, int force)
>>  {
>>       struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
>> +     int ret = 0;
>>
>> -     return clockevents_program_event(dev, expires, force);
>> +     /* Shut down event device if it is not required for long */
>> +     if (unlikely(expires.tv64 == KTIME_MAX)) {
>> +             dev->last_mode = dev->mode;
>> +             clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
>
> No, we are not doing a state change behind the scene and a magic
> restore. And I know at least one way to make this fall flat on its
> nose, because you are blindly doing dev->last_mode = dev->mode on
> every invocation. So if that gets called twice without a restore in
> between, the device is going to be in shutdown mode forever.

During my tests I had this as well:

        if (unlikely(expires.tv64 == KTIME_MAX)) {
+               WARN_ON(dev->mode == CLOCK_EVT_MODE_SHUTDOWN);

But it never got to it and I thought it might never happen, so removed it.
But yes, there should be some check here for that.

> It's moronic anyway as the clock event device has the state
> CLOCK_EVT_MODE_ONESHOT if its active, otherwise we would not be in
> that code path.

Yeah, Missed that earlier.

> But what's even worse: you just define that it's the best way for all
> implementations of clockevents to handle this.
>
> It's definitley NOT. Some startup/shutdown implementations are rather
> complex, so that would burden them with rather big latencies and some
> of them will even outright break.
>
> There is a world outside of YOUR favourite subarch.

:)

> We do not hijack stuff just because we can and it works on some
> machines. We think about it proper.

Agreed..

> If we hijack some existing facility then we audit ALL implementation
> sites and document that we did so and why we are sure that it won't
> break stuff. It still might break some oddball case, but that's not a
> big issue.

Because SHUTDOWN was an existing old API, I thought it will work
without breaking stuff. Yes, I must have done some auditing or made
this an RFC series atleast to get the discussion going forward..

> In the clockevents case we do not even need a new interface, but this
> must be made OPT-in and not a flagday change for all users.
>
> And no we are not going to abuse a feature flag for this. It's not a
> feature.

Okay.

> I'd rather have a new state for this, simply because it is NOT
> shutdown. It is in ONESHOT_STOPPED state. Whether a specific
> implementation will use the SHUTDOWN code for it or not does not
> matter.

Correct.

> That requires a full tree update of all implementations because most
> of them have a switch case for the mode. And adding a state will cause
> all of them which do not have a default clause to omit warnings
> because the mode is an enum for this very reason.
>
> And even if all of them would have a default clause, you'd need a way
> to OPT-In, because some of the defaults have a BUG() in there. Again,
> no feature flag exclusion. See above.

Okay..

> So the right thing to do this is:
>
> 1A) Change the prototype of the set_mode callback to return int and
>     fixup all users. Either add the missing default clause or remove
>     the existing BUG()/ pr_err()/whatever handling in the existing
>     default clause and return a UNIQUE error code.
>
>     I know I should have done that from the very beginning, but in
>     hindsight one could have done everything better.
>
>     coccinelle is your friend (if you need help ask me or Julia
>     Lawall). But it's going to be quite some manual work on top.

Sure.

> 1B) Audit the changes and look at the implementations. If the patch is
>     just adding the default clause or replacing some BUG/printk error
>     handling goto #1C
>
>     If it looks like it needs some preparatory care or if you find
>     bugs in a particular implementation, roll back the changes and do
>     the bug fixes and preparatory changes first as separate patches.
>
>     Go back to #1A until the coccinelle patches are just squeaky
>     clean.
>
> 1C) Add proper error handling for the various modes to the set_mode
>     callback call sites, only two AFAIK.
>
> 2A) Add a new mode ONESHOT_STOPPED. That's safe now as all error
>     handling will be done in the core code.
>
> 2B) Implement the ONESHOT_STOPPED logic and make sure all of the core
>     code is aware of it.

Okay..

> And don't tell me it can't be done.

No way :)

> I've done it I don't know how many
> times with interrupts, timers, locking and some more. It's hard work,
> but it's valuable and way better than the brainless "make it work for
> me" hackery.

I didn't mean that actually. I just pin pointed how badly things can go
with an example of ARM's platform. But I never meant that it must get
in as it "works for me" :) .. But yes, you got that impression and I need
to make sure it doesn't happen again.

> You asked me yesterday about your other hrtimer patches. You know why
> I do not come around to review them? Because I have found way too much
> half baken stuff in your patches I reviewed so far.

Hmm, that's bad. I thought most of them wouldn't make any difference
functionally, and so wouldn't break anything. Sorry about that.

> That forces me to
> go through all of them with a fine comb and I simply do not scale.
>
> Alone reviewing this patch took me several couple of hours, because I
> had to think through the implications and stare into the code. And you
> know why? Because, first of all I do not trust your patches and

I will try my best to come over that :)

> secondly your changelogs (especially the one of the 1/2 patch) told me
> clearly, that this is "works for me" hackery.

I really didn't meant that :(

> So YOU forced me to spend time on looking at the consequences all over
> the place instead of YOU had looked in the first place and figured it
> out yourself.
>
> Did you look at ALL implementations of clock events when you made that
> change?  Definitely NOT.

No, I didn't ... Yeah, I should have handled it in a better way.. With some
more study and work..

> I did. And found quite some of them which are going to be hurt. I also
> found some of them which are buggy.
>
> Just get it. This is CORE code and it affects ALL of its users. You
> can play that "hack it into submission game" with a random driver, i.e
> at the end of the callchain, but core code is very very differrent.
>
> There is always the risk to break something when you work on core code
> and nobody will rip your head off, if you break something because you
> did not notice the random oddity of some use site.
>
> But breaking stuff wholesale by just not thinking about it carefully
> won't earn you any brownie points.

Agreed.

> Vs. your other pending patches, I have no idea whether I have the time
> and the stomach to go through them before I vanish to Japan next
> weekend.
>
> If there are urgent bugfixes, which are obvious or proper thought
> through and explained, please resend them.

Only one as far as I remember and I already got a go ahead from you
on that, will resend it.

Let me get your next mail in here as well:

> There is even a better way to do that:
>
> 1) Create a new callback set_state() which has an
>    int return value.
>
> 2) Make the callsites do
>
>    if (dev->set_state) {
>       ret = dev->set_state();
>       handle_return_value();
>    } else
>       dev->set_mode();
>
> 3) Convert implementations one by one to use the new callback
>
> 4) Remove the set_mode callback
>
> 5) Implement new features.

Yeah, this is obviously going to be far more easy as there is less risk
of breaking things here :)

Again, sorry for the noise (Atleast the issue was real and important). I
wanted to do it in a better way but thought the existing API should work
smoothly..

I will do my best to earn your trust :)

Thanks..
Viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Viresh Kumar May 12, 2014, 7:07 a.m. UTC | #4
On 10 May 2014 16:31, Thomas Gleixner <tglx@linutronix.de> wrote:
> There is even a better way to do that:
>
> 1) Create a new callback set_state() which has an
>    int return value.
>
> 2) Make the callsites do
>
>    if (dev->set_state) {
>       ret = dev->set_state();
>       handle_return_value();
>    } else
>       dev->set_mode();

Do you want me to touch clock_event_mode as well?
Otherwise we will pass mode into a function setting state..

Or we can do s/mode/state after all the work suggested by you
is done ..

Or leave as is..

1 & 2, should be just 1-2 patches, I will try to send them as soon
as possible. Once these get your nod, will start working on fixing
all the clockevent drivers.

--
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Thomas Gleixner May 12, 2014, 7:39 a.m. UTC | #5
On Mon, 12 May 2014, Viresh Kumar wrote:

> On 10 May 2014 16:31, Thomas Gleixner <tglx@linutronix.de> wrote:
> > There is even a better way to do that:
> >
> > 1) Create a new callback set_state() which has an
> >    int return value.
> >
> > 2) Make the callsites do
> >
> >    if (dev->set_state) {
> >       ret = dev->set_state();
> >       handle_return_value();
> >    } else
> >       dev->set_mode();
> 
> Do you want me to touch clock_event_mode as well?
> Otherwise we will pass mode into a function setting state..
> 
> Or we can do s/mode/state after all the work suggested by you
> is done ..
> 
> Or leave as is..

You can name the new callback set_dev_mode() :)
 
set_state() was just pulled out of the air for illustration.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 2e4cb67..36a4ca6 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -105,6 +105,7 @@  struct clock_event_device {
 	u32			mult;
 	u32			shift;
 	enum clock_event_mode	mode;
+	enum clock_event_mode	last_mode;
 	unsigned int		features;
 	unsigned long		retries;
 
diff --git a/kernel/time/tick-oneshot.c b/kernel/time/tick-oneshot.c
index 8241090..9543815 100644
--- a/kernel/time/tick-oneshot.c
+++ b/kernel/time/tick-oneshot.c
@@ -27,8 +27,20 @@ 
 int tick_program_event(ktime_t expires, int force)
 {
 	struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev);
+	int ret = 0;
 
-	return clockevents_program_event(dev, expires, force);
+	/* Shut down event device if it is not required for long */
+	if (unlikely(expires.tv64 == KTIME_MAX)) {
+		dev->last_mode = dev->mode;
+		clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+	} else {
+		/* restore mode when restarting event dev */
+		if (unlikely(dev->mode == CLOCK_EVT_MODE_SHUTDOWN))
+			clockevents_set_mode(dev, dev->last_mode);
+		ret = clockevents_program_event(dev, expires, force);
+	}
+
+	return ret;
 }
 
 /**