diff mbox

[1/2] time: Fix NTP adjustment mult overflow.

Message ID 1412838271-11175-1-git-send-email-pang.xunlei@linaro.org
State Superseded
Headers show

Commit Message

pang.xunlei Oct. 9, 2014, 7:04 a.m. UTC
The mult memember of struct clocksource should always be a large u32 number when calculated through
__clocksource_updatefreq_scale(). The value of (cs->mult+cs->maxadj) may have a chance to reach very
near 0xFFFFFFFF. For instance, 555MHz oscillator: cs->mult is 0xE6A17102, cs->maxadj is 0x195E8EFD,
cs->mult+cs->maxadj is 0xFFFFFFFF. Such oscillators would probably exist on some processors like
MIPS which use CP0 compare/count CPU clock as the clock source.

Clocksource might encounter large frequency adjustment due to the hardware unstability, environment
temperature, software deviation, NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
through update_wall_time()->...->timekeeping_apply_adjustment(): tk->tkr.mult += mult_adj;
Unfortunately, tk->tkr.mult may overflow after this operation, though such cases are next to impossible
to happen in practice.

This patch avoids mult overflow by judging the overflow case before adding mult_adj to mult, also adds the
WARNING message when capturing such case.

Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
---
 kernel/time/timekeeping.c |    6 ++++++
 1 file changed, 6 insertions(+)

Comments

Thomas Gleixner Oct. 21, 2014, 10:19 a.m. UTC | #1
On Thu, 9 Oct 2014, pang.xunlei wrote:

First of all: Please use proper line breaks in the changelog.

> The mult memember of struct clocksource should always be a large u32
> number when calculated through __clocksource_updatefreq_scale(). The
> value of (cs->mult+cs->maxadj) may have a chance to reach very near
> 0xFFFFFFFF.

And what's the actual problem with reaching a value near 0xFFFFFFFF?

> For instance, 555MHz oscillator: cs->mult is 0xE6A17102,
> cs->maxadj is 0x195E8EFD, cs->mult+cs->maxadj is 0xFFFFFFFF. Such
> oscillators would probably exist on some processors like MIPS which
> use CP0 compare/count CPU clock as the clock source.

Again, what's the problem?

> Clocksource might encounter large frequency adjustment due to the
> hardware unstability, environment temperature, software deviation,
> NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
> through update_wall_time()->...->timekeeping_apply_adjustment():
> tk->tkr.mult += mult_adj; Unfortunately, tk->tkr.mult may overflow
> after this operation, though such cases are next to impossible to
> happen in practice.

So you adding this just for correctness reasons, not because you
observed the problem in practice?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
John Stultz Oct. 21, 2014, 4:11 p.m. UTC | #2
On Tue, Oct 21, 2014 at 3:19 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 9 Oct 2014, pang.xunlei wrote:
>
> First of all: Please use proper line breaks in the changelog.
>
>> The mult memember of struct clocksource should always be a large u32
>> number when calculated through __clocksource_updatefreq_scale(). The
>> value of (cs->mult+cs->maxadj) may have a chance to reach very near
>> 0xFFFFFFFF.
>
> And what's the actual problem with reaching a value near 0xFFFFFFFF?
>
>> For instance, 555MHz oscillator: cs->mult is 0xE6A17102,
>> cs->maxadj is 0x195E8EFD, cs->mult+cs->maxadj is 0xFFFFFFFF. Such
>> oscillators would probably exist on some processors like MIPS which
>> use CP0 compare/count CPU clock as the clock source.
>
> Again, what's the problem?

So the problem is that with an adjustment the mult value might
overflow, becoming very small, bascially causing time to stop
increasing. This is mentioned below, but I agree we're burying the
headline a bit.


>> Clocksource might encounter large frequency adjustment due to the
>> hardware unstability, environment temperature, software deviation,
>> NTP algorithm accuracy, etc. When NTP slewes the clock, kernel goes
>> through update_wall_time()->...->timekeeping_apply_adjustment():
>> tk->tkr.mult += mult_adj; Unfortunately, tk->tkr.mult may overflow
>> after this operation, though such cases are next to impossible to
>> happen in practice.
>
> So you adding this just for correctness reasons, not because you
> observed the problem in practice?

This is my understanding.

I'll work with Xunlei to make further clarifications to the changelog
to make this more explicit.

Thanks for your feedback!
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Thomas Gleixner Oct. 21, 2014, 7:35 p.m. UTC | #3
On Tue, 21 Oct 2014, John Stultz wrote:
> On Tue, Oct 21, 2014 at 3:19 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > So you adding this just for correctness reasons, not because you
> > observed the problem in practice?
> 
> This is my understanding.
> 
> I'll work with Xunlei to make further clarifications to the changelog
> to make this more explicit.

Ok. I have no objections to the patch itself, just the changelog made
my brain go into spiral mode ...

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
John Stultz Oct. 24, 2014, 4:22 a.m. UTC | #4
On Wed, Oct 22, 2014 at 5:37 AM, Xunlei Pang <pang.xunlei@linaro.org> wrote:
> The mult memember of struct clocksource should always be a large u32 number
> when calculated through
> __clocksource_updatefreq_scale(). The value of (cs->mult+cs->maxadj) may
> have a chance to reach very
> near 0xFFFFFFFF, so it may overflow when doing NTP positive adjustment, see
> the following detail:
> When NTP slewes the clock, kernel goes through
> update_wall_time()->...->timekeeping_apply_adjustment():
> tk->tkr.mult += mult_adj;
> Unfortunately, tk->tkr.mult may overflow after this operation.
>
>
> This patch avoids mult overflow by judging the overflow case before adding
> mult_adj to mult, also adds the
> WARNING message when capturing such case.
>
> Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>

I reworded this a bit further, but its in my queue for 3.19.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ec1791f..cad61b3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1332,6 +1332,12 @@  static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
 	 *
 	 * XXX - TODO: Doc ntp_error calculation.
 	 */
+	if (tk->tkr.mult + mult_adj < mult_adj) {
+		/* NTP adjustment caused clocksource mult overflow */
+		WARN_ON_ONCE(1);
+		return;
+	}
+
 	tk->tkr.mult += mult_adj;
 	tk->xtime_interval += interval;
 	tk->tkr.xtime_nsec -= offset;