Message ID | 20171108160056.3042976-1-arnd@arndb.de |
---|---|
State | Superseded |
Headers | show |
Series | pstore: use ktime_get_real_fast_ns() instead of __getnstimeofday() | expand |
On Wed, Nov 8, 2017 at 8:00 AM, Arnd Bergmann <arnd@arndb.de> wrote: > I noticed that __getnstimeofday() is a rather odd interface, with > a number of quirks: > > - The caller may come from NMI context, but the implementation is not NMI safe > - The calling conventions are different from any other timekeeping functions > - The naming doesn't fit into the 'ktime_get_*()' scheme > > This addresses the above issues by using a completely different > method to get the time: ktime_get_real_fast_ns() is NMI safe and > doesn't print a warning when called with the timekeeping suspended, > and we can easily transform the result into a timespec structure. Since > ktime_get_real_fast_ns() was not exported to modules, I also add the > export. > > The behavior for the suspended case changes, we now return the time > if we can find it out rather than setting it to zero. We could still > change that but this would require exporting the 'timekeeping_suspended' > to modules. > > I'm not trying to address y2038-safety at this point, but plan to > do that later with a more complex patch. > > Cc: Kees Cook <keescook@chromium.org> > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > It would be helpful to get this merged into v4.15-rc1, since I have > some other cleanups that depend on this. Please have a look to see if > this makes sense to you. As long as this is sane to call when timekeeping is not running, I'm happy to take the patch. Would you prefer I take this via pstore or do you have another tree it should go through? Acked-by: Kees Cook <keescook@chromium.org> -Kees > --- > fs/pstore/platform.c | 5 +---- > kernel/time/timekeeping.c | 1 + > 2 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c > index ec7199e859d2..086e491faf04 100644 > --- a/fs/pstore/platform.c > +++ b/fs/pstore/platform.c > @@ -482,10 +482,7 @@ void pstore_record_init(struct pstore_record *record, > record->psi = psinfo; > > /* Report zeroed timestamp if called before timekeeping has resumed. */ > - if (__getnstimeofday(&record->time)) { > - record->time.tv_sec = 0; > - record->time.tv_nsec = 0; > - } > + record->time = ns_to_timespec(ktime_get_real_fast_ns()); > } > > /* > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > index 353f7bd1eeb0..198afa78bf69 100644 > --- a/kernel/time/timekeeping.c > +++ b/kernel/time/timekeeping.c > @@ -528,6 +528,7 @@ u64 ktime_get_real_fast_ns(void) > { > return __ktime_get_real_fast_ns(&tk_fast_mono); > } > +EXPORT_SYMBOL_GPL(ktime_get_real_fast_ns); > > /** > * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource. > -- > 2.9.0 > -- Kees Cook Pixel Security
On Thu, Nov 9, 2017 at 1:55 AM, Kees Cook <keescook@chromium.org> wrote: > On Wed, Nov 8, 2017 at 8:00 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> I noticed that __getnstimeofday() is a rather odd interface, with >> a number of quirks: >> >> - The caller may come from NMI context, but the implementation is not NMI safe >> - The calling conventions are different from any other timekeeping functions >> - The naming doesn't fit into the 'ktime_get_*()' scheme >> >> This addresses the above issues by using a completely different >> method to get the time: ktime_get_real_fast_ns() is NMI safe and >> doesn't print a warning when called with the timekeeping suspended, >> and we can easily transform the result into a timespec structure. Since >> ktime_get_real_fast_ns() was not exported to modules, I also add the >> export. >> >> The behavior for the suspended case changes, we now return the time >> if we can find it out rather than setting it to zero. We could still >> change that but this would require exporting the 'timekeeping_suspended' >> to modules. >> >> I'm not trying to address y2038-safety at this point, but plan to >> do that later with a more complex patch. >> >> Cc: Kees Cook <keescook@chromium.org> >> Signed-off-by: Arnd Bergmann <arnd@arndb.de> >> --- >> It would be helpful to get this merged into v4.15-rc1, since I have >> some other cleanups that depend on this. Please have a look to see if >> this makes sense to you. > > As long as this is sane to call when timekeeping is not running, I'm > happy to take the patch. Maybe John or Thomas can comment on this, I'm not totally sure how reliable it is. My best guess is that it will still produce correct time stamps a lot of the time, but it depends a bit on the clocksource hardware and how long the timekeeping has been suspended. As far as I can tell, - if the clocksource register contents wrap around, the reported time might appear to go back to the time of the last timer interrupt. The shortest time I could find for an overflow is a 16-bit timer running at 32khz on i.MX1, overflowing every two seconds. - if reading a suspended clocksource returns zero (or another incorrect value) the time might be in the far future - if reading a suspended clocksource causes a hang or crash, you lose. the latter two could probably be considered bugs in the respective clocksource drivers, and it's possible that this doesn't actually happen on any of them, but we could in theory work around it by not reading the a suspended clocksource at all and returning the last known time from the interrupt. I suppose we could even make ktime_get_*_fast_ns() always return zero when the timekeeping is suspended, but I don't know if that would hurt the other callers. > Would you prefer I take this via pstore or do > you have another tree it should go through? > > Acked-by: Kees Cook <keescook@chromium.org> The pstore tree is probably best, but going through -tip would work as well I think. Arnd
On Thu, 9 Nov 2017, Arnd Bergmann wrote: > On Thu, Nov 9, 2017 at 1:55 AM, Kees Cook <keescook@chromium.org> wrote: > > On Wed, Nov 8, 2017 at 8:00 AM, Arnd Bergmann <arnd@arndb.de> wrote: > >> I noticed that __getnstimeofday() is a rather odd interface, with > >> a number of quirks: > >> > >> - The caller may come from NMI context, but the implementation is not NMI safe Hmm, no. None of the regular accessor functions can be called from NMI context safely. > >> - The calling conventions are different from any other timekeeping functions Yes, that was done back then to actually return the value (which is possible with TSC) and indicate at the same time that timekeeping is suspended. > >> - The naming doesn't fit into the 'ktime_get_*()' scheme Right, because it's not in the ktime_get_() family of functions. > > As long as this is sane to call when timekeeping is not running, I'm > > happy to take the patch. > > Maybe John or Thomas can comment on this, I'm not totally sure how > reliable it is. My best guess is that it will still produce correct time stamps > a lot of the time, but it depends a bit on the clocksource hardware and > how long the timekeeping has been suspended. As far as I can tell, > > - if the clocksource register contents wrap around, the reported time > might appear to go back to the time of the last timer interrupt. The > shortest time I could find for an overflow is a 16-bit timer running at > 32khz on i.MX1, overflowing every two seconds. > - if reading a suspended clocksource returns zero (or another incorrect > value) the time might be in the far future > - if reading a suspended clocksource causes a hang or crash, you > lose. None of those problems exist for the fast NMI safe accessors. In timekeeping_suspend() we store the current time and any access to the fast timekeeper returns exactly the time at suspend up to the point where timekeeping is resumed. See halt_fast_timekeeper(). So all you get between timekeeping_suspend() and timekeeping_resume() is a stale timestamp. No backwards, no crash and burn. The normal timekeeping accessor functions cannot be called between timekeeping_suspend() and timekeeping_resume() at all. They will emit a warning and can indeed crash and burn in one of the ways you described above. This does not happen on x86 because the TSC will just work on systems with pstore. Thanks, tglx
On Fri, Nov 10, 2017 at 12:00 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > On Thu, 9 Nov 2017, Arnd Bergmann wrote: >> On Thu, Nov 9, 2017 at 1:55 AM, Kees Cook <keescook@chromium.org> wrote: >> > On Wed, Nov 8, 2017 at 8:00 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> >> I noticed that __getnstimeofday() is a rather odd interface, with >> >> a number of quirks: >> >> >> >> - The caller may come from NMI context, but the implementation is not NMI safe > > Hmm, no. None of the regular accessor functions can be called from NMI > context safely. Right, that's what I mean: it must not get called from NMI context, but it currently is, at least for this case: NMI handler: something bad panic() kmsg_dump() pstore_dump() pstore_record_init() __getnstimeofday() I should probably add that to the changelog text ;-) >> >> - The calling conventions are different from any other timekeeping functions > > Yes, that was done back then to actually return the value (which is > possible with TSC) and indicate at the same time that timekeeping is > suspended. > >> >> - The naming doesn't fit into the 'ktime_get_*()' scheme > > Right, because it's not in the ktime_get_() family of functions. For the background on how I got here: I'm currently trying to go through all functions in include/linux/timekeeping32.h with the intention of replacing them with appropriate ktime_get_*() interfaces that don't have the y2038 overflow problem. I considered just using the existing __getnstimeofday64() here, which would be a trivial change, but then I noticed the NMI problem. Also, I have a related patch series that renames getrawmonotonic64(), current_kernel_time64() and get_monotonic_coarse64() to ktime_get_raw_ts64(), ktime_get_coarse_real_ts64() and ktime_get_coarse_ts64(), for consistency, but then I couldn't come up with a good name for __getnstimeofday64(), as the __ktime_get_*() naming is already used for a number of other things and I did not want to overload that more. Completely removing __getnstimeofday64() would be handier here. >> > As long as this is sane to call when timekeeping is not running, I'm >> > happy to take the patch. >> >> Maybe John or Thomas can comment on this, I'm not totally sure how >> reliable it is. My best guess is that it will still produce correct time stamps >> a lot of the time, but it depends a bit on the clocksource hardware and >> how long the timekeeping has been suspended. As far as I can tell, >> >> - if the clocksource register contents wrap around, the reported time >> might appear to go back to the time of the last timer interrupt. The >> shortest time I could find for an overflow is a 16-bit timer running at >> 32khz on i.MX1, overflowing every two seconds. >> - if reading a suspended clocksource returns zero (or another incorrect >> value) the time might be in the far future >> - if reading a suspended clocksource causes a hang or crash, you >> lose. > > None of those problems exist for the fast NMI safe accessors. In > timekeeping_suspend() we store the current time and any access to the fast > timekeeper returns exactly the time at suspend up to the point where > timekeeping is resumed. See halt_fast_timekeeper(). > > So all you get between timekeeping_suspend() and timekeeping_resume() is > a stale timestamp. No backwards, no crash and burn. Ah, perfect, then my patch should be fine. I think I had read that part of the code before, but then forgotten about it. > The normal timekeeping accessor functions cannot be called between > timekeeping_suspend() and timekeeping_resume() at all. They will emit a > warning and can indeed crash and burn in one of the ways you described > above. This does not happen on x86 because the TSC will just work on > systems with pstore. Sure, except for __getnstimeofday64(), which will intentionally not warn but could crash in the clocksource driver (on non-x86). We do ignore the result from __getnstimeofday64() when timekeeping is suspended, but only after we call into the clocksource driver. Arnd
On Fri, 10 Nov 2017, Arnd Bergmann wrote: > On Fri, Nov 10, 2017 at 12:00 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > Hmm, no. None of the regular accessor functions can be called from NMI > > context safely. > > Right, that's what I mean: it must not get called from NMI context, but it > currently is, at least for this case: > > NMI handler: > something bad > panic() > kmsg_dump() > pstore_dump() > pstore_record_init() > __getnstimeofday() > > I should probably add that to the changelog text ;-) Indeed. > Also, I have a related patch series that renames getrawmonotonic64(), > current_kernel_time64() and get_monotonic_coarse64() to > ktime_get_raw_ts64(), ktime_get_coarse_real_ts64() and > ktime_get_coarse_ts64(), for consistency, but then I couldn't > come up with a good name for __getnstimeofday64(), as the > __ktime_get_*() naming is already used for a number of other > things and I did not want to overload that more. Completely > removing __getnstimeofday64() would be handier here. Oh yes, it's an abomination. > > The normal timekeeping accessor functions cannot be called between > > timekeeping_suspend() and timekeeping_resume() at all. They will emit a > > warning and can indeed crash and burn in one of the ways you described > > above. This does not happen on x86 because the TSC will just work on > > systems with pstore. > > Sure, except for __getnstimeofday64(), which will intentionally not warn but > could crash in the clocksource driver (on non-x86). We do ignore the result > from __getnstimeofday64() when timekeeping is suspended, but only after > we call into the clocksource driver. Right, let's get rid of it before it grows another user. Thanks, tglx
On Thu, Nov 9, 2017 at 4:46 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > On Fri, 10 Nov 2017, Arnd Bergmann wrote: >> On Fri, Nov 10, 2017 at 12:00 AM, Thomas Gleixner <tglx@linutronix.de> wrote: >> > Hmm, no. None of the regular accessor functions can be called from NMI >> > context safely. >> >> Right, that's what I mean: it must not get called from NMI context, but it >> currently is, at least for this case: >> >> NMI handler: >> something bad >> panic() >> kmsg_dump() >> pstore_dump() >> pstore_record_init() >> __getnstimeofday() >> >> I should probably add that to the changelog text ;-) > > Indeed. Er, so, is this safe to call there? I've had to fix this a few times now, so if using ktime_get_real_fast_ns() can be used here (and doesn't return 0) then this is easily an improvement over the existing "maybe read 0" case pstore has now. -Kees -- Kees Cook Pixel Security
On Thu, 9 Nov 2017, Kees Cook wrote: > On Thu, Nov 9, 2017 at 4:46 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Fri, 10 Nov 2017, Arnd Bergmann wrote: > >> On Fri, Nov 10, 2017 at 12:00 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > >> > Hmm, no. None of the regular accessor functions can be called from NMI > >> > context safely. > >> > >> Right, that's what I mean: it must not get called from NMI context, but it > >> currently is, at least for this case: > >> > >> NMI handler: > >> something bad > >> panic() > >> kmsg_dump() > >> pstore_dump() > >> pstore_record_init() > >> __getnstimeofday() > >> > >> I should probably add that to the changelog text ;-) > > > > Indeed. > > Er, so, is this safe to call there? I've had to fix this a few times > now, so if using ktime_get_real_fast_ns() can be used here (and > doesn't return 0) then this is easily an improvement over the existing > "maybe read 0" case pstore has now. ktime_get_real_fast_ns() is NMI safe and returns before timekeeping_suspend(): correct time after timekeeping_suspend(): timestamp which was frozen in timekeeping_suspend() after timekeeping_resume(): correct time Thanks, tglx
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c index ec7199e859d2..086e491faf04 100644 --- a/fs/pstore/platform.c +++ b/fs/pstore/platform.c @@ -482,10 +482,7 @@ void pstore_record_init(struct pstore_record *record, record->psi = psinfo; /* Report zeroed timestamp if called before timekeeping has resumed. */ - if (__getnstimeofday(&record->time)) { - record->time.tv_sec = 0; - record->time.tv_nsec = 0; - } + record->time = ns_to_timespec(ktime_get_real_fast_ns()); } /* diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 353f7bd1eeb0..198afa78bf69 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -528,6 +528,7 @@ u64 ktime_get_real_fast_ns(void) { return __ktime_get_real_fast_ns(&tk_fast_mono); } +EXPORT_SYMBOL_GPL(ktime_get_real_fast_ns); /** * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource.
I noticed that __getnstimeofday() is a rather odd interface, with a number of quirks: - The caller may come from NMI context, but the implementation is not NMI safe - The calling conventions are different from any other timekeeping functions - The naming doesn't fit into the 'ktime_get_*()' scheme This addresses the above issues by using a completely different method to get the time: ktime_get_real_fast_ns() is NMI safe and doesn't print a warning when called with the timekeeping suspended, and we can easily transform the result into a timespec structure. Since ktime_get_real_fast_ns() was not exported to modules, I also add the export. The behavior for the suspended case changes, we now return the time if we can find it out rather than setting it to zero. We could still change that but this would require exporting the 'timekeeping_suspended' to modules. I'm not trying to address y2038-safety at this point, but plan to do that later with a more complex patch. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- It would be helpful to get this merged into v4.15-rc1, since I have some other cleanups that depend on this. Please have a look to see if this makes sense to you. --- fs/pstore/platform.c | 5 +---- kernel/time/timekeeping.c | 1 + 2 files changed, 2 insertions(+), 4 deletions(-) -- 2.9.0