Message ID | 20180718010242.5254-2-nicolas.pitre@linaro.org |
---|---|
State | New |
Headers | show |
Series | follow-up to the unicode vt console series | expand |
On Tue, Jul 17, 2018 at 09:02:40PM -0400, Nicolas Pitre wrote: > The nr argument is typically small: most often nr == 1. However this > could be abused with a very large explicit scroll in a resized screen. > Make the code scroll lines one at a time in all cases to avoid the VLA. > Anything smarter is most likely not warranted here. Even though nr can be 32767 at most, your new version is O(nr*nr) for no reason. Instead of O(n) memory or O(n²) time, a variant of the original that copies values one at a time would be shorter and faster. > Requested-by: Kees Cook <keescook@chromium.org> > Signed-off-by: Nicolas Pitre <nico@linaro.org> > --- > drivers/tty/vt/vt.c | 18 ++++++++++-------- > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c > index 2d14bb195d..03e79f7787 100644 > --- a/drivers/tty/vt/vt.c > +++ b/drivers/tty/vt/vt.c > @@ -433,20 +433,22 @@ static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b, > > if (uniscr) { > unsigned int s, d, rescue, clear; > - char32_t *save[nr]; > > s = clear = t; > - d = t + nr; > - rescue = b - nr; > + d = t + 1; > + rescue = b - 1; > if (dir == SM_UP) { > swap(s, d); > swap(clear, rescue); > } > - memcpy(save, uniscr->lines + rescue, nr * sizeof(*save)); > - memmove(uniscr->lines + d, uniscr->lines + s, > - (b - t - nr) * sizeof(*uniscr->lines)); > - memcpy(uniscr->lines + clear, save, nr * sizeof(*save)); > - vc_uniscr_clear_lines(vc, clear, nr); > + while (nr--) { > + char32_t *tmp; > + tmp = uniscr->lines[rescue]; > + memmove(uniscr->lines + d, uniscr->lines + s, > + (b - t - 1) * sizeof(*uniscr->lines)); > + uniscr->lines[clear] = tmp; > + vc_uniscr_clear_lines(vc, clear, 1); > + } > } > } What the function does is rotating an array (slice [t..b) here), by nr if SM_DOWN or by -nr ie (b - t - nr) if SM_UP. A nice problem that almost every "code interview questions" book includes :) Please say if you don't have time for such games, I've just refreshed what's a good answer. :þ Meow. -- // If you believe in so-called "intellectual property", please immediately // cease using counterfeit alphabets. Instead, contact the nearest temple // of Amon, whose priests will provide you with scribal services for all // your writing needs, for Reasonable And Non-Discriminatory prices.
On Wed, 18 Jul 2018, Adam Borowski wrote: > On Tue, Jul 17, 2018 at 09:02:40PM -0400, Nicolas Pitre wrote: > > The nr argument is typically small: most often nr == 1. However this > > could be abused with a very large explicit scroll in a resized screen. > > Make the code scroll lines one at a time in all cases to avoid the VLA. > > Anything smarter is most likely not warranted here. > > Even though nr can be 32767 at most, your new version is O(nr*nr) for no > reason. Instead of O(n) memory or O(n²) time, a variant of the original > that copies values one at a time would be shorter and faster. Well... even though nr _can_ be up to 32766 to be precise, it is most likely to be just 1 in 99.9% of the cases. So in that case, you'll execute the loop only once and the code is currently optimal with O(n). If nr > 1 then the current cost is O(n*nr) where n is the height of the scroll window i.e. relatively small in practice (typically between 25 and 60). There is no point optimizing for 32767 rows as that is rather silly. If we had then the best solution would be a linked list rather than an array. But still, if nr > 2 that means you need a temporary storage because the destination memory has to be preserved before the source memory can be moved there, and that destination memory content cannot be stored in the vacated source memory until the source content is moved. Copying values one at a time cannot work because the destination memory, the source memory, and the area where the previous content from the destination memory will end up don't overlap most of the time. That temporary storage was that VLA. We don't want VLAs. So how do we efficiently allocate and deallocate memory for, say, 25 words? Maybe that doesn't have to be efficient because that doesn't happen very often as we said, at which point we can just do it in a loop with a one-line increment instead, as this patch does. If you still have a more clever way of doing this then please propose it in code form (I'm genuinely curious of what you have in mind). But let's get the baseline working in an obvious "correct" way first. > > Requested-by: Kees Cook <keescook@chromium.org> > > Signed-off-by: Nicolas Pitre <nico@linaro.org> > > --- > > drivers/tty/vt/vt.c | 18 ++++++++++-------- > > 1 file changed, 10 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c > > index 2d14bb195d..03e79f7787 100644 > > --- a/drivers/tty/vt/vt.c > > +++ b/drivers/tty/vt/vt.c > > @@ -433,20 +433,22 @@ static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b, > > > > if (uniscr) { > > unsigned int s, d, rescue, clear; > > - char32_t *save[nr]; > > > > s = clear = t; > > - d = t + nr; > > - rescue = b - nr; > > + d = t + 1; > > + rescue = b - 1; > > if (dir == SM_UP) { > > swap(s, d); > > swap(clear, rescue); > > } > > - memcpy(save, uniscr->lines + rescue, nr * sizeof(*save)); > > - memmove(uniscr->lines + d, uniscr->lines + s, > > - (b - t - nr) * sizeof(*uniscr->lines)); > > - memcpy(uniscr->lines + clear, save, nr * sizeof(*save)); > > - vc_uniscr_clear_lines(vc, clear, nr); > > + while (nr--) { > > + char32_t *tmp; > > + tmp = uniscr->lines[rescue]; > > + memmove(uniscr->lines + d, uniscr->lines + s, > > + (b - t - 1) * sizeof(*uniscr->lines)); > > + uniscr->lines[clear] = tmp; > > + vc_uniscr_clear_lines(vc, clear, 1); > > + } > > } > > } > > What the function does is rotating an array (slice [t..b) here), by nr if > SM_DOWN or by -nr ie (b - t - nr) if SM_UP. A nice problem that almost every > "code interview questions" book includes :) > > Please say if you don't have time for such games, I've just refreshed what's > a good answer. :þ > > > Meow. > -- > // If you believe in so-called "intellectual property", please immediately > // cease using counterfeit alphabets. Instead, contact the nearest temple > // of Amon, whose priests will provide you with scribal services for all > // your writing needs, for Reasonable And Non-Discriminatory prices. >
On Tue, 17 Jul 2018, Nicolas Pitre wrote: > But still, if nr > 2 that means you need a temporary storage because the > destination memory has to be preserved before the source memory can be > moved there, and that destination memory content cannot be stored in the > vacated source memory until the source content is moved. OK I'm an idiot. After looking in the literature, I found out that there is indeed a better way to do this. So here's an updated patch: ----- >8 Subject: [PATCH v2 1/3] vt: avoid a VLA in the unicode screen scroll function The nr argument is typically small: most often nr == 1. However this could be abused with a very large explicit scroll in a resized screen. Make the code scroll lines by performing an array rotation operation to avoid the need for a large temporary space. Requested-by: Kees Cook <keescook@chromium.org> Suggested-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Nicolas Pitre <nico@linaro.org> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 2d14bb195d..d527184579 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -104,6 +104,7 @@ #include <linux/kdb.h> #include <linux/ctype.h> #include <linux/bsearch.h> +#include <linux/gcd.h> #define MAX_NR_CON_DRIVER 16 @@ -432,20 +433,29 @@ static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b, struct uni_screen *uniscr = get_vc_uniscr(vc); if (uniscr) { - unsigned int s, d, rescue, clear; - char32_t *save[nr]; - - s = clear = t; - d = t + nr; - rescue = b - nr; - if (dir == SM_UP) { - swap(s, d); - swap(clear, rescue); + unsigned int i, j, k, sz, d, clear; + + sz = b - t; + clear = b - nr; + d = nr; + if (dir == SM_DOWN) { + clear = t; + d = sz - nr; + } + for (i = 0; i < gcd(d, sz); i++) { + char32_t *tmp = uniscr->lines[t + i]; + j = i; + while (1) { + k = j + d; + if (k >= sz) + k -= sz; + if (k == i) + break; + uniscr->lines[t + j] = uniscr->lines[t + k]; + j = k; + } + uniscr->lines[t + j] = tmp; } - memcpy(save, uniscr->lines + rescue, nr * sizeof(*save)); - memmove(uniscr->lines + d, uniscr->lines + s, - (b - t - nr) * sizeof(*uniscr->lines)); - memcpy(uniscr->lines + clear, save, nr * sizeof(*save)); vc_uniscr_clear_lines(vc, clear, nr); } }
On Thu, Jul 19, 2018 at 12:05:25AM -0400, Nicolas Pitre wrote: > > The nr argument is typically small: most often nr == 1. However this > could be abused with a very large explicit scroll in a resized screen. > Make the code scroll lines by performing an array rotation operation to > avoid the need for a large temporary space. > > Requested-by: Kees Cook <keescook@chromium.org> > Suggested-by: Adam Borowski <kilobyte@angband.pl> > Signed-off-by: Nicolas Pitre <nico@linaro.org> I've now applied this v2 of the patch, thanks for this. greg k-h
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 2d14bb195d..03e79f7787 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -433,20 +433,22 @@ static void vc_uniscr_scroll(struct vc_data *vc, unsigned int t, unsigned int b, if (uniscr) { unsigned int s, d, rescue, clear; - char32_t *save[nr]; s = clear = t; - d = t + nr; - rescue = b - nr; + d = t + 1; + rescue = b - 1; if (dir == SM_UP) { swap(s, d); swap(clear, rescue); } - memcpy(save, uniscr->lines + rescue, nr * sizeof(*save)); - memmove(uniscr->lines + d, uniscr->lines + s, - (b - t - nr) * sizeof(*uniscr->lines)); - memcpy(uniscr->lines + clear, save, nr * sizeof(*save)); - vc_uniscr_clear_lines(vc, clear, nr); + while (nr--) { + char32_t *tmp; + tmp = uniscr->lines[rescue]; + memmove(uniscr->lines + d, uniscr->lines + s, + (b - t - 1) * sizeof(*uniscr->lines)); + uniscr->lines[clear] = tmp; + vc_uniscr_clear_lines(vc, clear, 1); + } } }
The nr argument is typically small: most often nr == 1. However this could be abused with a very large explicit scroll in a resized screen. Make the code scroll lines one at a time in all cases to avoid the VLA. Anything smarter is most likely not warranted here. Requested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nicolas Pitre <nico@linaro.org> --- drivers/tty/vt/vt.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) -- 2.17.1