Message ID | 20190219214940.391081-1-arnd@arndb.de |
---|---|
State | Superseded |
Headers | show |
Series | kasan: turn off asan-stack for clang-8 and earlier | expand |
On Tue, 2019-02-19 at 22:49 +0100, Arnd Bergmann wrote: > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > about overly large stack frames, the worst ones being: > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size > of 20224 bytes in function 'st7789v_prepare' > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: > error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes > in function 'max3421_spi_thread' > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes > in function 'slic_ds26522_probe' > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in > function 'ccp_run_cmd' > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 > bytes in function 'stv0367ter_algo' > > None of these happen with gcc today, and almost all of these are the result > of a single known bug in llvm. Hopefully it will eventually get fixed with > the > clang-9 release. > > In the meantime, the best idea I have is to turn off asan-stack for clang-8 > and earlier, so we can produce a kernel that is safe to run. > > I have posted three patches that address the frame overflow warnings that are > not addressed by turning off asan-stack, so in combination with this change, > we get much closer to a clean allmodconfig build, which in turn is necessary > to do meaningful build regression testing. Well, I am using clang 8.0 on arm64 and running the kernel just fine for a few weeks now and never trigger a single stack overflow (THREAD_SHIFT = 15) because I never use any of those drivers you mentioned above. I don't think it is a good idea to blankly remove the testing coverage here and affect people don't use all those offensive functions at all.
+ Evgenii, Kostya for KASAN On Tue, Feb 19, 2019 at 2:17 PM Qian Cai <cai@lca.pw> wrote: > > On Tue, 2019-02-19 at 22:49 +0100, Arnd Bergmann wrote: > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > > about overly large stack frames, the worst ones being: > > > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size > > of 20224 bytes in function 'st7789v_prepare' > > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: > > error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes > > in function 'max3421_spi_thread' > > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes > > in function 'slic_ds26522_probe' > > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in > > function 'ccp_run_cmd' > > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 > > bytes in function 'stv0367ter_algo' > > > > None of these happen with gcc today, and almost all of these are the result > > of a single known bug in llvm. Hopefully it will eventually get fixed with > > the > > clang-9 release. Would it be better to disable outright if CC_IS_CLANG, then go back and adjust it once we know explicitly which version it lands in? (Maybe I'm being too pessimistic about when it may be fixed). > > > > In the meantime, the best idea I have is to turn off asan-stack for clang-8 > > and earlier, so we can produce a kernel that is safe to run. > > > > I have posted three patches that address the frame overflow warnings that are > > not addressed by turning off asan-stack, so in combination with this change, > > we get much closer to a clean allmodconfig build, which in turn is necessary > > to do meaningful build regression testing. > > Well, I am using clang 8.0 on arm64 and running the kernel just fine for a few > weeks now and never trigger a single stack overflow (THREAD_SHIFT = 15) because > I never use any of those drivers you mentioned above. I don't think it is a good > idea to blankly remove the testing coverage here and affect people don't use all > those offensive functions at all. Thanks for the patch, Arnd! Hopefully we can fix that up in Clang soon. Qian, I guess the alternative would be to add `-mllvm -asan-stack=0` on potentially up to 140 Makefiles? -- Thanks, ~Nick Desaulniers
On Tue, Feb 19, 2019 at 2:43 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > + Evgenii, Kostya for KASAN > > On Tue, Feb 19, 2019 at 2:17 PM Qian Cai <cai@lca.pw> wrote: > > > > On Tue, 2019-02-19 at 22:49 +0100, Arnd Bergmann wrote: > > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > > > about overly large stack frames, the worst ones being: > > > > > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size > > > of 20224 bytes in function 'st7789v_prepare' > > > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: > > > error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > > > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes > > > in function 'max3421_spi_thread' > > > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes > > > in function 'slic_ds26522_probe' > > > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in > > > function 'ccp_run_cmd' > > > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 > > > bytes in function 'stv0367ter_algo' > > > > > > None of these happen with gcc today, and almost all of these are the result > > > of a single known bug in llvm. Hopefully it will eventually get fixed with > > > the > > > clang-9 release. I am not confident we can fix this in clang. The difference between gcc and clang, AFAICT, is not in the asan instrumentation, but in inining. Looking at the reproducer from https://bugs.llvm.org/show_bug.cgi?id=38809#c4, if I change ltv350qv_write_reg to always inline, gcc also produces a huge 10K stack frame, and making it noinline fixes the stack frame for both gcc and clang. > > Would it be better to disable outright if CC_IS_CLANG, then go back > and adjust it once we know explicitly which version it lands in? > (Maybe I'm being too pessimistic about when it may be fixed). > > > > > > > In the meantime, the best idea I have is to turn off asan-stack for clang-8 > > > and earlier, so we can produce a kernel that is safe to run. > > > > > > I have posted three patches that address the frame overflow warnings that are > > > not addressed by turning off asan-stack, so in combination with this change, > > > we get much closer to a clean allmodconfig build, which in turn is necessary > > > to do meaningful build regression testing. > > > > Well, I am using clang 8.0 on arm64 and running the kernel just fine for a few > > weeks now and never trigger a single stack overflow (THREAD_SHIFT = 15) because > > I never use any of those drivers you mentioned above. I don't think it is a good > > idea to blankly remove the testing coverage here and affect people don't use all > > those offensive functions at all. > > Thanks for the patch, Arnd! Hopefully we can fix that up in Clang > soon. Qian, I guess the alternative would be to add `-mllvm > -asan-stack=0` on potentially up to 140 Makefiles? > > -- > Thanks, > ~Nick Desaulniers
On 2/19/19 7:33 PM, Kostya Serebryany wrote: >>> Well, I am using clang 8.0 on arm64 and running the kernel just fine for a few >>> weeks now and never trigger a single stack overflow (THREAD_SHIFT = 15) because >>> I never use any of those drivers you mentioned above. I don't think it is a good >>> idea to blankly remove the testing coverage here and affect people don't use all >>> those offensive functions at all. >> >> Thanks for the patch, Arnd! Hopefully we can fix that up in Clang >> soon. Qian, I guess the alternative would be to add `-mllvm >> -asan-stack=0` on potentially up to 140 Makefiles? Depends on the exact stack consumption of those 140 functions. For example, I don't care if you turn-off -asan-stack if the current stack size is < 32k. For those functions always consume like a lot of stack like 8k or more, fix them in Makefile if not too many of them.
On Wed, Feb 20, 2019 at 1:34 AM Kostya Serebryany <kcc@google.com> wrote: > > On Tue, Feb 19, 2019 at 2:43 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: > > > > + Evgenii, Kostya for KASAN > > > > On Tue, Feb 19, 2019 at 2:17 PM Qian Cai <cai@lca.pw> wrote: > > > > > > On Tue, 2019-02-19 at 22:49 +0100, Arnd Bergmann wrote: > > > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > > > > about overly large stack frames, the worst ones being: > > > > > > > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size > > > > of 20224 bytes in function 'st7789v_prepare' > > > > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: > > > > error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > > > > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes > > > > in function 'max3421_spi_thread' > > > > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes > > > > in function 'slic_ds26522_probe' > > > > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in > > > > function 'ccp_run_cmd' > > > > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 > > > > bytes in function 'stv0367ter_algo' > > > > > > > > None of these happen with gcc today, and almost all of these are the result > > > > of a single known bug in llvm. Hopefully it will eventually get fixed with > > > > the > > > > clang-9 release. > > I am not confident we can fix this in clang. > The difference between gcc and clang, AFAICT, is not in the asan > instrumentation, but in inining. > Looking at the reproducer from https://bugs.llvm.org/show_bug.cgi?id=38809#c4, > if I change ltv350qv_write_reg to always inline, gcc also produces a > huge 10K stack frame, > and making it noinline fixes the stack frame for both gcc and clang. Does your gcc have fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 ? For me gcc rejects to inline it: gcc version 7.3.0 (Debian 7.3.0-5) $ gcc /tmp/test.c -Wframe-larger-than=128 -c -fsanitize=kernel-address -O2 --param asan-stack=1 /tmp/test.c:23:13: warning: always_inline function might not be inlinable [-Wattributes] static void ltv350qv_write_reg(void) { ^~~~~~~~~~~~~~~~~~ /tmp/test.c: In function ‘ltv350qv_power_on’: /tmp/test.c:57:1: warning: the frame size of 416 bytes is larger than 128 bytes [-Wframe-larger-than=] } ^
On Wed, Feb 20, 2019 at 7:44 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Wed, Feb 20, 2019 at 1:34 AM Kostya Serebryany <kcc@google.com> wrote: > > > > On Tue, Feb 19, 2019 at 2:43 PM Nick Desaulniers > > <ndesaulniers@google.com> wrote: > > > > > > + Evgenii, Kostya for KASAN > > > > > > On Tue, Feb 19, 2019 at 2:17 PM Qian Cai <cai@lca.pw> wrote: > > > > > > > > On Tue, 2019-02-19 at 22:49 +0100, Arnd Bergmann wrote: > > > > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > > > > > about overly large stack frames, the worst ones being: > > > > > > > > > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size > > > > > of 20224 bytes in function 'st7789v_prepare' > > > > > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: > > > > > error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > > > > > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes > > > > > in function 'max3421_spi_thread' > > > > > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes > > > > > in function 'slic_ds26522_probe' > > > > > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in > > > > > function 'ccp_run_cmd' > > > > > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 > > > > > bytes in function 'stv0367ter_algo' > > > > > > > > > > None of these happen with gcc today, and almost all of these are the result > > > > > of a single known bug in llvm. Hopefully it will eventually get fixed with > > > > > the > > > > > clang-9 release. > > > > I am not confident we can fix this in clang. > > The difference between gcc and clang, AFAICT, is not in the asan > > instrumentation, but in inining. > > Looking at the reproducer from https://bugs.llvm.org/show_bug.cgi?id=38809#c4, > > if I change ltv350qv_write_reg to always inline, gcc also produces a > > huge 10K stack frame, > > and making it noinline fixes the stack frame for both gcc and clang. > > Does your gcc have fix for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715 ? > > For me gcc rejects to inline it: > > gcc version 7.3.0 (Debian 7.3.0-5) > > $ gcc /tmp/test.c -Wframe-larger-than=128 -c -fsanitize=kernel-address > -O2 --param asan-stack=1 > /tmp/test.c:23:13: warning: always_inline function might not be > inlinable [-Wattributes] I don't see that warning here > static void ltv350qv_write_reg(void) { > ^~~~~~~~~~~~~~~~~~ > /tmp/test.c: In function ‘ltv350qv_power_on’: > /tmp/test.c:57:1: warning: the frame size of 416 bytes is larger than > 128 bytes [-Wframe-larger-than=] > } but I also see the small stack size here: https://godbolt.org/z/Uz5Ruv gcc definitely inlines the function here but only has one copy of spi_message and spi_transfer on the stack, while clang inserts a call to __asan_report_store8_noabort() and has one copy per inlined call to ltv350qv_write_reg(). Arnd
On Tue, Feb 19, 2019 at 10:49 PM Arnd Bergmann <arnd@arndb.de> wrote: > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > about overly large stack frames, the worst ones being: > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare' > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread' > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe' > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd' > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo' > > None of these happen with gcc today, and almost all of these are the result > of a single known bug in llvm. Hopefully it will eventually get fixed with the > clang-9 release. > > In the meantime, the best idea I have is to turn off asan-stack for clang-8 > and earlier, so we can produce a kernel that is safe to run. Hi Arnd, I don't think it's good to disable KASAN stack instrumentation for the whole kernel by default with clang. It makes more sense to disable stack instrumentation only for these few drivers. Thanks! > > I have posted three patches that address the frame overflow warnings that are > not addressed by turning off asan-stack, so in combination with this change, > we get much closer to a clean allmodconfig build, which in turn is necessary > to do meaningful build regression testing. > > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> > Cc: Dmitry Vyukov <dvyukov@google.com> > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Mark Brown <broonie@kernel.org> > Cc: Qian Cai <cai@lca.pw> > Link: https://bugs.llvm.org/show_bug.cgi?id=38809 > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > lib/Kconfig.kasan | 13 +++++++++++++ > scripts/Makefile.kasan | 2 +- > 2 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan > index 67d7d1309c52..219cddc913ac 100644 > --- a/lib/Kconfig.kasan > +++ b/lib/Kconfig.kasan > @@ -103,6 +103,19 @@ config KASAN_INLINE > > endchoice > > +config KASAN_STACK > + int > + default 0 if CC_IS_CLANG && (CLANG_VERSION < 90000) > + default 1 > + help > + The LLVM stack address sanitizer has a know bug that > + causes excessive stack usage in a lot of functions, see > + https://bugs.llvm.org/show_bug.cgi?id=38809 > + Disabling asan-stack makes it safe to run kernels build > + with clang-8 with KASAN enabled, though it loses some of > + the functionality. We assume that clang-9 will have a fix, > + so the feature can be used. > + > config KASAN_S390_4_LEVEL_PAGING > bool "KASan: use 4-level paging" > depends on KASAN && S390 > diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan > index f1fb8e502657..6410bd22fe38 100644 > --- a/scripts/Makefile.kasan > +++ b/scripts/Makefile.kasan > @@ -26,7 +26,7 @@ else > CFLAGS_KASAN := $(CFLAGS_KASAN_SHADOW) \ > $(call cc-param,asan-globals=1) \ > $(call cc-param,asan-instrumentation-with-call-threshold=$(call_threshold)) \ > - $(call cc-param,asan-stack=1) \ > + $(call cc-param,asan-stack=$(CONFIG_KASAN_STACK)) \ > $(call cc-param,asan-instrument-allocas=1) > endif > > -- > 2.20.0 >
On Wed, Feb 20, 2019 at 3:45 PM Andrey Konovalov <andreyknvl@google.com> wrote: > > On Tue, Feb 19, 2019 at 10:49 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > Building an arm64 allmodconfig kernel with clang results in over 140 warnings > > about overly large stack frames, the worst ones being: > > > > drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare' > > drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' > > drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread' > > drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe' > > drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd' > > drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo' > > > > None of these happen with gcc today, and almost all of these are the result > > of a single known bug in llvm. Hopefully it will eventually get fixed with the > > clang-9 release. > > > > In the meantime, the best idea I have is to turn off asan-stack for clang-8 > > and earlier, so we can produce a kernel that is safe to run. > > Hi Arnd, > > I don't think it's good to disable KASAN stack instrumentation for the > whole kernel by default with clang. It makes more sense to disable > stack instrumentation only for these few drivers. Do you mean just the 114 files that we get warnings for in allmodconfig, or also those that run into the same bug but stay below the warning limit, and the ones that don't warn in allmodconfig but do warn in other configurations? I would have to some more research, but I expect several hundred patches before we get to a clean randconfig build with a broken compiler. Arnd
On 2/20/19 5:51 PM, Arnd Bergmann wrote: > On Wed, Feb 20, 2019 at 3:45 PM Andrey Konovalov <andreyknvl@google.com> wrote: >> >> On Tue, Feb 19, 2019 at 10:49 PM Arnd Bergmann <arnd@arndb.de> wrote: >>> >>> Building an arm64 allmodconfig kernel with clang results in over 140 warnings >>> about overly large stack frames, the worst ones being: >>> >>> drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare' >>> drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' >>> drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread' >>> drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe' >>> drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd' >>> drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo' >>> >>> None of these happen with gcc today, and almost all of these are the result >>> of a single known bug in llvm. Hopefully it will eventually get fixed with the >>> clang-9 release. >>> >>> In the meantime, the best idea I have is to turn off asan-stack for clang-8 >>> and earlier, so we can produce a kernel that is safe to run. >> >> Hi Arnd, >> >> I don't think it's good to disable KASAN stack instrumentation for the >> whole kernel by default with clang. It makes more sense to disable >> stack instrumentation only for these few drivers. > > Do you mean just the 114 files that we get warnings for in allmodconfig, > or also those that run into the same bug but stay below the warning limit, > and the ones that don't warn in allmodconfig but do warn in other > configurations? > > I would have to some more research, but I expect several hundred > patches before we get to a clean randconfig build with a broken > compiler. Manually maintaining asan-stack parameter for the sake of one broken compiler isn't a great idea either. Couple alternative suggestions: 1) If we can't fix the problem or the cost of fixing is too high, maybe just hide it? Disable -Wframe-larger-then on pre clang-9 compilers. 2) Fallback cflags. The idea is to try to compile every the file with "-mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror" at first, and fallback to "-mllvm -asan-stack=0" if failed. So it would be something similar to $(call cc-option, -mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror, -mllvm -asan-stack=0) except that "cc-option" tries options only once on some code example while we need to try options on every file that we actually compile. Honestly, I'm not sure that it's worthy to hack Kbuild engine for that particular use-case.
On Wed, Feb 20, 2019 at 6:00 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > On 2/20/19 5:51 PM, Arnd Bergmann wrote: > > On Wed, Feb 20, 2019 at 3:45 PM Andrey Konovalov <andreyknvl@google.com> wrote: > > I would have to some more research, but I expect several hundred > > patches before we get to a clean randconfig build with a broken > > compiler. > > Manually maintaining asan-stack parameter for the sake of one broken compiler isn't a great idea either. > > Couple alternative suggestions: > > 1) If we can't fix the problem or the cost of fixing is too high, maybe just hide it? Disable -Wframe-larger-then on pre clang-9 compilers. > > 2) Fallback cflags. The idea is to try to compile every the file with "-mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror" at first, > and fallback to "-mllvm -asan-stack=0" if failed. So it would be something similar to $(call cc-option, -mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror, -mllvm -asan-stack=0) > except that "cc-option" tries options only once on some code example while we need to try options on every file that we actually compile. > Honestly, I'm not sure that it's worthy to hack Kbuild engine for that particular use-case. My original plan was to put this under CONFIG_KASAN_EXTRA to allow you to still enable it in older compilers, but you just removed that option ;-) Maybe bringing it back would be a compromise? That way it's hidden from all the build testing bots (because of the !CONFIG_COMPILE_TEST dependency), but anyone who really wants it can still have the option, and set CONFIG_FRAME_WARN to whichever value they like. Arnd
+ Evgenii On Wed, Feb 20, 2019 at 9:36 AM Arnd Bergmann <arnd@arndb.de> wrote: > > On Wed, Feb 20, 2019 at 6:00 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > > On 2/20/19 5:51 PM, Arnd Bergmann wrote: > > > On Wed, Feb 20, 2019 at 3:45 PM Andrey Konovalov <andreyknvl@google.com> wrote: > > > I would have to some more research, but I expect several hundred > > > patches before we get to a clean randconfig build with a broken > > > compiler. > > > > Manually maintaining asan-stack parameter for the sake of one broken compiler isn't a great idea either. > > > > Couple alternative suggestions: > > > > 1) If we can't fix the problem or the cost of fixing is too high, maybe just hide it? Disable -Wframe-larger-then on pre clang-9 compilers. > > > > 2) Fallback cflags. The idea is to try to compile every the file with "-mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror" at first, > > and fallback to "-mllvm -asan-stack=0" if failed. So it would be something similar to $(call cc-option, -mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror, -mllvm -asan-stack=0) > > except that "cc-option" tries options only once on some code example while we need to try options on every file that we actually compile. > > Honestly, I'm not sure that it's worthy to hack Kbuild engine for that particular use-case. > > My original plan was to put this under CONFIG_KASAN_EXTRA to allow you > to still enable it in older compilers, but you just removed that option ;-) > > Maybe bringing it back would be a compromise? That way it's hidden from > all the build testing bots (because of the !CONFIG_COMPILE_TEST dependency), > but anyone who really wants it can still have the option, and set > CONFIG_FRAME_WARN > to whichever value they like. > > Arnd I like Evgenii's idea: https://bugs.llvm.org/show_bug.cgi?id=38809#c10 Even though something like that wouldn't make the clang-8 train, I think it's ok. While I myself share Arnd's goal of driving compiler warnings to zero, in general I'd prefer not to disable warning-producing-features or disable warnings outright for cases where we have some ideas of changes we can make to the compiler. There's probably a list now of false warnings produced by old versions of Clang from bugs in Clang that we fixed. I'm not interested in additionally trying to work around those somehow in kernel sources. Qian previously pointed out that most drivers don't produce this warning under KASAN+Clang. While 114 is a lot, what are the chances that someone NEEDS a KASAN+Clang build to compile warning free and happen to include one of these problematic drivers? And if there is a chance they do observe the warning, are we doing a disservice by disabling the feature (-asan-stack=1) outright for the whole kernel, or disabling the warning (`-Wstack-frame-larger-than=`) which can flag issues unrelated to KASAN? To Evgenii's idea, I vote that the compiler is incorrect here, and we shouldn't start turning things off. Evgenii, do you have some sense of how to tune the inliner as you described? -- Thanks, ~Nick Desaulniers
On Wed, Feb 20, 2019 at 10:07:36AM -0800, Nick Desaulniers wrote: > I like Evgenii's idea: > https://bugs.llvm.org/show_bug.cgi?id=38809#c10 That's a suggestion to tune the inlining heuristics. > While I myself share Arnd's goal of driving compiler warnings to zero, > in general I'd prefer not to disable warning-producing-features or > disable warnings outright for cases where we have some ideas of > changes we can make to the compiler. There's probably a list now of > false warnings produced by old versions of Clang from bugs in Clang > that we fixed. I'm not interested in additionally trying to work > around those somehow in kernel sources. We do have infrastructure in the kernel for managing warnings based on compiler version (Arnd was looking at some improvements to that IIRC), if we've got a kernel that builds with a given compiler it's worth looking at tuning what we do with that compiler. If newer versions of the compiler work better or have new options we can turn things on for them. > Qian previously pointed out that most drivers don't produce this > warning under KASAN+Clang. While 114 is a lot, what are the chances > that someone NEEDS a KASAN+Clang build to compile warning free and > happen to include one of these problematic drivers? And if there is a > chance they do observe the warning, are we doing a disservice by > disabling the feature (-asan-stack=1) outright for the whole kernel, > or disabling the warning (`-Wstack-frame-larger-than=`) which can flag > issues unrelated to KASAN? People doing treewide work and subsystem maintainers are a reasonably important target for this sort of thing - for example people looking at the kernelci output. It's a lot easier to pay attention to problems if you don't have to wade through large numbers of false positives.
On Wed, Feb 20, 2019 at 10:44 AM Mark Brown <broonie@kernel.org> wrote: > > On Wed, Feb 20, 2019 at 10:07:36AM -0800, Nick Desaulniers wrote: > > > I like Evgenii's idea: > > https://bugs.llvm.org/show_bug.cgi?id=38809#c10 > > That's a suggestion to tune the inlining heuristics. Yes; but it will also improve KASAN (if feasible). > > While I myself share Arnd's goal of driving compiler warnings to zero, > > in general I'd prefer not to disable warning-producing-features or > > disable warnings outright for cases where we have some ideas of > > changes we can make to the compiler. There's probably a list now of > > false warnings produced by old versions of Clang from bugs in Clang > > that we fixed. I'm not interested in additionally trying to work > > around those somehow in kernel sources. > > We do have infrastructure in the kernel for managing warnings based on > compiler version (Arnd was looking at some improvements to that IIRC), > if we've got a kernel that builds with a given compiler it's worth > looking at tuning what we do with that compiler. If newer versions of > the compiler work better or have new options we can turn things on for > them. so maybe something like (pseudocode): if kasan && clang && clang_version < 9: disable -Wframe-larger-than= If you overrun the stack with KASAN, a warning would be nice, but you'll hopefully find out the hard way at runtime. And that doesn't require up to 114 Makefile changes, which would be kind of obnoxious for this papercut. > > > Qian previously pointed out that most drivers don't produce this > > warning under KASAN+Clang. While 114 is a lot, what are the chances > > that someone NEEDS a KASAN+Clang build to compile warning free and > > happen to include one of these problematic drivers? And if there is a > > chance they do observe the warning, are we doing a disservice by > > disabling the feature (-asan-stack=1) outright for the whole kernel, > > or disabling the warning (`-Wstack-frame-larger-than=`) which can flag > > issues unrelated to KASAN? > > People doing treewide work and subsystem maintainers are a reasonably > important target for this sort of thing - for example people looking at > the kernelci output. It's a lot easier to pay attention to problems if > you don't have to wade through large numbers of false positives. Good point. Current reports are a flood of -Wframe-larger-than= because of KASAN (we've fixed just about everything else), and I have to pick out what's new from that sea of false positives. I would hate for these warnings from KASAN to be the last thing before people start taking clang builds seriously due to false positive warnings. -- Thanks, ~Nick Desaulniers
On Wed, Feb 20, 2019 at 9:02 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Wed, Feb 20, 2019 at 10:44 AM Mark Brown <broonie@kernel.org> wrote: > > > > On Wed, Feb 20, 2019 at 10:07:36AM -0800, Nick Desaulniers wrote: > > > > > I like Evgenii's idea: > > > https://bugs.llvm.org/show_bug.cgi?id=38809#c10 > > > > That's a suggestion to tune the inlining heuristics. > > Yes; but it will also improve KASAN (if feasible). From my experiments that I cited in the bug report, very few cases seem to actually be the result of different inlining decisions, and those that are tend to be kernel bugs (e.g. in drivers/scsi/bfa/bfa_fcs_lport.c clang reports a larger stack than gcc because it combines two functions that individually use 800 bytes each, but if one calls the other, we use even more stack than the combined function). In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12 (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses over ten times as much stack space as gcc, for reasons I still can't explain. My assumption right now is that the underlying bug causes most of the problems with excessive stack usage in allmodconfig kernels. > > > While I myself share Arnd's goal of driving compiler warnings to zero, > > > in general I'd prefer not to disable warning-producing-features or > > > disable warnings outright for cases where we have some ideas of > > > changes we can make to the compiler. There's probably a list now of > > > false warnings produced by old versions of Clang from bugs in Clang > > > that we fixed. I'm not interested in additionally trying to work > > > around those somehow in kernel sources. > > > > We do have infrastructure in the kernel for managing warnings based on > > compiler version (Arnd was looking at some improvements to that IIRC), > > if we've got a kernel that builds with a given compiler it's worth > > looking at tuning what we do with that compiler. If newer versions of > > the compiler work better or have new options we can turn things on for > > them. > > so maybe something like (pseudocode): > if kasan && clang && clang_version < 9: > disable -Wframe-larger-than= > > If you overrun the stack with KASAN, a warning would be nice, but > you'll hopefully find out the hard way at runtime. And that doesn't > require up to 114 Makefile changes, which would be kind of obnoxious > for this papercut. That would mean that build testing allmodconfig with clang would ignore all the interesting stack overflow bugs that happen in real user systems without KASAN. Arnd
On Wed, Feb 20, 2019 at 10:13 PM Arnd Bergmann <arnd@arndb.de> wrote: > > In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12 > (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses > over ten times as much stack space as gcc, for reasons I still > can't explain. My assumption right now is that the underlying bug > causes most of the problems with excessive stack usage in > allmodconfig kernels. Here is an even more minimal example: struct s { int i[5]; } f(void); void g(void) { f(); f();} https://godbolt.org/z/d_KWkh It's clear that clang does /something/ here when asan-stack=1 is set, but I fail to see what it is, or why that is necessary. The output of clang with asan-stack=0 is the expected code, and basically identical to what gcc produces with or without asan-stack. Arnd
On Wed, Feb 20, 2019 at 1:40 PM Arnd Bergmann <arnd@arndb.de> wrote: > > On Wed, Feb 20, 2019 at 10:13 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12 > > (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses > > over ten times as much stack space as gcc, for reasons I still > > can't explain. My assumption right now is that the underlying bug > > causes most of the problems with excessive stack usage in > > allmodconfig kernels. > > Here is an even more minimal example: > > struct s { int i[5]; } f(void); > void g(void) { f(); f();} On this example I can see some stupidity that clang/asan is doing. Let me try fixing it and see if it helps bigger cases. Thanks for reducing the case! This is the input we get in the asan instrumentation: ; Function Attrs: noinline nounwind optnone sanitize_address uwtable define dso_local void @g() #0 { entry: %tmp = alloca %struct.s, align 4 <<<<<<<<<<<<<<<<<<<<<<< %tmp1 = alloca %struct.s, align 4 %0 = bitcast %struct.s* %tmp to i8* call void @llvm.lifetime.start.p0i8(i64 20, i8* %0) #3 call void @f(%struct.s* sret %tmp) %1 = bitcast %struct.s* %tmp to i8* <<<<<<<<<<<<<<<<<<<<< call void @llvm.lifetime.end.p0i8(i64 20, i8* %1) #3 %2 = bitcast %struct.s* %tmp1 to i8* call void @llvm.lifetime.start.p0i8(i64 20, i8* %2) #3 <<<<<<<<<<<<< call void @f(%struct.s* sret %tmp1) %3 = bitcast %struct.s* %tmp1 to i8* call void @llvm.lifetime.end.p0i8(i64 20, i8* %3) #3 ret void } the stack variables are not *really* used, but since they are "used" inside the lifetime markers they are not eliminated by asan, and so asan instruments them, after which no one can remove them any more... > > https://godbolt.org/z/d_KWkh > > It's clear that clang does /something/ here when asan-stack=1 is > set, but I fail to see what it is, or why that is necessary. > > The output of clang with asan-stack=0 is the expected > code, and basically identical to what gcc produces with or > without asan-stack. > > Arnd
On Wed, Feb 20, 2019 at 2:12 PM Kostya Serebryany <kcc@google.com> wrote: > > On Wed, Feb 20, 2019 at 1:40 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > On Wed, Feb 20, 2019 at 10:13 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12 > > > (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses > > > over ten times as much stack space as gcc, for reasons I still > > > can't explain. My assumption right now is that the underlying bug > > > causes most of the problems with excessive stack usage in > > > allmodconfig kernels. > > > > Here is an even more minimal example: > > > > struct s { int i[5]; } f(void); > > void g(void) { f(); f();} > > On this example I can see some stupidity that clang/asan is doing. > Let me try fixing it and see if it helps bigger cases. > Thanks for reducing the case! > > This is the input we get in the asan instrumentation: > > ; Function Attrs: noinline nounwind optnone sanitize_address uwtable > define dso_local void @g() #0 { > entry: > %tmp = alloca %struct.s, align 4 <<<<<<<<<<<<<<<<<<<<<<< > %tmp1 = alloca %struct.s, align 4 > %0 = bitcast %struct.s* %tmp to i8* > call void @llvm.lifetime.start.p0i8(i64 20, i8* %0) #3 > call void @f(%struct.s* sret %tmp) > %1 = bitcast %struct.s* %tmp to i8* <<<<<<<<<<<<<<<<<<<<< > call void @llvm.lifetime.end.p0i8(i64 20, i8* %1) #3 > %2 = bitcast %struct.s* %tmp1 to i8* > call void @llvm.lifetime.start.p0i8(i64 20, i8* %2) #3 <<<<<<<<<<<<< > call void @f(%struct.s* sret %tmp1) > %3 = bitcast %struct.s* %tmp1 to i8* > call void @llvm.lifetime.end.p0i8(i64 20, i8* %3) #3 > ret void > } Err.. taking my words back. These allocas *are* used other then in lifetime markers, since they are passed to f() as 'sret'. And we can not drop instrumentation for such allocas. Example: static volatile int zero = 0; typedef struct { int ar[5]; } S; S foo() { S s; s.ar[zero + 6] = 42; return s; } int main(int argc, char **argv) { S s = foo(); return s.ar[argc]; } % clang -g -O1 -fsanitize=address sret.c && ./a.out ==5822==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffcad027f78 at pc 0x0000004f878d bp 0x7ffcad027f20 sp 0x7ffcad027f18 WRITE of size 4 at 0x7ffcad027f78 thread T0 #0 0x4f878c in foo sret.c:7:18 #1 0x4f8838 in main sret.c:11:9 Address 0x7ffcad027f78 is located in stack of thread T0 at offset 56 in frame #0 0x4f879f in main sret.c:10 This frame has 1 object(s): [32, 52) 's' (line 11) <== Memory access at offset 56 overflows this variable Here we have a struct return that needs to be instrumented inside main so that a buffer overflow in foo() is detected. Now, I am also not confident that the reduced case reflects the real problem. --kcc > > the stack variables are not *really* used, but since they are "used" > inside the lifetime markers they are not eliminated by asan, > and so asan instruments them, after which no one can remove them any more... > > > > > > > > https://godbolt.org/z/d_KWkh > > > > It's clear that clang does /something/ here when asan-stack=1 is > > set, but I fail to see what it is, or why that is necessary. > > > > The output of clang with asan-stack=0 is the expected > > code, and basically identical to what gcc produces with or > > without asan-stack. > > > > Arnd
On 2/20/19 8:35 PM, Arnd Bergmann wrote: > On Wed, Feb 20, 2019 at 6:00 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: >> On 2/20/19 5:51 PM, Arnd Bergmann wrote: >>> On Wed, Feb 20, 2019 at 3:45 PM Andrey Konovalov <andreyknvl@google.com> wrote: >>> I would have to some more research, but I expect several hundred >>> patches before we get to a clean randconfig build with a broken >>> compiler. >> >> Manually maintaining asan-stack parameter for the sake of one broken compiler isn't a great idea either. >> >> Couple alternative suggestions: >> >> 1) If we can't fix the problem or the cost of fixing is too high, maybe just hide it? Disable -Wframe-larger-then on pre clang-9 compilers. >> >> 2) Fallback cflags. The idea is to try to compile every the file with "-mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror" at first, >> and fallback to "-mllvm -asan-stack=0" if failed. So it would be something similar to $(call cc-option, -mllvm -asan-stack=1 -Wframe-larger-than=2048 -Werror, -mllvm -asan-stack=0) >> except that "cc-option" tries options only once on some code example while we need to try options on every file that we actually compile. >> Honestly, I'm not sure that it's worthy to hack Kbuild engine for that particular use-case. > > My original plan was to put this under CONFIG_KASAN_EXTRA to allow you > to still enable it in older compilers, but you just removed that option ;-) > > Maybe bringing it back would be a compromise? That way it's hidden from > all the build testing bots (because of the !CONFIG_COMPILE_TEST dependency), > but anyone who really wants it can still have the option, and set > CONFIG_FRAME_WARN > to whichever value they like. > I think there is much simpler solution: --- lib/Kconfig.kasan | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan index 219cddc913ac..6cd035f06cee 100644 --- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -105,6 +105,8 @@ endchoice config KASAN_STACK int + range 0 1 + prompt "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && (CLANG_VERSION < 90000) default 0 if CC_IS_CLANG && (CLANG_VERSION < 90000) default 1 help -- AFAIK, randconfig is not able to randomize int config options, so it will be disabled for build robots, but users still will be able to enable it.
On Thu, Feb 21, 2019 at 11:06 AM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > On 2/20/19 8:35 PM, Arnd Bergmann wrote: > > On Wed, Feb 20, 2019 at 6:00 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > >> On 2/20/19 5:51 PM, Arnd Bergmann wrote: > > Maybe bringing it back would be a compromise? That way it's hidden from > > all the build testing bots (because of the !CONFIG_COMPILE_TEST dependency), > > but anyone who really wants it can still have the option, and set > > CONFIG_FRAME_WARN > > to whichever value they like. > > > > > I think there is much simpler solution: > > --- > lib/Kconfig.kasan | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan > index 219cddc913ac..6cd035f06cee 100644 > --- a/lib/Kconfig.kasan > +++ b/lib/Kconfig.kasan > @@ -105,6 +105,8 @@ endchoice > > config KASAN_STACK > int > + range 0 1 > + prompt "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && (CLANG_VERSION < 90000) > default 0 if CC_IS_CLANG && (CLANG_VERSION < 90000) > default 1 > help > -- > > > AFAIK, randconfig is not able to randomize int config options, so it will be disabled for build robots, > but users still will be able to enable it. Right, this will work, but I find it a bit awkward to require users to enter 0 or 1. My assumption is that build bots turn on CONFIG_COMPILE_TEST, so having a bool option that depends on COMPILE_TEST would be more conventional. We can debate whether it should also depend on CONFIG_EXPERT or not. Something like config KASAN_STACK bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && !COMPILE_TEST default CC_IS_GCC || (CLANG_VERSION >= 90000) And then a simpler Makefile logic (could also be done in Kconfig) to turn that bool symbol into an integer argument for asan-stack= Arnd
On 2/21/19 6:19 PM, Arnd Bergmann wrote: > On Thu, Feb 21, 2019 at 11:06 AM Andrey Ryabinin > <aryabinin@virtuozzo.com> wrote: >> On 2/20/19 8:35 PM, Arnd Bergmann wrote: >>> On Wed, Feb 20, 2019 at 6:00 PM Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: >>>> On 2/20/19 5:51 PM, Arnd Bergmann wrote: > >>> Maybe bringing it back would be a compromise? That way it's hidden from >>> all the build testing bots (because of the !CONFIG_COMPILE_TEST dependency), >>> but anyone who really wants it can still have the option, and set >>> CONFIG_FRAME_WARN >>> to whichever value they like. >>> >> >> >> I think there is much simpler solution: >> >> --- >> lib/Kconfig.kasan | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan >> index 219cddc913ac..6cd035f06cee 100644 >> --- a/lib/Kconfig.kasan >> +++ b/lib/Kconfig.kasan >> @@ -105,6 +105,8 @@ endchoice >> >> config KASAN_STACK >> int >> + range 0 1 >> + prompt "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && (CLANG_VERSION < 90000) >> default 0 if CC_IS_CLANG && (CLANG_VERSION < 90000) >> default 1 >> help >> -- >> >> >> AFAIK, randconfig is not able to randomize int config options, so it will be disabled for build robots, >> but users still will be able to enable it. > > Right, this will work, but I find it a bit awkward to require users to > enter 0 or 1. > > My assumption is that build bots turn on CONFIG_COMPILE_TEST, so > having a bool option that depends on COMPILE_TEST would be more > conventional. We can debate whether it should also depend on > CONFIG_EXPERT or not. Something like > > config KASAN_STACK > bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG > && !COMPILE_TEST > default CC_IS_GCC || (CLANG_VERSION >= 90000) > > And then a simpler Makefile logic (could also be done in Kconfig) to turn > that bool symbol into an integer argument for asan-stack= > Sounds good.
On Thu, Feb 21, 2019 at 12:47 AM Kostya Serebryany <kcc@google.com> wrote: > > On Wed, Feb 20, 2019 at 2:12 PM Kostya Serebryany <kcc@google.com> wrote: > > > > On Wed, Feb 20, 2019 at 1:40 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > On Wed, Feb 20, 2019 at 10:13 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > > > In the example in https://bugs.llvm.org/show_bug.cgi?id=38809#c12 > > > > (https://godbolt.org/z/ylsGSQ) there is no inlining, yet clang uses > > > > over ten times as much stack space as gcc, for reasons I still > > > > can't explain. My assumption right now is that the underlying bug > > > > causes most of the problems with excessive stack usage in > > > > allmodconfig kernels. > > > > > > Here is an even more minimal example: > > > > > > struct s { int i[5]; } f(void); > > > void g(void) { f(); f();} > > > > On this example I can see some stupidity that clang/asan is doing. > > Let me try fixing it and see if it helps bigger cases. > > Thanks for reducing the case! > > > > This is the input we get in the asan instrumentation: > > > > ; Function Attrs: noinline nounwind optnone sanitize_address uwtable > > define dso_local void @g() #0 { > > entry: > > %tmp = alloca %struct.s, align 4 <<<<<<<<<<<<<<<<<<<<<<< > > %tmp1 = alloca %struct.s, align 4 > > %0 = bitcast %struct.s* %tmp to i8* > > call void @llvm.lifetime.start.p0i8(i64 20, i8* %0) #3 > > call void @f(%struct.s* sret %tmp) > > %1 = bitcast %struct.s* %tmp to i8* <<<<<<<<<<<<<<<<<<<<< > > call void @llvm.lifetime.end.p0i8(i64 20, i8* %1) #3 > > %2 = bitcast %struct.s* %tmp1 to i8* > > call void @llvm.lifetime.start.p0i8(i64 20, i8* %2) #3 <<<<<<<<<<<<< > > call void @f(%struct.s* sret %tmp1) > > %3 = bitcast %struct.s* %tmp1 to i8* > > call void @llvm.lifetime.end.p0i8(i64 20, i8* %3) #3 > > ret void > > } > > Err.. taking my words back. > These allocas *are* used other then in lifetime markers, since they > are passed to f() as 'sret'. Thanks a lot for the analysis! > And we can not drop instrumentation for such allocas. Example: > > static volatile int zero = 0; > typedef struct { > int ar[5]; > } S; > S foo() { > S s; > s.ar[zero + 6] = 42; > return s; > } > int main(int argc, char **argv) { > S s = foo(); > return s.ar[argc]; > } > > % clang -g -O1 -fsanitize=address sret.c && ./a.out > > ==5822==ERROR: AddressSanitizer: stack-buffer-overflow on address > 0x7ffcad027f78 at pc 0x0000004f878d bp 0x7ffcad027f20 sp > 0x7ffcad027f18 > WRITE of size 4 at 0x7ffcad027f78 thread T0 > #0 0x4f878c in foo sret.c:7:18 > #1 0x4f8838 in main sret.c:11:9 > Address 0x7ffcad027f78 is located in stack of thread T0 at offset 56 in frame > #0 0x4f879f in main sret.c:10 > > This frame has 1 object(s): > [32, 52) 's' (line 11) <== Memory access at offset 56 overflows > this variable > > Here we have a struct return that needs to be instrumented inside main > so that a buffer overflow in foo() is detected. I tried reproducing this with gcc. The example you list above will generate very similar code on gcc. However, if I change main() to not assign the return value of foo() to a local variable, I get a trivial main main function without sanitizer code, but foo() still detects the overflow: ================================================================= ==3442660==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff5682bd58 at pc 0x0000004009c3 bp 0x7fff5682bd10 sp 0x7fff5682bd00 WRITE of size 4 at 0x7fff5682bd58 thread T0 #0 0x4009c2 in foo (/tmp/a.out+0x4009c2) #1 0x4009dd in main (/tmp/a.out+0x4009dd) #2 0x7fc6f0ce482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #3 0x400808 in _start (/tmp/a.out+0x400808) Address 0x7fff5682bd58 is located in stack of thread T0 at offset 56 in frame #0 0x4008c6 in foo (/tmp/a.out+0x4008c6) This frame has 1 object(s): [32, 52) 's' <== Memory access at offset 56 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow (/tmp/a.out+0x4009c2) in foo The other difference is that with gcc, we only get one copy of each stack variable even when it adds code from asan-stack to check it, while clang generally creates one copy per instance when asan-stack=1 is set (but doesn't otherwise). > Now, I am also not confident that the reduced case reflects the real problem. No, it probably doesn't, but it seems vaguely related. This is a problem of using creduce, it sometimes reduces the test cases too much and runs into a different symptom. The case we frequently see in the kernel is probably more like void extf(long *); static inline __attribute__((always_inline)) void f(void) { long i = 7; extg(&i); } int main(void) { f(); f(); } https://godbolt.org/z/ReEQLo Here, both gcc and clang add sanitizing for 'i', but clang has multiple copies, while gcc only has one. This happens for both scalars and structs, and inline functions as well as open-coded blocks or macros with the same code in them. Using the inline function for illustration above since that is the most common way it appears in the kernel. Arnd
diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan index 67d7d1309c52..219cddc913ac 100644 --- a/lib/Kconfig.kasan +++ b/lib/Kconfig.kasan @@ -103,6 +103,19 @@ config KASAN_INLINE endchoice +config KASAN_STACK + int + default 0 if CC_IS_CLANG && (CLANG_VERSION < 90000) + default 1 + help + The LLVM stack address sanitizer has a know bug that + causes excessive stack usage in a lot of functions, see + https://bugs.llvm.org/show_bug.cgi?id=38809 + Disabling asan-stack makes it safe to run kernels build + with clang-8 with KASAN enabled, though it loses some of + the functionality. We assume that clang-9 will have a fix, + so the feature can be used. + config KASAN_S390_4_LEVEL_PAGING bool "KASan: use 4-level paging" depends on KASAN && S390 diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan index f1fb8e502657..6410bd22fe38 100644 --- a/scripts/Makefile.kasan +++ b/scripts/Makefile.kasan @@ -26,7 +26,7 @@ else CFLAGS_KASAN := $(CFLAGS_KASAN_SHADOW) \ $(call cc-param,asan-globals=1) \ $(call cc-param,asan-instrumentation-with-call-threshold=$(call_threshold)) \ - $(call cc-param,asan-stack=1) \ + $(call cc-param,asan-stack=$(CONFIG_KASAN_STACK)) \ $(call cc-param,asan-instrument-allocas=1) endif
Building an arm64 allmodconfig kernel with clang results in over 140 warnings about overly large stack frames, the worst ones being: drivers/gpu/drm/panel/panel-sitronix-st7789v.c:196:12: error: stack frame size of 20224 bytes in function 'st7789v_prepare' drivers/video/fbdev/omap2/omapfb/displays/panel-tpo-td028ttec1.c:196:12: error: stack frame size of 13120 bytes in function 'td028ttec1_panel_enable' drivers/usb/host/max3421-hcd.c:1395:1: error: stack frame size of 10048 bytes in function 'max3421_spi_thread' drivers/net/wan/slic_ds26522.c:209:12: error: stack frame size of 9664 bytes in function 'slic_ds26522_probe' drivers/crypto/ccp/ccp-ops.c:2434:5: error: stack frame size of 8832 bytes in function 'ccp_run_cmd' drivers/media/dvb-frontends/stv0367.c:1005:12: error: stack frame size of 7840 bytes in function 'stv0367ter_algo' None of these happen with gcc today, and almost all of these are the result of a single known bug in llvm. Hopefully it will eventually get fixed with the clang-9 release. In the meantime, the best idea I have is to turn off asan-stack for clang-8 and earlier, so we can produce a kernel that is safe to run. I have posted three patches that address the frame overflow warnings that are not addressed by turning off asan-stack, so in combination with this change, we get much closer to a clean allmodconfig build, which in turn is necessary to do meaningful build regression testing. Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Mark Brown <broonie@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: https://bugs.llvm.org/show_bug.cgi?id=38809 Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- lib/Kconfig.kasan | 13 +++++++++++++ scripts/Makefile.kasan | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) -- 2.20.0