Message ID | 1545062607-8599-1-git-send-email-yamada.masahiro@socionext.com |
---|---|
Headers | show |
Series | x86, kbuild: revert macrofying inline assembly code | expand |
> On Dec 17, 2018, at 8:03 AM, Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > > This series reverts the in-kernel workarounds for inlining issues. > > The commit description of 77b0bf55bc67 mentioned > "We also hope that GCC will eventually get fixed,..." > > Now, GCC provides a solution. > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2FExtended-Asm.html&data=02%7C01%7Cnamit%40vmware.com%7Cc43f433d8c6244de15f108d6643a49e4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636806598899962669&sdata=88UJ25RoiHik9vTCJKZV6%2F7xpzCGsvKb9LFg1kfEYL0%3D&reserved=0 > explains the new "asm inline" syntax. > > The performance issue will be eventually solved. > > [About Code cleanups] > > I know Nadam Amit is opposed to the full revert. My name is Nadav. > He also claims his motivation for macrofying was not only > performance, but also cleanups. Masahiro, I understand your concerns and criticism, and indeed various alternative solutions exist. I do have my reservations about the one you propose, since it makes coding more complicated to simplify the Make system. Yet, more important, starting this discussion suddenly now after RC7 is strange. Anyhow, since it’s obviously not my call, please don’t make it sound as if I am a side in the decision.
On Wed, Dec 19, 2018 at 5:26 AM Nadav Amit <namit@vmware.com> wrote: > > > On Dec 17, 2018, at 8:03 AM, Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > > > > This series reverts the in-kernel workarounds for inlining issues. > > > > The commit description of 77b0bf55bc67 mentioned > > "We also hope that GCC will eventually get fixed,..." > > > > Now, GCC provides a solution. > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2FExtended-Asm.html&data=02%7C01%7Cnamit%40vmware.com%7Cc43f433d8c6244de15f108d6643a49e4%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636806598899962669&sdata=88UJ25RoiHik9vTCJKZV6%2F7xpzCGsvKb9LFg1kfEYL0%3D&reserved=0 > > explains the new "asm inline" syntax. > > > > The performance issue will be eventually solved. > > > > [About Code cleanups] > > > > I know Nadam Amit is opposed to the full revert. > > My name is Nadav. Sorry about that. (I relied on a spell checker. I should be careful about typos in people's name.) > > He also claims his motivation for macrofying was not only > > performance, but also cleanups. > > Masahiro, I understand your concerns and criticism, and indeed various > alternative solutions exist. I do have my reservations about the one you > propose, since it makes coding more complicated to simplify the Make system. > Yet, more important, starting this discussion suddenly now after RC7 is > strange. I did not think this was so sudden. When I suggested the revert a few weeks ago, Borislav was for it. I did not hear from the other x86 maintainers. Anyway, it is true we are running out of time for the release. > Anyhow, since it’s obviously not my call, please don’t make it > sound as if I am a side in the decision. > Not my call, either. That's why I put the x86 maintainers in the TO list, and other people in CC. The original patch set went in via x86 tree. So, its revert set, if we want, should go in the same path. Anyway, we have to do something for v4.20. Maybe, discussing short-term / long-term solutions separately could move things forward. [1] Solution for v4.20 [2] Solution for future kernel If we do not want to see the revert for [1], the best I can suggest is to copy arch/x86/kernel/macros.s to include/generated/macros.h so that it is kept for the external module build. (It is not literally included by anyone, but should work at least.) For [2], what do we want to see? -- Best Regards Masahiro Yamada
* Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > This series reverts the in-kernel workarounds for inlining issues. > > The commit description of 77b0bf55bc67 mentioned > "We also hope that GCC will eventually get fixed,..." > > Now, GCC provides a solution. > > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html > explains the new "asm inline" syntax. > > The performance issue will be eventually solved. > > [About Code cleanups] > > I know Nadam Amit is opposed to the full revert. > He also claims his motivation for macrofying was not only > performance, but also cleanups. > > IIUC, the criticism addresses the code duplication between C and ASM. > > If so, I'd like to suggest a different approach for cleanups. > Please see the last 3 patches. > IMHO, preprocessor approach is more straight-forward, and readable. > Basically, this idea should work because it is what we already do for > __ASM_FORM() etc. > > [Quick Test of "asm inline" of GCC 9] > > If you want to try "asm inline" feature, the patch is available: > https://lore.kernel.org/patchwork/patch/1024590/ > > The number of symbols for arch/x86/configs/x86_64_defconfig: > > nr_symbols > [1] v4.20-rc7 : 96502 > [2] [1]+full revert : 96705 (+203) > [3] [2]+"asm inline": 96568 (-137) > > [3]: apply my patch, then replace "asm" -> "asm_inline" > for _BUG_FLAGS(), refcount_add(), refcount_inc(), refcount_dec(), > annotate_reachable(), annotate_unreachable() > > > Changes in v3: > - Split into per-commit revert (per Nadav Amit) > - Add some cleanups with preprocessor approach > > Changes in v2: > - Revive clean-ups made by 5bdcd510c2ac (per Peter Zijlstra) > - Fix commit quoting style (per Peter Zijlstra) > > Masahiro Yamada (12): > Revert "x86/jump-labels: Macrofy inline assembly code to work around > GCC inlining bugs" > Revert "x86/cpufeature: Macrofy inline assembly code to work around > GCC inlining bugs" > Revert "x86/extable: Macrofy inline assembly code to work around GCC > inlining bugs" > Revert "x86/paravirt: Work around GCC inlining bugs when compiling > paravirt ops" > Revert "x86/bug: Macrofy the BUG table section handling, to work > around GCC inlining bugs" > Revert "x86/alternatives: Macrofy lock prefixes to work around GCC > inlining bugs" > Revert "x86/refcount: Work around GCC inlining bug" > Revert "x86/objtool: Use asm macros to work around GCC inlining bugs" > Revert "kbuild/Makefile: Prepare for using macros in inline assembly > code to work around asm() related GCC inlining bugs" > linux/linkage: add ASM() macro to reduce duplication between C/ASM > code > x86/alternatives: consolidate LOCK_PREFIX macro > x86/asm: consolidate ASM_EXTABLE_* macros > > Makefile | 9 +-- > arch/x86/Makefile | 7 --- > arch/x86/include/asm/alternative-asm.h | 22 +------ > arch/x86/include/asm/alternative-common.h | 47 +++++++++++++++ > arch/x86/include/asm/alternative.h | 30 +--------- > arch/x86/include/asm/asm.h | 46 +++++---------- > arch/x86/include/asm/bug.h | 98 +++++++++++++------------------ > arch/x86/include/asm/cpufeature.h | 82 +++++++++++--------------- > arch/x86/include/asm/jump_label.h | 22 +++++-- > arch/x86/include/asm/paravirt_types.h | 56 +++++++++--------- > arch/x86/include/asm/refcount.h | 81 +++++++++++-------------- > arch/x86/kernel/macros.S | 16 ----- > include/asm-generic/bug.h | 8 +-- > include/linux/compiler.h | 56 ++++-------------- > include/linux/linkage.h | 8 +++ > scripts/Kbuild.include | 4 +- > scripts/mod/Makefile | 2 - > 17 files changed, 249 insertions(+), 345 deletions(-) > create mode 100644 arch/x86/include/asm/alternative-common.h > delete mode 100644 arch/x86/kernel/macros.S I absolutely agree that this needs to be resolved in v4.20. So I did the 1-9 reverts manually myself as well, because I think the first commit should be reverted fully to get as close to the starting point as possible (we are late in the cycle) - and came to the attached interdiff between your series and mine. Does this approach look OK to you, or did I miss something? Thanks, Ingo =============> entry/calling.h | 2 - include/asm/jump_label.h | 50 ++++++++++++++++++++++++++++++++++------------- 2 files changed, 38 insertions(+), 14 deletions(-) diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h index 25e5a6bda8c3..20d0885b00fb 100644 --- a/arch/x86/entry/calling.h +++ b/arch/x86/entry/calling.h @@ -352,7 +352,7 @@ For 32-bit we have the following conventions - kernel is built with .macro CALL_enter_from_user_mode #ifdef CONFIG_CONTEXT_TRACKING #ifdef HAVE_JUMP_LABEL - STATIC_BRANCH_JMP l_yes=.Lafter_call_\@, key=context_tracking_enabled, branch=1 + STATIC_JUMP_IF_FALSE .Lafter_call_\@, context_tracking_enabled, def=0 #endif call enter_from_user_mode .Lafter_call_\@: diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h index cf88ebf9a4ca..21efc9d07ed9 100644 --- a/arch/x86/include/asm/jump_label.h +++ b/arch/x86/include/asm/jump_label.h @@ -2,6 +2,19 @@ #ifndef _ASM_X86_JUMP_LABEL_H #define _ASM_X86_JUMP_LABEL_H +#ifndef HAVE_JUMP_LABEL +/* + * For better or for worse, if jump labels (the gcc extension) are missing, + * then the entire static branch patching infrastructure is compiled out. + * If that happens, the code in here will malfunction. Raise a compiler + * error instead. + * + * In theory, jump labels and the static branch patching infrastructure + * could be decoupled to fix this. + */ +#error asm/jump_label.h included on a non-jump-label kernel +#endif + #define JUMP_LABEL_NOP_SIZE 5 #ifdef CONFIG_X86_64 @@ -53,26 +66,37 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool #else /* __ASSEMBLY__ */ -.macro STATIC_BRANCH_NOP l_yes:req key:req branch:req -.Lstatic_branch_nop_\@: - .byte STATIC_KEY_INIT_NOP -.Lstatic_branch_no_after_\@: +.macro STATIC_JUMP_IF_TRUE target, key, def +.Lstatic_jump_\@: + .if \def + /* Equivalent to "jmp.d32 \target" */ + .byte 0xe9 + .long \target - .Lstatic_jump_after_\@ +.Lstatic_jump_after_\@: + .else + .byte STATIC_KEY_INIT_NOP + .endif .pushsection __jump_table, "aw" _ASM_ALIGN - .long .Lstatic_branch_nop_\@ - ., \l_yes - . - _ASM_PTR \key + \branch - . + .long .Lstatic_jump_\@ - ., \target - . + _ASM_PTR \key - . .popsection .endm -.macro STATIC_BRANCH_JMP l_yes:req key:req branch:req -.Lstatic_branch_jmp_\@: - .byte 0xe9 - .long \l_yes - .Lstatic_branch_jmp_after_\@ -.Lstatic_branch_jmp_after_\@: +.macro STATIC_JUMP_IF_FALSE target, key, def +.Lstatic_jump_\@: + .if \def + .byte STATIC_KEY_INIT_NOP + .else + /* Equivalent to "jmp.d32 \target" */ + .byte 0xe9 + .long \target - .Lstatic_jump_after_\@ +.Lstatic_jump_after_\@: + .endif .pushsection __jump_table, "aw" _ASM_ALIGN - .long .Lstatic_branch_jmp_\@ - ., \l_yes - . - _ASM_PTR \key + \branch - . + .long .Lstatic_jump_\@ - ., \target - . + _ASM_PTR \key + 1 - . .popsection .endm
On Wed, Dec 19, 2018 at 9:44 PM Ingo Molnar <mingo@kernel.org> wrote: > > > * Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > > > This series reverts the in-kernel workarounds for inlining issues. > > > > The commit description of 77b0bf55bc67 mentioned > > "We also hope that GCC will eventually get fixed,..." > > > > Now, GCC provides a solution. > > > > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html > > explains the new "asm inline" syntax. > > > > The performance issue will be eventually solved. > > > > [About Code cleanups] > > > > I know Nadam Amit is opposed to the full revert. > > He also claims his motivation for macrofying was not only > > performance, but also cleanups. > > > > IIUC, the criticism addresses the code duplication between C and ASM. > > > > If so, I'd like to suggest a different approach for cleanups. > > Please see the last 3 patches. > > IMHO, preprocessor approach is more straight-forward, and readable. > > Basically, this idea should work because it is what we already do for > > __ASM_FORM() etc. > > > > [Quick Test of "asm inline" of GCC 9] > > > > If you want to try "asm inline" feature, the patch is available: > > https://lore.kernel.org/patchwork/patch/1024590/ > > > > The number of symbols for arch/x86/configs/x86_64_defconfig: > > > > nr_symbols > > [1] v4.20-rc7 : 96502 > > [2] [1]+full revert : 96705 (+203) > > [3] [2]+"asm inline": 96568 (-137) > > > > [3]: apply my patch, then replace "asm" -> "asm_inline" > > for _BUG_FLAGS(), refcount_add(), refcount_inc(), refcount_dec(), > > annotate_reachable(), annotate_unreachable() > > > > > > Changes in v3: > > - Split into per-commit revert (per Nadav Amit) > > - Add some cleanups with preprocessor approach > > > > Changes in v2: > > - Revive clean-ups made by 5bdcd510c2ac (per Peter Zijlstra) > > - Fix commit quoting style (per Peter Zijlstra) > > > > Masahiro Yamada (12): > > Revert "x86/jump-labels: Macrofy inline assembly code to work around > > GCC inlining bugs" > > Revert "x86/cpufeature: Macrofy inline assembly code to work around > > GCC inlining bugs" > > Revert "x86/extable: Macrofy inline assembly code to work around GCC > > inlining bugs" > > Revert "x86/paravirt: Work around GCC inlining bugs when compiling > > paravirt ops" > > Revert "x86/bug: Macrofy the BUG table section handling, to work > > around GCC inlining bugs" > > Revert "x86/alternatives: Macrofy lock prefixes to work around GCC > > inlining bugs" > > Revert "x86/refcount: Work around GCC inlining bug" > > Revert "x86/objtool: Use asm macros to work around GCC inlining bugs" > > Revert "kbuild/Makefile: Prepare for using macros in inline assembly > > code to work around asm() related GCC inlining bugs" > > linux/linkage: add ASM() macro to reduce duplication between C/ASM > > code > > x86/alternatives: consolidate LOCK_PREFIX macro > > x86/asm: consolidate ASM_EXTABLE_* macros > > > > Makefile | 9 +-- > > arch/x86/Makefile | 7 --- > > arch/x86/include/asm/alternative-asm.h | 22 +------ > > arch/x86/include/asm/alternative-common.h | 47 +++++++++++++++ > > arch/x86/include/asm/alternative.h | 30 +--------- > > arch/x86/include/asm/asm.h | 46 +++++---------- > > arch/x86/include/asm/bug.h | 98 +++++++++++++------------------ > > arch/x86/include/asm/cpufeature.h | 82 +++++++++++--------------- > > arch/x86/include/asm/jump_label.h | 22 +++++-- > > arch/x86/include/asm/paravirt_types.h | 56 +++++++++--------- > > arch/x86/include/asm/refcount.h | 81 +++++++++++-------------- > > arch/x86/kernel/macros.S | 16 ----- > > include/asm-generic/bug.h | 8 +-- > > include/linux/compiler.h | 56 ++++-------------- > > include/linux/linkage.h | 8 +++ > > scripts/Kbuild.include | 4 +- > > scripts/mod/Makefile | 2 - > > 17 files changed, 249 insertions(+), 345 deletions(-) > > create mode 100644 arch/x86/include/asm/alternative-common.h > > delete mode 100644 arch/x86/kernel/macros.S > > I absolutely agree that this needs to be resolved in v4.20. > > So I did the 1-9 reverts manually myself as well, because I think the > first commit should be reverted fully to get as close to the starting > point as possible (we are late in the cycle) - and came to the attached > interdiff between your series and mine. > > Does this approach look OK to you, or did I miss something? It looks OK to me. I thought the diff was a good cleanup part, but we can deal with it later on, so I do not mind it. Thanks! > Thanks, > > Ingo > > =============> > > entry/calling.h | 2 - > include/asm/jump_label.h | 50 ++++++++++++++++++++++++++++++++++------------- > 2 files changed, 38 insertions(+), 14 deletions(-) > > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h > index 25e5a6bda8c3..20d0885b00fb 100644 > --- a/arch/x86/entry/calling.h > +++ b/arch/x86/entry/calling.h > @@ -352,7 +352,7 @@ For 32-bit we have the following conventions - kernel is built with > .macro CALL_enter_from_user_mode > #ifdef CONFIG_CONTEXT_TRACKING > #ifdef HAVE_JUMP_LABEL > - STATIC_BRANCH_JMP l_yes=.Lafter_call_\@, key=context_tracking_enabled, branch=1 > + STATIC_JUMP_IF_FALSE .Lafter_call_\@, context_tracking_enabled, def=0 > #endif > call enter_from_user_mode > .Lafter_call_\@: > diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h > index cf88ebf9a4ca..21efc9d07ed9 100644 > --- a/arch/x86/include/asm/jump_label.h > +++ b/arch/x86/include/asm/jump_label.h > @@ -2,6 +2,19 @@ > #ifndef _ASM_X86_JUMP_LABEL_H > #define _ASM_X86_JUMP_LABEL_H > > +#ifndef HAVE_JUMP_LABEL > +/* > + * For better or for worse, if jump labels (the gcc extension) are missing, > + * then the entire static branch patching infrastructure is compiled out. > + * If that happens, the code in here will malfunction. Raise a compiler > + * error instead. > + * > + * In theory, jump labels and the static branch patching infrastructure > + * could be decoupled to fix this. > + */ > +#error asm/jump_label.h included on a non-jump-label kernel > +#endif > + > #define JUMP_LABEL_NOP_SIZE 5 > > #ifdef CONFIG_X86_64 > @@ -53,26 +66,37 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool > > #else /* __ASSEMBLY__ */ > > -.macro STATIC_BRANCH_NOP l_yes:req key:req branch:req > -.Lstatic_branch_nop_\@: > - .byte STATIC_KEY_INIT_NOP > -.Lstatic_branch_no_after_\@: > +.macro STATIC_JUMP_IF_TRUE target, key, def > +.Lstatic_jump_\@: > + .if \def > + /* Equivalent to "jmp.d32 \target" */ > + .byte 0xe9 > + .long \target - .Lstatic_jump_after_\@ > +.Lstatic_jump_after_\@: > + .else > + .byte STATIC_KEY_INIT_NOP > + .endif > .pushsection __jump_table, "aw" > _ASM_ALIGN > - .long .Lstatic_branch_nop_\@ - ., \l_yes - . > - _ASM_PTR \key + \branch - . > + .long .Lstatic_jump_\@ - ., \target - . > + _ASM_PTR \key - . > .popsection > .endm > > -.macro STATIC_BRANCH_JMP l_yes:req key:req branch:req > -.Lstatic_branch_jmp_\@: > - .byte 0xe9 > - .long \l_yes - .Lstatic_branch_jmp_after_\@ > -.Lstatic_branch_jmp_after_\@: > +.macro STATIC_JUMP_IF_FALSE target, key, def > +.Lstatic_jump_\@: > + .if \def > + .byte STATIC_KEY_INIT_NOP > + .else > + /* Equivalent to "jmp.d32 \target" */ > + .byte 0xe9 > + .long \target - .Lstatic_jump_after_\@ > +.Lstatic_jump_after_\@: > + .endif > .pushsection __jump_table, "aw" > _ASM_ALIGN > - .long .Lstatic_branch_jmp_\@ - ., \l_yes - . > - _ASM_PTR \key + \branch - . > + .long .Lstatic_jump_\@ - ., \target - . > + _ASM_PTR \key + 1 - . > .popsection > .endm > > -- Best Regards Masahiro Yamada