diff mbox series

objtool warnings in prerelease clang-9

Message ID CAKwvOdnGL_9cJ+ETNce89+z7CTDctjACS8DFsLu=ev4+vkVkUw@mail.gmail.com
State New
Headers show
Series objtool warnings in prerelease clang-9 | expand

Commit Message

Nick Desaulniers July 2, 2019, 8:53 p.m. UTC
TL;DR
LLVM currently has a bug when unrolling loops containing asm goto and
we have a fix in hand.

Following up from our thread last week (https://lkml.org/lkml/2019/6/29/14).

With an x86 defconfig (6fbc7275c7a9) +clang 9 (e1006259d84d), I see
the following 4 warnings from objtool:
arch/x86/events/intel/core.o: warning: objtool:
intel_pmu_nhm_workaround()+0xa8: unreachable instruction
arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool:
get_fixed_ranges()+0x9b: unreachable instruction
arch/x86/kernel/platform-quirks.o: warning: objtool:
x86_early_init_platform_quirks()+0x84: unreachable instruction
drivers/gpu/drm/i915/i915_gem_execbuffer.o: warning: objtool:
.altinstr_replacement+0x88: redundant UACCESS disable

Peter mentioned seeing 6 cases:
https://lkml.org/lkml/2019/6/27/118

We have 21 open reports (mostly unverified, I think someone was
testing -O3 -march=native):
https://github.com/ClangBuiltLinux/linux/issues?q=is%3Aissue+is%3Aopen+label%3Aobjtool

Taking a look specifically at arch/x86/kernel/cpu/mtrr/generic.o (as
tglx and Peter did in https://lkml.org/lkml/2019/6/27/118:

Disassembly and jump table:
https://gist.github.com/nickdesaulniers/395129e75d9f7afe028db832145594d8
More concise snippet:
https://gist.github.com/nickdesaulniers/a3159782908cb6e50e26ea9563b14c20

Of interest are the disassembled __jump_table entries; in groups of
three, there is a group for which the second element is duplicated
with a previous group.  This is bad because (as explained by Peter in
https://lkml.org/lkml/2019/6/27/118) the triples are in the form (code
location, jump target, pointer to key).  Duplicate or repeated jump
targets are unexpected, and will lead to incorrect control flow after
patching such code locations.  Also, the jump target should be 0x7
bytes ahead of the location, IIUC.

I've annotated the concise snippet:
https://gist.github.com/nickdesaulniers/c0cb8fc67720a5abbadf42fde5440a8e
we can see that two groups in the __jump_table share a similar jump
target, unexpectedly.

Now that we can identify what's bad (duplicate __jump_table entries),
we can try to creduce the input.
(https://nickdesaulniers.github.io/blog/2019/01/18/finding-compiler-bugs-with-c-reduce/)

The preprocessed input is 37497 lines long.  We want to reduce it
down, but I haven't been able to write a correct "interestingness
test" for creduce where I was then able to make sense of the output.
Instead, I'm looking into program slicing:
https://en.wikipedia.org/wiki/Program_slicing.  We'd want to do this
for C code, and it should be possible to implement with libclang, but
since it doesn't exist yet AFAICT and llvm-extract does, we can do the
program slicing on the LLVM IR level.  This will also give us a test
case to add to the compiler's test suite.

$ clang <same c flags kernel uses via `make V=`> \
  -emit-llvm -S -Xclang -disable-llvm- passes -o get_fixed_ranges generic.ll
$ llvm-extract -func get_fixed_ranges generic.ll -o
generic.get_fixed_ranges.ll -S --recursive

is a more manageable 541 lines of LLVM IR.
https://gist.github.com/nickdesaulniers/73aef95e9735a50765f654ebdcbb99dd

`callbr` is the LLVM IR instruction representing `asm goto`.
https://llvm.org/docs/LangRef.html#callbr-instruction
The gist of the instruction is:
callbr ... <list of block addresses from the label list> to
<fallthrough basic block [<list of block addresses from the label
list>].  IIUC, the repeated list of addresses better match.

We can then observe the middle end work its optimizations:
$ opt -O2 -print-before-all -print-after-all
generic.get_fixed_ranges.ll -S -o - 2>&1 | less
(27199 lines of output, omitted)
Here's the relevant snippet of the 27199 lines:
https://gist.github.com/nickdesaulniers/089bf56165675065bca830cd7da767d8

grepping through the output and looking for the first pass which
contains non-matching list of block addresses in the label list, we
come upon:

```
callbr void asm sideeffect "1:.byte
0x0f,0x1f,0x44,0x00,0\0A\09.pushsection __jump_table,  \22aw\22 \0A\09
.balign 8 \0A\09.long 1b - ., ${2:l} - . \0A\09 .quad ${0:c} + ${1:c}
- .\0A\09.popsection \0A\09",
"i,i,X,~{dirflag},~{fpsr},~{flags}"(%struct.static_key* getelementptr
inbounds (%struct.tracepoint, %struct.tracepoint*
@__tracepoint_read_msr, i64 0, i32 1), i1 false, i8*
blockaddress(@get_fixed_ranges, %20)) #6
          to label %native_read_msr.exit31.1 [label %49], !srcloc !4
```

(particularly I would have expected the `%20` and `%49` to match.)

which occurs before "MergedLoadStoreMotion," but due to what looks
like a bug in the pass printer
(https://lists.llvm.org/pipermail/cfe-dev/2019-July/062795.html)
doesn't print the results of loop unrolling, which runs just before
"MergedLoadStoreMotion."

I see that `get_fixed_ranges` is a loop that gets unrolled to have a
trip count of two.
$ opt -O2 -debug-only=loop-unroll generic.get_fixed_ranges.ll -S -o /dev/null
...
Loop Unroll: F[get_fixed_ranges] Loop %
  Loop Size = 27
  Trip Count = 2
  Trip Multiple = 2
COMPLETELY UNROLLING loop % with trip count 2!

If I manually add some print statements to loop unrolling, I can see
that before we unroll the loop, things look good. After unrolling the
loop, things look bad.  Indeed, if I kneecap loop unrolling the
problem goes away:

$ opt -O2 -debug-only=loop-unroll -unroll-full-max-count=0
generic.get_fixed_ranges.ll -S -o -

So this is a bug in our loop unroller when unrolling IR
representations of asm goto.  Peter's guess that this was a bug in
inlining was very close!  Function inlining and loop unrolling are
closely related and probably share a lot of machinery related to
cloning of basic blocks and instructions.  In this case, if we have a
loop with asm goto and we unroll the loop, the referenced labels from
the asm goto statement need to reflect the newly duplicated labels.
In this case, the new duplicated asm goto is referencing the initial
asm goto's label, which would result in non-sensical control flow at
runtime.

In the loop unroller, I can see that we skip loop unrolling in the
event of "indirectbr" instructions.
https://github.com/llvm/llvm-project/blob/97316fff5d6a7c3d2c072301a118ac0cb5094ad8/llvm/lib/Transforms/Utils/LoopUnroll.cpp#L294
https://github.com/llvm/llvm-project/blob/97316fff5d6a7c3d2c072301a118ac0cb5094ad8/llvm/lib/Analysis/LoopInfo.cpp#L432

Conservatively, we can block loop unrolling when we see asm goto in a loop:
     for (Instruction &I : *BB)

This causes objtool to not find any issues in
arch/x86/kernel/cpu/mtrr/generic.o.  I don't observe any duplication
in the __jump_table section of the resulting .o file.  It also cuts
down the objtool warnings I observe in a defconfig (listed at the
beginning of the email) from 4 to 2. (platform-quirks.o predates asm
goto, i915_gem_execbuffer.o is likely a separate bug).

Coincidentally, Alexander and Bill just found what I think is the same bug:
https://bugs.llvm.org/show_bug.cgi?id=42489

So for now we'll be conservative/correct and disable unrolling of
loops that contain asm goto; then we can use the same test case to
permit unrolling with the correct code transformations added.
-- 
Thanks,
~Nick Desaulniers

Comments

Thomas Gleixner July 2, 2019, 9:58 p.m. UTC | #1
Nick,

On Tue, 2 Jul 2019, Nick Desaulniers wrote:

> TL;DR

> LLVM currently has a bug when unrolling loops containing asm goto and

> we have a fix in hand.


<snip>

> 

> Conservatively, we can block loop unrolling when we see asm goto in a loop:


Makes sense in order to make progress.

> This causes objtool to not find any issues in

> arch/x86/kernel/cpu/mtrr/generic.o.  I don't observe any duplication

> in the __jump_table section of the resulting .o file.  It also cuts

> down the objtool warnings I observe in a defconfig (listed at the

> beginning of the email) from 4 to 2. (platform-quirks.o predates asm

> goto,


It does not have asm goto inside :)

> i915_gem_execbuffer.o is likely a separate bug).


platform-quirks.o:

        if (x86_platform.set_legacy_features)
  74:   4c 8b 1d 00 00 00 00    mov    0x0(%rip),%r11        # 7b <x86_early_init_platform_quirks+0x7b>
  7b:   4d 85 db                test   %r11,%r11
  7e:   0f 85 00 00 00 00       jne    84 <x86_early_init_platform_quirks+0x84>
                x86_platform.set_legacy_features();
}
  84:   c3                      retq   

That jne jumps to __x86_indirect_thunk_r11, aka. ratpoutine.

No idea why objtool thinks that the instruction at 0x84 is not
reachable. Josh?

Thanks,

	tglx
Peter Zijlstra July 2, 2019, 10:07 p.m. UTC | #2
Hi Nick,

That is good news; and I'll strive to read the email in more detail
in the morning when there is a better chance of me actually
understanding some of it :-)

But his here is something I felt needed clarification:

On Tue, Jul 02, 2019 at 01:53:51PM -0700, Nick Desaulniers wrote:
> Of interest are the disassembled __jump_table entries; in groups of

> three, there is a group for which the second element is duplicated

> with a previous group.  This is bad because (as explained by Peter in

> https://lkml.org/lkml/2019/6/27/118) the triples are in the form (code

> location, jump target, pointer to key).  Duplicate or repeated jump

> targets are unexpected, and will lead to incorrect control flow after

> patching such code locations.


> Also, the jump target should be 0x7 bytes ahead of the location, IIUC.


Even if you mean 'at least' I'm fairly sure this is not correct. The
instruction at the 'code location' is either a jmp.d32 or a nop5 (both 5
bytes). The target must (obviously) be at an instruction boundary, but
really can be anywhere (it is compiler generated after all).
Nick Desaulniers July 2, 2019, 10:09 p.m. UTC | #3
On Tue, Jul 2, 2019 at 2:58 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> On Tue, 2 Jul 2019, Nick Desaulniers wrote:

> > TL;DR

> > LLVM currently has a bug when unrolling loops containing asm goto and

> > we have a fix in hand.

> >

> > Conservatively, we can block loop unrolling when we see asm goto in a loop:

>

> Makes sense in order to make progress.


I have the conservative fix posted for review: https://reviews.llvm.org/D64101.
I've modified the test case there for what it should look like when we
can and do inline it:
https://gist.github.com/nickdesaulniers/7216f6e5a17c7064285190440cb88f1d
I feel like I'm pretty close to just fixing the block duplication in
unrolling outright and can probably have a working fix in the next day
or two.

> > This causes objtool to not find any issues in

> > arch/x86/kernel/cpu/mtrr/generic.o.  I don't observe any duplication

> > in the __jump_table section of the resulting .o file.  It also cuts

> > down the objtool warnings I observe in a defconfig (listed at the

> > beginning of the email) from 4 to 2. (platform-quirks.o predates asm

> > goto,

>

> It does not have asm goto inside :)


I think you're conflating arch/x86/kernel/cpu/mtrr/generic.o with
arch/x86/kernel/platform-quirks.o.

> platform-quirks.o:

>

>         if (x86_platform.set_legacy_features)

>   74:   4c 8b 1d 00 00 00 00    mov    0x0(%rip),%r11        # 7b <x86_early_init_platform_quirks+0x7b>

>   7b:   4d 85 db                test   %r11,%r11

>   7e:   0f 85 00 00 00 00       jne    84 <x86_early_init_platform_quirks+0x84>

>                 x86_platform.set_legacy_features();

> }

>   84:   c3                      retq

>

> That jne jumps to __x86_indirect_thunk_r11, aka. ratpoutine.


As an American with 100% French Canadian ancestry, I approve of the
term ratpoutine over retpoline.

> No idea why objtool thinks that the instruction at 0x84 is not

> reachable. Josh?



-- 
Thanks,
~Nick Desaulniers
Thomas Gleixner July 2, 2019, 10:20 p.m. UTC | #4
Nick,

On Tue, 2 Jul 2019, Nick Desaulniers wrote:
> On Tue, Jul 2, 2019 at 2:58 PM Thomas Gleixner <tglx@linutronix.de> wrote:

> > On Tue, 2 Jul 2019, Nick Desaulniers wrote:

> 

> > > This causes objtool to not find any issues in

> > > arch/x86/kernel/cpu/mtrr/generic.o.  I don't observe any duplication

> > > in the __jump_table section of the resulting .o file.  It also cuts

> > > down the objtool warnings I observe in a defconfig (listed at the

> > > beginning of the email) from 4 to 2. (platform-quirks.o predates asm

> > > goto,

> >

> > It does not have asm goto inside :)

> 

> I think you're conflating arch/x86/kernel/cpu/mtrr/generic.o with

> arch/x86/kernel/platform-quirks.o.


Nope. I deliberately split the quote after the platform-quirks part so the
reply goes near to it. Seems it wasn't as obvious as I thought :)

Thanks,

	tglx
Josh Poimboeuf July 6, 2019, 3:50 p.m. UTC | #5
On Tue, Jul 02, 2019 at 11:58:27PM +0200, Thomas Gleixner wrote:
> platform-quirks.o:

> 

>         if (x86_platform.set_legacy_features)

>   74:   4c 8b 1d 00 00 00 00    mov    0x0(%rip),%r11        # 7b <x86_early_init_platform_quirks+0x7b>

>   7b:   4d 85 db                test   %r11,%r11

>   7e:   0f 85 00 00 00 00       jne    84 <x86_early_init_platform_quirks+0x84>

>                 x86_platform.set_legacy_features();

> }

>   84:   c3                      retq   

> 

> That jne jumps to __x86_indirect_thunk_r11, aka. ratpoutine.

> 

> No idea why objtool thinks that the instruction at 0x84 is not

> reachable. Josh?


That's a conditional tail call, which is something GCC never does.
Objtool doesn't understand that, so we'll need to fix it.

-- 
Josh
Josh Poimboeuf July 10, 2019, 11:22 p.m. UTC | #6
On Sat, Jul 06, 2019 at 10:50:01AM -0500, Josh Poimboeuf wrote:
> On Tue, Jul 02, 2019 at 11:58:27PM +0200, Thomas Gleixner wrote:

> > platform-quirks.o:

> > 

> >         if (x86_platform.set_legacy_features)

> >   74:   4c 8b 1d 00 00 00 00    mov    0x0(%rip),%r11        # 7b <x86_early_init_platform_quirks+0x7b>

> >   7b:   4d 85 db                test   %r11,%r11

> >   7e:   0f 85 00 00 00 00       jne    84 <x86_early_init_platform_quirks+0x84>

> >                 x86_platform.set_legacy_features();

> > }

> >   84:   c3                      retq   

> > 

> > That jne jumps to __x86_indirect_thunk_r11, aka. ratpoutine.

> > 

> > No idea why objtool thinks that the instruction at 0x84 is not

> > reachable. Josh?

> 

> That's a conditional tail call, which is something GCC never does.

> Objtool doesn't understand that, so we'll need to fix it.


Can somebody test this patch to see if it fixes the platform-quirks.o
warning?

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 27818a93f0b1..42fe09a93593 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -97,6 +97,36 @@ static struct instruction *next_insn_same_func(struct objtool_file *file,
 	for (insn = next_insn_same_sec(file, insn); insn;		\
 	     insn = next_insn_same_sec(file, insn))
 
+static bool is_sibling_call(struct instruction *insn)
+{
+	struct symbol *insn_func, *dest_func;
+
+	/* Only C code does sibling calls. */
+	if (!insn->func)
+		return false;
+
+	/* An indirect jump is either a sibling call or a jump to a table. */
+	if (insn->type == INSN_JUMP_DYNAMIC)
+		return list_empty(&insn->alts);
+
+	if (insn->type != INSN_JUMP_CONDITIONAL &&
+	    insn->type != INSN_JUMP_UNCONDITIONAL)
+		return false;
+
+	/* A direct jump to outside the .o file is a sibling call. */
+	if (!insn->jump_dest)
+		return true;
+
+	if (!insn->jump_dest->func)
+		return false;
+
+	/* A direct jump to the beginning of a function is a sibling call. */
+	insn_func = insn->func->pfunc;
+	dest_func = insn->jump_dest->func->pfunc;
+	return insn_func != dest_func &&
+	       insn->jump_dest->offset == dest_func->offset;
+}
+
 /*
  * This checks to see if the given function is a "noreturn" function.
  *
@@ -169,34 +199,25 @@ static int __dead_end_function(struct objtool_file *file, struct symbol *func,
 	 * of the sibling call returns.
 	 */
 	func_for_each_insn_all(file, func, insn) {
-		if (insn->type == INSN_JUMP_UNCONDITIONAL) {
+		if (is_sibling_call(insn)) {
 			struct instruction *dest = insn->jump_dest;
 
 			if (!dest)
 				/* sibling call to another file */
 				return 0;
 
-			if (dest->func && dest->func->pfunc != insn->func->pfunc) {
-
-				/* local sibling call */
-				if (recursion == 5) {
-					/*
-					 * Infinite recursion: two functions
-					 * have sibling calls to each other.
-					 * This is a very rare case.  It means
-					 * they aren't dead ends.
-					 */
-					return 0;
-				}
-
-				return __dead_end_function(file, dest->func,
-							   recursion + 1);
+			/* local sibling call */
+			if (recursion == 5) {
+				/*
+				 * Infinite recursion: two functions have
+				 * sibling calls to each other.  This is a very
+				 * rare case.  It means they aren't dead ends.
+				 */
+				return 0;
 			}
-		}
 
-		if (insn->type == INSN_JUMP_DYNAMIC && list_empty(&insn->alts))
-			/* sibling call */
-			return 0;
+			return __dead_end_function(file, dest->func, recursion+1);
+		}
 	}
 
 	return 1;
@@ -635,7 +656,7 @@ static int add_jump_destinations(struct objtool_file *file)
 			} else if (insn->jump_dest->func->pfunc != insn->func->pfunc &&
 				   insn->jump_dest->offset == insn->jump_dest->func->offset) {
 
-				/* sibling class */
+				/* sibling calls */
 				insn->call_dest = insn->jump_dest->func;
 				insn->jump_dest = NULL;
 			}
@@ -2100,7 +2121,7 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
 
 		case INSN_JUMP_CONDITIONAL:
 		case INSN_JUMP_UNCONDITIONAL:
-			if (func && !insn->jump_dest) {
+			if (is_sibling_call(insn)) {
 				ret = validate_sibling_call(insn, &state);
 				if (ret)
 					return ret;
@@ -2123,7 +2144,7 @@ static int validate_branch(struct objtool_file *file, struct instruction *first,
 			break;
 
 		case INSN_JUMP_DYNAMIC:
-			if (func && list_empty(&insn->alts)) {
+			if (is_sibling_call(insn)) {
 				ret = validate_sibling_call(insn, &state);
 				if (ret)
 					return ret;
Nick Desaulniers July 10, 2019, 11:42 p.m. UTC | #7
On Wed, Jul 10, 2019 at 4:22 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>

> On Sat, Jul 06, 2019 at 10:50:01AM -0500, Josh Poimboeuf wrote:

> > On Tue, Jul 02, 2019 at 11:58:27PM +0200, Thomas Gleixner wrote:

> > > platform-quirks.o:

> > >

> > >         if (x86_platform.set_legacy_features)

> > >   74:   4c 8b 1d 00 00 00 00    mov    0x0(%rip),%r11        # 7b <x86_early_init_platform_quirks+0x7b>

> > >   7b:   4d 85 db                test   %r11,%r11

> > >   7e:   0f 85 00 00 00 00       jne    84 <x86_early_init_platform_quirks+0x84>

> > >                 x86_platform.set_legacy_features();

> > > }

> > >   84:   c3                      retq

> > >

> > > That jne jumps to __x86_indirect_thunk_r11, aka. ratpoutine.

> > >

> > > No idea why objtool thinks that the instruction at 0x84 is not

> > > reachable. Josh?

> >

> > That's a conditional tail call, which is something GCC never does.

> > Objtool doesn't understand that, so we'll need to fix it.

>

> Can somebody test this patch to see if it fixes the platform-quirks.o

> warning?


$ make CC=clang -j71 2>&1 | grep platform-quirks
  CC      arch/x86/kernel/platform-quirks.o
arch/x86/kernel/platform-quirks.o: warning: objtool:
x86_early_init_platform_quirks()+0x84: unreachable instruction
$ git am /tmp/objtool.patch
$ make CC=clang -j71 clean
$ make CC=clang -j71 2>&1 | grep platform-quirks
  CC      arch/x86/kernel/platform-quirks.o
arch/x86/kernel/platform-quirks.o: warning: objtool:
x86_early_init_platform_quirks()+0x84: unreachable instruction

:(

$ llvm-objdump -dr arch/x86/kernel/platform-quirks.o

arch/x86/kernel/platform-quirks.o: file format ELF64-x86-64


Disassembly of section .init.text:

0000000000000000 x86_early_init_platform_quirks:
       0: 48 b8 02 00 00 00 01 00 00 00 movabsq $4294967298, %rax
       a: 48 89 05 00 00 00 00          movq %rax, (%rip)
000000000000000d:  R_X86_64_PC32 x86_platform+84
      11: c7 05 00 00 00 00 01 00 00 00 movl $1, (%rip)
0000000000000013:  R_X86_64_PC32 x86_platform+88
      1b: 48 b8 00 00 00 00 01 00 00 00 movabsq $4294967296, %rax
      25: 48 89 05 00 00 00 00          movq %rax, (%rip)
0000000000000028:  R_X86_64_PC32 x86_platform+100
      2c: 8b 05 00 00 00 00            movl (%rip), %eax
000000000000002e:  R_X86_64_PC32 boot_params+568
      32: 8d 48 fd                      leal -3(%rax), %ecx
      35: 83 f9 02                      cmpl $2, %ecx
      38: 72 15                        jb 21
<x86_early_init_platform_quirks+0x4f>
      3a: 83 f8 02                      cmpl $2, %eax
      3d: 74 27                        je 39
<x86_early_init_platform_quirks+0x66>
      3f: 85 c0                        testl %eax, %eax
      41: 75 31                        jne 49
<x86_early_init_platform_quirks+0x74>
      43: c7 05 00 00 00 00 01 00 00 00 movl $1, (%rip)
0000000000000045:  R_X86_64_PC32 x86_platform+96
      4d: eb 25                        jmp 37
<x86_early_init_platform_quirks+0x74>
      4f: c7 05 00 00 00 00 00 00 00 00 movl $0, (%rip)
0000000000000051:  R_X86_64_PC32 x86_platform+100
      59: 48 c7 05 00 00 00 00 00 00 00 00      movq $0, (%rip)
000000000000005c:  R_X86_64_PC32 x86_platform+80
      64: eb 0e                        jmp 14
<x86_early_init_platform_quirks+0x74>
      66: 31 c0                        xorl %eax, %eax
      68: 89 05 00 00 00 00            movl %eax, (%rip)
000000000000006a:  R_X86_64_PC32 x86_platform+104
      6e: 89 05 00 00 00 00            movl %eax, (%rip)
0000000000000070:  R_X86_64_PC32 x86_platform+88
      74: 4c 8b 1d 00 00 00 00          movq (%rip), %r11
0000000000000077:  R_X86_64_PC32 x86_platform+108
      7b: 4d 85 db                      testq %r11, %r11
      7e: 0f 85 00 00 00 00            jne 0
<x86_early_init_platform_quirks+0x84>
0000000000000080:  R_X86_64_PC32 __x86_indirect_thunk_r11-4
      84: c3                            retq

I've sent you the .o file off thread as well.  Thanks for taking a
look into this. :D
-- 
Thanks,
~Nick Desaulniers
Josh Poimboeuf July 11, 2019, 12:04 a.m. UTC | #8
On Wed, Jul 10, 2019 at 04:42:43PM -0700, Nick Desaulniers wrote:
>       7e: 0f 85 00 00 00 00            jne 0

> <x86_early_init_platform_quirks+0x84>

> 0000000000000080:  R_X86_64_PC32 __x86_indirect_thunk_r11-4

>       84: c3                            retq

> 

> I've sent you the .o file off thread as well.  Thanks for taking a

> look into this. :D


Ah, right.  I forgot it was a retpoline jump.  I had a patch for that as
well, will test with the .o and post tomorrow-ish.

-- 
Josh
diff mbox series

Patch

diff --git a/llvm/lib/Analysis/LoopInfo.cpp b/llvm/lib/Analysis/LoopInfo.cpp
index 00dbe30c2b3d..f800c890d1db 100644
--- a/llvm/lib/Analysis/LoopInfo.cpp
+++ b/llvm/lib/Analysis/LoopInfo.cpp
@@ -433,7 +433,8 @@  bool Loop::isSafeToClone() const {
   // Return false if any loop blocks contain indirectbrs, or there
are any calls
   // to noduplicate functions.
   for (BasicBlock *BB : this->blocks()) {
-    if (isa<IndirectBrInst>(BB->getTerminator()))
+    if (isa<IndirectBrInst>(BB->getTerminator()) ||
+        isa<CallBrInst>(BB->getTerminator()))
       return false;