[GOLD] Support --icf=safe with -pie for x86_64

Message ID CAJRD=oqcd2y03pjosB6ifygwGv1wO0VgPFFqvTiSOvFhaqisJA@mail.gmail.com
State New
Headers show

Commit Message

Rahul Chaudhry via binutils Jan. 12, 2017, 9:28 p.m.
--icf=safe folds identical functions only if the functions do not have their
address taken. It uses relocation types to distinguish between a function call
and taking the address of a function. This works fine for all architectures
except x86_64, where the same relocation type (R_X86_64_PC32) may be used for
both function call and taking the address of a function (when compiled with
-fPIE). Currently Target_x86_64::do_can_check_for_function_pointers returns
false if the linker is invoked with -pie, and --icf=safe folds only ctors and
dtors.

This patch adds improved support for --icf=safe when used with -pie. For the
ambiguous case of R_X86_64_PC32 relocation, it checks the instruction opcode to
distinguish between function calls and taking the address of a function.

OK?

        * x86_64.cc (Target_x86_64::do_can_check_for_function_pointers):
        Return true even when building pie binaries.
        (Target_x86_64::possible_function_pointer_reloc): Check opcode
        for R_X86_64_PC32 relocations.
        (Target_x86_64::local_reloc_may_be_function_pointer): Pass
        extra arguments to local_reloc_may_be_function_pointer.
        (Target_x86_64::global_reloc_may_be_function_pointer): Likewise.
        * testsuite/Makefile.am (icf_safe_pie_test): New test case.
        * testsuite/Makefile.in: Regenerate.
        * testsuite/icf_safe_pie_test.sh: New shell script.

-- 
Rahul

Comments

Alan Modra Jan. 13, 2017, 1:23 a.m. | #1
On Thu, Jan 12, 2017 at 01:28:49PM -0800, Rahul Chaudhry via binutils wrote:
> +    case elfcpp::R_X86_64_PC32:

> +      {

> +        // This relocation may be used both for function calls and

> +        // for taking address of a function. We distinguish between

> +        // them by checking the opcodes.

> +        section_size_type stype;

> +        const unsigned char* view = src_obj->section_contents(src_indx,

> +                                                              &stype,

> +                                                              true);

> +

> +        // call

> +        if (r_offset >= 1

> +            && view[r_offset - 1] == 0xe8)

> +          return false;


Is it safe to assume that 0xe8 is really the start of an instruction?

What if instead you are looking at a modrm or sib for a rip-relative read?
It may not match in this case (I'm rusty at x86 and would have to look
at an AMD or Intel manual to know) but your should check this and of
course for the other encodings below.

Also, might you have an R_X86_64_PC32 in data and so be looking at the
high byte of the previous word?

> +

> +        // jmp

> +        if (r_offset >= 1

> +            && view[r_offset - 1] == 0xe9)

> +          return false;

> +

> +        // jo/jno/jb/jnb/je/jne/jna/ja/js/jns/jp/jnp/jl/jge/jle/jg

> +        if (r_offset >= 2

> +            && view[r_offset - 2] == 0x0f

> +            && view[r_offset - 1] >= 0x80

> +            && view[r_offset - 1] <= 0x8f)

> +          return false;

> +

> +        // Be conservative and treat all others as function pointers.

> +        return true;

> +      }

>      }

>    return false;

>  }


-- 
Alan Modra
Australia Development Lab, IBM
Rahul Chaudhry via binutils Jan. 13, 2017, 8:10 p.m. | #2
> Is it safe to assume that 0xe8 is really the start of an instruction?

>

> What if instead you are looking at a modrm or sib for a rip-relative read?

> It may not match in this case (I'm rusty at x86 and would have to look

> at an AMD or Intel manual to know) but your should check this and of

> course for the other encodings below.


You're right, this may result in false positives with some instance getting
classified as a function call when it's not, which in turn might result in some
section getting folded when it shouldn't be.

Ideally, we'd want a way to go from a relocation offset to the beginning of the
instruction containing the relocation. Is there a good way to do that for
x86_64 (short of disassembling the whole section)?

The approach here is inspired by other places in the same file that seem to be
doing something similar (can_convert_mov_to_lea/can_convert_callq_to_direct).
Maybe I'm missing something and there's more context in those other functions
that makes it safe to do so.


> Also, might you have an R_X86_64_PC32 in data and so be looking at the

> high byte of the previous word?


Not sure what you mean here, but this code is only called for text sections, so
we can be sure that the relocation offset is part of an instruction. Did you
mean an instruction with an encoding containning opcode bytes, followed by
immediate operand, which is followed by a R_X86_64_PC32 relocation?

-- 
Rahul


On Thu, Jan 12, 2017 at 5:23 PM, Alan Modra <amodra@gmail.com> wrote:
> On Thu, Jan 12, 2017 at 01:28:49PM -0800, Rahul Chaudhry via binutils wrote:

>> +    case elfcpp::R_X86_64_PC32:

>> +      {

>> +        // This relocation may be used both for function calls and

>> +        // for taking address of a function. We distinguish between

>> +        // them by checking the opcodes.

>> +        section_size_type stype;

>> +        const unsigned char* view = src_obj->section_contents(src_indx,

>> +                                                              &stype,

>> +                                                              true);

>> +

>> +        // call

>> +        if (r_offset >= 1

>> +            && view[r_offset - 1] == 0xe8)

>> +          return false;

>

> Is it safe to assume that 0xe8 is really the start of an instruction?

>

> What if instead you are looking at a modrm or sib for a rip-relative read?

> It may not match in this case (I'm rusty at x86 and would have to look

> at an AMD or Intel manual to know) but your should check this and of

> course for the other encodings below.

>

> Also, might you have an R_X86_64_PC32 in data and so be looking at the

> high byte of the previous word?

>

>> +

>> +        // jmp

>> +        if (r_offset >= 1

>> +            && view[r_offset - 1] == 0xe9)

>> +          return false;

>> +

>> +        // jo/jno/jb/jnb/je/jne/jna/ja/js/jns/jp/jnp/jl/jge/jle/jg

>> +        if (r_offset >= 2

>> +            && view[r_offset - 2] == 0x0f

>> +            && view[r_offset - 1] >= 0x80

>> +            && view[r_offset - 1] <= 0x8f)

>> +          return false;

>> +

>> +        // Be conservative and treat all others as function pointers.

>> +        return true;

>> +      }

>>      }

>>    return false;

>>  }

>

> --

> Alan Modra

> Australia Development Lab, IBM
Alan Modra Jan. 13, 2017, 10:23 p.m. | #3
On Fri, Jan 13, 2017 at 12:10:55PM -0800, Rahul Chaudhry wrote:
> > Also, might you have an R_X86_64_PC32 in data and so be looking at the

> > high byte of the previous word?

> 

> Not sure what you mean here, but this code is only called for text sections, so


OK, if you know it is text then you're a lot safer, but even so..

> we can be sure that the relocation offset is part of an instruction. Did you

> mean an instruction with an encoding containning opcode bytes, followed by

> immediate operand, which is followed by a R_X86_64_PC32 relocation?


.. what if you have a jump table of 32-bit relative offsets in text?
I think such a table would use R_X86_64_PC32 relocs if the
destinations were in other sections or other files.  If your new code
happens to look at one of those relocs then one byte before r_offset
is not part of an instruction using an R_X86_64_PC32 reloc.

-- 
Alan Modra
Australia Development Lab, IBM
Rahul Chaudhry via binutils Jan. 14, 2017, 12:28 a.m. | #4
On Fri, Jan 13, 2017 at 2:23 PM, Alan Modra <amodra@gmail.com> wrote:
> On Fri, Jan 13, 2017 at 12:10:55PM -0800, Rahul Chaudhry wrote:

>> > Also, might you have an R_X86_64_PC32 in data and so be looking at the

>> > high byte of the previous word?

>>

>> Not sure what you mean here, but this code is only called for text sections, so

>

> OK, if you know it is text then you're a lot safer, but even so..

>

>> we can be sure that the relocation offset is part of an instruction. Did you

>> mean an instruction with an encoding containning opcode bytes, followed by

>> immediate operand, which is followed by a R_X86_64_PC32 relocation?

>

> .. what if you have a jump table of 32-bit relative offsets in text?

> I think such a table would use R_X86_64_PC32 relocs if the

> destinations were in other sections or other files.  If your new code

> happens to look at one of those relocs then one byte before r_offset

> is not part of an instruction using an R_X86_64_PC32 reloc.


For a jump table containing relocations, it should be fine to treat all of them
as function calls/jumps, right?

If anything, the check for opcode 1-2 bytes before r_offset is likely to fail
and classify the relocation as address taken instead of a function call. This
is the reverse problem of above, it may result in a function section not
getting folded, that could have been folded safely.

-- 
Rahul
Alan Modra Jan. 16, 2017, 3:01 a.m. | #5
On Fri, Jan 13, 2017 at 04:28:36PM -0800, Rahul Chaudhry wrote:
> On Fri, Jan 13, 2017 at 2:23 PM, Alan Modra <amodra@gmail.com> wrote:

> > On Fri, Jan 13, 2017 at 12:10:55PM -0800, Rahul Chaudhry wrote:

> >> > Also, might you have an R_X86_64_PC32 in data and so be looking at the

> >> > high byte of the previous word?

> >>

> >> Not sure what you mean here, but this code is only called for text sections, so

> >

> > OK, if you know it is text then you're a lot safer, but even so..

> >

> >> we can be sure that the relocation offset is part of an instruction. Did you

> >> mean an instruction with an encoding containning opcode bytes, followed by

> >> immediate operand, which is followed by a R_X86_64_PC32 relocation?

> >

> > .. what if you have a jump table of 32-bit relative offsets in text?

> > I think such a table would use R_X86_64_PC32 relocs if the

> > destinations were in other sections or other files.  If your new code

> > happens to look at one of those relocs then one byte before r_offset

> > is not part of an instruction using an R_X86_64_PC32 reloc.

> 

> For a jump table containing relocations, it should be fine to treat all of them

> as function calls/jumps, right?


I don't know.  I could look at gold source and figure out the answer,
but it's really your job to convince Cary and others that what you're
doing is safe.  ie. Prove to some level of confidence that comparing
the byte before r_offset of an R_X86_64_PC32 against 0xe8 really is a
valid test for a call.  Note that I'm *not* saying your patch was
wrong.  It may in fact be the case that all insns that could use an
R_X86_64_PC32 besides a call will have modrm or sib bytes different
to 0xe8.  Ditto for the other opcodes you test.  It might also be the
case that gcc won't put jump tables using R_X86_64_PC32 in code
sections.  (What crazy assembly programmers do is another story.)

x86 code editing/analysis is difficult.  If I had to do it, I think
I'd define a few more relocs for use in code so that you could find
the start of an instruction.  You could even do that without bloating
objects with extra relocs.  eg. instead of R_X86_64_32 in code you'd
use a series of relocs that all behave like R_X86_64_32 for relocation
but also say something about the insn opcode:
  R_X86_64_32_I1 insn opcode one byte before r_offset
  R_X86_64_32_I2 insn opcode two bytes before r_offset
  etc.

-- 
Alan Modra
Australia Development Lab, IBM
Cary Coutant Jan. 18, 2017, 6:55 p.m. | #6
>> Also, might you have an R_X86_64_PC32 in data and so be looking at the

>> high byte of the previous word?

>

> Not sure what you mean here, but this code is only called for text sections, so

> we can be sure that the relocation offset is part of an instruction. Did you

> mean an instruction with an encoding containning opcode bytes, followed by

> immediate operand, which is followed by a R_X86_64_PC32 relocation?


I looked (maybe not carefully enough), but I didn't find anything in
this call chain that restricts the check to text sections:

gc_process_relocs
 -> scan.[global|local]_reloc_may_be_function_pointer
 -> possible_function_pointer_reloc

You need to check the SHF_EXECINST flag before assuming you're looking
at an opcode. (Look in x86_64.cc for "is_executable".)


>> For a jump table containing relocations, it should be fine to treat all of them

>> as function calls/jumps, right?

>

> I don't know.  I could look at gold source and figure out the answer,

> but it's really your job to convince Cary and others that what you're

> doing is safe.  ie. Prove to some level of confidence that comparing

> the byte before r_offset of an R_X86_64_PC32 against 0xe8 really is a

> valid test for a call.  Note that I'm *not* saying your patch was

> wrong.  It may in fact be the case that all insns that could use an

> R_X86_64_PC32 besides a call will have modrm or sib bytes different

> to 0xe8.  Ditto for the other opcodes you test.  It might also be the

> case that gcc won't put jump tables using R_X86_64_PC32 in code

> sections.  (What crazy assembly programmers do is another story.)


I think you should also be checking that the target symbol is
STT_FUNC; that should rule out most jump table cases. (I see a check
for != STT_OBJECT in the local symbol path, but nothing in the global
symbol path. It may be the case that STT_NOTYPE is used for some
extern function symbols, but you want to be conservative, right?)


> x86 code editing/analysis is difficult.  If I had to do it, I think

> I'd define a few more relocs for use in code so that you could find

> the start of an instruction.  You could even do that without bloating

> objects with extra relocs.  eg. instead of R_X86_64_32 in code you'd

> use a series of relocs that all behave like R_X86_64_32 for relocation

> but also say something about the insn opcode:

>   R_X86_64_32_I1 insn opcode one byte before r_offset

>   R_X86_64_32_I2 insn opcode two bytes before r_offset

>   etc.


Actually, I'd much prefer separate reloc types that just tells the
linker whether it's a call or a materialization of a function pointer.
In an ideal world, the linker should never have to load section
contents until it's applying relocations in pass 2. I really dislike
looking at opcodes. Everything we need to know in pass 1 should be
available by scanning the relocations.

-cary
Rahul Chaudhry via binutils Jan. 21, 2017, 12:53 a.m. | #7
Hi Alan and Cary,

Thanks for your feedback. Some comments below.

> I looked (maybe not carefully enough), but I didn't find anything in

> this call chain that restricts the check to text sections:

>

> gc_process_relocs

>  -> scan.[global|local]_reloc_may_be_function_pointer

>  -> possible_function_pointer_reloc

>

> You need to check the SHF_EXECINST flag before assuming you're looking

> at an opcode. (Look in x86_64.cc for "is_executable".)


I'll take a closer look at this, add a check, and send an updated patch.


> I think you should also be checking that the target symbol is

> STT_FUNC; that should rule out most jump table cases. (I see a check

> for != STT_OBJECT in the local symbol path, but nothing in the global

> symbol path. It may be the case that STT_NOTYPE is used for some

> extern function symbols, but you want to be conservative, right?)


Ditto for this one.


> > x86 code editing/analysis is difficult.  If I had to do it, I think

> > I'd define a few more relocs for use in code so that you could find

> > the start of an instruction.  You could even do that without bloating

> > objects with extra relocs.  eg. instead of R_X86_64_32 in code you'd

> > use a series of relocs that all behave like R_X86_64_32 for relocation

> > but also say something about the insn opcode:

> >   R_X86_64_32_I1 insn opcode one byte before r_offset

> >   R_X86_64_32_I2 insn opcode two bytes before r_offset

> >   etc.

>

> Actually, I'd much prefer separate reloc types that just tells the

> linker whether it's a call or a materialization of a function pointer.

> In an ideal world, the linker should never have to load section

> contents until it's applying relocations in pass 2. I really dislike

> looking at opcodes. Everything we need to know in pass 1 should be

> available by scanning the relocations.


I agree completely. The need for this patch (and any related heuristics) comes
from the fact that the same relocation type is being used for both function
calls and taking the address of the function. The compiler already knows the
difference, and an ideal solution would be to simply use different relocation
types for different use cases.

What would it take to reach that solution? How involved is it to add a new
relocation type for something like this? I would guess that some coordination
is needed with gcc (and other compilers) for this to work. On the binutils
side, a new relocation type would need support in gas/objdump/ld/gold and maybe
other places as well. Can you point me to a recent example for adding a similar
new relocation type so I can get an idea of what it entails?

Thanks,
Rahul
Cary Coutant Jan. 21, 2017, 2 a.m. | #8
>> Actually, I'd much prefer separate reloc types that just tells the

>> linker whether it's a call or a materialization of a function pointer.

>> In an ideal world, the linker should never have to load section

>> contents until it's applying relocations in pass 2. I really dislike

>> looking at opcodes. Everything we need to know in pass 1 should be

>> available by scanning the relocations.

>

> I agree completely. The need for this patch (and any related heuristics) comes

> from the fact that the same relocation type is being used for both function

> calls and taking the address of the function. The compiler already knows the

> difference, and an ideal solution would be to simply use different relocation

> types for different use cases.

>

> What would it take to reach that solution? How involved is it to add a new

> relocation type for something like this? I would guess that some coordination

> is needed with gcc (and other compilers) for this to work. On the binutils

> side, a new relocation type would need support in gas/objdump/ld/gold and maybe

> other places as well. Can you point me to a recent example for adding a similar

> new relocation type so I can get an idea of what it entails?


The x86 psABI is pretty well established, and changing it involves
overcoming a lot of inertia. It's not impossible, though -- there have
been some recent changes, and a few new relocations have been added.
I'd suggest starting with a proposal on the psABI mailing list at
x86-64-abi@googlegroups.com. If you can recruit enough support, it's
possible that others would contribute the GCC and Gnu ld changes.

-cary
Rahul Chaudhry via binutils Feb. 3, 2017, 8:17 p.m. | #9
>> I looked (maybe not carefully enough), but I didn't find anything in

>> this call chain that restricts the check to text sections:

>>

>> gc_process_relocs

>>  -> scan.[global|local]_reloc_may_be_function_pointer

>>  -> possible_function_pointer_reloc

>>

>> You need to check the SHF_EXECINST flag before assuming you're looking

>> at an opcode. (Look in x86_64.cc for "is_executable".)

>

> I'll take a closer look at this, add a check, and send an updated patch.

>

>

>> I think you should also be checking that the target symbol is

>> STT_FUNC; that should rule out most jump table cases. (I see a check

>> for != STT_OBJECT in the local symbol path, but nothing in the global

>> symbol path. It may be the case that STT_NOTYPE is used for some

>> extern function symbols, but you want to be conservative, right?)

>

> Ditto for this one.


Please see the updated patch below. There are two updates:

* Added a check for SHF_EXECINSTR flag before checking the call/jump opcodes.

* Added a check for STT_FUNC in the global symbol path.


Also, to add some background behind this patch, we're linking a large C++
binary (chrome-browser) with -pie.

* With --icf=none, the binary size is 177M.

* With --icf=safe, before this patch, the binary size is 176M.
ICF is able to fold 16730 sections (only ctors and dtors, since
Target_x86_64::do_can_check_for_function_pointers returns
false when using -pie).

* After applying this patch, with --icf=safe, the binary size is 169M.
ICF is able to fold 55623 sections.


        * x86_64.cc (Target_x86_64::do_can_check_for_function_pointers):
        Return true even when building pie binaries.
        (Target_x86_64::possible_function_pointer_reloc): Check opcode
        for R_X86_64_PC32 relocations.
        (Target_x86_64::local_reloc_may_be_function_pointer): Pass
        extra arguments to local_reloc_may_be_function_pointer.
        (Target_x86_64::global_reloc_may_be_function_pointer): Likewise.
        * gc.h (gc_process_relocs): Add check for STT_FUNC.
        * testsuite/Makefile.am (icf_safe_pie_test): New test case.
        * testsuite/Makefile.in: Regenerate.
        * testsuite/icf_safe_pie_test.sh: New shell script.


-- 
Rahuldiff --git gold/ChangeLog gold/ChangeLog
index 56fe9669db..e811b8ea6d 100644
--- gold/ChangeLog
+++ gold/ChangeLog
@@ -1,3 +1,17 @@
+2017-02-03  Rahul Chaudhry  <rahulchaudhry@google.com>
+
+	* x86_64.cc (Target_x86_64::do_can_check_for_function_pointers):
+	Return true even when building pie binaries.
+	(Target_x86_64::possible_function_pointer_reloc): Check opcode
+	for R_X86_64_PC32 relocations.
+	(Target_x86_64::local_reloc_may_be_function_pointer): Pass
+	extra arguments to local_reloc_may_be_function_pointer.
+	(Target_x86_64::global_reloc_may_be_function_pointer): Likewise.
+	* gc.h (gc_process_relocs): Add check for STT_FUNC.
+	* testsuite/Makefile.am (icf_safe_pie_test): New test case.
+	* testsuite/Makefile.in: Regenerate.
+	* testsuite/icf_safe_pie_test.sh: New shell script.
+
 2017-02-03  Alan Modra  <amodra@gmail.com>
 
 	* powerpc.cc (Powerpc_relobj::make_toc_relative): Don't crash
diff --git gold/gc.h gold/gc.h
index 45f8d2af9c..327efc25bf 100644
--- gold/gc.h
+++ gold/gc.h
@@ -295,6 +295,7 @@ gc_process_relocs(
 	  // When doing safe folding, check to see if this relocation is that
 	  // of a function pointer being taken.
 	  if (gsym->source() == Symbol::FROM_OBJECT
+              && gsym->type() == elfcpp::STT_FUNC
               && check_section_for_function_pointers
               && dst_obj != NULL
               && (!is_ordinary
diff --git gold/testsuite/Makefile.am gold/testsuite/Makefile.am
index d9480abb42..d0803d251a 100644
--- gold/testsuite/Makefile.am
+++ gold/testsuite/Makefile.am
@@ -274,6 +274,20 @@ icf_safe_test_1.stdout: icf_safe_test
 icf_safe_test_2.stdout: icf_safe_test
 	$(TEST_READELF) -h $< > $@
 
+check_SCRIPTS += icf_safe_pie_test.sh
+check_DATA += icf_safe_pie_test_1.stdout icf_safe_pie_test_2.stdout icf_safe_pie_test.map
+MOSTLYCLEANFILES += icf_safe_pie_test icf_safe_pie_test.map
+icf_safe_pie_test.o: icf_safe_test.cc
+	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIE -g -o $@ $<
+icf_safe_pie_test: icf_safe_pie_test.o gcctestdir/ld
+	$(CXXLINK) -o icf_safe_pie_test -Bgcctestdir/ -Wl,--icf=safe,-Map,icf_safe_pie_test.map icf_safe_pie_test.o -pie
+icf_safe_pie_test.map: icf_safe_pie_test
+	@touch icf_safe_pie_test.map
+icf_safe_pie_test_1.stdout: icf_safe_pie_test
+	$(TEST_NM) $< > $@
+icf_safe_pie_test_2.stdout: icf_safe_pie_test
+	$(TEST_READELF) -h $< > $@
+
 check_SCRIPTS += icf_safe_so_test.sh
 check_DATA += icf_safe_so_test_1.stdout icf_safe_so_test_2.stdout icf_safe_so_test.map
 MOSTLYCLEANFILES += icf_safe_so_test icf_safe_so_test.map
diff --git gold/testsuite/Makefile.in gold/testsuite/Makefile.in
index 2c67bf5fe2..133e733cf1 100644
--- gold/testsuite/Makefile.in
+++ gold/testsuite/Makefile.in
@@ -82,6 +82,7 @@ check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_keep_unique_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test.sh \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	final_layout.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	text_section_grouping.sh \
@@ -103,6 +104,9 @@ check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test_1.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test_2.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test.map \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test_1.stdout \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test_2.stdout \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test_1.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test_2.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.map \
@@ -125,6 +129,8 @@ check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_test icf_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_keep_unique_test \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test icf_safe_test.map \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	final_layout \
@@ -5093,6 +5099,8 @@ icf_keep_unique_test.sh.log: icf_keep_unique_test.sh
 	@p='icf_keep_unique_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 icf_safe_test.sh.log: icf_safe_test.sh
 	@p='icf_safe_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
+icf_safe_pie_test.sh.log: icf_safe_pie_test.sh
+	@p='icf_safe_pie_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 icf_safe_so_test.sh.log: icf_safe_so_test.sh
 	@p='icf_safe_so_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 final_layout.sh.log: final_layout.sh
@@ -5910,6 +5918,16 @@ uninstall-am:
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_NM) $< > $@
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_test_2.stdout: icf_safe_test
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_READELF) -h $< > $@
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test.o: icf_safe_test.cc
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIE -g -o $@ $<
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test: icf_safe_pie_test.o gcctestdir/ld
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXLINK) -o icf_safe_pie_test -Bgcctestdir/ -Wl,--icf=safe,-Map,icf_safe_pie_test.map icf_safe_pie_test.o -pie
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test.map: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	@touch icf_safe_pie_test.map
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test_1.stdout: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_NM) $< > $@
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test_2.stdout: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_READELF) -h $< > $@
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_so_test.o: icf_safe_so_test.cc
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIC -g -o $@ $<
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_so_test: icf_safe_so_test.o gcctestdir/ld
diff --git gold/testsuite/icf_safe_pie_test.sh gold/testsuite/icf_safe_pie_test.sh
new file mode 100755
index 0000000000..f91a80c0b9
--- /dev/null
+++ gold/testsuite/icf_safe_pie_test.sh
@@ -0,0 +1,76 @@
+#!/bin/sh
+
+# icf_safe_pie_test.sh -- test --icf=safe -pie
+
+# Copyright (C) 2009-2017 Free Software Foundation, Inc.
+# Written by Sriraman Tallam <tmsriram@google.com>.
+# Modified by Rahul Chaudhry <rahulchaudhry@google.com>.
+
+# This file is part of gold.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+# MA 02110-1301, USA.
+
+# The goal of this program is to verify if --icf=safe works with
+# -pie as expected. File icf_safe_test.cc is in this test. This
+# program checks if only ctors and dtors are folded, except for
+# the architectures which use relocation types and instruction
+# opcodes to detect if function pointers are taken.
+
+set -e
+
+check_nofold()
+{
+    func_addr_1=`grep $2 $1 | awk '{print $1}'`
+    func_addr_2=`grep $3 $1 | awk '{print $1}'`
+    if [ $func_addr_1 = $func_addr_2 ]
+    then
+        echo "Safe Identical Code Folding folded" $2 "and" $3
+	exit 1
+    fi
+}
+
+check_fold()
+{
+    awk "
+BEGIN { discard = 0; }
+/^Discarded input/ { discard = 1; }
+/^Memory map/ { discard = 0; }
+/.*\\.text\\..*($2|$3).*/ { act[discard] = act[discard] \" \" \$0; }
+END {
+      # printf \"kept\" act[0] \"\\nfolded\" act[1] \"\\n\";
+      if (length(act[0]) == 0 || length(act[1]) == 0)
+	{
+	  printf \"Safe Identical Code Folding did not fold $2 and $3\\n\"
+	  exit 1;
+	}
+    }" $1
+}
+
+arch_specific_safe_fold()
+{
+    if grep -q -e "Advanced Micro Devices X86-64" -e "Intel 80386" -e "ARM" -e "TILE" -e "PowerPC" -e "AArch64" -e "IBM S/390" $2;
+    then
+      check_fold $3 $4 $5
+    else
+      check_nofold $1 $4 $5
+    fi
+}
+
+arch_specific_safe_fold icf_safe_pie_test_1.stdout icf_safe_pie_test_2.stdout \
+  icf_safe_pie_test.map "kept_func_1" "kept_func_2"
+check_fold   icf_safe_pie_test.map "_ZN1AD2Ev" "_ZN1AC2Ev"
+check_nofold icf_safe_pie_test_1.stdout "kept_func_3" "kept_func_1"
+check_nofold icf_safe_pie_test_1.stdout "kept_func_3" "kept_func_2"
diff --git gold/x86_64.cc gold/x86_64.cc
index d21c26813c..7f1742dd5f 100644
--- gold/x86_64.cc
+++ gold/x86_64.cc
@@ -729,10 +729,13 @@ class Target_x86_64 : public Sized_target<size, false>
   // and global_reloc_may_be_function_pointer)
   // if a function's pointer is taken.  ICF uses this in safe mode to only
   // fold those functions whose pointer is defintely not taken.  For x86_64
-  // pie binaries, safe ICF cannot be done by looking at relocation types.
+  // pie binaries, safe ICF cannot be done by looking at only relocation
+  // types, and for certain cases (e.g. R_X86_64_PC32), the instruction
+  // opcode is checked as well to distinguish a function call from taking
+  // a function's pointer.
   bool
   do_can_check_for_function_pointers() const
-  { return !parameters->options().pie(); }
+  { return true; }
 
   // Return the base for a DW_EH_PE_datarel encoding.
   uint64_t
@@ -924,7 +927,10 @@ class Target_x86_64 : public Sized_target<size, false>
     check_non_pic(Relobj*, unsigned int r_type, Symbol*);
 
     inline bool
-    possible_function_pointer_reloc(unsigned int r_type);
+    possible_function_pointer_reloc(Sized_relobj_file<size, false>* src_obj,
+                                    unsigned int src_indx,
+                                    unsigned int r_offset,
+                                    unsigned int r_type);
 
     bool
     reloc_needs_plt_for_ifunc(Sized_relobj_file<size, false>*,
@@ -3277,7 +3283,11 @@ Target_x86_64<size>::Scan::unsupported_reloc_global(
 // Returns true if this relocation type could be that of a function pointer.
 template<int size>
 inline bool
-Target_x86_64<size>::Scan::possible_function_pointer_reloc(unsigned int r_type)
+Target_x86_64<size>::Scan::possible_function_pointer_reloc(
+    Sized_relobj_file<size, false>* src_obj,
+    unsigned int src_indx,
+    unsigned int r_offset,
+    unsigned int r_type)
 {
   switch (r_type)
     {
@@ -3296,6 +3306,41 @@ Target_x86_64<size>::Scan::possible_function_pointer_reloc(unsigned int r_type)
       {
 	return true;
       }
+    case elfcpp::R_X86_64_PC32:
+      {
+        // This relocation may be used both for function calls and
+        // for taking address of a function. We distinguish between
+        // them by checking the opcodes.
+        uint64_t sh_flags = src_obj->section_flags(src_indx);
+        bool is_executable = (sh_flags & elfcpp::SHF_EXECINSTR) != 0;
+        if (is_executable)
+          {
+            section_size_type stype;
+            const unsigned char* view = src_obj->section_contents(src_indx,
+                                                                  &stype,
+                                                                  true);
+
+            // call
+            if (r_offset >= 1
+                && view[r_offset - 1] == 0xe8)
+              return false;
+
+            // jmp
+            if (r_offset >= 1
+                && view[r_offset - 1] == 0xe9)
+              return false;
+
+            // jo/jno/jb/jnb/je/jne/jna/ja/js/jns/jp/jnp/jl/jge/jle/jg
+            if (r_offset >= 2
+                && view[r_offset - 2] == 0x0f
+                && view[r_offset - 1] >= 0x80
+                && view[r_offset - 1] <= 0x8f)
+              return false;
+          }
+
+        // Be conservative and treat all others as function pointers.
+        return true;
+      }
     }
   return false;
 }
@@ -3310,18 +3355,21 @@ Target_x86_64<size>::Scan::local_reloc_may_be_function_pointer(
   Symbol_table* ,
   Layout* ,
   Target_x86_64<size>* ,
-  Sized_relobj_file<size, false>* ,
-  unsigned int ,
+  Sized_relobj_file<size, false>* src_obj,
+  unsigned int src_indx,
   Output_section* ,
-  const elfcpp::Rela<size, false>& ,
+  const elfcpp::Rela<size, false>& reloc,
   unsigned int r_type,
   const elfcpp::Sym<size, false>&)
 {
   // When building a shared library, do not fold any local symbols as it is
   // not possible to distinguish pointer taken versus a call by looking at
   // the relocation types.
-  return (parameters->options().shared()
-	  || possible_function_pointer_reloc(r_type));
+  if (parameters->options().shared())
+    return true;
+
+  return possible_function_pointer_reloc(src_obj, src_indx,
+                                         reloc.get_r_offset(), r_type);
 }
 
 // For safe ICF, scan a relocation for a global symbol to check if it
@@ -3334,20 +3382,23 @@ Target_x86_64<size>::Scan::global_reloc_may_be_function_pointer(
   Symbol_table*,
   Layout* ,
   Target_x86_64<size>* ,
-  Sized_relobj_file<size, false>* ,
-  unsigned int ,
+  Sized_relobj_file<size, false>* src_obj,
+  unsigned int src_indx,
   Output_section* ,
-  const elfcpp::Rela<size, false>& ,
+  const elfcpp::Rela<size, false>& reloc,
   unsigned int r_type,
   Symbol* gsym)
 {
   // When building a shared library, do not fold symbols whose visibility
   // is hidden, internal or protected.
-  return ((parameters->options().shared()
-	   && (gsym->visibility() == elfcpp::STV_INTERNAL
-	       || gsym->visibility() == elfcpp::STV_PROTECTED
-	       || gsym->visibility() == elfcpp::STV_HIDDEN))
-	  || possible_function_pointer_reloc(r_type));
+  if (parameters->options().shared()
+      && (gsym->visibility() == elfcpp::STV_INTERNAL
+	  || gsym->visibility() == elfcpp::STV_PROTECTED
+	  || gsym->visibility() == elfcpp::STV_HIDDEN))
+    return true;
+
+  return possible_function_pointer_reloc(src_obj, src_indx,
+                                         reloc.get_r_offset(), r_type);
 }
 
 // Scan a relocation for a global symbol.

Cary Coutant Feb. 15, 2017, 8:39 a.m. | #10
>         * x86_64.cc (Target_x86_64::do_can_check_for_function_pointers):

>         Return true even when building pie binaries.

>         (Target_x86_64::possible_function_pointer_reloc): Check opcode

>         for R_X86_64_PC32 relocations.

>         (Target_x86_64::local_reloc_may_be_function_pointer): Pass

>         extra arguments to local_reloc_may_be_function_pointer.

>         (Target_x86_64::global_reloc_may_be_function_pointer): Likewise.

>         * gc.h (gc_process_relocs): Add check for STT_FUNC.

>         * testsuite/Makefile.am (icf_safe_pie_test): New test case.

>         * testsuite/Makefile.in: Regenerate.

>         * testsuite/icf_safe_pie_test.sh: New shell script.


This is OK. I've committed it on your behalf.

Thanks!

-cary

Patch hide | download patch | download mbox

diff --git gold/ChangeLog gold/ChangeLog
index 6cc97528a8..05e081186a 100644
--- gold/ChangeLog
+++ gold/ChangeLog
@@ -1,3 +1,16 @@ 
+2017-01-12  Rahul Chaudhry  <rahulchaudhry@google.com>
+
+	* x86_64.cc (Target_x86_64::do_can_check_for_function_pointers):
+	Return true even when building pie binaries.
+	(Target_x86_64::possible_function_pointer_reloc): Check opcode
+	for R_X86_64_PC32 relocations.
+	(Target_x86_64::local_reloc_may_be_function_pointer): Pass
+	extra arguments to local_reloc_may_be_function_pointer.
+	(Target_x86_64::global_reloc_may_be_function_pointer): Likewise.
+	* testsuite/Makefile.am (icf_safe_pie_test): New test case.
+	* testsuite/Makefile.in: Regenerate.
+	* testsuite/icf_safe_pie_test.sh: New shell script.
+
 2017-01-11  Cary Coutant  <ccoutant@gmail.com>
 
 	PR gold/21040
diff --git gold/testsuite/Makefile.am gold/testsuite/Makefile.am
index d9480abb42..d0803d251a 100644
--- gold/testsuite/Makefile.am
+++ gold/testsuite/Makefile.am
@@ -274,6 +274,20 @@  icf_safe_test_1.stdout: icf_safe_test
 icf_safe_test_2.stdout: icf_safe_test
 	$(TEST_READELF) -h $< > $@
 
+check_SCRIPTS += icf_safe_pie_test.sh
+check_DATA += icf_safe_pie_test_1.stdout icf_safe_pie_test_2.stdout icf_safe_pie_test.map
+MOSTLYCLEANFILES += icf_safe_pie_test icf_safe_pie_test.map
+icf_safe_pie_test.o: icf_safe_test.cc
+	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIE -g -o $@ $<
+icf_safe_pie_test: icf_safe_pie_test.o gcctestdir/ld
+	$(CXXLINK) -o icf_safe_pie_test -Bgcctestdir/ -Wl,--icf=safe,-Map,icf_safe_pie_test.map icf_safe_pie_test.o -pie
+icf_safe_pie_test.map: icf_safe_pie_test
+	@touch icf_safe_pie_test.map
+icf_safe_pie_test_1.stdout: icf_safe_pie_test
+	$(TEST_NM) $< > $@
+icf_safe_pie_test_2.stdout: icf_safe_pie_test
+	$(TEST_READELF) -h $< > $@
+
 check_SCRIPTS += icf_safe_so_test.sh
 check_DATA += icf_safe_so_test_1.stdout icf_safe_so_test_2.stdout icf_safe_so_test.map
 MOSTLYCLEANFILES += icf_safe_so_test icf_safe_so_test.map
diff --git gold/testsuite/Makefile.in gold/testsuite/Makefile.in
index 2c67bf5fe2..133e733cf1 100644
--- gold/testsuite/Makefile.in
+++ gold/testsuite/Makefile.in
@@ -82,6 +82,7 @@  check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_keep_unique_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test.sh \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	final_layout.sh \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	text_section_grouping.sh \
@@ -103,6 +104,9 @@  check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test_1.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test_2.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test.map \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test_1.stdout \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test_2.stdout \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test_1.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test_2.stdout \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.map \
@@ -125,6 +129,8 @@  check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3) \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_test icf_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_keep_unique_test \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_test icf_safe_test.map \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test \
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_pie_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	icf_safe_so_test.map \
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	final_layout \
@@ -5093,6 +5099,8 @@  icf_keep_unique_test.sh.log: icf_keep_unique_test.sh
 	@p='icf_keep_unique_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 icf_safe_test.sh.log: icf_safe_test.sh
 	@p='icf_safe_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
+icf_safe_pie_test.sh.log: icf_safe_pie_test.sh
+	@p='icf_safe_pie_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 icf_safe_so_test.sh.log: icf_safe_so_test.sh
 	@p='icf_safe_so_test.sh'; $(am__check_pre) $(LOG_COMPILE) "$$tst" $(am__check_post)
 final_layout.sh.log: final_layout.sh
@@ -5910,6 +5918,16 @@  uninstall-am:
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_NM) $< > $@
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_test_2.stdout: icf_safe_test
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_READELF) -h $< > $@
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test.o: icf_safe_test.cc
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIE -g -o $@ $<
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test: icf_safe_pie_test.o gcctestdir/ld
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXLINK) -o icf_safe_pie_test -Bgcctestdir/ -Wl,--icf=safe,-Map,icf_safe_pie_test.map icf_safe_pie_test.o -pie
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test.map: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	@touch icf_safe_pie_test.map
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test_1.stdout: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_NM) $< > $@
+@GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_pie_test_2.stdout: icf_safe_pie_test
+@GCC_TRUE@@NATIVE_LINKER_TRUE@	$(TEST_READELF) -h $< > $@
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_so_test.o: icf_safe_so_test.cc
 @GCC_TRUE@@NATIVE_LINKER_TRUE@	$(CXXCOMPILE) -O0 -c -ffunction-sections -fPIC -g -o $@ $<
 @GCC_TRUE@@NATIVE_LINKER_TRUE@icf_safe_so_test: icf_safe_so_test.o gcctestdir/ld
diff --git gold/testsuite/icf_safe_pie_test.sh gold/testsuite/icf_safe_pie_test.sh
new file mode 100755
index 0000000000..f91a80c0b9
--- /dev/null
+++ gold/testsuite/icf_safe_pie_test.sh
@@ -0,0 +1,76 @@ 
+#!/bin/sh
+
+# icf_safe_pie_test.sh -- test --icf=safe -pie
+
+# Copyright (C) 2009-2017 Free Software Foundation, Inc.
+# Written by Sriraman Tallam <tmsriram@google.com>.
+# Modified by Rahul Chaudhry <rahulchaudhry@google.com>.
+
+# This file is part of gold.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston,
+# MA 02110-1301, USA.
+
+# The goal of this program is to verify if --icf=safe works with
+# -pie as expected. File icf_safe_test.cc is in this test. This
+# program checks if only ctors and dtors are folded, except for
+# the architectures which use relocation types and instruction
+# opcodes to detect if function pointers are taken.
+
+set -e
+
+check_nofold()
+{
+    func_addr_1=`grep $2 $1 | awk '{print $1}'`
+    func_addr_2=`grep $3 $1 | awk '{print $1}'`
+    if [ $func_addr_1 = $func_addr_2 ]
+    then
+        echo "Safe Identical Code Folding folded" $2 "and" $3
+	exit 1
+    fi
+}
+
+check_fold()
+{
+    awk "
+BEGIN { discard = 0; }
+/^Discarded input/ { discard = 1; }
+/^Memory map/ { discard = 0; }
+/.*\\.text\\..*($2|$3).*/ { act[discard] = act[discard] \" \" \$0; }
+END {
+      # printf \"kept\" act[0] \"\\nfolded\" act[1] \"\\n\";
+      if (length(act[0]) == 0 || length(act[1]) == 0)
+	{
+	  printf \"Safe Identical Code Folding did not fold $2 and $3\\n\"
+	  exit 1;
+	}
+    }" $1
+}
+
+arch_specific_safe_fold()
+{
+    if grep -q -e "Advanced Micro Devices X86-64" -e "Intel 80386" -e "ARM" -e "TILE" -e "PowerPC" -e "AArch64" -e "IBM S/390" $2;
+    then
+      check_fold $3 $4 $5
+    else
+      check_nofold $1 $4 $5
+    fi
+}
+
+arch_specific_safe_fold icf_safe_pie_test_1.stdout icf_safe_pie_test_2.stdout \
+  icf_safe_pie_test.map "kept_func_1" "kept_func_2"
+check_fold   icf_safe_pie_test.map "_ZN1AD2Ev" "_ZN1AC2Ev"
+check_nofold icf_safe_pie_test_1.stdout "kept_func_3" "kept_func_1"
+check_nofold icf_safe_pie_test_1.stdout "kept_func_3" "kept_func_2"
diff --git gold/x86_64.cc gold/x86_64.cc
index ffa876116a..8360033d8c 100644
--- gold/x86_64.cc
+++ gold/x86_64.cc
@@ -729,10 +729,13 @@  class Target_x86_64 : public Sized_target<size, false>
   // and global_reloc_may_be_function_pointer)
   // if a function's pointer is taken.  ICF uses this in safe mode to only
   // fold those functions whose pointer is defintely not taken.  For x86_64
-  // pie binaries, safe ICF cannot be done by looking at relocation types.
+  // pie binaries, safe ICF cannot be done by looking at only relocation
+  // types, and for certain cases (e.g. R_X86_64_PC32), the instruction
+  // opcode is checked as well to distinguish a function call from taking
+  // a function's pointer.
   bool
   do_can_check_for_function_pointers() const
-  { return !parameters->options().pie(); }
+  { return true; }
 
   // Return the base for a DW_EH_PE_datarel encoding.
   uint64_t
@@ -924,7 +927,10 @@  class Target_x86_64 : public Sized_target<size, false>
     check_non_pic(Relobj*, unsigned int r_type, Symbol*);
 
     inline bool
-    possible_function_pointer_reloc(unsigned int r_type);
+    possible_function_pointer_reloc(Sized_relobj_file<size, false>* src_obj,
+                                    unsigned int src_indx,
+                                    unsigned int r_offset,
+                                    unsigned int r_type);
 
     bool
     reloc_needs_plt_for_ifunc(Sized_relobj_file<size, false>*,
@@ -3274,7 +3280,11 @@  Target_x86_64<size>::Scan::unsupported_reloc_global(
 // Returns true if this relocation type could be that of a function pointer.
 template<int size>
 inline bool
-Target_x86_64<size>::Scan::possible_function_pointer_reloc(unsigned int r_type)
+Target_x86_64<size>::Scan::possible_function_pointer_reloc(
+    Sized_relobj_file<size, false>* src_obj,
+    unsigned int src_indx,
+    unsigned int r_offset,
+    unsigned int r_type)
 {
   switch (r_type)
     {
@@ -3293,6 +3303,36 @@  Target_x86_64<size>::Scan::possible_function_pointer_reloc(unsigned int r_type)
       {
 	return true;
       }
+    case elfcpp::R_X86_64_PC32:
+      {
+        // This relocation may be used both for function calls and
+        // for taking address of a function. We distinguish between
+        // them by checking the opcodes.
+        section_size_type stype;
+        const unsigned char* view = src_obj->section_contents(src_indx,
+                                                              &stype,
+                                                              true);
+
+        // call
+        if (r_offset >= 1
+            && view[r_offset - 1] == 0xe8)
+          return false;
+
+        // jmp
+        if (r_offset >= 1
+            && view[r_offset - 1] == 0xe9)
+          return false;
+
+        // jo/jno/jb/jnb/je/jne/jna/ja/js/jns/jp/jnp/jl/jge/jle/jg
+        if (r_offset >= 2
+            && view[r_offset - 2] == 0x0f
+            && view[r_offset - 1] >= 0x80
+            && view[r_offset - 1] <= 0x8f)
+          return false;
+
+        // Be conservative and treat all others as function pointers.
+        return true;
+      }
     }
   return false;
 }
@@ -3307,18 +3347,21 @@  Target_x86_64<size>::Scan::local_reloc_may_be_function_pointer(
   Symbol_table* ,
   Layout* ,
   Target_x86_64<size>* ,
-  Sized_relobj_file<size, false>* ,
-  unsigned int ,
+  Sized_relobj_file<size, false>* src_obj,
+  unsigned int src_indx,
   Output_section* ,
-  const elfcpp::Rela<size, false>& ,
+  const elfcpp::Rela<size, false>& reloc,
   unsigned int r_type,
   const elfcpp::Sym<size, false>&)
 {
   // When building a shared library, do not fold any local symbols as it is
   // not possible to distinguish pointer taken versus a call by looking at
   // the relocation types.
-  return (parameters->options().shared()
-	  || possible_function_pointer_reloc(r_type));
+  if (parameters->options().shared())
+    return true;
+
+  return possible_function_pointer_reloc(src_obj, src_indx,
+                                         reloc.get_r_offset(), r_type);
 }
 
 // For safe ICF, scan a relocation for a global symbol to check if it
@@ -3331,20 +3374,23 @@  Target_x86_64<size>::Scan::global_reloc_may_be_function_pointer(
   Symbol_table*,
   Layout* ,
   Target_x86_64<size>* ,
-  Sized_relobj_file<size, false>* ,
-  unsigned int ,
+  Sized_relobj_file<size, false>* src_obj,
+  unsigned int src_indx,
   Output_section* ,
-  const elfcpp::Rela<size, false>& ,
+  const elfcpp::Rela<size, false>& reloc,
   unsigned int r_type,
   Symbol* gsym)
 {
   // When building a shared library, do not fold symbols whose visibility
   // is hidden, internal or protected.
-  return ((parameters->options().shared()
-	   && (gsym->visibility() == elfcpp::STV_INTERNAL
-	       || gsym->visibility() == elfcpp::STV_PROTECTED
-	       || gsym->visibility() == elfcpp::STV_HIDDEN))
-	  || possible_function_pointer_reloc(r_type));
+  if (parameters->options().shared()
+      && (gsym->visibility() == elfcpp::STV_INTERNAL
+	  || gsym->visibility() == elfcpp::STV_PROTECTED
+	  || gsym->visibility() == elfcpp::STV_HIDDEN))
+    return true;
+
+  return possible_function_pointer_reloc(src_obj, src_indx,
+                                         reloc.get_r_offset(), r_type);
 }
 
 // Scan a relocation for a global symbol.