[v3,14/30] perf clang: Support compile IR to BPF object and add testcase

Message ID 583C076F.8000505@huawei.com
State New
Headers show

Commit Message

Wang Nan Nov. 28, 2016, 10:31 a.m.
On 2016/11/28 14:32, Wangnan (F) wrote:
>

>

> On 2016/11/27 1:25, Alexei Starovoitov wrote:

>> On Sat, Nov 26, 2016 at 07:03:38AM +0000, Wang Nan wrote:

>>> getBPFObjectFromModule() is introduced to compile LLVM IR(Module)

>>> to BPF object. Add new testcase for it.

>>>

>>> Test result:

>>>    $ ./buildperf/perf test -v clang

>>>    51: Test builtin clang support                               :

>>>    51.1: Test builtin clang compile C source to IR              :

>>>    --- start ---

>>>    test child forked, pid 21822

>>>    test child finished with 0

>>>    ---- end ----

>>>    Test builtin clang support subtest 0: Ok

>>>    51.2: Test builtin clang compile C source to ELF object      :

>>>    --- start ---

>>>    test child forked, pid 21823

>>>    test child finished with 0

>>>    ---- end ----

>>>    Test builtin clang support subtest 1: Ok

>>>

>>> Signed-off-by: Wang Nan <wangnan0@huawei.com>

>> ...

>>> +    legacy::PassManager PM;

>>> +    if (TargetMachine->addPassesToEmitFile(PM, ostream,

>>> +                           TargetMachine::CGFT_ObjectFile)) {

>>> +        llvm::errs() << "TargetMachine can't emit a file of this 

>>> type\n";

>>> +        return std::unique_ptr<llvm::SmallVectorImpl<char>>(nullptr);;

>>> +    }

>>> +    PM.run(*Module);

>> I'm pretty sure you want to add FunctionInlingPass as well otherwise

>> I think llvm won't be doing much inlining and only very very simple

>> programs will compile fine. See what we did on bcc side.

>

> Thank you for your information. I though inlining should be done during

> C to IR phase, and we have use -O2 for it. Let me check it.

>


I did a simple test. It seems even without FunctionInliningPass clang/llvm
can inline static function with no problem. For example, in the sample code
in the cover letter, extract a static function like this:

   static void inc_counter(u64 id)
   {
       u64 *counter;

       counter = bpf_map_lookup_elem(&syscall_counter, &id);
       if (!counter) {
           u64 value = 1;
           bpf_map_update_elem(&syscall_counter, &id, &value, 0);
           return;
       }
       __sync_fetch_and_add(counter, 1);
       return;
   }

Then enable llvm.dump-obj = true in ~/.perfconfig so we can see the
resuling ELF object.

The script worked correctly. readelf report:

  $ readelf -a ./count_syscalls.o | grep inc_counter
  $

Inserting output command into PerfModule::prepareBPF and 
PerfModule::prepareJIT
to print names of functions, can't see inc_counter.

Then remove -O2 in cflags in createCompilerInvocation. Result:

# ./perf record -e ./count_syscalls.c -a sleep 1
LLVM ERROR: Cannot select: t38: ch,glue = BPFISD::CALL t37, t31, 
Register:i64 %R1, Register:i64 %R2, t37:1
   t31: i64,ch = load<LD8[@bpf_map_lookup_elem]> t51, t58, undef:i64
     t58: i64 = BPFISD::Wrapper TargetGlobalAddress:i64<i8* (i8*, i8*)** 
@bpf_map_lookup_elem> 0
       t57: i64 = TargetGlobalAddress<i8* (i8*, i8*)** 
@bpf_map_lookup_elem> 0
     t5: i64 = undef
   t34: i64 = Register %R1
   t36: i64 = Register %R2
   t37: ch,glue = CopyToReg t35, Register:i64 %R2, FrameIndex:i64<5>, t35:1
     t36: i64 = Register %R2
     t8: i64 = FrameIndex<5>
     t35: ch,glue = CopyToReg t33, Register:i64 %R1, t56
       t34: i64 = Register %R1
       t56: i64 = BPFISD::Wrapper 
TargetGlobalAddress:i64<%struct.bpf_map_def* @GVALS> 0
         t55: i64 = TargetGlobalAddress<%struct.bpf_map_def* @GVALS> 0
In function: func

Don't know whether -O2 imply inlining.

In bcc, you not only use FunctionInlining, but also add AlwaysInlinerPass
and use populateModulePassManager to append other optimization. I tried to
minimic your code, but it seems the perfhook functions are optimized out
by some optimization added by populateModulePassManager.

Although not quite clear, I'll make following change. Please help me
check it.

Thank you.

Comments

Alexei Starovoitov Nov. 28, 2016, 7:33 p.m. | #1
On Mon, Nov 28, 2016 at 06:31:11PM +0800, Wangnan (F) wrote:
> 

> 

> On 2016/11/28 14:32, Wangnan (F) wrote:

> >

> >

> >On 2016/11/27 1:25, Alexei Starovoitov wrote:

> >>On Sat, Nov 26, 2016 at 07:03:38AM +0000, Wang Nan wrote:

> >>>getBPFObjectFromModule() is introduced to compile LLVM IR(Module)

> >>>to BPF object. Add new testcase for it.

> >>>

> >>>Test result:

> >>>   $ ./buildperf/perf test -v clang

> >>>   51: Test builtin clang support                               :

> >>>   51.1: Test builtin clang compile C source to IR              :

> >>>   --- start ---

> >>>   test child forked, pid 21822

> >>>   test child finished with 0

> >>>   ---- end ----

> >>>   Test builtin clang support subtest 0: Ok

> >>>   51.2: Test builtin clang compile C source to ELF object      :

> >>>   --- start ---

> >>>   test child forked, pid 21823

> >>>   test child finished with 0

> >>>   ---- end ----

> >>>   Test builtin clang support subtest 1: Ok

> >>>

> >>>Signed-off-by: Wang Nan <wangnan0@huawei.com>

> >>...

> >>>+    legacy::PassManager PM;

> >>>+    if (TargetMachine->addPassesToEmitFile(PM, ostream,

> >>>+                           TargetMachine::CGFT_ObjectFile)) {

> >>>+        llvm::errs() << "TargetMachine can't emit a file of this

> >>>type\n";

> >>>+        return std::unique_ptr<llvm::SmallVectorImpl<char>>(nullptr);;

> >>>+    }

> >>>+    PM.run(*Module);

> >>I'm pretty sure you want to add FunctionInlingPass as well otherwise

> >>I think llvm won't be doing much inlining and only very very simple

> >>programs will compile fine. See what we did on bcc side.

> >

> >Thank you for your information. I though inlining should be done during

> >C to IR phase, and we have use -O2 for it. Let me check it.

> >

> 

> I did a simple test. It seems even without FunctionInliningPass clang/llvm

> can inline static function with no problem. For example, in the sample code

> in the cover letter, extract a static function like this:

> 

>   static void inc_counter(u64 id)

>   {

>       u64 *counter;

> 

>       counter = bpf_map_lookup_elem(&syscall_counter, &id);

>       if (!counter) {

>           u64 value = 1;

>           bpf_map_update_elem(&syscall_counter, &id, &value, 0);

>           return;

>       }

>       __sync_fetch_and_add(counter, 1);

>       return;

>   }

> 

> Then enable llvm.dump-obj = true in ~/.perfconfig so we can see the

> resuling ELF object.

> 

> The script worked correctly. readelf report:

> 

>  $ readelf -a ./count_syscalls.o | grep inc_counter

>  $

> 

> Inserting output command into PerfModule::prepareBPF and

> PerfModule::prepareJIT

> to print names of functions, can't see inc_counter.

> 

> Then remove -O2 in cflags in createCompilerInvocation. Result:

> 

> # ./perf record -e ./count_syscalls.c -a sleep 1

> LLVM ERROR: Cannot select: t38: ch,glue = BPFISD::CALL t37, t31,

> Register:i64 %R1, Register:i64 %R2, t37:1

>   t31: i64,ch = load<LD8[@bpf_map_lookup_elem]> t51, t58, undef:i64

>     t58: i64 = BPFISD::Wrapper TargetGlobalAddress:i64<i8* (i8*, i8*)**

> @bpf_map_lookup_elem> 0

>       t57: i64 = TargetGlobalAddress<i8* (i8*, i8*)** @bpf_map_lookup_elem>

> 0

>     t5: i64 = undef

>   t34: i64 = Register %R1

>   t36: i64 = Register %R2

>   t37: ch,glue = CopyToReg t35, Register:i64 %R2, FrameIndex:i64<5>, t35:1

>     t36: i64 = Register %R2

>     t8: i64 = FrameIndex<5>

>     t35: ch,glue = CopyToReg t33, Register:i64 %R1, t56

>       t34: i64 = Register %R1

>       t56: i64 = BPFISD::Wrapper

> TargetGlobalAddress:i64<%struct.bpf_map_def* @GVALS> 0

>         t55: i64 = TargetGlobalAddress<%struct.bpf_map_def* @GVALS> 0

> In function: func


yeah. we need to improve the above 'cannot select' error.
please send a patch if you have spare cycles.
I'm planning to add more meaningful warnings to backend.
In most cases backend knows that certain program will not pass
the verifier, so it's better to warn about that early.

> Don't know whether -O2 imply inlining.

> 

> In bcc, you not only use FunctionInlining, but also add AlwaysInlinerPass

> and use populateModulePassManager to append other optimization. I tried to

> minimic your code, but it seems the perfhook functions are optimized out

> by some optimization added by populateModulePassManager.

> 

> Although not quite clear, I'll make following change. Please help me

> check it.


well, if you see inlining happening without explicitly specifying
the pass, then just keep it as-is.
Though please double check that backend is doing inling.
May be in your simple test front-end inlined it.

Patch

diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index d05ab6f..d6d1959 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -22,6 +22,8 @@ 
  #include "llvm/Support/TargetSelect.h"
  #include "llvm/Target/TargetMachine.h"
  #include "llvm/Target/TargetOptions.h"
+#include "llvm-c/Transforms/IPO.h"
+#include "llvm/Transforms/IPO.h"
  #include <memory>

  #include "clang.h"
@@ -133,6 +135,13 @@  getBPFObjectFromModule(llvm::Module *Module)
         raw_svector_ostream ostream(*Buffer);

         legacy::PassManager PM;
+
+    PM.add(createFunctionInliningPass());
+    /*
+     * LLVM is changing its interface. Use a stable workaround.
+     */
+ LLVMAddAlwaysInlinerPass(reinterpret_cast<LLVMPassManagerRef>(&PM));
+
      if (TargetMachine->addPassesToEmitFile(PM, ostream,
                             TargetMachine::CGFT_ObjectFile)) {
          llvm::errs() << "TargetMachine can't emit a file of this type\n";