[v2,3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

Message ID 20190606094845.4800-4-leo.yan@linaro.org
State New
Headers show
Series
  • perf augmented_raw_syscalls: Support for arm64
Related show

Commit Message

Leo Yan June 6, 2019, 9:48 a.m.
This patch adds support for arm64 raw syscall numbers so that we can use
it on arm64 platform.

After applied this patch, we need to specify macro -D__aarch64__ or
-D__x86_64__ in compilation option so Clang can use the corresponding
syscall numbers for arm64 or x86_64 respectively, other architectures
will report failure when compilation.

Signed-off-by: Leo Yan <leo.yan@linaro.org>

---
 .../examples/bpf/augmented_raw_syscalls.c     | 81 +++++++++++++++++++
 1 file changed, 81 insertions(+)

-- 
2.17.1

Comments

Leo Yan June 7, 2019, 9:58 a.m. | #1
Hi Arnaldo,

On Thu, Jun 06, 2019 at 11:44:12AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:

> > Hi Arnaldo,

> > 

> > On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:

> > > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:

> > > > This patch adds support for arm64 raw syscall numbers so that we can use

> > > > it on arm64 platform.

> > > > 

> > > > After applied this patch, we need to specify macro -D__aarch64__ or

> > > > -D__x86_64__ in compilation option so Clang can use the corresponding

> > > > syscall numbers for arm64 or x86_64 respectively, other architectures

> > > > will report failure when compilation.

> > > 

> > > So, please check what I have in my perf/core branch, I've completely

> > > removed arch specific stuff from augmented_raw_syscalls.c.

> > > 

> > > What is done now is use a map to specify what to copy, that same map

> > > that is used to state which syscalls should be traced.

> > > 

> > > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure

> > > out the mapping of syscall names to ids, just like is done for x86_64

> > > and other arches, falling back to audit-libs when that syscalltbl thing

> > > is not present.

> > 

> > Actually I have noticed mksyscalltbl has been enabled for arm64, and

> > had to say your approach is much better :)

> > 

> > Thanks for the info and I will try your patch at my side.

> 

> That is excellent news! I'm eager to hear from you if this perf+BPF

> integration experiment works for arm64.


I tested with the lastest perf/core branch which contains the patch:
'perf augmented_raw_syscalls: Tell which args are filenames and how
many bytes to copy' and got the error as below:

# perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
Error:  Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
rmdir, stat, statfs, symlink, truncate, unlink
Hint:   try 'perf list syscalls:sys_enter_*'
Hint:   and: 'man syscalls'

So seems mksyscalltbl has not included completely for syscalls, I
use below command to generate syscalltbl_arm64[] array and it don't
include related entries for access, chmod, chown, etc ...

You could refer the generated syscalltbl_arm64 in:
http://paste.ubuntu.com/p/8Bj7Jkm2mP/

> I'm now trying to get past the verifier when checking if more than one

> syscall arg is a filename, i.e. things like the rename* family, that

> take two filenames.

> 

> An exercise in loop unrolling, providing the right hints to the

> verifier, making sure clang don't trash those via explicit barriers, and

> a lot of patience, limitless fun! ;-)


Cool!  Please feel free let me know if need me to do testing for this.

Thanks,
Leo Yan
Arnaldo Carvalho de Melo June 10, 2019, 6:47 p.m. | #2
Em Sun, Jun 09, 2019 at 09:18:49PM +0800, Leo Yan escreveu:
> On Fri, Jun 07, 2019 at 05:58:31PM +0800, Leo Yan wrote:

> > Hi Arnaldo,

> > 

> > On Thu, Jun 06, 2019 at 11:44:12AM -0300, Arnaldo Carvalho de Melo wrote:

> > > Em Thu, Jun 06, 2019 at 10:12:31PM +0800, Leo Yan escreveu:

> > > > Hi Arnaldo,

> > > > 

> > > > On Thu, Jun 06, 2019 at 10:38:38AM -0300, Arnaldo Carvalho de Melo wrote:

> > > > > Em Thu, Jun 06, 2019 at 05:48:44PM +0800, Leo Yan escreveu:

> > > > > > This patch adds support for arm64 raw syscall numbers so that we can use

> > > > > > it on arm64 platform.

> > > > > > 

> > > > > > After applied this patch, we need to specify macro -D__aarch64__ or

> > > > > > -D__x86_64__ in compilation option so Clang can use the corresponding

> > > > > > syscall numbers for arm64 or x86_64 respectively, other architectures

> > > > > > will report failure when compilation.

> > > > > 

> > > > > So, please check what I have in my perf/core branch, I've completely

> > > > > removed arch specific stuff from augmented_raw_syscalls.c.

> > > > > 

> > > > > What is done now is use a map to specify what to copy, that same map

> > > > > that is used to state which syscalls should be traced.

> > > > > 

> > > > > It uses that tools/perf/arch/arm64/entry/syscalls/mksyscalltbl to figure

> > > > > out the mapping of syscall names to ids, just like is done for x86_64

> > > > > and other arches, falling back to audit-libs when that syscalltbl thing

> > > > > is not present.

> > > > 

> > > > Actually I have noticed mksyscalltbl has been enabled for arm64, and

> > > > had to say your approach is much better :)

> > > > 

> > > > Thanks for the info and I will try your patch at my side.

> > > 

> > > That is excellent news! I'm eager to hear from you if this perf+BPF

> > > integration experiment works for arm64.

> > 

> > I tested with the lastest perf/core branch which contains the patch:

> > 'perf augmented_raw_syscalls: Tell which args are filenames and how

> > many bytes to copy' and got the error as below:

> > 

> > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c

> > Error:  Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,

> > rmdir, stat, statfs, symlink, truncate, unlink


Humm, I think that we can just make the code that parses the
tools/perf/trace/strace/groups/string file to ignore syscalls it can't
find in the syscall_tbl, i.e. trace those if they exist in the arch.

> > Hint:   try 'perf list syscalls:sys_enter_*'

> > Hint:   and: 'man syscalls'

> > 

> > So seems mksyscalltbl has not included completely for syscalls, I

> > use below command to generate syscalltbl_arm64[] array and it don't

> > include related entries for access, chmod, chown, etc ...


So, we need to investigate why is that these are missing, good thing we
have this 'strings' group :-)

> > You could refer the generated syscalltbl_arm64 in:

> > http://paste.ubuntu.com/p/8Bj7Jkm2mP/

> 

> After digging into this issue on Arm64, below is summary info:

> 

> - arm64 uses the header include/uapi/linux/unistd.h to define system

>   call numbers, in this header some system calls are not defined (I

>   think the reason is these system calls are obsolete at the end) so the

>   corresponding strings are missed in the array syscalltbl_native,

>   for arm64 the array is defined in the file:

>   tools/perf/arch/arm64/include/generated/asm/syscalls.c.


Yeah, I looked at the 'access' case and indeed it is not present in
include/uapi/asm-generic/unistd.h, which is the place
include/uapi/linux/unistd.h ends up.

Ok please take a look at the patch at the end of this message, should be ok?

I tested it by changing the strace/gorups/string file to have a few
unknown syscalls, running it with -v we see:

[root@quaco perf]# perf trace -v -e string ls
Skipping unknown syscalls: access99, acct99, add_key99
<SNIP other verbose messages>
normal operation not considering those unknown syscalls.

>   On the other hand, the file tools/perf/trace/strace/groups/string

>   stores the required system call strings, these system call strings

>   are based on x86_64 platform but not for arm64, the strings mismatch

>   with the system call defined in the array syscalltbl_native.  This

>   is the reason why reports the fail: "Error:  Invalid syscall access,

>   chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod,

>   newfstatat, open, readlink, rename, rmdir, stat, statfs, symlink,

>   truncate, unlink".

> 

>   I tried to manually remove these reported strings from

>   tools/perf/trace/strace/groups/string, then 'perf trace' can work

>   well.

> 

>   But I don't know what's a good way to proceed.  Seems to me, we can

>   create a dedicated string file

>   tools/perf/trace/strace/groups/uapi_string which can be used to

>   match with system calls definitions in include/uapi/linux/unistd.h.

>   If there have other more general methods, will be great.

 
> - As a side topic, arm64 also supports aarch32 compat system call

>   which are defined in header arch/arm64/include/asm/unistd32.h.

> 

>   For either aarch64 or aarch32 system call, both of them finally will

>   invoke function el0_svc_common() to handle system call [1].  But so

>   far we don't distinguish the system call numbers is for aarch64 or

>   aarch32 and always consider it's aarch64 system call.

> 

>   I think we can set an extra bit (e.g. use the 16th bit in 32 bits

>   signed int) to indicate it's a aarch32 compat system call, but not

>   sure if this is general method or not.


compat syscalls were not supported because of limitations in the
raw_syscalls:sys_{enter,exit} tracepoints I can't recall from the top of
my head right now, I think that looking at that code
(raw_syscalls:sys_{enter,exit}) git log history may provide the
explanation.
 
>   Maybe there have existed solution in other architectures for this,

>   especially other platforms also should support 32 bits and 64 bits

>   system calls along with the architecture evoluation, so want to

>   inquiry firstly to avoid duplicate works.

> 

> Thanks a lot for suggestions!

> Leo.

> 

> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/syscall.c#n93


- Arnaldo


commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Mon Jun 10 15:37:45 2019 -0300

    perf trace: Skip unknown syscalls when expanding strace like syscall groups
    
    We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
    syscalls that should be selected when 'string' is used, meaning, in this
    case, syscalls that receive as one of its arguments a string, like a
    pathname.
    
    But those were first selected and tested on x86_64, and end up failing
    in architectures where some of those syscalls are not available, like
    the 'access' syscall on arm64, which makes using 'perf trace -e string'
    in such archs to fail.
    
    Since this the routine doing the validation is used only when reading
    such files, do not fail when some syscall is not found in the
    syscalltbl, instead just use pr_debug() to register that in case people
    are suspicious of problems.
    
    Now using 'perf trace -e string' should work on arm64, selecting only
    the syscalls that have a string and are available on that architecture.
    
    Reported-by: Leo Yan <leo.yan@linaro.org>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Alexei Starovoitov <ast@kernel.org>
    Cc: Daniel Borkmann <daniel@iogearbox.net>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Martin KaFai Lau <kafai@fb.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Mike Leach <mike.leach@linaro.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Song Liu <songliubraving@fb.com>
    Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
    Cc: Yonghong Song <yhs@fb.com>
    Link: https://lkml.kernel.org/n/tip-oa4c2x8p3587jme0g89fyg18@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>


diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 1a2a605cf068..eb70a4b71755 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
 static int trace__validate_ev_qualifier(struct trace *trace)
 {
 	int err = 0, i;
+	bool printed_invalid_prefix = false;
 	size_t nr_allocated;
 	struct str_node *pos;
 
@@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
 			if (id >= 0)
 				goto matches;
 
-			if (err == 0) {
-				fputs("Error:\tInvalid syscall ", trace->output);
-				err = -EINVAL;
+			if (!printed_invalid_prefix) {
+				pr_debug("Skipping unknown syscalls: ");
+				printed_invalid_prefix = true;
 			} else {
-				fputs(", ", trace->output);
+				pr_debug(", ");
 			}
 
-			fputs(sc, trace->output);
+			pr_debug("%s", sc);
+			continue;
 		}
 matches:
 		trace->ev_qualifier_ids.entries[i++] = id;
@@ -1591,15 +1593,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
 		}
 	}
 
-	if (err < 0) {
-		fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
-		      "\nHint:\tand: 'man syscalls'\n", trace->output);
-out_free:
-		zfree(&trace->ev_qualifier_ids.entries);
-		trace->ev_qualifier_ids.nr = 0;
-	}
 out:
+	if (printed_invalid_prefix)
+		pr_debug("\n");
 	return err;
+out_free:
+	zfree(&trace->ev_qualifier_ids.entries);
+	trace->ev_qualifier_ids.nr = 0;
+	goto out;
 }
 
 /*

Patch

diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
index 5c4a4e715ae6..a3701a4daf2e 100644
--- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
+++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
@@ -45,6 +45,83 @@  struct augmented_filename {
 	char		value[PATH_MAX];
 };
 
+#if defined(__aarch64__)
+
+/* syscalls where the first arg is a string */
+#define SYS_OPEN               1024
+#define SYS_STAT               1038
+#define SYS_LSTAT              1039
+#define SYS_ACCESS             1033
+#define SYS_EXECVE              221
+#define SYS_TRUNCATE             45
+#define SYS_CHDIR                49
+#define SYS_RENAME             1034
+#define SYS_MKDIR              1030
+#define SYS_RMDIR              1031
+#define SYS_CREAT              1064
+#define SYS_LINK               1025
+#define SYS_UNLINK             1026
+#define SYS_SYMLINK            1036
+#define SYS_READLINK           1035
+#define SYS_CHMOD              1028
+#define SYS_CHOWN              1029
+#define SYS_LCHOWN             1032
+#define SYS_MKNOD              1027
+#define SYS_STATFS             1056
+#define SYS_PIVOT_ROOT           41
+#define SYS_CHROOT               51
+#define SYS_ACCT                 89
+#define SYS_SWAPON              224
+#define SYS_SWAPOFF             225
+#define SYS_DELETE_MODULE       106
+#define SYS_SETXATTR              5
+#define SYS_LSETXATTR             6
+#define SYS_GETXATTR              8
+#define SYS_LGETXATTR             9
+#define SYS_LISTXATTR            11
+#define SYS_LLISTXATTR           12
+#define SYS_REMOVEXATTR          14
+#define SYS_LREMOVEXATTR         15
+#define SYS_MQ_OPEN             180
+#define SYS_MQ_UNLINK           181
+#define SYS_ADD_KEY             217
+#define SYS_REQUEST_KEY         218
+#define SYS_SYMLINKAT            36
+#define SYS_MEMFD_CREATE        279
+
+/* syscalls where the second arg is a string */
+#define SYS_PWRITE64             68
+#define SYS_RENAME             1034
+#define SYS_QUOTACTL             60
+#define SYS_FSETXATTR             7
+#define SYS_FGETXATTR            10
+#define SYS_FREMOVEXATTR         16
+#define SYS_MQ_TIMEDSEND        182
+#define SYS_REQUEST_KEY         218
+#define SYS_INOTIFY_ADD_WATCH    27
+#define SYS_OPENAT               56
+#define SYS_MKDIRAT              34
+#define SYS_MKNODAT              33
+#define SYS_FCHOWNAT             54
+#define SYS_FUTIMESAT          1066
+#define SYS_NEWFSTATAT         1054
+#define SYS_UNLINKAT             35
+#define SYS_RENAMEAT             38
+#define SYS_LINKAT               37
+#define SYS_READLINKAT           78
+#define SYS_FCHMODAT             53
+#define SYS_FACCESSAT            48
+#define SYS_UTIMENSAT            88
+#define SYS_NAME_TO_HANDLE_AT   264
+#define SYS_FINIT_MODULE        273
+#define SYS_RENAMEAT2           276
+#define SYS_EXECVEAT            281
+#define SYS_STATX               291
+#define SYS_MOVE_MOUNT          429
+#define SYS_FSPICK              433
+
+#elif defined(__x86_64__)
+
 /* syscalls where the first arg is a string */
 #define SYS_OPEN                 2
 #define SYS_STAT                 4
@@ -119,6 +196,10 @@  struct augmented_filename {
 #define SYS_MOVE_MOUNT         429
 #define SYS_FSPICK             433
 
+#else
+#error "unsupported architecture"
+#endif
+
 pid_filter(pids_filtered);
 
 struct augmented_args_filename {