diff mbox series

arm: always update thread_info->syscall

Message ID 20181126225335.10477-1-rafael.tinoco@linaro.org
State New
Headers show
Series arm: always update thread_info->syscall | expand

Commit Message

Rafael David Tinoco Nov. 26, 2018, 10:53 p.m. UTC
Right now, only way for task->thread_info->syscall to be updated is if
if _TIF_SYSCALL_WORK is set in current's task thread_info->flags
(similar to what has_syscall_work() checks for arm64).

This means that "->syscall" will only be updated if we are tracing the
syscalls through ptrace, for example. This is NOT the same behavior as
arm64, when pt_regs->syscallno is updated in the beginning of svc0
handler for *every* syscall entry.

This patch fixes the issue since this behavior is needed for
/proc/<pid>/syscall 1st argument to be correctly updated.

Link: https://bugs.linaro.org/show_bug.cgi?id=3783
Cc: <stable@vger.kernel.org> # v4.4 v4.9 v4.14 v4.19
Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>

---
 arch/arm/kernel/asm-offsets.c  | 1 +
 arch/arm/kernel/entry-common.S | 2 ++
 2 files changed, 3 insertions(+)

-- 
2.20.0.rc1

Comments

Russell King (Oracle) Nov. 26, 2018, 11:33 p.m. UTC | #1
On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:
> Right now, only way for task->thread_info->syscall to be updated is if

> if _TIF_SYSCALL_WORK is set in current's task thread_info->flags

> (similar to what has_syscall_work() checks for arm64).

> 

> This means that "->syscall" will only be updated if we are tracing the

> syscalls through ptrace, for example. This is NOT the same behavior as

> arm64, when pt_regs->syscallno is updated in the beginning of svc0

> handler for *every* syscall entry.


So when was it decided that the syscall number will always be required
(we need it to know how far back this has to be backported).

> This patch fixes the issue since this behavior is needed for

> /proc/<pid>/syscall 1st argument to be correctly updated.

> 

> Link: https://bugs.linaro.org/show_bug.cgi?id=3783

> Cc: <stable@vger.kernel.org> # v4.4 v4.9 v4.14 v4.19

> Signed-off-by: Rafael David Tinoco <rafael.tinoco@linaro.org>

> ---

>  arch/arm/kernel/asm-offsets.c  | 1 +

>  arch/arm/kernel/entry-common.S | 2 ++

>  2 files changed, 3 insertions(+)

> 

> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c

> index 3968d6c22455..bfe68a98e1c6 100644

> --- a/arch/arm/kernel/asm-offsets.c

> +++ b/arch/arm/kernel/asm-offsets.c

> @@ -64,6 +64,7 @@ int main(void)

>    DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));

>    DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));

>    DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));

> +  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));

>  #ifdef CONFIG_VFP

>    DEFINE(TI_VFPSTATE,		offsetof(struct thread_info, vfpstate));

>  #ifdef CONFIG_SMP

> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S

> index 0465d65d23de..557e2add4e83 100644

> --- a/arch/arm/kernel/entry-common.S

> +++ b/arch/arm/kernel/entry-common.S

> @@ -257,6 +257,8 @@ local_restart:

>  	tst	r10, #_TIF_SYSCALL_WORK		@ are we tracing syscalls?

>  	bne	__sys_trace

>  

> +	str	r7, [tsk, #TI_SYSCALL]		@ update thread_info->syscall


"scno" is the systemcall number, not "r7".

> +

>  	invoke_syscall tbl, scno, r10, __ret_fast_syscall

>  

>  	add	r1, sp, #S_OFF

> -- 

> 2.20.0.rc1

> 

> 

> _______________________________________________

> linux-arm-kernel mailing list

> linux-arm-kernel@lists.infradead.org

> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Russell King (Oracle) Nov. 26, 2018, 11:41 p.m. UTC | #2
On Mon, Nov 26, 2018 at 11:33:03PM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:

> > Right now, only way for task->thread_info->syscall to be updated is if

> > if _TIF_SYSCALL_WORK is set in current's task thread_info->flags

> > (similar to what has_syscall_work() checks for arm64).

> > 

> > This means that "->syscall" will only be updated if we are tracing the

> > syscalls through ptrace, for example. This is NOT the same behavior as

> > arm64, when pt_regs->syscallno is updated in the beginning of svc0

> > handler for *every* syscall entry.

> 

> So when was it decided that the syscall number will always be required

> (we need it to know how far back this has to be backported).


PS, I rather object to the fact that the required behaviour seems to
change, arch maintainers aren't told about it until... some test is
created at some random point in the future which then fails.

Surely there's a better way to communicate changes in requirements
than discovery-by-random-bug-report ?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Russell King (Oracle) Nov. 26, 2018, 11:44 p.m. UTC | #3
On Mon, Nov 26, 2018 at 11:41:11PM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 26, 2018 at 11:33:03PM +0000, Russell King - ARM Linux wrote:

> > On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:

> > > Right now, only way for task->thread_info->syscall to be updated is if

> > > if _TIF_SYSCALL_WORK is set in current's task thread_info->flags

> > > (similar to what has_syscall_work() checks for arm64).

> > > 

> > > This means that "->syscall" will only be updated if we are tracing the

> > > syscalls through ptrace, for example. This is NOT the same behavior as

> > > arm64, when pt_regs->syscallno is updated in the beginning of svc0

> > > handler for *every* syscall entry.

> > 

> > So when was it decided that the syscall number will always be required

> > (we need it to know how far back this has to be backported).

> 

> PS, I rather object to the fact that the required behaviour seems to

> change, arch maintainers aren't told about it until... some test is

> created at some random point in the future which then fails.

> 

> Surely there's a better way to communicate changes in requirements

> than discovery-by-random-bug-report ?


Final comment for tonight - the commit introducing /proc/*/syscall says:

    This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.
    These use task_current_syscall() to show the task's current system call
    number and argument registers, stack pointer and PC.  For a task blocked
    but not in a syscall, the file shows "-1" in place of the syscall number,
    followed by only the SP and PC.  For a task that's not blocked, it shows
    "running".

Please validate that a blocked task does indeed show -1 with your patch
applied.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Rafael David Tinoco Nov. 27, 2018, 10:54 a.m. UTC | #4
On 11/27/18 8:43 AM, Sasha Levin wrote:
> Hi,

> 

> [This is an automated email]

> 

> This commit has been processed because it contains a -stable tag.

> The stable tag indicates that it's relevant for the following trees: v4.19,

> v4.14, v4.9, v4.4.

> 

> The bot has tested the following trees: v4.19.4, v4.14.83, v4.9.140, v4.4.164,

> v3.18.126. 

> 

> v4.19.4: Build OK!

> v4.14.83: Failed to apply! Possible dependencies:

>     afc9f65e01cd ("ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+")

> 

> v4.9.140: Failed to apply! Possible dependencies:

>     afc9f65e01cd ("ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+")

> 

> v4.4.164: Failed to apply! Possible dependencies:

>     10573ae547c8 ("ARM: spectre-v1: fix syscall entry")

>     afc9f65e01cd ("ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+")

> 

> 

> How should we proceed with this patch?


I'm sending a v2 for the same patch. Please hold.

Thank you!
Russell King (Oracle) Nov. 27, 2018, 10:56 a.m. UTC | #5
On Tue, Nov 27, 2018 at 08:30:32AM -0200, Rafael David Tinoco wrote:
> On 11/26/18 9:44 PM, Russell King - ARM Linux wrote:

> >On Mon, Nov 26, 2018 at 11:41:11PM +0000, Russell King - ARM Linux wrote:

> >>On Mon, Nov 26, 2018 at 11:33:03PM +0000, Russell King - ARM Linux wrote:

> >>>On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:

> >>>>Right now, only way for task->thread_info->syscall to be updated is if

> >>>>if _TIF_SYSCALL_WORK is set in current's task thread_info->flags

> >>>>(similar to what has_syscall_work() checks for arm64).

> >>>>

> >>>>This means that "->syscall" will only be updated if we are tracing the

> >>>>syscalls through ptrace, for example. This is NOT the same behavior as

> >>>>arm64, when pt_regs->syscallno is updated in the beginning of svc0

> >>>>handler for *every* syscall entry.

> >>>

> >>>So when was it decided that the syscall number will always be required

> >>>(we need it to know how far back this has to be backported).

> >>

> >>PS, I rather object to the fact that the required behaviour seems to

> >>change, arch maintainers aren't told about it until... some test is

> >>created at some random point in the future which then fails.

> >>

> >>Surely there's a better way to communicate changes in requirements

> >>than discovery-by-random-bug-report ?

> >

> >Final comment for tonight - the commit introducing /proc/*/syscall says:

> >

> >     This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.

> >     These use task_current_syscall() to show the task's current system call

> >     number and argument registers, stack pointer and PC.  For a task blocked

> >     but not in a syscall, the file shows "-1" in place of the syscall number,

> >     followed by only the SP and PC.  For a task that's not blocked, it shows

> >     "running".

> >

> >Please validate that a blocked task does indeed show -1 with your patch

> >applied.

> 

> Will do. This is done in an upper level (collect_syscall <-

> task_current_syscall <- proc_pid_syscall):

> 

> 	if (!try_get_task_stack(target)) {

> 		/* Task has no stack, so the task isn't in a syscall. */

> 		*sp = *pc = 0;

> 		*callno = -1;

> 		return 0;

> 	}

> 

> I think only missing part for arm was that one, but will confirm, after

> fixing usage of "r7" for obtaining "scno". Will send a v2 in this thread.


There's another question - what's the expected behaviour when we
restart a syscall using the restartblock mechanism?  Is the syscall
number expected to be __NR_restart_syscall or the original syscall
number?

I can't find anywhere that this detail is specified (damn the lack
of API documentation - I'm tempted to say that we won't implement
this until it gets documented properly, and that test can continue
failing until such time that happens.)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Russell King (Oracle) Nov. 27, 2018, 3:35 p.m. UTC | #6
On Tue, Nov 27, 2018 at 10:56:20AM +0000, Russell King - ARM Linux wrote:
> On Tue, Nov 27, 2018 at 08:30:32AM -0200, Rafael David Tinoco wrote:

> > On 11/26/18 9:44 PM, Russell King - ARM Linux wrote:

> > >On Mon, Nov 26, 2018 at 11:41:11PM +0000, Russell King - ARM Linux wrote:

> > >>On Mon, Nov 26, 2018 at 11:33:03PM +0000, Russell King - ARM Linux wrote:

> > >>>On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:

> > >>>>Right now, only way for task->thread_info->syscall to be updated is if

> > >>>>if _TIF_SYSCALL_WORK is set in current's task thread_info->flags

> > >>>>(similar to what has_syscall_work() checks for arm64).

> > >>>>

> > >>>>This means that "->syscall" will only be updated if we are tracing the

> > >>>>syscalls through ptrace, for example. This is NOT the same behavior as

> > >>>>arm64, when pt_regs->syscallno is updated in the beginning of svc0

> > >>>>handler for *every* syscall entry.

> > >>>

> > >>>So when was it decided that the syscall number will always be required

> > >>>(we need it to know how far back this has to be backported).

> > >>

> > >>PS, I rather object to the fact that the required behaviour seems to

> > >>change, arch maintainers aren't told about it until... some test is

> > >>created at some random point in the future which then fails.

> > >>

> > >>Surely there's a better way to communicate changes in requirements

> > >>than discovery-by-random-bug-report ?

> > >

> > >Final comment for tonight - the commit introducing /proc/*/syscall says:

> > >

> > >     This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.

> > >     These use task_current_syscall() to show the task's current system call

> > >     number and argument registers, stack pointer and PC.  For a task blocked

> > >     but not in a syscall, the file shows "-1" in place of the syscall number,

> > >     followed by only the SP and PC.  For a task that's not blocked, it shows

> > >     "running".

> > >

> > >Please validate that a blocked task does indeed show -1 with your patch

> > >applied.

> > 

> > Will do. This is done in an upper level (collect_syscall <-

> > task_current_syscall <- proc_pid_syscall):

> > 

> > 	if (!try_get_task_stack(target)) {

> > 		/* Task has no stack, so the task isn't in a syscall. */

> > 		*sp = *pc = 0;

> > 		*callno = -1;

> > 		return 0;

> > 	}

> > 

> > I think only missing part for arm was that one, but will confirm, after

> > fixing usage of "r7" for obtaining "scno". Will send a v2 in this thread.

> 

> There's another question - what's the expected behaviour when we

> restart a syscall using the restartblock mechanism?  Is the syscall

> number expected to be __NR_restart_syscall or the original syscall

> number?

> 

> I can't find anywhere that this detail is specified (damn the lack

> of API documentation - I'm tempted to say that we won't implement

> this until it gets documented properly, and that test can continue

> failing until such time that happens.)


Having looked around, it seems that the /proc/<PID>/syscall interface
was sneaked into the kernel.  The patch series which added it was
sent in 2008 with a covering message that made no mention of this new
interface, instead stating:

  http://lkml.iu.edu/hypermail/linux/kernel/0807.2/0551.html

   Most of these changes move code around with little or no change,
   and they should not break anything or change any behavior.

While that statement is absolutely correct, it doesn't highlight the
fact that the set of patches _also_ include a brand new userspace
interface exposing things like syscall numbers and arguments in /proc.

There appears to be no documentation at all of this interface, so there
is no definition of how it is supposed to work or what it is supposed
to expose beyond what little information is in the original patch:

  http://lkml.iu.edu/hypermail/linux/kernel/0807.2/0577.html

   This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.
   These use task_current_syscall() to show the task's current system call
   number and argument registers, stack pointer and PC. For a task blocked
   but not in a syscall, the file shows "-1" in place of the syscall number,
   followed by only the SP and PC. For a task that's not blocked, it shows
   "running".

This really isn't a good place to be - this is why commit messages
should _not_ just describe what the changes are doing, also _why_ they
are being made.  Also, any new user interface needs to be fully and
properly documented, because years later, people will move away,
knowledge will be lost, and that leaves us with a maintainability
problem, exactly like we have right now with this.

With the lack of interface documentation, how do we even know whether
the /proc/*/syscall is supposed to show the syscall number of non-traced
threads?  How do we know that the test that found this is actually
correct in reporting a failure?  How do we know whether it's supposed to
expose __NR_restart_syscall?

So, I thought I'd write a test program:

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/fcntl.h>
#include <unistd.h>

static int read_file(const char *fn, char *buf, size_t size)
{
	int fd, ret, nr;

	fd = open(fn, O_RDONLY);
	if (fd == -1)
		return -1;

	for (nr = 0; nr < size; nr += ret) {
		ret = read(fd, buf + nr, size - nr);
		if (ret <= 0)
			break;
	}

	close(fd);

	return nr ? nr : ret;
}

int main()
{
	char fn[64], buf[256];
	int pid, ret;

	pid = fork();
	if (pid == 0) {
		/* child */
		sleep(5);
		exit(0);
	}

	/* parent */
	sleep(1);
	snprintf(fn, sizeof(fn), "/proc/%d/syscall", pid);
	ret = read_file(fn, buf, sizeof(buf));

	printf("%.*s", ret, buf);

	kill(pid, SIGCONT);
	sleep(1);

	ret = read_file(fn, buf, sizeof(buf));

	printf("%.*s", ret, buf);

	return 0;
}

On x86 (32-bit app on 64-bit kernel), it has this behaviour:

$ ./syscall-test
162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59
162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59

which looks good, except:

$ strace -o /dev/null -f ./syscall-test
162 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59
0 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59

So, if we're syscall ptracing a program, __NR_restart_syscall gets
exposed through this interface, but if we aren't, it isn't exposed.
Which version is correct?  *shrug*, no documentation...

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
David Laight Nov. 27, 2018, 3:48 p.m. UTC | #7
From: Russell King - ARM Linux

> Sent: 27 November 2018 15:36

...
> There appears to be no documentation at all of this interface, so there

> is no definition of how it is supposed to work or what it is supposed

> to expose beyond what little information is in the original patch:

> 

>   http://lkml.iu.edu/hypermail/linux/kernel/0807.2/0577.html

> 

>    This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files.

>    These use task_current_syscall() to show the task's current system call

>    number and argument registers, stack pointer and PC. For a task blocked

>    but not in a syscall, the file shows "-1" in place of the syscall number,

>    followed by only the SP and PC. For a task that's not blocked, it shows

>    "running".


I 'like' the way the columns for sp and pc jump about ...

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Rafael David Tinoco Nov. 27, 2018, 8:52 p.m. UTC | #8
On 11/27/18 1:35 PM, Russell King - ARM Linux wrote:
> On x86 (32-bit app on 64-bit kernel), it has this behaviour:

> 

> $ ./syscall-test

> 162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59

> 162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59

> 

> which looks good, except:

> 

> $ strace -o /dev/null -f ./syscall-test

> 162 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59

> 0 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59

> 

> So, if we're syscall ptracing a program, __NR_restart_syscall gets

> exposed through this interface, but if we aren't, it isn't exposed.

> Which version is correct?  *shrug*, no documentation...


I looked around and could only find people using this interface as an
alternate mechanism - than ptracing - for discovering what a task was
doing a certain moment (Mostly during hangs. I haven't found a good
schematic way - or tool - that uses it, but I might be wrong there). For
that purpose, it would be better *not* to update when restart_syscall
happens, unless the hang is right there =o).

If you check:
https://bugs.linaro.org/show_bug.cgi?id=3783#c11

The only syscalls being updated now - that I could get, without any
workload - were 252 (epoll_wait) or 252 (sched_getaffinity):

252 0x7 0xbee80578 0x1b 0xffffffff 0xbee80578 0x7 0xbee80550 0xb6e9d286
252 0xa 0xbef3d930 0x9 0xffffffff 0x0 0x6c 0xbef3d908 0xb6e6f286
142 0x4 0x0 0xbea3eb2c 0x0 0x0 0x6c 0xbea3ea88 0xb6e40286
252 0x4 0xbe8d59e8 0x9 0xffffffff 0xbe8d59e8 0x4 0xbe8d59c0 0xb6e11286

All others I got zeroed, and that's where everything started.

Please, let me know if you still want a v2, or something else... like
fixing it and making it to ignore the restart_syscall for *at least*
having the same behavior across diff archs.

If it is not worth, I'll just blacklist those tests in our functional
tests environment and move on =).

Thanks a lot, very best rgds,
-- 
Rafael D. Tinoco
Linaro Kernel Validation
diff mbox series

Patch

diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 3968d6c22455..bfe68a98e1c6 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -64,6 +64,7 @@  int main(void)
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
   DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
   DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
+  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));
 #ifdef CONFIG_VFP
   DEFINE(TI_VFPSTATE,		offsetof(struct thread_info, vfpstate));
 #ifdef CONFIG_SMP
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 0465d65d23de..557e2add4e83 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -257,6 +257,8 @@  local_restart:
 	tst	r10, #_TIF_SYSCALL_WORK		@ are we tracing syscalls?
 	bne	__sys_trace
 
+	str	r7, [tsk, #TI_SYSCALL]		@ update thread_info->syscall
+
 	invoke_syscall tbl, scno, r10, __ret_fast_syscall
 
 	add	r1, sp, #S_OFF