diff mbox

[RFC] ptrace: add generic SET_SYSCALL request

Message ID 1415346443-28915-1-git-send-email-takahiro.akashi@linaro.org
State New
Headers show

Commit Message

AKASHI Takahiro Nov. 7, 2014, 7:47 a.m. UTC
This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
It can be used to change a system call number as follows:
    ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
'new_syscall_no' can be -1 to skip this system call, you need to modify
a register's value, in arch-specific way, as return value though.

Please note that we can't define PTRACE_SET_SYSCALL macro in
uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
request on sparc.

This patch also contains an example of change on arch side, arm.
Only syscall_set_nr() is required to be defined in asm/syscall.h.

Currently only arm has this request, while arm64 would also have it
once my patch series of seccomp for arm64 is merged. It will also be
usable for most of other arches.
See the discussions in lak-ml:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
---
 arch/arm/include/asm/syscall.h |    7 +++++++
 arch/arm/kernel/ptrace.c       |    5 -----
 kernel/ptrace.c                |    6 ++++++
 3 files changed, 13 insertions(+), 5 deletions(-)

Comments

Arnd Bergmann Nov. 7, 2014, 9:30 a.m. UTC | #1
On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote:
> This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
> It can be used to change a system call number as follows:
>     ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
> 'new_syscall_no' can be -1 to skip this system call, you need to modify
> a register's value, in arch-specific way, as return value though.
> 
> Please note that we can't define PTRACE_SET_SYSCALL macro in
> uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
> request on sparc.
> 
> This patch also contains an example of change on arch side, arm.
> Only syscall_set_nr() is required to be defined in asm/syscall.h.
> 
> Currently only arm has this request, while arm64 would also have it
> once my patch series of seccomp for arm64 is merged. It will also be
> usable for most of other arches.
> See the discussions in lak-ml:
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 

Can you describe why you are moving the implementation? Is this a feature
that we want to have on all architectures in the future? As you say,
only arm32 implements is at the moment.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 7, 2014, 11:55 a.m. UTC | #2
On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote:
> On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote:
> > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
> > It can be used to change a system call number as follows:
> >     ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
> > 'new_syscall_no' can be -1 to skip this system call, you need to modify
> > a register's value, in arch-specific way, as return value though.
> > 
> > Please note that we can't define PTRACE_SET_SYSCALL macro in
> > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
> > request on sparc.
> > 
> > This patch also contains an example of change on arch side, arm.
> > Only syscall_set_nr() is required to be defined in asm/syscall.h.
> > 
> > Currently only arm has this request, while arm64 would also have it
> > once my patch series of seccomp for arm64 is merged. It will also be
> > usable for most of other arches.
> > See the discussions in lak-ml:
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html
> > 
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > 
> 
> Can you describe why you are moving the implementation? Is this a feature
> that we want to have on all architectures in the future? As you say,
> only arm32 implements is at the moment.

We need this for arm64 and, since all architectures seem to have a mechanism
for setting a system call via ptrace, moving it to generic code should make
sense for new architectures too, no?

We don't have any arch-specific ptrace requests on arm64, so it would be
a shame if we had to add one now, especially since there's nothing
conceptually arch-specific about setting a syscall number.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Nov. 7, 2014, 12:03 p.m. UTC | #3
On Friday 07 November 2014 11:55:51 Will Deacon wrote:
> On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote:
> > On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote:
> > > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
> > > It can be used to change a system call number as follows:
> > >     ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
> > > 'new_syscall_no' can be -1 to skip this system call, you need to modify
> > > a register's value, in arch-specific way, as return value though.
> > > 
> > > Please note that we can't define PTRACE_SET_SYSCALL macro in
> > > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
> > > request on sparc.
> > > 
> > > This patch also contains an example of change on arch side, arm.
> > > Only syscall_set_nr() is required to be defined in asm/syscall.h.
> > > 
> > > Currently only arm has this request, while arm64 would also have it
> > > once my patch series of seccomp for arm64 is merged. It will also be
> > > usable for most of other arches.
> > > See the discussions in lak-ml:
> > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html
> > > 
> > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > 
> > 
> > Can you describe why you are moving the implementation? Is this a feature
> > that we want to have on all architectures in the future? As you say,
> > only arm32 implements is at the moment.
> 
> We need this for arm64 and, since all architectures seem to have a mechanism
> for setting a system call via ptrace, moving it to generic code should make
> sense for new architectures too, no?

It makes a little more sense now, but I still don't understand why you
need to set the system call number via ptrace. What is this used for,
and why doesn't any other architecture have this?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Russell King - ARM Linux Nov. 7, 2014, 12:11 p.m. UTC | #4
On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote:
> On Friday 07 November 2014 11:55:51 Will Deacon wrote:
> > We need this for arm64 and, since all architectures seem to have a mechanism
> > for setting a system call via ptrace, moving it to generic code should make
> > sense for new architectures too, no?
> 
> It makes a little more sense now, but I still don't understand why you
> need to set the system call number via ptrace. What is this used for,
> and why doesn't any other architecture have this?

All other architectures have a way.  x86, for example, you set orig_eax
(or orig_rax) to change the syscall number.  On ARM, that doesn't work
because we don't always pass the syscall number in a register.
Will Deacon Nov. 7, 2014, 12:27 p.m. UTC | #5
On Fri, Nov 07, 2014 at 12:03:00PM +0000, Arnd Bergmann wrote:
> On Friday 07 November 2014 11:55:51 Will Deacon wrote:
> > On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote:
> > > On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote:
> > > > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
> > > > It can be used to change a system call number as follows:
> > > >     ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
> > > > 'new_syscall_no' can be -1 to skip this system call, you need to modify
> > > > a register's value, in arch-specific way, as return value though.
> > > > 
> > > > Please note that we can't define PTRACE_SET_SYSCALL macro in
> > > > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
> > > > request on sparc.
> > > > 
> > > > This patch also contains an example of change on arch side, arm.
> > > > Only syscall_set_nr() is required to be defined in asm/syscall.h.
> > > > 
> > > > Currently only arm has this request, while arm64 would also have it
> > > > once my patch series of seccomp for arm64 is merged. It will also be
> > > > usable for most of other arches.
> > > > See the discussions in lak-ml:
> > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html
> > > > 
> > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> > > > 
> > > 
> > > Can you describe why you are moving the implementation? Is this a feature
> > > that we want to have on all architectures in the future? As you say,
> > > only arm32 implements is at the moment.
> > 
> > We need this for arm64 and, since all architectures seem to have a mechanism
> > for setting a system call via ptrace, moving it to generic code should make
> > sense for new architectures too, no?
> 
> It makes a little more sense now, but I still don't understand why you
> need to set the system call number via ptrace. What is this used for,
> and why doesn't any other architecture have this?

I went through the same thought process back in August, and Akashi
eventually convinced me that this was the best thing to do:

  http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/278692.html

It comes down to a debugger (which could be GDB, seccomp, tracer ...)
wanting to change the system call number. This is also used as a mechanism
to skip a system call by setting it to '-1' (yeah, it's gross, and the
interaction between all of these syscall hooks is horrible too).

If we update w8 directly instead, we run into a couple of issues:

  - Needing to restore the original w8 if the value is set to '-1' for
    skip, but continuing to return -ENOSYS for syscall(-1) when not on a
    tracer path

  - seccomp assumes that syscall_get_nr will return the version set by
    the most recent tracer, so then we need hacks in ptrace to route
    register writes to w8 to syscallno in pt_regs, but again, only in the
    case that we're tracing.

Akashi might be able to elaborate on other problems, since this was a
couple of months ago and I take every opportunity I can to avoid looking
at this part of the kernel.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Nov. 7, 2014, 12:44 p.m. UTC | #6
On Friday 07 November 2014 12:11:19 Russell King - ARM Linux wrote:
> On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote:
> > On Friday 07 November 2014 11:55:51 Will Deacon wrote:
> > > We need this for arm64 and, since all architectures seem to have a mechanism
> > > for setting a system call via ptrace, moving it to generic code should make
> > > sense for new architectures too, no?
> > 
> > It makes a little more sense now, but I still don't understand why you
> > need to set the system call number via ptrace. What is this used for,
> > and why doesn't any other architecture have this?
> 
> All other architectures have a way.  x86, for example, you set orig_eax
> (or orig_rax) to change the syscall number.  On ARM, that doesn't work
> because we don't always pass the syscall number in a register.
> 

Sorry for being slow today, but why can't we use the same interface that
s390 has on arm64:

static int s390_system_call_get(struct task_struct *target,
                                const struct user_regset *regset,
                                unsigned int pos, unsigned int count,
                                void *kbuf, void __user *ubuf)
{
        unsigned int *data = &task_thread_info(target)->system_call;
        return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
                                   data, 0, sizeof(unsigned int));
}

static int s390_system_call_set(struct task_struct *target,
                                const struct user_regset *regset,
                                unsigned int pos, unsigned int count,
                                const void *kbuf, const void __user *ubuf)
{
        unsigned int *data = &task_thread_info(target)->system_call;
        return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
                                  data, 0, sizeof(unsigned int));
}

static const struct user_regset s390_regsets[] = {
	...
        {
                .core_note_type = NT_S390_SYSTEM_CALL,
                .n = 1,
                .size = sizeof(unsigned int),
                .align = sizeof(unsigned int),
                .get = s390_system_call_get,
                .set = s390_system_call_set,
        },
	...
};

Is it just preference for being consistent with ARM32, or is there a
reason this won't work?

It's not that I care strongly about the interface, my main point is
that the changelog doesn't describe why one interface was used instead
the other.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 7, 2014, 1:11 p.m. UTC | #7
On Fri, Nov 07, 2014 at 12:44:07PM +0000, Arnd Bergmann wrote:
> On Friday 07 November 2014 12:11:19 Russell King - ARM Linux wrote:
> > On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote:
> > > On Friday 07 November 2014 11:55:51 Will Deacon wrote:
> > > > We need this for arm64 and, since all architectures seem to have a mechanism
> > > > for setting a system call via ptrace, moving it to generic code should make
> > > > sense for new architectures too, no?
> > > 
> > > It makes a little more sense now, but I still don't understand why you
> > > need to set the system call number via ptrace. What is this used for,
> > > and why doesn't any other architecture have this?
> > 
> > All other architectures have a way.  x86, for example, you set orig_eax
> > (or orig_rax) to change the syscall number.  On ARM, that doesn't work
> > because we don't always pass the syscall number in a register.
> > 
> 
> Sorry for being slow today, but why can't we use the same interface that
> s390 has on arm64:
> 
> static int s390_system_call_get(struct task_struct *target,
>                                 const struct user_regset *regset,
>                                 unsigned int pos, unsigned int count,
>                                 void *kbuf, void __user *ubuf)
> {
>         unsigned int *data = &task_thread_info(target)->system_call;
>         return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
>                                    data, 0, sizeof(unsigned int));
> }
> 
> static int s390_system_call_set(struct task_struct *target,
>                                 const struct user_regset *regset,
>                                 unsigned int pos, unsigned int count,
>                                 const void *kbuf, const void __user *ubuf)
> {
>         unsigned int *data = &task_thread_info(target)->system_call;
>         return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
>                                   data, 0, sizeof(unsigned int));
> }
> 
> static const struct user_regset s390_regsets[] = {
> 	...
>         {
>                 .core_note_type = NT_S390_SYSTEM_CALL,
>                 .n = 1,
>                 .size = sizeof(unsigned int),
>                 .align = sizeof(unsigned int),
>                 .get = s390_system_call_get,
>                 .set = s390_system_call_set,
>         },
> 	...
> };
> 
> Is it just preference for being consistent with ARM32, or is there a
> reason this won't work?

Interesting, I hadn't considered a unit-length regset.

> It's not that I care strongly about the interface, my main point is
> that the changelog doesn't describe why one interface was used instead
> the other.

I suspect the current approach was taken because it follows the same scheme
as 32-bit ARM. If both methods are sufficient (Kees would have a better idea
than me on that), then I don't have a strong preference.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Oleg Nesterov Nov. 7, 2014, 2:04 p.m. UTC | #8
On 11/07, AKASHI Takahiro wrote:
>
> --- a/arch/arm/kernel/ptrace.c
> +++ b/arch/arm/kernel/ptrace.c
> @@ -853,11 +853,6 @@ long arch_ptrace(struct task_struct *child, long request,
>  				       datap);
>  			break;
>  
> -		case PTRACE_SET_SYSCALL:
> -			task_thread_info(child)->syscall = data;
> -			ret = 0;
> -			break;
> -
>  #ifdef CONFIG_CRUNCH
>  		case PTRACE_GETCRUNCHREGS:
>  			ret = ptrace_getcrunchregs(child, datap);
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 54e7522..d7048fa 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -1001,6 +1001,12 @@ int ptrace_request(struct task_struct *child, long request,
>  		break;
>  	}
>  #endif
> +
> +#ifdef PTRACE_SET_SYSCALL
> +	case PTRACE_SET_SYSCALL:
> +		ret = syscall_set_nr(child, task_pt_regs(child), data);
> +		break;
> +#endif

I too do not understand why it makes sense to move PTRACE_SET_SYSCALL into
the common kernel/ptrace.c.

To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr()
is very much arch-dependant (but most probably trivial) means that this code
should live in arch_ptrace().

In any case, I think it doesn't make sense to pass task_pt_regs(child), this
helper can do this itself if it needs struct pt_regs.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Nov. 7, 2014, 2:30 p.m. UTC | #9
On Friday 07 November 2014 13:11:30 Will Deacon wrote:
> 
> > It's not that I care strongly about the interface, my main point is
> > that the changelog doesn't describe why one interface was used instead
> > the other.
> 
> I suspect the current approach was taken because it follows the same scheme
> as 32-bit ARM. If both methods are sufficient (Kees would have a better idea
> than me on that), then I don't have a strong preference.

Using the regset would probably address Oleg's comment, and would keep the
implementation architecture specific. You could even share the NT_S390_SYSTEM_CALL
number, but I don't know if there any downsides to doing that.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Kees Cook Nov. 7, 2014, 4:44 p.m. UTC | #10
On Fri, Nov 7, 2014 at 6:30 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Friday 07 November 2014 13:11:30 Will Deacon wrote:
>>
>> > It's not that I care strongly about the interface, my main point is
>> > that the changelog doesn't describe why one interface was used instead
>> > the other.
>>
>> I suspect the current approach was taken because it follows the same scheme
>> as 32-bit ARM. If both methods are sufficient (Kees would have a better idea
>> than me on that), then I don't have a strong preference.
>
> Using the regset would probably address Oleg's comment, and would keep the
> implementation architecture specific. You could even share the NT_S390_SYSTEM_CALL
> number, but I don't know if there any downsides to doing that.

That's fine by me -- I only want an interface. :) I think it'd be nice
to keep it the same between arm32 and arm64, but using a specific
regset does seem to be the better approach.

-Kees
AKASHI Takahiro Nov. 10, 2014, 6:36 a.m. UTC | #11
On 11/07/2014 09:27 PM, Will Deacon wrote:
> On Fri, Nov 07, 2014 at 12:03:00PM +0000, Arnd Bergmann wrote:
>> On Friday 07 November 2014 11:55:51 Will Deacon wrote:
>>> On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote:
>>>> On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote:
>>>>> This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL.
>>>>> It can be used to change a system call number as follows:
>>>>>      ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no);
>>>>> 'new_syscall_no' can be -1 to skip this system call, you need to modify
>>>>> a register's value, in arch-specific way, as return value though.
>>>>>
>>>>> Please note that we can't define PTRACE_SET_SYSCALL macro in
>>>>> uapi/linux/ptrace.h partly because its value on arm, 23, is used as another
>>>>> request on sparc.
>>>>>
>>>>> This patch also contains an example of change on arch side, arm.
>>>>> Only syscall_set_nr() is required to be defined in asm/syscall.h.
>>>>>
>>>>> Currently only arm has this request, while arm64 would also have it
>>>>> once my patch series of seccomp for arm64 is merged. It will also be
>>>>> usable for most of other arches.
>>>>> See the discussions in lak-ml:
>>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
>>>>>
>>>>
>>>> Can you describe why you are moving the implementation? Is this a feature
>>>> that we want to have on all architectures in the future? As you say,
>>>> only arm32 implements is at the moment.
>>>
>>> We need this for arm64 and, since all architectures seem to have a mechanism
>>> for setting a system call via ptrace, moving it to generic code should make
>>> sense for new architectures too, no?
>>
>> It makes a little more sense now, but I still don't understand why you
>> need to set the system call number via ptrace. What is this used for,
>> and why doesn't any other architecture have this?
>
> I went through the same thought process back in August, and Akashi
> eventually convinced me that this was the best thing to do:
>
>    http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/278692.html
>
> It comes down to a debugger (which could be GDB, seccomp, tracer ...)
> wanting to change the system call number. This is also used as a mechanism
> to skip a system call by setting it to '-1' (yeah, it's gross, and the
> interaction between all of these syscall hooks is horrible too).
>
> If we update w8 directly instead, we run into a couple of issues:
>
>    - Needing to restore the original w8 if the value is set to '-1' for
>      skip, but continuing to return -ENOSYS for syscall(-1) when not on a
>      tracer path

Yeah, this restriction still exists on my recent patch, v7.
(this is because arm64 uses the same register, x0, as the first argument
and a return value.)

>    - seccomp assumes that syscall_get_nr will return the version set by
>      the most recent tracer, so then we need hacks in ptrace to route
>      register writes to w8 to syscallno in pt_regs, but again, only in the
>      case that we're tracing.

The problem here is that, if we had a hack of replacinging syscallno with w8
in ptrace (ptrace_syscall_enter()), secure_computing() (actually, seccomp_phase2()
on v3.18-rc) would have no chance of seeing a modified syscall number because
the hack would be executed after secure_computing().
(Please note that a tracer simply modifies w8, not syscallno directly).

This eventually results in missing a special case of -1 (skipping this system call).
     http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/280846.html

That is why we needed to have a dedicated new interface.

-Takahiro AKASHI

> Akashi might be able to elaborate on other problems, since this was a
> couple of months ago and I take every opportunity I can to avoid looking
> at this part of the kernel.
>
> Will
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
AKASHI Takahiro Nov. 12, 2014, 10:46 a.m. UTC | #12
Will,

On 11/07/2014 11:04 PM, Oleg Nesterov wrote:
> On 11/07, AKASHI Takahiro wrote:
>>
>> --- a/arch/arm/kernel/ptrace.c
>> +++ b/arch/arm/kernel/ptrace.c
>> @@ -853,11 +853,6 @@ long arch_ptrace(struct task_struct *child, long request,
>>   				       datap);
>>   			break;
>>
>> -		case PTRACE_SET_SYSCALL:
>> -			task_thread_info(child)->syscall = data;
>> -			ret = 0;
>> -			break;
>> -
>>   #ifdef CONFIG_CRUNCH
>>   		case PTRACE_GETCRUNCHREGS:
>>   			ret = ptrace_getcrunchregs(child, datap);
>> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
>> index 54e7522..d7048fa 100644
>> --- a/kernel/ptrace.c
>> +++ b/kernel/ptrace.c
>> @@ -1001,6 +1001,12 @@ int ptrace_request(struct task_struct *child, long request,
>>   		break;
>>   	}
>>   #endif
>> +
>> +#ifdef PTRACE_SET_SYSCALL
>> +	case PTRACE_SET_SYSCALL:
>> +		ret = syscall_set_nr(child, task_pt_regs(child), data);
>> +		break;
>> +#endif
>
> I too do not understand why it makes sense to move PTRACE_SET_SYSCALL into
> the common kernel/ptrace.c.

I think I explained why we need a new (atomic) interface of changing a system
call number while tracing with ptrace. But I don't have a strong preference,
either ptrace(SET_SYSCALL) or ptrace(SETREGSET, NT_SYSTEM_CALL).

> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr()
> is very much arch-dependant (but most probably trivial) means that this  code
> should live in arch_ptrace().

Thinking of Oleg's comment above, it doesn't make sense neither to define generic
NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset()
in kernel/ptrace.c with arch-defined syscall_(g)set_nr().

Since we should have the same interface on arm and arm64, we'd better implement
ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted).

-Takahiro AKASHI

> In any case, I think it doesn't make sense to pass task_pt_regs(child), this
> helper can do this itself if it needs struct pt_regs.
>
> Oleg.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 12, 2014, 11 a.m. UTC | #13
On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote:
> On 11/07/2014 11:04 PM, Oleg Nesterov wrote:
> > To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr()
> > is very much arch-dependant (but most probably trivial) means that this  code
> > should live in arch_ptrace().
> 
> Thinking of Oleg's comment above, it doesn't make sense neither to define generic
> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset()
> in kernel/ptrace.c with arch-defined syscall_(g)set_nr().
> 
> Since we should have the same interface on arm and arm64, we'd better implement
> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted).

I think the regset approach is cleaner. We already do something similar for
TLS. That would be implemented under arch/arm64/ with it's own NT type.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
AKASHI Takahiro Nov. 12, 2014, 11:06 a.m. UTC | #14
On 11/12/2014 08:00 PM, Will Deacon wrote:
> On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote:
>> On 11/07/2014 11:04 PM, Oleg Nesterov wrote:
>>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr()
>>> is very much arch-dependant (but most probably trivial) means that this  code
>>> should live in arch_ptrace().
>>
>> Thinking of Oleg's comment above, it doesn't make sense neither to define generic
>> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset()
>> in kernel/ptrace.c with arch-defined syscall_(g)set_nr().
>>
>> Since we should have the same interface on arm and arm64, we'd better implement
>> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted).
>
> I think the regset approach is cleaner. We already do something similar for
> TLS. That would be implemented under arch/arm64/ with it's own NT type.

Okey, so arm64 goes its own way :)
Or do you want to have a similar regset on arm, too?
(In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h)

-Takahiro AKASHI

> Will
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon Nov. 12, 2014, 11:13 a.m. UTC | #15
On Wed, Nov 12, 2014 at 11:06:59AM +0000, AKASHI Takahiro wrote:
> On 11/12/2014 08:00 PM, Will Deacon wrote:
> > On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote:
> >> On 11/07/2014 11:04 PM, Oleg Nesterov wrote:
> >>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr()
> >>> is very much arch-dependant (but most probably trivial) means that this  code
> >>> should live in arch_ptrace().
> >>
> >> Thinking of Oleg's comment above, it doesn't make sense neither to define generic
> >> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset()
> >> in kernel/ptrace.c with arch-defined syscall_(g)set_nr().
> >>
> >> Since we should have the same interface on arm and arm64, we'd better implement
> >> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted).
> >
> > I think the regset approach is cleaner. We already do something similar for
> > TLS. That would be implemented under arch/arm64/ with it's own NT type.
> 
> Okey, so arm64 goes its own way :)
> Or do you want to have a similar regset on arm, too?
> (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h)

Just do arm64. We already have the dedicated request for arch/arm/.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index e86c985..3e1d9c0 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -24,6 +24,13 @@  static inline int syscall_get_nr(struct task_struct *task,
 	return task_thread_info(task)->syscall;
 }
 
+static inline int syscall_set_nr(struct task_struct *task,
+				 struct pt_regs *regs, int syscall)
+{
+	task_thread_info(task)->syscall = syscall;
+	return 0;
+}
+
 static inline void syscall_rollback(struct task_struct *task,
 				    struct pt_regs *regs)
 {
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index ef9119f..908bae8 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -853,11 +853,6 @@  long arch_ptrace(struct task_struct *child, long request,
 				       datap);
 			break;
 
-		case PTRACE_SET_SYSCALL:
-			task_thread_info(child)->syscall = data;
-			ret = 0;
-			break;
-
 #ifdef CONFIG_CRUNCH
 		case PTRACE_GETCRUNCHREGS:
 			ret = ptrace_getcrunchregs(child, datap);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54e7522..d7048fa 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1001,6 +1001,12 @@  int ptrace_request(struct task_struct *child, long request,
 		break;
 	}
 #endif
+
+#ifdef PTRACE_SET_SYSCALL
+	case PTRACE_SET_SYSCALL:
+		ret = syscall_set_nr(child, task_pt_regs(child), data);
+		break;
+#endif
 	default:
 		break;
 	}