diff mbox

arm: ptrace: fix syscall modification under PTRACE_O_TRACESECCOMP

Message ID 20140620172330.GA30656@arm.com
State New
Headers show

Commit Message

Will Deacon June 20, 2014, 5:23 p.m. UTC
On Fri, Jun 20, 2014 at 05:44:52PM +0100, Kees Cook wrote:
> On Fri, Jun 20, 2014 at 3:22 AM, Will Deacon <will.deacon@arm.com> wrote:
> > I'm struggling to see the bug in the current code, so apologies if my
> > questions aren't helpful.
> >
> > On Wed, Jun 18, 2014 at 09:27:48PM +0100, Kees Cook wrote:
> >> An x86 tracer wanting to change the syscall uses PTRACE_SETREGS
> >> (stored to regs->orig_ax), and an ARM tracer uses PTRACE_SET_SYSCALL
> >> (stored to current_thread_info()->syscall). When this happens, the
> >> syscall can change across the call to secure_computing(), since it may
> >> block on tracer notification, and the tracer can then make changes
> >> to the process, before we return from secure_computing(). This
> >> means the code must respect the changed syscall after the
> >> secure_computing() call in syscall_trace_enter(). The same is true
> >> for tracehook_report_syscall_entry() which may also block and change
> >> the syscall.
> >
> > I don't think I understand what you mean by `the code must respect the
> > changed syscall'. The current code does indeed issue the new syscall, so are
> > you more concerned with secure_computing changing ->syscall, then the
> > debugger can't see the new syscall when it sees the trap from tracehook?
> > Are these even supposed to inter-operate?
> 
> The problem is the use of "scno" in the call -- it results in ignoring
> the value that may be set up in ->syscall by a tracer:
> 
> syscall_trace_enter(regs, scno):
>     current_thread_info()->syscall = scno;
>     secure_computing(scno):
>         [block on ptrace]
>         [ptracer changes current_thread_info()->syscall vis PTRACE_SET_SYSCALL]
>     ...
>     return scno;
> 
> This means the tracer's changed syscall value will be ignored
> (syscall_trace_enter returns original "scno" instead of actual
> current_thread_info()->syscall). In the original code, failure cases
> were propagated correctly, but not tracer-induced changes.
> 
> Is that more clear? It's not an obvious state (due to the external
> modification of process state during the ptrace blocking). I've also
> got tests for this, if that's useful to further illustrate:
> 
> https://github.com/kees/seccomp/commit/bd24e174593f79784b97178b583f17e0ea9d2aa7

Right, gotcha. Thanks for the explanation. I was confused, because
tracehook_report_syscall does the right thing (returns
current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
then updates during the secure_computing callback will be ignored.

However, my fix to this is significantly smaller than your patch, so I fear
I'm still missing something.

Will

--->8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Kees Cook June 20, 2014, 5:36 p.m. UTC | #1
On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jun 20, 2014 at 05:44:52PM +0100, Kees Cook wrote:
>> On Fri, Jun 20, 2014 at 3:22 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > I'm struggling to see the bug in the current code, so apologies if my
>> > questions aren't helpful.
>> >
>> > On Wed, Jun 18, 2014 at 09:27:48PM +0100, Kees Cook wrote:
>> >> An x86 tracer wanting to change the syscall uses PTRACE_SETREGS
>> >> (stored to regs->orig_ax), and an ARM tracer uses PTRACE_SET_SYSCALL
>> >> (stored to current_thread_info()->syscall). When this happens, the
>> >> syscall can change across the call to secure_computing(), since it may
>> >> block on tracer notification, and the tracer can then make changes
>> >> to the process, before we return from secure_computing(). This
>> >> means the code must respect the changed syscall after the
>> >> secure_computing() call in syscall_trace_enter(). The same is true
>> >> for tracehook_report_syscall_entry() which may also block and change
>> >> the syscall.
>> >
>> > I don't think I understand what you mean by `the code must respect the
>> > changed syscall'. The current code does indeed issue the new syscall, so are
>> > you more concerned with secure_computing changing ->syscall, then the
>> > debugger can't see the new syscall when it sees the trap from tracehook?
>> > Are these even supposed to inter-operate?
>>
>> The problem is the use of "scno" in the call -- it results in ignoring
>> the value that may be set up in ->syscall by a tracer:
>>
>> syscall_trace_enter(regs, scno):
>>     current_thread_info()->syscall = scno;
>>     secure_computing(scno):
>>         [block on ptrace]
>>         [ptracer changes current_thread_info()->syscall vis PTRACE_SET_SYSCALL]
>>     ...
>>     return scno;
>>
>> This means the tracer's changed syscall value will be ignored
>> (syscall_trace_enter returns original "scno" instead of actual
>> current_thread_info()->syscall). In the original code, failure cases
>> were propagated correctly, but not tracer-induced changes.
>>
>> Is that more clear? It's not an obvious state (due to the external
>> modification of process state during the ptrace blocking). I've also
>> got tests for this, if that's useful to further illustrate:
>>
>> https://github.com/kees/seccomp/commit/bd24e174593f79784b97178b583f17e0ea9d2aa7
>
> Right, gotcha. Thanks for the explanation. I was confused, because
> tracehook_report_syscall does the right thing (returns
> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
> then updates during the secure_computing callback will be ignored.
>
> However, my fix to this is significantly smaller than your patch, so I fear
> I'm still missing something.

Oh, yes, that's much smaller. Nice! I will test this and report back.

>
> Will
>
> --->8
>
> diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
> index 0dd3b79b15c3..0c27ed6f3f23 100644
> --- a/arch/arm/kernel/ptrace.c
> +++ b/arch/arm/kernel/ptrace.c
> @@ -908,7 +908,7 @@ enum ptrace_syscall_dir {
>         PTRACE_SYSCALL_EXIT,
>  };
>
> -static int tracehook_report_syscall(struct pt_regs *regs,
> +static void tracehook_report_syscall(struct pt_regs *regs,
>                                     enum ptrace_syscall_dir dir)
>  {
>         unsigned long ip;
> @@ -926,7 +926,6 @@ static int tracehook_report_syscall(struct pt_regs *regs,
>                 current_thread_info()->syscall = -1;
>
>         regs->ARM_ip = ip;
> -       return current_thread_info()->syscall;
>  }
>
>  asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno)
> @@ -938,7 +937,9 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno)
>                 return -1;
>
>         if (test_thread_flag(TIF_SYSCALL_TRACE))
> -               scno = tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
> +               tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
> +
> +       scno = current_thread_info()->syscall;

Perhaps it'd be worth adding a comment above this line just for people
looking at this in the future. Something like:

/* secure_computing and tracehook_report_syscall may have changed syscall */

-Kees

>
>         if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
>                 trace_sys_enter(regs, scno);
Kees Cook June 20, 2014, 6:10 p.m. UTC | #2
On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
>> On Fri, Jun 20, 2014 at 05:44:52PM +0100, Kees Cook wrote:
>>> On Fri, Jun 20, 2014 at 3:22 AM, Will Deacon <will.deacon@arm.com> wrote:
>>> > I'm struggling to see the bug in the current code, so apologies if my
>>> > questions aren't helpful.
>>> >
>>> > On Wed, Jun 18, 2014 at 09:27:48PM +0100, Kees Cook wrote:
>>> >> An x86 tracer wanting to change the syscall uses PTRACE_SETREGS
>>> >> (stored to regs->orig_ax), and an ARM tracer uses PTRACE_SET_SYSCALL
>>> >> (stored to current_thread_info()->syscall). When this happens, the
>>> >> syscall can change across the call to secure_computing(), since it may
>>> >> block on tracer notification, and the tracer can then make changes
>>> >> to the process, before we return from secure_computing(). This
>>> >> means the code must respect the changed syscall after the
>>> >> secure_computing() call in syscall_trace_enter(). The same is true
>>> >> for tracehook_report_syscall_entry() which may also block and change
>>> >> the syscall.
>>> >
>>> > I don't think I understand what you mean by `the code must respect the
>>> > changed syscall'. The current code does indeed issue the new syscall, so are
>>> > you more concerned with secure_computing changing ->syscall, then the
>>> > debugger can't see the new syscall when it sees the trap from tracehook?
>>> > Are these even supposed to inter-operate?
>>>
>>> The problem is the use of "scno" in the call -- it results in ignoring
>>> the value that may be set up in ->syscall by a tracer:
>>>
>>> syscall_trace_enter(regs, scno):
>>>     current_thread_info()->syscall = scno;
>>>     secure_computing(scno):
>>>         [block on ptrace]
>>>         [ptracer changes current_thread_info()->syscall vis PTRACE_SET_SYSCALL]
>>>     ...
>>>     return scno;
>>>
>>> This means the tracer's changed syscall value will be ignored
>>> (syscall_trace_enter returns original "scno" instead of actual
>>> current_thread_info()->syscall). In the original code, failure cases
>>> were propagated correctly, but not tracer-induced changes.
>>>
>>> Is that more clear? It's not an obvious state (due to the external
>>> modification of process state during the ptrace blocking). I've also
>>> got tests for this, if that's useful to further illustrate:
>>>
>>> https://github.com/kees/seccomp/commit/bd24e174593f79784b97178b583f17e0ea9d2aa7
>>
>> Right, gotcha. Thanks for the explanation. I was confused, because
>> tracehook_report_syscall does the right thing (returns
>> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
>> then updates during the secure_computing callback will be ignored.
>>
>> However, my fix to this is significantly smaller than your patch, so I fear
>> I'm still missing something.
>
> Oh, yes, that's much smaller. Nice! I will test this and report back.

Yup, I can confirm this works. Thanks!

Tested-by: Kees Cook <keescook@chromium.org>

-Kees
Will Deacon June 23, 2014, 8:46 a.m. UTC | #3
On Fri, Jun 20, 2014 at 07:10:46PM +0100, Kees Cook wrote:
> On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> Right, gotcha. Thanks for the explanation. I was confused, because
> >> tracehook_report_syscall does the right thing (returns
> >> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
> >> then updates during the secure_computing callback will be ignored.
> >>
> >> However, my fix to this is significantly smaller than your patch, so I fear
> >> I'm still missing something.
> >
> > Oh, yes, that's much smaller. Nice! I will test this and report back.
> 
> Yup, I can confirm this works. Thanks!
> 
> Tested-by: Kees Cook <keescook@chromium.org>

Thanks, Kees. I'll post a patch shortly. I'll try and remember to keep an
eye out for this when seccomp lands for arm64 too.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Kees Cook June 23, 2014, 7:46 p.m. UTC | #4
On Mon, Jun 23, 2014 at 1:46 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jun 20, 2014 at 07:10:46PM +0100, Kees Cook wrote:
>> On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
>> > On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
>> >> Right, gotcha. Thanks for the explanation. I was confused, because
>> >> tracehook_report_syscall does the right thing (returns
>> >> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
>> >> then updates during the secure_computing callback will be ignored.
>> >>
>> >> However, my fix to this is significantly smaller than your patch, so I fear
>> >> I'm still missing something.
>> >
>> > Oh, yes, that's much smaller. Nice! I will test this and report back.
>>
>> Yup, I can confirm this works. Thanks!
>>
>> Tested-by: Kees Cook <keescook@chromium.org>
>
> Thanks, Kees. I'll post a patch shortly. I'll try and remember to keep an
> eye out for this when seccomp lands for arm64 too.

Great, thanks!

What's the state of seccomp on arm64? I saw a series back in March,
but nothing since then? It looked complete, but I haven't set up a
test environment yet to verify.

-Kees
Will Deacon June 24, 2014, 8:54 a.m. UTC | #5
On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
> On Mon, Jun 23, 2014 at 1:46 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, Jun 20, 2014 at 07:10:46PM +0100, Kees Cook wrote:
> >> On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
> >> > On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> >> Right, gotcha. Thanks for the explanation. I was confused, because
> >> >> tracehook_report_syscall does the right thing (returns
> >> >> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
> >> >> then updates during the secure_computing callback will be ignored.
> >> >>
> >> >> However, my fix to this is significantly smaller than your patch, so I fear
> >> >> I'm still missing something.
> >> >
> >> > Oh, yes, that's much smaller. Nice! I will test this and report back.
> >>
> >> Yup, I can confirm this works. Thanks!
> >>
> >> Tested-by: Kees Cook <keescook@chromium.org>
> >
> > Thanks, Kees. I'll post a patch shortly. I'll try and remember to keep an
> > eye out for this when seccomp lands for arm64 too.
> 
> Great, thanks!
> 
> What's the state of seccomp on arm64? I saw a series back in March,
> but nothing since then? It looked complete, but I haven't set up a
> test environment yet to verify.

I think Akashi was going to repost `real soon now' so we can include them
for 3.17. He missed the merge window last time around.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
AKASHI Takahiro June 24, 2014, 9:20 a.m. UTC | #6
On 06/24/2014 05:54 PM, Will Deacon wrote:
> On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
>> On Mon, Jun 23, 2014 at 1:46 AM, Will Deacon <will.deacon@arm.com> wrote:
>>> On Fri, Jun 20, 2014 at 07:10:46PM +0100, Kees Cook wrote:
>>>> On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
>>>>> On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
>>>>>> Right, gotcha. Thanks for the explanation. I was confused, because
>>>>>> tracehook_report_syscall does the right thing (returns
>>>>>> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
>>>>>> then updates during the secure_computing callback will be ignored.
>>>>>>
>>>>>> However, my fix to this is significantly smaller than your patch, so I fear
>>>>>> I'm still missing something.
>>>>>
>>>>> Oh, yes, that's much smaller. Nice! I will test this and report back.
>>>>
>>>> Yup, I can confirm this works. Thanks!
>>>>
>>>> Tested-by: Kees Cook <keescook@chromium.org>
>>>
>>> Thanks, Kees. I'll post a patch shortly. I'll try and remember to keep an
>>> eye out for this when seccomp lands for arm64 too.
>>
>> Great, thanks!
>>
>> What's the state of seccomp on arm64? I saw a series back in March,
>> but nothing since then? It looked complete, but I haven't set up a
>> test environment yet to verify.
>
> I think Akashi was going to repost `real soon now' so we can include them
> for 3.17. He missed the merge window last time around.

Do I really have to repost my patch even without any changes since the last one?
Just kidding. I will do so once I confirm the issue discussed here.
(I've finally found out a user of my patch :)

-Takahiro AKASHI

> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
AKASHI Takahiro July 3, 2014, 7:43 a.m. UTC | #7
Hi Will,

On 06/24/2014 05:54 PM, Will Deacon wrote:
> On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
>> On Mon, Jun 23, 2014 at 1:46 AM, Will Deacon <will.deacon@arm.com> wrote:
>>> On Fri, Jun 20, 2014 at 07:10:46PM +0100, Kees Cook wrote:
>>>> On Fri, Jun 20, 2014 at 10:36 AM, Kees Cook <keescook@chromium.org> wrote:
>>>>> On Fri, Jun 20, 2014 at 10:23 AM, Will Deacon <will.deacon@arm.com> wrote:
>>>>>> Right, gotcha. Thanks for the explanation. I was confused, because
>>>>>> tracehook_report_syscall does the right thing (returns
>>>>>> current_thread_info()->syscall), but if we don't have TIF_SYSCALL_TRACE set,
>>>>>> then updates during the secure_computing callback will be ignored.re
>>>>>>
>>>>>> However, my fix to this is significantly smaller than your patch, so I fear
>>>>>> I'm still missing something.
>>>>>
>>>>> Oh, yes, that's much smaller. Nice! I will test this and report back.
>>>>
>>>> Yup, I can confirm this works. Thanks!
>>>>
>>>> Tested-by: Kees Cook <keescook@chromium.org>
>>>
>>> Thanks, Kees. I'll post a patch shortly. I'll try and remember to keep an
>>> eye out for this when seccomp lands for arm64 too.
>>
>> Great, thanks!
>>
>> What's the state of seccomp on arm64? I saw a series back in March,
>> but nothing since then? It looked complete, but I haven't set up a
>> test environment yet to verify.
>
> I think Akashi was going to repost `real soon now' so we can include them
> for 3.17. He missed the merge window last time around.

I took a quick look at the current implementation of ptrace.
ptrace(PTRACE_GETREGSET/SETREGSET), eventually gpr_get/set(), handles only 'struct user_pt_regs',
and we have no way to modify orig_x0 nor syscallno in 'struct pt_regs' directly.
So it seems to me that we can't change a system call by ptrace().
Do I misunderstand anything?

-Takahiro AKASHI


> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon July 3, 2014, 10:24 a.m. UTC | #8
On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
> Hi Will,

Hi Akashi,

> On 06/24/2014 05:54 PM, Will Deacon wrote:
> > On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
> >> What's the state of seccomp on arm64? I saw a series back in March,
> >> but nothing since then? It looked complete, but I haven't set up a
> >> test environment yet to verify.
> >
> > I think Akashi was going to repost `real soon now' so we can include them
> > for 3.17. He missed the merge window last time around.
> 
> I took a quick look at the current implementation of ptrace.
> ptrace(PTRACE_GETREGSET/SETREGSET), eventually gpr_get/set(), handles only
> 'struct user_pt_regs', and we have no way to modify orig_x0 nor syscallno
> in 'struct pt_regs' directly.
> So it seems to me that we can't change a system call by ptrace().
> Do I misunderstand anything?

No, it looks like you have a point here. I don't think userspace has any
business with orig_x0, but changing syscallno is certainly useful. I can
think of two ways to fix this:

  (1) Updating syscallno based on w8, but this ties us to the current ABI
      and could get messy if this register changes in the future.

  (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
      but that means adding arch-specific stuff to arch_ptrace (which
      currently goes straight to ptrace_request on arm64).

It looks like x86 uses orig_ax, which I *think* means we would go with
(1) above if we followed their lead.

Anybody else have an opinion about this?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andy Lutomirski July 3, 2014, 3:39 p.m. UTC | #9
On Thu, Jul 3, 2014 at 3:24 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
>> Hi Will,
>
> Hi Akashi,
>
>> On 06/24/2014 05:54 PM, Will Deacon wrote:
>> > On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
>> >> What's the state of seccomp on arm64? I saw a series back in March,
>> >> but nothing since then? It looked complete, but I haven't set up a
>> >> test environment yet to verify.
>> >
>> > I think Akashi was going to repost `real soon now' so we can include them
>> > for 3.17. He missed the merge window last time around.
>>
>> I took a quick look at the current implementation of ptrace.
>> ptrace(PTRACE_GETREGSET/SETREGSET), eventually gpr_get/set(), handles only
>> 'struct user_pt_regs', and we have no way to modify orig_x0 nor syscallno
>> in 'struct pt_regs' directly.
>> So it seems to me that we can't change a system call by ptrace().
>> Do I misunderstand anything?
>
> No, it looks like you have a point here. I don't think userspace has any
> business with orig_x0, but changing syscallno is certainly useful. I can
> think of two ways to fix this:
>
>   (1) Updating syscallno based on w8, but this ties us to the current ABI
>       and could get messy if this register changes in the future.
>
>   (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
>       but that means adding arch-specific stuff to arch_ptrace (which
>       currently goes straight to ptrace_request on arm64).
>
> It looks like x86 uses orig_ax, which I *think* means we would go with
> (1) above if we followed their lead.

w8 is a real register, right?  On x86, at least orig_ax isn't a real
register, so it's quite unlikely to conflict with hardware stuff.

On x86, the "user_struct" thing has nothing to do with any real kernel
data structure, so it's extensible.  Can you just add syscallno to it?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Will Deacon July 3, 2014, 4:11 p.m. UTC | #10
On Thu, Jul 03, 2014 at 04:39:21PM +0100, Andy Lutomirski wrote:
> On Thu, Jul 3, 2014 at 3:24 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
> >> On 06/24/2014 05:54 PM, Will Deacon wrote:
> >> > On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
> >> >> What's the state of seccomp on arm64? I saw a series back in March,
> >> >> but nothing since then? It looked complete, but I haven't set up a
> >> >> test environment yet to verify.
> >> >
> >> > I think Akashi was going to repost `real soon now' so we can include them
> >> > for 3.17. He missed the merge window last time around.
> >>
> >> I took a quick look at the current implementation of ptrace.
> >> ptrace(PTRACE_GETREGSET/SETREGSET), eventually gpr_get/set(), handles only
> >> 'struct user_pt_regs', and we have no way to modify orig_x0 nor syscallno
> >> in 'struct pt_regs' directly.
> >> So it seems to me that we can't change a system call by ptrace().
> >> Do I misunderstand anything?
> >
> > No, it looks like you have a point here. I don't think userspace has any
> > business with orig_x0, but changing syscallno is certainly useful. I can
> > think of two ways to fix this:
> >
> >   (1) Updating syscallno based on w8, but this ties us to the current ABI
> >       and could get messy if this register changes in the future.
> >
> >   (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
> >       but that means adding arch-specific stuff to arch_ptrace (which
> >       currently goes straight to ptrace_request on arm64).
> >
> > It looks like x86 uses orig_ax, which I *think* means we would go with
> > (1) above if we followed their lead.
> 
> w8 is a real register, right?  On x86, at least orig_ax isn't a real
> register, so it's quite unlikely to conflict with hardware stuff.

Yeah, w8 is the hardware register which the Linux ABI uses for the system
call number. I was thinking We could allow the debugger/tracer to update
the syscall number by updating that register, or do you see an issue with
that? (other than tying us to the current ABI).

> On x86, the "user_struct" thing has nothing to do with any real kernel
> data structure, so it's extensible.  Can you just add syscallno to it?

I'm really not keen on changing user-facing structures like that. For
example, KVM embeds user_pt_regs into kvm_regs.

We can add a new ptrace request if we have to.

Will
Andy Lutomirski July 3, 2014, 4:13 p.m. UTC | #11
On Thu, Jul 3, 2014 at 9:11 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jul 03, 2014 at 04:39:21PM +0100, Andy Lutomirski wrote:
>> On Thu, Jul 3, 2014 at 3:24 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
>> >> On 06/24/2014 05:54 PM, Will Deacon wrote:
>> >> > On Mon, Jun 23, 2014 at 08:46:52PM +0100, Kees Cook wrote:
>> >> >> What's the state of seccomp on arm64? I saw a series back in March,
>> >> >> but nothing since then? It looked complete, but I haven't set up a
>> >> >> test environment yet to verify.
>> >> >
>> >> > I think Akashi was going to repost `real soon now' so we can include them
>> >> > for 3.17. He missed the merge window last time around.
>> >>
>> >> I took a quick look at the current implementation of ptrace.
>> >> ptrace(PTRACE_GETREGSET/SETREGSET), eventually gpr_get/set(), handles only
>> >> 'struct user_pt_regs', and we have no way to modify orig_x0 nor syscallno
>> >> in 'struct pt_regs' directly.
>> >> So it seems to me that we can't change a system call by ptrace().
>> >> Do I misunderstand anything?
>> >
>> > No, it looks like you have a point here. I don't think userspace has any
>> > business with orig_x0, but changing syscallno is certainly useful. I can
>> > think of two ways to fix this:
>> >
>> >   (1) Updating syscallno based on w8, but this ties us to the current ABI
>> >       and could get messy if this register changes in the future.
>> >
>> >   (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
>> >       but that means adding arch-specific stuff to arch_ptrace (which
>> >       currently goes straight to ptrace_request on arm64).
>> >
>> > It looks like x86 uses orig_ax, which I *think* means we would go with
>> > (1) above if we followed their lead.
>>
>> w8 is a real register, right?  On x86, at least orig_ax isn't a real
>> register, so it's quite unlikely to conflict with hardware stuff.
>
> Yeah, w8 is the hardware register which the Linux ABI uses for the system
> call number. I was thinking We could allow the debugger/tracer to update
> the syscall number by updating that register, or do you see an issue with
> that? (other than tying us to the current ABI).

Not immediately, but I'm not super-familiar with ptrace.

Is w8 clobbered or otherwise changed by syscalls?  Using w8 for this
has the odd effect that tracers can't force a return with a specific
value of w8 without executing the corresponding syscall.  If that's a
meaningful limitation, then presumably some other channel should be
used.

>
>> On x86, the "user_struct" thing has nothing to do with any real kernel
>> data structure, so it's extensible.  Can you just add syscallno to it?
>
> I'm really not keen on changing user-facing structures like that. For
> example, KVM embeds user_pt_regs into kvm_regs.

Fair enough.

>
> We can add a new ptrace request if we have to.
>
> Will
Will Deacon July 3, 2014, 4:32 p.m. UTC | #12
On Thu, Jul 03, 2014 at 05:13:50PM +0100, Andy Lutomirski wrote:
> On Thu, Jul 3, 2014 at 9:11 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Thu, Jul 03, 2014 at 04:39:21PM +0100, Andy Lutomirski wrote:
> >> On Thu, Jul 3, 2014 at 3:24 AM, Will Deacon <will.deacon@arm.com> wrote:
> >> > On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
> >> >> So it seems to me that we can't change a system call by ptrace().
> >> >> Do I misunderstand anything?
> >> >
> >> > No, it looks like you have a point here. I don't think userspace has any
> >> > business with orig_x0, but changing syscallno is certainly useful. I can
> >> > think of two ways to fix this:
> >> >
> >> >   (1) Updating syscallno based on w8, but this ties us to the current ABI
> >> >       and could get messy if this register changes in the future.
> >> >
> >> >   (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
> >> >       but that means adding arch-specific stuff to arch_ptrace (which
> >> >       currently goes straight to ptrace_request on arm64).
> >> >
> >> > It looks like x86 uses orig_ax, which I *think* means we would go with
> >> > (1) above if we followed their lead.
> >>
> >> w8 is a real register, right?  On x86, at least orig_ax isn't a real
> >> register, so it's quite unlikely to conflict with hardware stuff.
> >
> > Yeah, w8 is the hardware register which the Linux ABI uses for the system
> > call number. I was thinking We could allow the debugger/tracer to update
> > the syscall number by updating that register, or do you see an issue with
> > that? (other than tying us to the current ABI).
> 
> Not immediately, but I'm not super-familiar with ptrace.
> 
> Is w8 clobbered or otherwise changed by syscalls?  Using w8 for this
> has the odd effect that tracers can't force a return with a specific
> value of w8 without executing the corresponding syscall.  If that's a
> meaningful limitation, then presumably some other channel should be
> used.

Hmm, that's true. Currently w8 is preserved across a syscall by the kernel,
so it would be pretty bizarre for somebody to try and modify it but I guess
they could do it if they wanted to. However, they could just as easily
modify it on the syscall return path and have the same effect...

Furthermore, glibc unconditionally emits a mov into w8 prior to the svc
instruction, so from a user's perspective that register always contains
the system call number.

FWIW: the role of w8 in the PCS is `Indirect result location register', so
I'd expect it to be saved across the syscall anyway.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andy Lutomirski July 4, 2014, 11:05 p.m. UTC | #13
On Thu, Jul 3, 2014 at 9:32 AM, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Jul 03, 2014 at 05:13:50PM +0100, Andy Lutomirski wrote:
>> On Thu, Jul 3, 2014 at 9:11 AM, Will Deacon <will.deacon@arm.com> wrote:
>> > On Thu, Jul 03, 2014 at 04:39:21PM +0100, Andy Lutomirski wrote:
>> >> On Thu, Jul 3, 2014 at 3:24 AM, Will Deacon <will.deacon@arm.com> wrote:
>> >> > On Thu, Jul 03, 2014 at 08:43:07AM +0100, AKASHI Takahiro wrote:
>> >> >> So it seems to me that we can't change a system call by ptrace().
>> >> >> Do I misunderstand anything?
>> >> >
>> >> > No, it looks like you have a point here. I don't think userspace has any
>> >> > business with orig_x0, but changing syscallno is certainly useful. I can
>> >> > think of two ways to fix this:
>> >> >
>> >> >   (1) Updating syscallno based on w8, but this ties us to the current ABI
>> >> >       and could get messy if this register changes in the future.
>> >> >
>> >> >   (2) Adding a PTRACE_SET_SYSCALL request, like we have for arch/arm/,
>> >> >       but that means adding arch-specific stuff to arch_ptrace (which
>> >> >       currently goes straight to ptrace_request on arm64).
>> >> >
>> >> > It looks like x86 uses orig_ax, which I *think* means we would go with
>> >> > (1) above if we followed their lead.
>> >>
>> >> w8 is a real register, right?  On x86, at least orig_ax isn't a real
>> >> register, so it's quite unlikely to conflict with hardware stuff.
>> >
>> > Yeah, w8 is the hardware register which the Linux ABI uses for the system
>> > call number. I was thinking We could allow the debugger/tracer to update
>> > the syscall number by updating that register, or do you see an issue with
>> > that? (other than tying us to the current ABI).
>>
>> Not immediately, but I'm not super-familiar with ptrace.
>>
>> Is w8 clobbered or otherwise changed by syscalls?  Using w8 for this
>> has the odd effect that tracers can't force a return with a specific
>> value of w8 without executing the corresponding syscall.  If that's a
>> meaningful limitation, then presumably some other channel should be
>> used.
>
> Hmm, that's true. Currently w8 is preserved across a syscall by the kernel,
> so it would be pretty bizarre for somebody to try and modify it but I guess
> they could do it if they wanted to. However, they could just as easily
> modify it on the syscall return path and have the same effect...
>
> Furthermore, glibc unconditionally emits a mov into w8 prior to the svc
> instruction, so from a user's perspective that register always contains
> the system call number.

That means that, if someone uses a seccomp trace action to skip or
emulate a syscall by writing -1 to w8, then user code will see an
unexpected -1 in w8.  I don't know how much this matters.

>
> FWIW: the role of w8 in the PCS is `Indirect result location register', so
> I'd expect it to be saved across the syscall anyway.
>
> Will
diff mbox

Patch

diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 0dd3b79b15c3..0c27ed6f3f23 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -908,7 +908,7 @@  enum ptrace_syscall_dir {
        PTRACE_SYSCALL_EXIT,
 };
 
-static int tracehook_report_syscall(struct pt_regs *regs,
+static void tracehook_report_syscall(struct pt_regs *regs,
                                    enum ptrace_syscall_dir dir)
 {
        unsigned long ip;
@@ -926,7 +926,6 @@  static int tracehook_report_syscall(struct pt_regs *regs,
                current_thread_info()->syscall = -1;
 
        regs->ARM_ip = ip;
-       return current_thread_info()->syscall;
 }
 
 asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno)
@@ -938,7 +937,9 @@  asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno)
                return -1;
 
        if (test_thread_flag(TIF_SYSCALL_TRACE))
-               scno = tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
+               tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
+
+       scno = current_thread_info()->syscall;
 
        if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
                trace_sys_enter(regs, scno);