ocfs2: use get_task_comm

Message ID	20171205152110.2050975-1-arnd@arndb.de
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; From: Arnd Bergmann <arnd@arndb.de> To: Mark Fasheh <mfasheh@versity.com>, Joel Becker <jlbec@evilplan.org> Cc: Kees Cook <keescook@chromium.org>, Arnd Bergmann <arnd@arndb.de>, Andrew Morton <akpm@linux-foundation.org>, Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Joseph Qi <jiangqi903@gmail.com>, piaojun <piaojun@huawei.com>, ocfs2-devel@oss.oracle.com, linux-kernel@vger.kernel.org Subject: [PATCH] ocfs2: use get_task_comm Date: Tue, 5 Dec 2017 16:20:50 +0100 Message-Id: <20171205152110.2050975-1-arnd@arndb.de> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk
Series	ocfs2: use get_task_comm \| expand ocfs2: use get_task_comm

Arnd Bergmann Dec. 5, 2017, 3:20 p.m. UTC

While reviewing all callers of get_task_comm(), I stumbled
over this one that claimed it was not exported, when in fact
it is. Accessing task->comm directly is not safe, so better
convert this one to using get_task_comm as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

---
 fs/ocfs2/cluster/netdebug.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

-- 
2.9.0

Kees Cook Dec. 5, 2017, 7:19 p.m. UTC | #1

On Tue, Dec 5, 2017 at 7:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> While reviewing all callers of get_task_comm(), I stumbled

> over this one that claimed it was not exported, when in fact

> it is. Accessing task->comm directly is not safe, so better

> convert this one to using get_task_comm as well.


Using get_task_comm() in cases like this is actually overkill (i.e.
using up stack space), since there's (currently) no benefit. Nothing
protects getting a "correct" view of task->comm (i.e. it could get
updated in the middle of a copy), but it _is_ always NULL terminated,
so it's safe to use with %s like this. While it does make me slightly
uncomfortable to _depend_ on this NULL termination, but there are lots
of open-coded %s users of task->comm. When we're trying to save a
_copy_ of task->comm, then we want get_task_comm(), just to make sure
we're doing it right.

So, while I don't oppose this patch, it might be seen as a wasteful
use of stack space.

-Kees

>

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> ---

>  fs/ocfs2/cluster/netdebug.c | 5 +++--

>  1 file changed, 3 insertions(+), 2 deletions(-)

>

> diff --git a/fs/ocfs2/cluster/netdebug.c b/fs/ocfs2/cluster/netdebug.c

> index 74a21f6695c8..a51d001b89b1 100644

> --- a/fs/ocfs2/cluster/netdebug.c

> +++ b/fs/ocfs2/cluster/netdebug.c

> @@ -130,6 +130,7 @@ static void *nst_seq_next(struct seq_file *seq, void *v, loff_t *pos)

>  static int nst_seq_show(struct seq_file *seq, void *v)

>  {

>         struct o2net_send_tracking *nst, *dummy_nst = seq->private;

> +       char comm[TASK_COMM_LEN];

>         ktime_t now;

>         s64 sock, send, status;

>

> @@ -142,8 +143,8 @@ static int nst_seq_show(struct seq_file *seq, void *v)

>         sock = ktime_to_us(ktime_sub(now, nst->st_sock_time));

>         send = ktime_to_us(ktime_sub(now, nst->st_send_time));

>         status = ktime_to_us(ktime_sub(now, nst->st_status_time));

> +       get_task_comm(comm, nst->st_task);

>

> -       /* get_task_comm isn't exported.  oh well. */

>         seq_printf(seq, "%p:\n"

>                    "  pid:          %lu\n"

>                    "  tgid:         %lu\n"

> @@ -158,7 +159,7 @@ static int nst_seq_show(struct seq_file *seq, void *v)

>                    "  wait start:   %lld usecs ago\n",

>                    nst, (unsigned long)task_pid_nr(nst->st_task),

>                    (unsigned long)nst->st_task->tgid,

> -                  nst->st_task->comm, nst->st_node,

> +                  comm, nst->st_node,

>                    nst->st_sc, nst->st_id, nst->st_msg_type,

>                    nst->st_msg_key,

>                    (long long)sock,

> --

> 2.9.0

>




-- 
Kees Cook
Pixel Security

Arnd Bergmann Dec. 5, 2017, 8:27 p.m. UTC | #2

On Tue, Dec 5, 2017 at 8:19 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Dec 5, 2017 at 7:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:

>> While reviewing all callers of get_task_comm(), I stumbled

>> over this one that claimed it was not exported, when in fact

>> it is. Accessing task->comm directly is not safe, so better

>> convert this one to using get_task_comm as well.

>

> Using get_task_comm() in cases like this is actually overkill (i.e.

> using up stack space), since there's (currently) no benefit. Nothing

> protects getting a "correct" view of task->comm (i.e. it could get

> updated in the middle of a copy), but it _is_ always NULL terminated,

> so it's safe to use with %s like this. While it does make me slightly

> uncomfortable to _depend_ on this NULL termination, but there are lots

> of open-coded %s users of task->comm. When we're trying to save a

> _copy_ of task->comm, then we want get_task_comm(), just to make sure

> we're doing it right.

>

> So, while I don't oppose this patch, it might be seen as a wasteful

> use of stack space.

It's only a few bytes of stack space in a leaf function, I'd not be
worried about that.

More generally speaking though, how exactly do we guarantee that
there is NUL-termination on tsk->comm during a concurrent update?
Could we ever get into a situation where overwrite the NUL byte
while setting tsk->comm to a longer string, and read the new start
of the string together with an unterminated end, or do we strictly
guarantee that the last byte is still NUL? I assume the latter is
true, just haven't found exactly where that guarantee is made.

      Arnd

Kees Cook Dec. 5, 2017, 8:32 p.m. UTC | #3

On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tue, Dec 5, 2017 at 8:19 PM, Kees Cook <keescook@chromium.org> wrote:

>> On Tue, Dec 5, 2017 at 7:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:

>>> While reviewing all callers of get_task_comm(), I stumbled

>>> over this one that claimed it was not exported, when in fact

>>> it is. Accessing task->comm directly is not safe, so better

>>> convert this one to using get_task_comm as well.

>>

>> Using get_task_comm() in cases like this is actually overkill (i.e.

>> using up stack space), since there's (currently) no benefit. Nothing

>> protects getting a "correct" view of task->comm (i.e. it could get

>> updated in the middle of a copy), but it _is_ always NULL terminated,

>> so it's safe to use with %s like this. While it does make me slightly

>> uncomfortable to _depend_ on this NULL termination, but there are lots

>> of open-coded %s users of task->comm. When we're trying to save a

>> _copy_ of task->comm, then we want get_task_comm(), just to make sure

>> we're doing it right.

>>

>> So, while I don't oppose this patch, it might be seen as a wasteful

>> use of stack space.

>

> It's only a few bytes of stack space in a leaf function, I'd not be

> worried about that.

>

> More generally speaking though, how exactly do we guarantee that

> there is NUL-termination on tsk->comm during a concurrent update?

> Could we ever get into a situation where overwrite the NUL byte

> while setting tsk->comm to a longer string, and read the new start

> of the string together with an unterminated end, or do we strictly

> guarantee that the last byte is still NUL? I assume the latter is

> true, just haven't found exactly where that guarantee is made.


strncpy will zero pad with the trailing NULL, so it's supposed to
always be safe... still gives me the creeps, though.

-Kees

-- 
Kees Cook
Pixel Security

Arnd Bergmann Dec. 5, 2017, 9:44 p.m. UTC | #4

On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote:

>>

>> More generally speaking though, how exactly do we guarantee that

>> there is NUL-termination on tsk->comm during a concurrent update?

>> Could we ever get into a situation where overwrite the NUL byte

>> while setting tsk->comm to a longer string, and read the new start

>> of the string together with an unterminated end, or do we strictly

>> guarantee that the last byte is still NUL? I assume the latter is

>> true, just haven't found exactly where that guarantee is made.

>

> strncpy will zero pad with the trailing NULL, so it's supposed to

> always be safe... still gives me the creeps, though.

But set_task_comm uses strlcpy(), not strncpy(), so you might
get some of the old data back, the question is just whether it could
leak uninitialized data or part of the task_struct up to the next
NUL byte. I could not come up with any code path that would leave
a non-NUL byte in at the end of task->comm though, so it's
probably still safe.

       Arnd

Peter Zijlstra Dec. 5, 2017, 10:01 p.m. UTC | #5

On Tue, Dec 05, 2017 at 10:44:17PM +0100, Arnd Bergmann wrote:
> On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote:

> > On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote:

> >>

> >> More generally speaking though, how exactly do we guarantee that

> >> there is NUL-termination on tsk->comm during a concurrent update?

> >> Could we ever get into a situation where overwrite the NUL byte

> >> while setting tsk->comm to a longer string, and read the new start

> >> of the string together with an unterminated end, or do we strictly

> >> guarantee that the last byte is still NUL? I assume the latter is

> >> true, just haven't found exactly where that guarantee is made.

> >

> > strncpy will zero pad with the trailing NULL, so it's supposed to

> > always be safe... still gives me the creeps, though.

> 

> But set_task_comm uses strlcpy(), not strncpy(), so you might

> get some of the old data back, the question is just whether it could

> leak uninitialized data or part of the task_struct up to the next

> NUL byte. I could not come up with any code path that would leave

> a non-NUL byte in at the end of task->comm though, so it's

> probably still safe.


So we used to have some magic code set_task_comm() which even included a
memory barrier etc.. But since none of the reading sites include a
memory barrier its all pointless.

There is no guarantee that a tsk->comm user reads the bytes in string
order.

The only thing that ensures we never run over, is the hard guarantee
that ->comm[TSK_COMM_LEN-1] == 0 at all times. If we don't trust
str*cpy() to do the right thing here, we could simply open code the
thing.

ocfs2: use get_task_comm

Commit Message

Comments

Patch