diff mbox series

[RFCv2,2/3] lib/vsprintf.c: make %pD print full path for file

Message ID 20210528113951.6225-3-justin.he@arm.com
State New
Headers show
Series make '%pD' print full path for file | expand

Commit Message

Justin He May 28, 2021, 11:39 a.m. UTC
We have '%pD' for printing a filename. It may not be perfect (by
default it only prints one component.)

As suggested by Linus at [1]:
A dentry has a parent, but at the same time, a dentry really does
inherently have "one name" (and given just the dentry pointers, you
can't show mount-related parenthood, so in many ways the "show just
one name" makes sense for "%pd" in ways it doesn't necessarily for
"%pD"). But while a dentry arguably has that "one primary component",
a _file_ is certainly not exclusively about that last component.

Hence "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
Despite that shared code origin, and despite that similar letter
choice (lower-vs-upper case), a dentry and a file really are very
different from a name standpoint.

Here stack space is preferred for file_d_path_name() because it is
much safer. The stack size 256 is a compromise between stack overflow
and too short full path.

[1] https://lore.kernel.org/lkml/CAHk-=wimsMqGdzik187YWLb-ru+iktb4MYbMQG1rnZ81dXYFVg@mail.gmail.com/

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jia He <justin.he@arm.com>
---
 Documentation/core-api/printk-formats.rst |  5 +++--
 lib/vsprintf.c                            | 21 +++++++++++++++++----
 2 files changed, 20 insertions(+), 6 deletions(-)

Comments

Matthew Wilcox May 28, 2021, 12:59 p.m. UTC | #1
On Fri, May 28, 2021 at 07:39:50PM +0800, Jia He wrote:
> We have '%pD' for printing a filename. It may not be perfect (by
> default it only prints one component.)
> 
> As suggested by Linus at [1]:
> A dentry has a parent, but at the same time, a dentry really does
> inherently have "one name" (and given just the dentry pointers, you
> can't show mount-related parenthood, so in many ways the "show just
> one name" makes sense for "%pd" in ways it doesn't necessarily for
> "%pD"). But while a dentry arguably has that "one primary component",
> a _file_ is certainly not exclusively about that last component.
> 
> Hence "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
> Despite that shared code origin, and despite that similar letter
> choice (lower-vs-upper case), a dentry and a file really are very
> different from a name standpoint.
> 
> Here stack space is preferred for file_d_path_name() because it is
> much safer. The stack size 256 is a compromise between stack overflow
> and too short full path.

How is it "safer"?  You already have a buffer passed from the caller.
Are you saying that d_path_fast() might overrun a really small buffer
but won't overrun a 256 byte buffer?

> @@ -920,13 +921,25 @@ char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_sp
>  }
>  
>  static noinline_for_stack
> -char *file_dentry_name(char *buf, char *end, const struct file *f,
> +char *file_d_path_name(char *buf, char *end, const struct file *f,
>  			struct printf_spec spec, const char *fmt)
>  {
> +	const struct path *path;
> +	char *p;
> +	char full_path[256];
> +
>  	if (check_pointer(&buf, end, f, spec))
>  		return buf;
>  
> -	return dentry_name(buf, end, f->f_path.dentry, spec, fmt);
> +	path = &f->f_path;
> +	if (check_pointer(&buf, end, path, spec))
> +		return buf;
> +
> +	p = d_path_fast(path, full_path, sizeof(full_path));
> +	if (IS_ERR(p))
> +		return err_ptr(buf, end, p, spec);
> +
> +	return string_nocheck(buf, end, p, spec);
>  }
>  #ifdef CONFIG_BLOCK
>  static noinline_for_stack
Justin He May 28, 2021, 2:22 p.m. UTC | #2
Hi Matthew
> -----Original Message-----
> From: Matthew Wilcox <willy@infradead.org>
> Sent: Friday, May 28, 2021 9:00 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>; Petr Mladek
> <pmladek@suse.com>; Steven Rostedt <rostedt@goodmis.org>; Sergey
> Senozhatsky <senozhatsky@chromium.org>; Andy Shevchenko
> <andriy.shevchenko@linux.intel.com>; Rasmus Villemoes
> <linux@rasmusvillemoes.dk>; Jonathan Corbet <corbet@lwn.net>; Alexander
> Viro <viro@zeniv.linux.org.uk>; Luca Coelho <luciano.coelho@intel.com>;
> Kalle Valo <kvalo@codeaurora.org>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; Heiko Carstens <hca@linux.ibm.com>;
> Vasily Gorbik <gor@linux.ibm.com>; Christian Borntraeger
> <borntraeger@de.ibm.com>; Johannes Berg <johannes.berg@intel.com>; linux-
> doc@vger.kernel.org; linux-kernel@vger.kernel.org; linux-
> wireless@vger.kernel.org; netdev@vger.kernel.org; linux-
> s390@vger.kernel.org
> Subject: Re: [PATCH RFCv2 2/3] lib/vsprintf.c: make %pD print full path
> for file
>
> On Fri, May 28, 2021 at 07:39:50PM +0800, Jia He wrote:
> > We have '%pD' for printing a filename. It may not be perfect (by
> > default it only prints one component.)
> >
> > As suggested by Linus at [1]:
> > A dentry has a parent, but at the same time, a dentry really does
> > inherently have "one name" (and given just the dentry pointers, you
> > can't show mount-related parenthood, so in many ways the "show just
> > one name" makes sense for "%pd" in ways it doesn't necessarily for
> > "%pD"). But while a dentry arguably has that "one primary component",
> > a _file_ is certainly not exclusively about that last component.
> >
> > Hence "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
> > Despite that shared code origin, and despite that similar letter
> > choice (lower-vs-upper case), a dentry and a file really are very
> > different from a name standpoint.
> >
> > Here stack space is preferred for file_d_path_name() because it is
> > much safer. The stack size 256 is a compromise between stack overflow
> > and too short full path.
>
> How is it "safer"?  You already have a buffer passed from the caller.
> Are you saying that d_path_fast() might overrun a really small buffer
> but won't overrun a 256 byte buffer?
No, it won't overrun a 256 byte buf. When the full path size is larger than 256, the p->len is < 0 in prepend_name, and this overrun will be
dectected in extract_string() with "-ENAMETOOLONG".

Each printk contains 2 vsnprintf. vsnprintf() returns the required size after formatting the string.
1. vprintk_store() will invoke 1st vsnprintf() will 8 bytes space to get the reserve_size. In this case, the _buf_ could be less than _end_ by design.
2. Then it invokes 2nd printk_sprint()->vscnprintf()->vsnprintf() to really fill the space.

If we choose the stack space, it can meet above 2 cases.

If we pass the parameter like:
p = d_path_fast(path, buf, end - buf);
We need to handle the complicated logic in prepend_name()
I have tried this way in local test, the code logic is very complicated
and not so graceful.
e.g. I need to firstly go through the loop and get the full path size of
that file. And then return reserved_size for that 1st vsnprintf

Thanks for any suggestion

--
Cheers,
Justin (Jia He)

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Matthew Wilcox May 28, 2021, 2:52 p.m. UTC | #3
On Fri, May 28, 2021 at 02:22:01PM +0000, Justin He wrote:
> > On Fri, May 28, 2021 at 07:39:50PM +0800, Jia He wrote:
> > > We have '%pD' for printing a filename. It may not be perfect (by
> > > default it only prints one component.)
> > >
> > > As suggested by Linus at [1]:
> > > A dentry has a parent, but at the same time, a dentry really does
> > > inherently have "one name" (and given just the dentry pointers, you
> > > can't show mount-related parenthood, so in many ways the "show just
> > > one name" makes sense for "%pd" in ways it doesn't necessarily for
> > > "%pD"). But while a dentry arguably has that "one primary component",
> > > a _file_ is certainly not exclusively about that last component.
> > >
> > > Hence "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
> > > Despite that shared code origin, and despite that similar letter
> > > choice (lower-vs-upper case), a dentry and a file really are very
> > > different from a name standpoint.
> > >
> > > Here stack space is preferred for file_d_path_name() because it is
> > > much safer. The stack size 256 is a compromise between stack overflow
> > > and too short full path.
> >
> > How is it "safer"?  You already have a buffer passed from the caller.
> > Are you saying that d_path_fast() might overrun a really small buffer
> > but won't overrun a 256 byte buffer?
> No, it won't overrun a 256 byte buf. When the full path size is larger than 256, the p->len is < 0 in prepend_name, and this overrun will be
> dectected in extract_string() with "-ENAMETOOLONG".
> 
> Each printk contains 2 vsnprintf. vsnprintf() returns the required size after formatting the string.
> 1. vprintk_store() will invoke 1st vsnprintf() will 8 bytes space to get the reserve_size. In this case, the _buf_ could be less than _end_ by design.
> 2. Then it invokes 2nd printk_sprint()->vscnprintf()->vsnprintf() to really fill the space.

I think you need to explain _that_ in the commit log, not make some
nebulous claim of "safer".

> If we choose the stack space, it can meet above 2 cases.
> 
> If we pass the parameter like:
> p = d_path_fast(path, buf, end - buf);
> We need to handle the complicated logic in prepend_name()
> I have tried this way in local test, the code logic is very complicated
> and not so graceful.
> e.g. I need to firstly go through the loop and get the full path size of
> that file. And then return reserved_size for that 1st vsnprintf

I'm not sure why it's so complicated.  p->len records how many bytes
are needed for the entire path; can't you just return -p->len ?
Matthew Wilcox May 28, 2021, 3:22 p.m. UTC | #4
On Fri, May 28, 2021 at 03:09:28PM +0000, Justin He wrote:
> > I'm not sure why it's so complicated.  p->len records how many bytes
> > are needed for the entire path; can't you just return -p->len ?
> 
> prepend_name() will return at the beginning if p->len is <0 in this case,
> we can't even get the correct full path size if keep __prepend_path unchanged.
> We need another new helper __prepend_path_size() to get the full path size
> regardless of the negative value p->len.

It's a little hard to follow, based on just the patches.  Is there a
git tree somewhere of Al's patches that you're based on?

Seems to me that prepend_name() is just fine because it updates p->len
before returning false:

 static bool prepend_name(struct prepend_buffer *p, const struct qstr *name)
 {
 	const char *dname = smp_load_acquire(&name->name); /* ^^^ */
 	u32 dlen = READ_ONCE(name->len);
 	char *s;

 	p->len -= dlen + 1;
 	if (unlikely(p->len < 0))
 		return false;

I think the only change you'd need to make for vsnprintf() is in
prepend_path():

-		if (!prepend_name(&b, &dentry->d_name))
-			break;
+		prepend_name(&b, &dentry->d_name);

Would that hurt anything else?

> More than that, even the 1st vsnprintf could have _end_ > _buf_ in some case:
> What if printk("%pD", filp) ? The 1st vsnprintf has positive (end-buf).

I don't understand the problem ... if p->len is positive, then you
succeeded.  if p->len is negative then -p->len is the expected return
value from vsnprintf().  No?
Rasmus Villemoes May 28, 2021, 8:06 p.m. UTC | #5
On 28/05/2021 16.22, Justin He wrote:
> 
>> From: Matthew Wilcox <willy@infradead.org>

>> How is it "safer"?  You already have a buffer passed from the caller.
>> Are you saying that d_path_fast() might overrun a really small buffer
>> but won't overrun a 256 byte buffer?
> No, it won't overrun a 256 byte buf. When the full path size is larger than 256, the p->len is < 0 in prepend_name, and this overrun will be
> dectected in extract_string() with "-ENAMETOOLONG".
> 
> Each printk contains 2 vsnprintf. vsnprintf() returns the required size after formatting the string.>
> 1. vprintk_store() will invoke 1st vsnprintf() will 8 bytes space to get the reserve_size. In this case, the _buf_ could be less than _end_ by design.
> 2. Then it invokes 2nd printk_sprint()->vscnprintf()->vsnprintf() to really fill the space.

Please do not assume that printk is the only user of vsnprintf() or the
only one that would use a given %p<foo> extension.

Also, is it clear that nothing can change underneath you in between two
calls to vsnprintf()? IOW, is it certain that the path will fit upon a
second call using the size returned from the first?

Rasmus
Matthew Wilcox May 30, 2021, 3:18 p.m. UTC | #6
On Fri, May 28, 2021 at 10:06:37PM +0200, Rasmus Villemoes wrote:
> On 28/05/2021 16.22, Justin He wrote:

> > 

> >> From: Matthew Wilcox <willy@infradead.org>

> 

> >> How is it "safer"?  You already have a buffer passed from the caller.

> >> Are you saying that d_path_fast() might overrun a really small buffer

> >> but won't overrun a 256 byte buffer?

> > No, it won't overrun a 256 byte buf. When the full path size is larger than 256, the p->len is < 0 in prepend_name, and this overrun will be

> > dectected in extract_string() with "-ENAMETOOLONG".

> > 

> > Each printk contains 2 vsnprintf. vsnprintf() returns the required size after formatting the string.>

> > 1. vprintk_store() will invoke 1st vsnprintf() will 8 bytes space to get the reserve_size. In this case, the _buf_ could be less than _end_ by design.

> > 2. Then it invokes 2nd printk_sprint()->vscnprintf()->vsnprintf() to really fill the space.

> 

> Please do not assume that printk is the only user of vsnprintf() or the

> only one that would use a given %p<foo> extension.

> 

> Also, is it clear that nothing can change underneath you in between two

> calls to vsnprintf()? IOW, is it certain that the path will fit upon a

> second call using the size returned from the first?


No, but that's also true of %s.  I think vprintk_store() is foolish to
do it this way.
Justin He May 31, 2021, 12:39 a.m. UTC | #7
> -----Original Message-----

> From: Matthew Wilcox <willy@infradead.org>

> Sent: Friday, May 28, 2021 11:22 PM

> To: Justin He <Justin.He@arm.com>

> Cc: Linus Torvalds <torvalds@linux-foundation.org>; Petr Mladek

> <pmladek@suse.com>; Steven Rostedt <rostedt@goodmis.org>; Sergey

> Senozhatsky <senozhatsky@chromium.org>; Andy Shevchenko

> <andriy.shevchenko@linux.intel.com>; Rasmus Villemoes

> <linux@rasmusvillemoes.dk>; Jonathan Corbet <corbet@lwn.net>; Alexander

> Viro <viro@zeniv.linux.org.uk>; Luca Coelho <luciano.coelho@intel.com>;

> Kalle Valo <kvalo@codeaurora.org>; David S. Miller <davem@davemloft.net>;

> Jakub Kicinski <kuba@kernel.org>; Heiko Carstens <hca@linux.ibm.com>;

> Vasily Gorbik <gor@linux.ibm.com>; Christian Borntraeger

> <borntraeger@de.ibm.com>; Johannes Berg <johannes.berg@intel.com>; linux-

> doc@vger.kernel.org; linux-kernel@vger.kernel.org; linux-

> wireless@vger.kernel.org; netdev@vger.kernel.org; linux-

> s390@vger.kernel.org

> Subject: Re: [PATCH RFCv2 2/3] lib/vsprintf.c: make %pD print full path

> for file

>

> On Fri, May 28, 2021 at 03:09:28PM +0000, Justin He wrote:

> > > I'm not sure why it's so complicated.  p->len records how many bytes

> > > are needed for the entire path; can't you just return -p->len ?

> >

> > prepend_name() will return at the beginning if p->len is <0 in this case,

> > we can't even get the correct full path size if keep __prepend_path

> unchanged.

> > We need another new helper __prepend_path_size() to get the full path

> size

> > regardless of the negative value p->len.

>

> It's a little hard to follow, based on just the patches.  Is there a

> git tree somewhere of Al's patches that you're based on?


The git tree of Al's patches is at:
https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/log/?h=work.d_path

>

> Seems to me that prepend_name() is just fine because it updates p->len

> before returning false:

>

>  static bool prepend_name(struct prepend_buffer *p, const struct qstr

> *name)

>  {

>       const char *dname = smp_load_acquire(&name->name); /* ^^^ */

>       u32 dlen = READ_ONCE(name->len);

>       char *s;

>

>       p->len -= dlen + 1;

>       if (unlikely(p->len < 0))

>               return false;

>

> I think the only change you'd need to make for vsnprintf() is in

> prepend_path():

>

> -             if (!prepend_name(&b, &dentry->d_name))

> -                     break;

> +             prepend_name(&b, &dentry->d_name);

>

> Would that hurt anything else?

I will try your suggestion soon.

>

> > More than that, even the 1st vsnprintf could have _end_ > _buf_ in some

> case:

> > What if printk("%pD", filp) ? The 1st vsnprintf has positive (end-buf).

>

> I don't understand the problem ... if p->len is positive, then you

> succeeded.  if p->len is negative then -p->len is the expected return

> value from vsnprintf().  No?


There are 3 cases I once met in my debugging:
1. p->len is positive but too small (e.g. end-buf is 6). In first prepend_name
loop p-len-=dlen, then p->len is negative

2. p->len is negative at the very beginning (i.e. end-buf is negative)

3. p->len positive and large enough. Typically the 2nd vsnprintf of printk


--
Cheers,
Justin (Jia He)


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Justin He June 1, 2021, 2:42 p.m. UTC | #8
Hi Matthew

> -----Original Message-----

> From: Matthew Wilcox <willy@infradead.org>

> Sent: Friday, May 28, 2021 11:22 PM

> To: Justin He <Justin.He@arm.com>

> Cc: Linus Torvalds <torvalds@linux-foundation.org>; Petr Mladek

> <pmladek@suse.com>; Steven Rostedt <rostedt@goodmis.org>; Sergey

> Senozhatsky <senozhatsky@chromium.org>; Andy Shevchenko

> <andriy.shevchenko@linux.intel.com>; Rasmus Villemoes

> <linux@rasmusvillemoes.dk>; Jonathan Corbet <corbet@lwn.net>; Alexander

> Viro <viro@zeniv.linux.org.uk>; Luca Coelho <luciano.coelho@intel.com>;

> Kalle Valo <kvalo@codeaurora.org>; David S. Miller <davem@davemloft.net>;

> Jakub Kicinski <kuba@kernel.org>; Heiko Carstens <hca@linux.ibm.com>;

> Vasily Gorbik <gor@linux.ibm.com>; Christian Borntraeger

> <borntraeger@de.ibm.com>; Johannes Berg <johannes.berg@intel.com>; linux-

> doc@vger.kernel.org; linux-kernel@vger.kernel.org; linux-

> wireless@vger.kernel.org; netdev@vger.kernel.org; linux-

> s390@vger.kernel.org

> Subject: Re: [PATCH RFCv2 2/3] lib/vsprintf.c: make %pD print full path

> for file

>

> On Fri, May 28, 2021 at 03:09:28PM +0000, Justin He wrote:

> > > I'm not sure why it's so complicated.  p->len records how many bytes

> > > are needed for the entire path; can't you just return -p->len ?

> >

> > prepend_name() will return at the beginning if p->len is <0 in this case,

> > we can't even get the correct full path size if keep __prepend_path

> unchanged.

> > We need another new helper __prepend_path_size() to get the full path

> size

> > regardless of the negative value p->len.

>

> It's a little hard to follow, based on just the patches.  Is there a

> git tree somewhere of Al's patches that you're based on?

>

> Seems to me that prepend_name() is just fine because it updates p->len

> before returning false:

>

>  static bool prepend_name(struct prepend_buffer *p, const struct qstr

> *name)

>  {

>       const char *dname = smp_load_acquire(&name->name); /* ^^^ */

>       u32 dlen = READ_ONCE(name->len);

>       char *s;

>

>       p->len -= dlen + 1;

>       if (unlikely(p->len < 0))

>               return false;

>

> I think the only change you'd need to make for vsnprintf() is in

> prepend_path():

>

> -             if (!prepend_name(&b, &dentry->d_name))

> -                     break;

> +             prepend_name(&b, &dentry->d_name);

>

> Would that hurt anything else?

>


It almost works except the snprintf case,
Consider,assuming filp path is 256 bytes, 2 dentries "/root/$long_string":
snprintf(buffer, 128, "%pD", filp);
p->len is positive at first, but negative after prepend_name loop.
So, it will not fill any bytes in _buffer_.
But in theory, it should fill the beginning 127 bytes and '\0'.

What do you think of it?

--
Cheers,
Justin (Jia He)


> > More than that, even the 1st vsnprintf could have _end_ > _buf_ in some

> case:

> > What if printk("%pD", filp) ? The 1st vsnprintf has positive (end-buf).

>

> I don't understand the problem ... if p->len is positive, then you

> succeeded.  if p->len is negative then -p->len is the expected return

> value from vsnprintf().  No?


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Matthew Wilcox June 1, 2021, 3:30 p.m. UTC | #9
somehow the linux-fsdevel mailing list got dropped from this revision
of the patch set.  anyone who's following along may wish to refer to
the archives:
https://lore.kernel.org/linux-doc/20210528113951.6225-1-justin.he@arm.com/

On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:
> > On Fri, May 28, 2021 at 03:09:28PM +0000, Justin He wrote:

> > > > I'm not sure why it's so complicated.  p->len records how many bytes

> > > > are needed for the entire path; can't you just return -p->len ?

> > >

> > > prepend_name() will return at the beginning if p->len is <0 in this case,

> > > we can't even get the correct full path size if keep __prepend_path

> > unchanged.

> > > We need another new helper __prepend_path_size() to get the full path

> > size

> > > regardless of the negative value p->len.

> >

> > It's a little hard to follow, based on just the patches.  Is there a

> > git tree somewhere of Al's patches that you're based on?

> >

> > Seems to me that prepend_name() is just fine because it updates p->len

> > before returning false:

> >

> >  static bool prepend_name(struct prepend_buffer *p, const struct qstr

> > *name)

> >  {

> >       const char *dname = smp_load_acquire(&name->name); /* ^^^ */

> >       u32 dlen = READ_ONCE(name->len);

> >       char *s;

> >

> >       p->len -= dlen + 1;

> >       if (unlikely(p->len < 0))

> >               return false;

> >

> > I think the only change you'd need to make for vsnprintf() is in

> > prepend_path():

> >

> > -             if (!prepend_name(&b, &dentry->d_name))

> > -                     break;

> > +             prepend_name(&b, &dentry->d_name);

> >

> > Would that hurt anything else?

> >

> 

> It almost works except the snprintf case,

> Consider,assuming filp path is 256 bytes, 2 dentries "/root/$long_string":

> snprintf(buffer, 128, "%pD", filp);

> p->len is positive at first, but negative after prepend_name loop.

> So, it will not fill any bytes in _buffer_.

> But in theory, it should fill the beginning 127 bytes and '\0'.


I have a few thoughts ...

1. Do we actually depend on that anywhere?
2. Is that something we should support?
3. We could print the start of the filename, if we do.  So something like
this ...

static void prepend(struct prepend_buffer *p, const char *str, int namelen)
{
	p->len -= namelen;
	if (likely(p->len >= 0)) {
		p->buf -= namelen;
		memcpy(p->buf, str, namelen);
	} else {
		char *s = p->buf;
		int buflen = strlen(p->buf);

		/* The first time we overflow the buffer */
		if (p->len + namelen > 0) {
			p->buf -= p->len + namelen;
			buflen += p->len + namelen;
		}

		if (buflen > namelen) {
			memmove(p->buf + namelen, s, buflen - namelen);
			memcpy(p->buf, str, namelen);
		} else {
			memcpy(p->buf, str, buflen);
		}
	}
}

I haven't tested this; it's probably full of confusion and off-by-one
errors.  But I hope you get the point -- we continue to accumulate
p->len to indicate how many characters we shifted off the right of the
buffer while adding the (start of) the filename on the left.

4. If we want the end of the filename instead, that looks easier:

static void prepend(struct prepend_buffer *p, const char *str, int namelen)
{
	p->len -= namelen;
	if (likely(p->len >= 0)) {
		p->buf -= namelen;
		memcpy(p->buf, str, namelen);
	} else if (p->len + namelen > 0) {
		p->buf -= p->len + namelen;
		memcpy(p->buf, str - p->len, p->len + namelen)
	}
}

But I don't think we want any of this at all.  Just don't put anything
in the buffer if the user didn't supply enough space.  As long as you
get the return value right, they know the string is bad (or they don't
care if the string is bad)
Andy Shevchenko June 1, 2021, 3:36 p.m. UTC | #10
On Tue, Jun 1, 2021 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:
> On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:


...

> Just don't put anything

> in the buffer if the user didn't supply enough space.  As long as you

> get the return value right, they know the string is bad (or they don't

> care if the string is bad)


It might be that I'm out of context here, but printf() functionality
in the kernel (vsprintf() if being precise)  and its users consider
that it should fill buffer up to the end of whatever space is
available.

-- 
With Best Regards,
Andy Shevchenko
Matthew Wilcox June 1, 2021, 3:44 p.m. UTC | #11
On Tue, Jun 01, 2021 at 06:36:41PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 1, 2021 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:

> > On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:

> 

> ...

> 

> > Just don't put anything

> > in the buffer if the user didn't supply enough space.  As long as you

> > get the return value right, they know the string is bad (or they don't

> > care if the string is bad)

> 

> It might be that I'm out of context here, but printf() functionality

> in the kernel (vsprintf() if being precise)  and its users consider

> that it should fill buffer up to the end of whatever space is

> available.


Do they though?  What use is it to specify a small buffer, print a
large filename into it and then use that buffer, knowing that it wasn't
big enough?  That would help decide whether we should print the
start or the end of the filename.

Remember, we're going for usefulness here, not abiding by the letter of
the standard under all circumstances, no matter the cost.  At least
partially because we're far outside the standard here; POSIX does
not specify what %pD does.

"The argument shall be a pointer to void. The value of the
pointer is converted to a sequence of printable characters, in an
implementation-defined manner."
Andy Shevchenko June 1, 2021, 3:53 p.m. UTC | #12
On Tue, Jun 01, 2021 at 04:44:00PM +0100, Matthew Wilcox wrote:
> On Tue, Jun 01, 2021 at 06:36:41PM +0300, Andy Shevchenko wrote:

> > On Tue, Jun 1, 2021 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:

> > > On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:

> > 

> > ...

> > 

> > > Just don't put anything

> > > in the buffer if the user didn't supply enough space.  As long as you

> > > get the return value right, they know the string is bad (or they don't

> > > care if the string is bad)

> > 

> > It might be that I'm out of context here, but printf() functionality

> > in the kernel (vsprintf() if being precise)  and its users consider

> > that it should fill buffer up to the end of whatever space is

> > available.

> 

> Do they though?  What use is it to specify a small buffer, print a

> large filename into it and then use that buffer, knowing that it wasn't

> big enough?  That would help decide whether we should print the

> start or the end of the filename.

> 

> Remember, we're going for usefulness here, not abiding by the letter of

> the standard under all circumstances, no matter the cost.  At least

> partially because we're far outside the standard here; POSIX does

> not specify what %pD does.

> 

> "The argument shall be a pointer to void. The value of the

> pointer is converted to a sequence of printable characters, in an

> implementation-defined manner."


All nice words, but don't forget kasprintf() or other usages like this.
For the same input we have to have the same result independently on the room in
the buffer.

So, if I print "Hello, World" I should always get it, not "Monkey's Paw".
I.o.w.

 snprintf(10) ==> "Hello, Wor"
 snprintf(5)  ==> "Hello"
 snprintf(2)  !=> "Mo"
 snprintf(1)  !=> "M"
 snprintf(1)  ==> "H"

Inconsistency here is really not what we want.


-- 
With Best Regards,
Andy Shevchenko
Andy Shevchenko June 1, 2021, 4:10 p.m. UTC | #13
On Tue, Jun 01, 2021 at 06:53:26PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 01, 2021 at 04:44:00PM +0100, Matthew Wilcox wrote:

> > On Tue, Jun 01, 2021 at 06:36:41PM +0300, Andy Shevchenko wrote:

> > > On Tue, Jun 1, 2021 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:

> > > > On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:

> > > 

> > > ...

> > > 

> > > > Just don't put anything

> > > > in the buffer if the user didn't supply enough space.  As long as you

> > > > get the return value right, they know the string is bad (or they don't

> > > > care if the string is bad)

> > > 

> > > It might be that I'm out of context here, but printf() functionality

> > > in the kernel (vsprintf() if being precise)  and its users consider

> > > that it should fill buffer up to the end of whatever space is

> > > available.

> > 

> > Do they though?  What use is it to specify a small buffer, print a

> > large filename into it and then use that buffer, knowing that it wasn't

> > big enough?  That would help decide whether we should print the

> > start or the end of the filename.

> > 

> > Remember, we're going for usefulness here, not abiding by the letter of

> > the standard under all circumstances, no matter the cost.  At least

> > partially because we're far outside the standard here; POSIX does

> > not specify what %pD does.

> > 

> > "The argument shall be a pointer to void. The value of the

> > pointer is converted to a sequence of printable characters, in an

> > implementation-defined manner."

> 

> All nice words, but don't forget kasprintf() or other usages like this.

> For the same input we have to have the same result independently on the room in

> the buffer.

> 

> So, if I print "Hello, World" I should always get it, not "Monkey's Paw".

> I.o.w.

> 

>  snprintf(10) ==> "Hello, Wor"

>  snprintf(5)  ==> "Hello"

>  snprintf(2)  !=> "Mo"

>  snprintf(1)  !=> "M"

>  snprintf(1)  ==> "H"

> 

> Inconsistency here is really not what we want.


I have to add that in light of the topic those characters should be counted
from the end of the filename. So, we will give user as much as possible of useful
information. I.o.w. always print the last part of filename up to the buffer
size or if the filename is shorter than buffer we will have it in full.

-- 
With Best Regards,
Andy Shevchenko
Matthew Wilcox June 1, 2021, 5:05 p.m. UTC | #14
On Tue, Jun 01, 2021 at 07:10:41PM +0300, Andy Shevchenko wrote:
> On Tue, Jun 01, 2021 at 06:53:26PM +0300, Andy Shevchenko wrote:
> > On Tue, Jun 01, 2021 at 04:44:00PM +0100, Matthew Wilcox wrote:
> > > On Tue, Jun 01, 2021 at 06:36:41PM +0300, Andy Shevchenko wrote:
> > > > On Tue, Jun 1, 2021 at 6:32 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > On Tue, Jun 01, 2021 at 02:42:15PM +0000, Justin He wrote:
> > > > 
> > > > ...
> > > > 
> > > > > Just don't put anything
> > > > > in the buffer if the user didn't supply enough space.  As long as you
> > > > > get the return value right, they know the string is bad (or they don't
> > > > > care if the string is bad)
> > > > 
> > > > It might be that I'm out of context here, but printf() functionality
> > > > in the kernel (vsprintf() if being precise)  and its users consider
> > > > that it should fill buffer up to the end of whatever space is
> > > > available.
> > > 
> > > Do they though?  What use is it to specify a small buffer, print a
> > > large filename into it and then use that buffer, knowing that it wasn't
> > > big enough?  That would help decide whether we should print the
> > > start or the end of the filename.
> > > 
> > > Remember, we're going for usefulness here, not abiding by the letter of
> > > the standard under all circumstances, no matter the cost.  At least
> > > partially because we're far outside the standard here; POSIX does
> > > not specify what %pD does.
> > > 
> > > "The argument shall be a pointer to void. The value of the
> > > pointer is converted to a sequence of printable characters, in an
> > > implementation-defined manner."
> > 
> > All nice words, but don't forget kasprintf() or other usages like this.
> > For the same input we have to have the same result independently on the room in
> > the buffer.
> > 
> > So, if I print "Hello, World" I should always get it, not "Monkey's Paw".
> > I.o.w.
> > 
> >  snprintf(10) ==> "Hello, Wor"
> >  snprintf(5)  ==> "Hello"
> >  snprintf(2)  !=> "Mo"
> >  snprintf(1)  !=> "M"
> >  snprintf(1)  ==> "H"
> > 
> > Inconsistency here is really not what we want.
> 
> I have to add that in light of the topic those characters should be counted
> from the end of the filename. So, we will give user as much as possible of useful
> information. I.o.w. always print the last part of filename up to the buffer
> size or if the filename is shorter than buffer we will have it in full.

Ah, not monkey's paw, but donkey hoof then ...

Here's some examples, what do you think makes sense?

snprintf(buf, 16, "bad file '%pD'\n", q);

what content do you want buf to have when q is variously:

1. /abcd/efgh
2. /a/bcdefgh.iso
3. /abcdef/gh

I would argue that
"bad file ''\n"
is actually a better string to have than any of (case 2)
"bad file '/a/bc"
"bad file 'bcdef"
"bad file 'h.iso"
Rasmus Villemoes June 1, 2021, 7:01 p.m. UTC | #15
On 01/06/2021 19.05, Matthew Wilcox wrote:

> Here's some examples, what do you think makes sense?

> 

> snprintf(buf, 16, "bad file '%pD'\n", q);

> 

> what content do you want buf to have when q is variously:

> 

> 1. /abcd/efgh

> 2. /a/bcdefgh.iso

> 3. /abcdef/gh

> 

> I would argue that

> "bad file ''\n"

> is actually a better string to have than any of (case 2)

> "bad file '/a/bc"

> "bad file 'bcdef"

> "bad file 'h.iso"

> 


Whatever ends up being decided, _please_ document that in
machine-readable and -verifiable form. I.e., update lib/test_printf.c
accordingly.

Currently (and originally) it only tests %pd because %pD is/was
essentially just %pd with an indirection to get the struct dentry* from
a struct file*.

The existing framework is strongly centered around expecting '/a/bc (see
all the logic where we do multiple checks with size 0, size random, size
plenty, and for the random case check that the buffer contents match the
complete output up till the randomly chosen size), so adding tests for
some other semantics would require a bit more juggling.

Not that that should be an argument in favor of that behaviour. But FWIW
that would be my preference.

Rasmus
Justin He June 2, 2021, 5:47 a.m. UTC | #16
Hi Rasmus

> -----Original Message-----

> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>

> Sent: Wednesday, June 2, 2021 3:02 AM

> To: Matthew Wilcox <willy@infradead.org>; Andy Shevchenko

> <andy.shevchenko@gmail.com>

> Cc: Justin He <Justin.He@arm.com>; Linus Torvalds <torvalds@linux-

> foundation.org>; Petr Mladek <pmladek@suse.com>; Steven Rostedt

> <rostedt@goodmis.org>; Sergey Senozhatsky <senozhatsky@chromium.org>;

> Jonathan Corbet <corbet@lwn.net>; Alexander Viro <viro@zeniv.linux.org.uk>;

> Luca Coelho <luciano.coelho@intel.com>; Kalle Valo <kvalo@codeaurora.org>;

> David S. Miller <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>;

> Heiko Carstens <hca@linux.ibm.com>; Vasily Gorbik <gor@linux.ibm.com>;

> Christian Borntraeger <borntraeger@de.ibm.com>; Johannes Berg

> <johannes.berg@intel.com>; linux-doc@vger.kernel.org; linux-

> kernel@vger.kernel.org; linux-wireless@vger.kernel.org;

> netdev@vger.kernel.org; linux-s390@vger.kernel.org; Linux FS Devel <linux-

> fsdevel@vger.kernel.org>

> Subject: Re: [PATCH RFCv2 2/3] lib/vsprintf.c: make %pD print full path for

> file

>

> On 01/06/2021 19.05, Matthew Wilcox wrote:

>

> > Here's some examples, what do you think makes sense?

> >

> > snprintf(buf, 16, "bad file '%pD'\n", q);

> >

> > what content do you want buf to have when q is variously:

> >

> > 1. /abcd/efgh

> > 2. /a/bcdefgh.iso

> > 3. /abcdef/gh

> >

> > I would argue that

> > "bad file ''\n"

> > is actually a better string to have than any of (case 2)

> > "bad file '/a/bc"

> > "bad file 'bcdef"

> > "bad file 'h.iso"

> >

>

> Whatever ends up being decided, _please_ document that in

> machine-readable and -verifiable form. I.e., update lib/test_printf.c

> accordingly.

>

> Currently (and originally) it only tests %pd because %pD is/was

> essentially just %pd with an indirection to get the struct dentry* from

> a struct file*.


Okay, I can add more test_printf cases for '%pD'

>

> The existing framework is strongly centered around expecting '/a/bc (see

> all the logic where we do multiple checks with size 0, size random, size

> plenty, and for the random case check that the buffer contents match the

> complete output up till the randomly chosen size), so adding tests for

> some other semantics would require a bit more juggling.

>


Yes, agree.
In other way, if the user:
char* full_path = d_path(...);
snprintf("%s", limited_size, full_path);

He/she will get the inconsistent result if we return "" for '%pD'.

--
Cheers,
Justin (Jia He)

> Not that that should be an argument in favor of that behaviour. But FWIW

> that would be my preference.

>

> Rasmus

>


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
diff mbox series

Patch

diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index f063a384c7c8..95ba14dc529b 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -408,12 +408,13 @@  dentry names
 ::
 
 	%pd{,2,3,4}
-	%pD{,2,3,4}
+	%pD
 
 For printing dentry name; if we race with :c:func:`d_move`, the name might
 be a mix of old and new ones, but it won't oops.  %pd dentry is a safer
 equivalent of %s dentry->d_name.name we used to use, %pd<n> prints ``n``
-last components.  %pD does the same thing for struct file.
+last components.  %pD prints full file path together with mount-related
+parenthood.
 
 Passed by reference.
 
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index f0c35d9b65bf..2e5387b08d67 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -27,6 +27,7 @@ 
 #include <linux/string.h>
 #include <linux/ctype.h>
 #include <linux/kernel.h>
+#include <linux/dcache.h>
 #include <linux/kallsyms.h>
 #include <linux/math64.h>
 #include <linux/uaccess.h>
@@ -920,13 +921,25 @@  char *dentry_name(char *buf, char *end, const struct dentry *d, struct printf_sp
 }
 
 static noinline_for_stack
-char *file_dentry_name(char *buf, char *end, const struct file *f,
+char *file_d_path_name(char *buf, char *end, const struct file *f,
 			struct printf_spec spec, const char *fmt)
 {
+	const struct path *path;
+	char *p;
+	char full_path[256];
+
 	if (check_pointer(&buf, end, f, spec))
 		return buf;
 
-	return dentry_name(buf, end, f->f_path.dentry, spec, fmt);
+	path = &f->f_path;
+	if (check_pointer(&buf, end, path, spec))
+		return buf;
+
+	p = d_path_fast(path, full_path, sizeof(full_path));
+	if (IS_ERR(p))
+		return err_ptr(buf, end, p, spec);
+
+	return string_nocheck(buf, end, p, spec);
 }
 #ifdef CONFIG_BLOCK
 static noinline_for_stack
@@ -2296,7 +2309,7 @@  early_param("no_hash_pointers", no_hash_pointers_enable);
  * - 'a[pd]' For address types [p] phys_addr_t, [d] dma_addr_t and derivatives
  *           (default assumed to be phys_addr_t, passed by reference)
  * - 'd[234]' For a dentry name (optionally 2-4 last components)
- * - 'D[234]' Same as 'd' but for a struct file
+ * - 'D' For full path name of a struct file
  * - 'g' For block_device name (gendisk + partition number)
  * - 't[RT][dt][r]' For time and date as represented by:
  *      R    struct rtc_time
@@ -2395,7 +2408,7 @@  char *pointer(const char *fmt, char *buf, char *end, void *ptr,
 	case 'C':
 		return clock(buf, end, ptr, spec, fmt);
 	case 'D':
-		return file_dentry_name(buf, end, ptr, spec, fmt);
+		return file_d_path_name(buf, end, ptr, spec, fmt);
 #ifdef CONFIG_BLOCK
 	case 'g':
 		return bdev_name(buf, end, ptr, spec, fmt);