debugfs: don't access 4 bytes for a boolean

Message ID 3d6f65fa15363650f2d10ca58b9d9d243e98980f.1441961769.git.viresh.kumar@linaro.org
State New
Headers show

Commit Message

Viresh Kumar Sept. 11, 2015, 9:06 a.m.
Long back 'bool' type used to be a typecast to 'int', but that changed
in v2.6.19. And that is a typecast to _Bool now, which (mostly) takes
just a byte. Anyway, the bool type in kernel is used to store true/false
or 1/0 only. So, accessing a single byte should be enough.

The problem with current code is that it reads/writes 4 bytes for a
boolean, which will read/update 3 excess bytes following the boolean
variable. And that can lead to hard to fix bugs. It was a nightmare to
crack this one.

The debugfs code had this bug since the first time it got introduced,
but was never got caught, strange. Maybe the bool variables (monitored
by debugfs) were followed by an 'int' or something bigger and the pad
bytes made sure, we never see this issue.

But the OPP (Operating performance points) library have three booleans
allocated to contiguous bytes and this bug got hit quite soon (The
debugfs support for OPP is yet to be merged).

Fix this by changing type of 'val' pointer to u8 type, so that we only
access a single byte.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Greg,

I wasn't sure about what to add the stable tag. This bug is around for a
really long time now.

And this also gets me worrying if any other part of the kernel are
treating booleans in a similar way :)

Also, there is another problem I see, which probably should be fixed as
well. But I wanted to hear from you before trying to patch the kernel
for this.

debugfs_create_bool() declares the pointer to be of type u32 *.
Shouldn't that be changed to u8 *? There are many users which are
typecasting the variables to make debugfs API happy :)

 fs/debugfs/file.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Rasmus Villemoes Sept. 11, 2015, 11:18 a.m. | #1
On Fri, Sep 11 2015, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> Long back 'bool' type used to be a typecast to 'int', but that changed
> in v2.6.19. And that is a typecast to _Bool now, which (mostly) takes
> just a byte. Anyway, the bool type in kernel is used to store true/false
> or 1/0 only. So, accessing a single byte should be enough.
>
> The problem with current code is that it reads/writes 4 bytes for a
> boolean, which will read/update 3 excess bytes following the boolean
> variable. And that can lead to hard to fix bugs. It was a nightmare to
> crack this one.
>
> The debugfs code had this bug since the first time it got introduced,
> but was never got caught, strange. Maybe the bool variables (monitored
> by debugfs) were followed by an 'int' or something bigger and the pad
> bytes made sure, we never see this issue.
>
> But the OPP (Operating performance points) library have three booleans
> allocated to contiguous bytes and this bug got hit quite soon (The
> debugfs support for OPP is yet to be merged).
>
> Fix this by changing type of 'val' pointer to u8 type, so that we only
> access a single byte.

If the pointed-to type is supposed to be a bool aka _Bool, shouldn't you
cast to bool* instead of assuming sizeof(bool)==1? It's probably
non-existing, but imagine a big-endian architecture where
sizeof(bool)==4; you'd end up reading/writing the wrong byte.

> Also, there is another problem I see, which probably should be fixed as
> well. But I wanted to hear from you before trying to patch the kernel
> for this.
>
> debugfs_create_bool() declares the pointer to be of type u32 *.
> Shouldn't that be changed to u8 *? There are many users which are
> typecasting the variables to make debugfs API happy :)

Hm, yes, that's annoying. But since most people currently do pass an
u32, treating the pointer as u8* is wrong on big-endian (though of
course it doesn't matter if the value is only ever checked for being
zero/non-zero). So it would probably be better to change the
debugfs_create_bool to actually expect a bool* - there aren't _that_
many current callers, and some are obviously aware of the weirdness
(with comments such as 'must be u32 for debugfs_create_bool').

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
gregkh@linuxfoundation.org Sept. 11, 2015, 4:41 p.m. | #2
On Fri, Sep 11, 2015 at 01:18:37PM +0200, Rasmus Villemoes wrote:
> On Fri, Sep 11 2015, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> 
> > Long back 'bool' type used to be a typecast to 'int', but that changed
> > in v2.6.19. And that is a typecast to _Bool now, which (mostly) takes
> > just a byte. Anyway, the bool type in kernel is used to store true/false
> > or 1/0 only. So, accessing a single byte should be enough.
> >
> > The problem with current code is that it reads/writes 4 bytes for a
> > boolean, which will read/update 3 excess bytes following the boolean
> > variable. And that can lead to hard to fix bugs. It was a nightmare to
> > crack this one.
> >
> > The debugfs code had this bug since the first time it got introduced,
> > but was never got caught, strange. Maybe the bool variables (monitored
> > by debugfs) were followed by an 'int' or something bigger and the pad
> > bytes made sure, we never see this issue.
> >
> > But the OPP (Operating performance points) library have three booleans
> > allocated to contiguous bytes and this bug got hit quite soon (The
> > debugfs support for OPP is yet to be merged).
> >
> > Fix this by changing type of 'val' pointer to u8 type, so that we only
> > access a single byte.

Nice catch, but let's do this a bit differently (see below).

> If the pointed-to type is supposed to be a bool aka _Bool, shouldn't you
> cast to bool* instead of assuming sizeof(bool)==1? It's probably
> non-existing, but imagine a big-endian architecture where
> sizeof(bool)==4; you'd end up reading/writing the wrong byte.
> 
> > Also, there is another problem I see, which probably should be fixed as
> > well. But I wanted to hear from you before trying to patch the kernel
> > for this.
> >
> > debugfs_create_bool() declares the pointer to be of type u32 *.
> > Shouldn't that be changed to u8 *? There are many users which are
> > typecasting the variables to make debugfs API happy :)
> 
> Hm, yes, that's annoying. But since most people currently do pass an
> u32, treating the pointer as u8* is wrong on big-endian (though of
> course it doesn't matter if the value is only ever checked for being
> zero/non-zero). So it would probably be better to change the
> debugfs_create_bool to actually expect a bool* - there aren't _that_
> many current callers, and some are obviously aware of the weirdness
> (with comments such as 'must be u32 for debugfs_create_bool').

I agree, let's just fix up the api to have the correct type.  I think
when I originally wrote the function, we didn't have a 'bool' type that
was "native" in the c standard.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Arnd Bergmann Sept. 14, 2015, 3:25 p.m. | #3
On Friday 11 September 2015 14:36:06 Viresh Kumar wrote:
> 
> debugfs_create_bool() declares the pointer to be of type u32 *.
> Shouldn't that be changed to u8 *? There are many users which are
> typecasting the variables to make debugfs API happy 

I'd say that the argument to debugfs_create_bool() has to match the
access in the functions you are modifying, as well as whatever
gets passed into it by callers.

By accessing only the first byte, you break all drivers that
call debugfs_create_bool() with a four-byte argument, at least
on big-endian systems!

If we change any part of this, we need to audit the existing 31 callers
of the function and change them all to use a bool type.

In the problem that you saw, what prevented gcc from printing a
compile-time warning about debugfs_create_bool() being called with
a bool argument?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Viresh Kumar Sept. 14, 2015, 3:31 p.m. | #4
On 14-09-15, 17:25, Arnd Bergmann wrote:
> I'd say that the argument to debugfs_create_bool() has to match the
> access in the functions you are modifying, as well as whatever
> gets passed into it by callers.
> 
> By accessing only the first byte, you break all drivers that
> call debugfs_create_bool() with a four-byte argument, at least
> on big-endian systems!
> 
> If we change any part of this, we need to audit the existing 31 callers
> of the function and change them all to use a bool type.

Right, so I have already sent a new version of this patch which should
be able to take care of stuff you pointed out.

> In the problem that you saw, what prevented gcc from printing a
> compile-time warning about debugfs_create_bool() being called with
> a bool argument?

A forced cast to u32 * :)

Patch hide | download patch | download mbox

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index 284f9aa0028b..c123185a296a 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
@@ -439,7 +439,7 @@  static ssize_t read_file_bool(struct file *file, char __user *user_buf,
 			      size_t count, loff_t *ppos)
 {
 	char buf[3];
-	u32 *val = file->private_data;
+	u8 *val = file->private_data;
 
 	if (*val)
 		buf[0] = 'Y';
@@ -456,7 +456,7 @@  static ssize_t write_file_bool(struct file *file, const char __user *user_buf,
 	char buf[32];
 	size_t buf_size;
 	bool bv;
-	u32 *val = file->private_data;
+	u8 *val = file->private_data;
 
 	buf_size = min(count, (sizeof(buf)-1));
 	if (copy_from_user(buf, user_buf, buf_size))