Message ID | 20210818060533.3569517-6-keescook@chromium.org |
---|---|
State | New |
Headers | show |
Series | Introduce strict memcpy() bounds checking | expand |
On Tue, Aug 17, 2021 at 11:06 PM Kees Cook <keescook@chromium.org> wrote: > > Kernel code has a regular need to describe groups of members within a > structure usually when they need to be copied or initialized separately > from the rest of the surrounding structure. The generally accepted design > pattern in C is to use a named sub-struct: > > struct foo { > int one; > struct { > int two; > int three, four; > } thing; > int five; > }; > > This would allow for traditional references and sizing: > > memcpy(&dst.thing, &src.thing, sizeof(dst.thing)); > > However, doing this would mean that referencing struct members enclosed > by such named structs would always require including the sub-struct name > in identifiers: > > do_something(dst.thing.three); > > This has tended to be quite inflexible, especially when such groupings > need to be added to established code which causes huge naming churn. > Three workarounds exist in the kernel for this problem, and each have > other negative properties. > > To avoid the naming churn, there is a design pattern of adding macro > aliases for the named struct: > > #define f_three thing.three > > This ends up polluting the global namespace, and makes it difficult to > search for identifiers. > > Another common work-around in kernel code avoids the pollution by avoiding > the named struct entirely, instead identifying the group's boundaries using > either a pair of empty anonymous structs of a pair of zero-element arrays: > > struct foo { > int one; > struct { } start; > int two; > int three, four; > struct { } finish; > int five; > }; > > struct foo { > int one; > int start[0]; > int two; > int three, four; > int finish[0]; > int five; > }; > > This allows code to avoid needing to use a sub-struct named for member > references within the surrounding structure, but loses the benefits of > being able to actually use such a struct, making it rather fragile. Using > these requires open-coded calculation of sizes and offsets. The efforts > made to avoid common mistakes include lots of comments, or adding various > BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason > about the boundaries (e.g. the "start" object looks like it's 0 bytes > in length), making bounds checking depend on open-coded calculations: > > if (length > offsetof(struct foo, finish) - > offsetof(struct foo, start)) > return -EINVAL; > memcpy(&dst.start, &src.start, offsetof(struct foo, finish) - > offsetof(struct foo, start)); > > However, the vast majority of places in the kernel that operate on > groups of members do so without any identification of the grouping, > relying either on comments or implicit knowledge of the struct contents, > which is even harder for the compiler to reason about, and results in > even more fragile manual sizing, usually depending on member locations > outside of the region (e.g. to copy "two" and "three", use the start of > "four" to find the size): > > BUILD_BUG_ON((offsetof(struct foo, four) < > offsetof(struct foo, two)) || > (offsetof(struct foo, four) < > offsetof(struct foo, three)); > if (length > offsetof(struct foo, four) - > offsetof(struct foo, two)) > return -EINVAL; > memcpy(&dst.two, &src.two, length); > > In order to have a regular programmatic way to describe a struct > region that can be used for references and sizing, can be examined for > bounds checking, avoids forcing the use of intermediate identifiers, > and avoids polluting the global namespace, introduce the struct_group() > macro. This macro wraps the member declarations to create an anonymous > union of an anonymous struct (no intermediate name) and a named struct > (for references and sizing): > > struct foo { > int one; > struct_group(thing, > int two; > int three, four; > ); > int five; > }; > > if (length > sizeof(src.thing)) > return -EINVAL; > memcpy(&dst.thing, &src.thing, length); > do_something(dst.three); > > There are some rare cases where the resulting struct_group() needs > attributes added, so struct_group_attr() is also introduced to allow > for specifying struct attributes (e.g. __align(x) or __packed). > Additionally, there are places where such declarations would like to > have the struct be typed, so struct_group_typed() is added. > > Given there is a need for a handful of UAPI uses too, the underlying > __struct_group() macro has been defined in UAPI so it can be used there > too. > > Co-developed-by: Keith Packard <keithp@keithp.com> > Signed-off-by: Keith Packard <keithp@keithp.com> > Signed-off-by: Kees Cook <keescook@chromium.org> > Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> > Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor > Enhanced-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> > Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk > Enhanced-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com>
diff --git a/include/linux/stddef.h b/include/linux/stddef.h index 998a4ba28eba..f2aefdb22d1d 100644 --- a/include/linux/stddef.h +++ b/include/linux/stddef.h @@ -36,4 +36,51 @@ enum { #define offsetofend(TYPE, MEMBER) \ (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER)) +/** + * struct_group(NAME, MEMBERS) + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. + * + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + */ +#define struct_group(NAME, MEMBERS...) \ + __struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS) + +/** + * struct_group_attr(NAME, ATTRS, MEMBERS) + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes structure attributes argument. + * + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes + * @MEMBERS: The member declarations for the mirrored structs + */ +#define struct_group_attr(NAME, ATTRS, MEMBERS...) \ + __struct_group(/* no tag */, NAME, ATTRS, MEMBERS) + +/** + * struct_group_tagged(TAG, NAME, MEMBERS) + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes struct tag argument for the named copy. + * + * @TAG: The tag name for the named sub-struct + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + */ +#define struct_group_tagged(TAG, NAME, MEMBERS...) \ + __struct_group(TAG, NAME, /* no attrs */, MEMBERS) + #endif diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index ee8220f8dcf5..0fbdf2f711aa 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -4,3 +4,24 @@ #ifndef __always_inline #define __always_inline inline #endif + +/** + * __struct_group(TAG, NAME, ATTRS, MEMBERS) + * + * Used to create an anonymous union of two structs with identical layout + * and size: one anonymous and one named. The former's members can be used + * normally without sub-struct naming, and the latter can be used to + * reason about the start, end, and size of the group of struct members. + * The named struct can also be explicitly tagged, as well as both having + * struct attributes. + * + * @TAG: The tag name for the named sub-struct (usually empty) + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes (usually empty) + * @MEMBERS: The member declarations for the mirrored structs + */ +#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \ + union { \ + struct { MEMBERS } ATTRS; \ + struct TAG { MEMBERS } ATTRS NAME; \ + }