mbox series

[v8,0/9] Generate modules.builtin.alias from match ids

Message ID 20221219191855.2010466-1-allenwebb@google.com
Headers show
Series Generate modules.builtin.alias from match ids | expand

Message

Allen Webb Dec. 19, 2022, 7:18 p.m. UTC
Generate modules.builtin.alias from match ids

This patch series (v8) generates `modules.builtin.alias` during modpost.
The goal is for tools like USBGuard to leverage not only modules.aliases
but also `modules.builtin.aliases` to associate devices with the modules
that may be bound before deciding to authorize a device or not. This is
particularly useful in cases when new devices of a particular type
shouldn't be allowed part of the time like for lock screens.

Also included in this series are style fixes and fixes for build
breakages for built-in modules that relied on MODULE_DEVICE_TABLE being
a no-op. Some of these were typos in the device table name and one
ifdef-ed out the device table.

--

# Generate modules.builtin.alias from match ids

This series (v7) has incremental improvements over the previous series.
One big positive of this patch series is it makes it harder for bugs
in kernel modules related to MODULE_DEVICE_TABLE to hide when a module
is only ever tested as a built-in module. This is demonstrated by all
the required fixes at the beginning of the series.

Note, cover letters were first added in v5.

  RFC (broken patch): https://lore.kernel.org/lkml/CAJzde042-M4UbpNYKw0eDVg4JqYmwmPYSsmgK+kCMTqsi+-2Yw@mail.gmail.com/
  v1 (missing v1 label): https://lore.kernel.org/lkml/20221111152852.2837363-1-allenwebb@google.com/
  v2 (missing v2 label): https://lore.kernel.org/lkml/20221128201332.3482092-1-allenwebb@google.com/
  v3: https://lore.kernel.org/lkml/20221129224313.455862-1-allenwebb@google.com/
  v4: https://lore.kernel.org/lkml/20221130221447.1202206-1-allenwebb@google.com/
  v5: https://lore.kernel.org/lkml/20221201211630.101541-1-allenwebb@google.com/
  v6: https://lore.kernel.org/lkml/20221202224540.1446952-1-allenwebb@google.com/
  v7: https://lore.kernel.org/lkml/20221216221703.294683-1-allenwebb@google.com/
  v8: This version

## Patch series status
123456789012345678901234567890123456789012345678901234567890123456789012
This series is still going through revisions in response to comments.

I believe there is potential to improve the Makefile part of the patch
series as well as an open question of whether modpost should generate
`modules.built.alias` directly or create a vmlinuz.mod.c containing the
missing module info for the match-id based aliases for built-in modules.

## Acknowledgements

Thanks to Greg Kroah-Hartman, Christophe Leroy, and the Linux
maintainers for being patient with me as I have worked through learning
the kernel workflow to get this series into a more presentable state.

Thanks to Luis Chamberlain for raising the alternative of using kmod to
address the primary motivation of the patch series.

Also, thanks to Intel's kernel test robot <lkp@intel.com> for catching
issues that showed up on different kernel configurations.


Allen Webb (9):
  imx: Fix typo
  rockchip-mailbox: Fix typo
  scsi/BusLogic: Always include device id table
  stmpe-spi: Fix typo
  module.h: MODULE_DEVICE_TABLE for built-in modules
  modpost: Track module name for built-in modules
  modpost: Add -b option for emitting built-in aliases
  file2alias.c: Implement builtin.alias generation
  build: Add modules.builtin.alias

 .gitignore                         |  1 +
 Makefile                           |  1 +
 drivers/mailbox/rockchip-mailbox.c |  2 +-
 drivers/mfd/stmpe-spi.c            |  2 +-
 drivers/scsi/BusLogic.c            |  2 -
 drivers/soc/imx/imx8mp-blk-ctrl.c  |  2 +-
 include/linux/module.h             | 15 ++++-
 scripts/Makefile.modpost           | 17 +++++-
 scripts/mod/file2alias.c           | 94 +++++++++++++++++++++++-------
 scripts/mod/modpost.c              | 23 +++++++-
 scripts/mod/modpost.h              |  2 +
 11 files changed, 131 insertions(+), 30 deletions(-)

Comments

Allen Webb Dec. 19, 2022, 8:49 p.m. UTC | #1
Please disregard this patch. I updated the commit message and this was
hanging around in my outgoing directory afterward.

On Mon, Dec 19, 2022 at 2:46 PM Allen Webb <allenwebb@google.com> wrote:
>
> Update the documentation to include the presense and use case of
> modules.builtin.alias.
>
> Signed-off-by: Allen Webb <allenwebb@google.com>
> ---
>  Documentation/kbuild/kbuild.rst | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
> index 08f575e6236c..1c7c02040a54 100644
> --- a/Documentation/kbuild/kbuild.rst
> +++ b/Documentation/kbuild/kbuild.rst
> @@ -17,6 +17,12 @@ modules.builtin
>  This file lists all modules that are built into the kernel. This is used
>  by modprobe to not fail when trying to load something builtin.
>
> +modules.builtin.alias
> +---------------------
> +This file lists all match-id based aliases for modules built into the kernel.
> +These are intended to enable userspace to make authorization decisions based
> +on which modules are likely to be bound to a device after it is authorized.
> +
>  modules.builtin.modinfo
>  -----------------------
>  This file contains modinfo from all modules that are built into the kernel.
> --
> 2.37.3
>
Luis Chamberlain Dec. 19, 2022, 9:23 p.m. UTC | #2
On Mon, Dec 19, 2022 at 02:46:18PM -0600, Allen Webb wrote:
> Update the documentation to include the presense and use case of
> modules.builtin.alias.
> 
> Signed-off-by: Allen Webb <allenwebb@google.com>
> ---
>  Documentation/kbuild/kbuild.rst | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
> index 08f575e6236c..1c7c02040a54 100644
> --- a/Documentation/kbuild/kbuild.rst
> +++ b/Documentation/kbuild/kbuild.rst
> @@ -17,6 +17,12 @@ modules.builtin
>  This file lists all modules that are built into the kernel. This is used
>  by modprobe to not fail when trying to load something builtin.
>  
> +modules.builtin.alias
> +---------------------
> +This file lists all match-id based aliases for modules built into the kernel.
> +These are intended to enable userspace to make authorization decisions based
> +on which modules are likely to be bound to a device after it is authorized.

What is an example? This sounds obscure.

  Luis
Allen Webb Dec. 19, 2022, 10:20 p.m. UTC | #3
On Mon, Dec 19, 2022 at 4:07 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Mon, Dec 19, 2022 at 03:40:42PM -0600, Allen Webb wrote:
> > On Mon, Dec 19, 2022 at 3:23 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Mon, Dec 19, 2022 at 02:46:18PM -0600, Allen Webb wrote:
> > > > Update the documentation to include the presense and use case of
> > > > modules.builtin.alias.
> > > >
> > > > Signed-off-by: Allen Webb <allenwebb@google.com>
> > > > ---
> > > >  Documentation/kbuild/kbuild.rst | 6 ++++++
> > > >  1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
> > > > index 08f575e6236c..1c7c02040a54 100644
> > > > --- a/Documentation/kbuild/kbuild.rst
> > > > +++ b/Documentation/kbuild/kbuild.rst
> > > > @@ -17,6 +17,12 @@ modules.builtin
> > > >  This file lists all modules that are built into the kernel. This is used
> > > >  by modprobe to not fail when trying to load something builtin.
> > > >
> > > > +modules.builtin.alias
> > > > +---------------------
> > > > +This file lists all match-id based aliases for modules built into the kernel.
> > > > +These are intended to enable userspace to make authorization decisions based
> > > > +on which modules are likely to be bound to a device after it is authorized.
> > >
> > > What is an example? This sounds obscure.
> >
> > Many of the devices that match the usb_storage driver only specify the
> > vendor id, product id, and device id (VID:PID:D) and do not match
> > against device class, interface class, etc. Here are some examples
> > from modules.alias: A grep for wildcards in these fields yields 6136
> > matches:
> > grep 'dc\*dsc\*dp\*ic\*isc\*ip\*in\*'
> > /lib/modules/5.19.11-1rodete1-amd64/modules.alias | wc -l
> > 6136
> >
> > To write USBGuard policy that only authorizes devices that bind to a
> > particular module the policy needs to be aware of all these VID:PID:D
> > which can change between kernel versions.
> >
> > This is done at runtime rather than excluding modules from the build
> > because some devices are not needed at or before login or when a
> > device is locked. By not authorizing new devices that would bind to a
> > set of modules, these modules become unreachable to an attacker who
> > seeks to exploit kernel bugs in those modules.
> >
> > I could add this detail to the documentation file, but I was trying to
> > keep the description to about the same length as the others around it.
>
> How about the second sentence you wrote say something like:
>
> An example usage of the built-in aliases is to enable software such as
> USBGuard to enable / disable specific devices outside of just the
> vendor, product and device ID. This allows more flexible security policies
> in userspace.

I tweaked it a tiny bit, but that makes the whole description:

This file lists all match-id based aliases for modules built into the kernel.
An example usage of the built-in aliases is to enable software such as
USBGuard to allow or block devices outside of just the vendor, product, and
device ID. This enables more flexible security policies in userspace.

>
>   Luis
Greg Kroah-Hartman Dec. 20, 2022, 6:46 a.m. UTC | #4
On Mon, Dec 19, 2022 at 02:46:10PM -0600, Allen Webb wrote:
> A one character difference in the name supplied to MODULE_DEVICE_TABLE
> breaks a future patch set, so fix the typo.
> 
> Cc: stable@vger.kernel.org
> Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")

How has this been an issue since the 4.6 kernel and no one has noticed
it?  Can this code not be built as a module?  If not, then please
explain this.

thanks,

greg k-h
Allen Webb Dec. 20, 2022, 2:58 p.m. UTC | #5
On Tue, Dec 20, 2022 at 12:46 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Mon, Dec 19, 2022 at 02:46:10PM -0600, Allen Webb wrote:
> > A one character difference in the name supplied to MODULE_DEVICE_TABLE
> > breaks a future patch set, so fix the typo.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: f70ed3b5dc8b ("mailbox: rockchip: Add Rockchip mailbox driver")
>
> How has this been an issue since the 4.6 kernel and no one has noticed
> it?  Can this code not be built as a module?  If not, then please
> explain this.

As mentioned in a different sub-thread this cannot be built as a
module so I updated the commit message to:

imx: Fix typo

A one character difference in the name supplied to MODULE_DEVICE_TABLE
breaks compilation for SOC_IMX8M after built-in modules can generate
match-id based module aliases, so fix the typo.

This was not caught earlier because SOC_IMX8M can not be built as a
module and MODULE_DEVICE_TABLE is a no-op for built-in modules.

Fixes: 556f5cf9568a ("soc: imx: add i.MX8MP HSIO blk-ctrl")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Allen Webb <allenwebb@google.com>

>
> thanks,
>
> greg k-h
Allen Webb Dec. 20, 2022, 4:36 p.m. UTC | #6
On Tue, Dec 20, 2022 at 12:45 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Mon, Dec 19, 2022 at 02:46:13PM -0600, Allen Webb wrote:
> > Implement MODULE_DEVICE_TABLE for build-in modules to make it possible
> > to generate a builtin.alias file to complement modules.alias.
> >
> > Signed-off-by: Allen Webb <allenwebb@google.com>
> > ---
> >  include/linux/module.h | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/module.h b/include/linux/module.h
> > index ec61fb53979a..3d1b04ca6350 100644
> > --- a/include/linux/module.h
> > +++ b/include/linux/module.h
> > @@ -243,7 +243,20 @@ extern void cleanup_module(void);
> >  extern typeof(name) __mod_##type##__##name##_device_table            \
> >    __attribute__ ((unused, alias(__stringify(name))))
> >  #else  /* !MODULE */
> > -#define MODULE_DEVICE_TABLE(type, name)
> > +/*
> > + * The names may not be unique for built-in modules, so include the module name
> > + * to guarantee uniqueness.
>
> What "names" are you referring to here with the words, "The names"?
>
> And built-in modules have the same rules as external names, they have to
> be unique so I do not understand the problem you are trying to solve
> here, which means you need to describe it better in both the changelog
> text and the comment.

I changed the comment to:
/*
 * Creates an alias so file2alias.c can find device table for built in modules.
 *
 * The module name is included for two reasons:
 *   - Adding the module name to the alias avoids creating two aliases with the
 *     same name. Historically MODULE_DEVICE_TABLE was a no-op for built-in
 *     modules, so there was nothing to stop different modules from having the
 *     same device table name and consequently the same alias when building as a
 *     module.
 *   - The module name is needed by files2alias.c to associate a particular
 *     device table with its associated module since files2alias would otherwise
 *     see the module name as `vmlinuz.o` for built-in modules.
 */

>
> > + *
> > + * Note that extern is needed because modpost reads these symbols to generate
> > + * modalias entries for each match id in each device table. They are not used
> > + * at runtime.
>
> This comment isn't explaining much about what the #define is to be used
> for, is it?

I will drop this. I originally added the comment because Christophe Leroy said:
"'extern' keyword is pointless of function prototypes and deprecated.
Don't add new occurences."

This is clearly not a typical function prototype and the guidance from:
https://www.kernel.org/doc/html/latest/process/coding-style.html#function-prototypes
should not apply.

>
> confused,
>
> greg k-h
Luis Chamberlain Dec. 20, 2022, 6:12 p.m. UTC | #7
On Tue, Dec 20, 2022 at 08:58:36AM -0600, Allen Webb wrote:
> As mentioned in a different sub-thread this cannot be built as a
> module so I updated the commit message to:
> 
> imx: Fix typo
>
> A one character difference in the name supplied to MODULE_DEVICE_TABLE
> breaks compilation for SOC_IMX8M after built-in modules can generate
> match-id based module aliases, so fix the typo.

Are you saying that it is a typo *now* only, and fixing it does not fix
compilation now, but that fixing the typo will fix a future compilation
issue once your patches get merged for built-in module aliases?

> This was not caught earlier because SOC_IMX8M can not be built as a
> module and MODULE_DEVICE_TABLE is a no-op for built-in modules.

Odd, so why did it use MODULE_DEVICE_TABLE then? What was the reason for
the driver having MODULE_DEVICE_TABLE if it was a no-op?

  Luis
Allen Webb Dec. 20, 2022, 7:49 p.m. UTC | #8
On Tue, Dec 20, 2022 at 12:47 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, Dec 20, 2022 at 12:19:49PM -0600, Allen Webb wrote:
> > On Tue, Dec 20, 2022 at 12:12 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Tue, Dec 20, 2022 at 08:58:36AM -0600, Allen Webb wrote:
> > > > As mentioned in a different sub-thread this cannot be built as a
> > > > module so I updated the commit message to:
> > > >
> > > > imx: Fix typo
> > > >
> > > > A one character difference in the name supplied to MODULE_DEVICE_TABLE
> > > > breaks compilation for SOC_IMX8M after built-in modules can generate
> > > > match-id based module aliases, so fix the typo.
> > >
> > > Are you saying that it is a typo *now* only, and fixing it does not fix
> > > compilation now, but that fixing the typo will fix a future compilation
> > > issue once your patches get merged for built-in module aliases?
> >
> > It was always a typo, it just doesn't affect the build because
> > MODULE_DEVICE_TABLE is not doing anything for built-in modules before
> > this patch series.
> >
> > >
> > > > This was not caught earlier because SOC_IMX8M can not be built as a
> > > > module and MODULE_DEVICE_TABLE is a no-op for built-in modules.
> > >
> > > Odd, so why did it use MODULE_DEVICE_TABLE then? What was the reason for
> > > the driver having MODULE_DEVICE_TABLE if it was a no-op?
> >
> > That is a good question. I can only speculate as to the answer
>
> You can use git blame to trace back to the original commit that added
> it, then  use git blame foo.c commit-id~1  on the file to keep going
> back until you get to the first commit that added that entry, check out
> that as a branch and see if the driver was still not a module then
> (tristate). If so then your speculation is very likely accurate and
> can be spelled out in the commit log message.

All three cases are bool configs.

>
> It begs the inverse question too though, you are finding uses of
> built-in-always code (never tristate) which uses MODULE_DEVICE_TABLE().
> Although today that's a no-op, after your changes this becomes useful
> information, so do you need to scrape to see what built-in-aways code
> *do* not use MODULE_DEVICE_TABLE() where after your patches it would
> have become useful?
>
> Determing if there is value to that endeavour should be easily grasped by
> reading the description you are adding to MODULE_DEVICE_TABLE() --
> in your patch 5 "module.h: MODULE_DEVICE_TABLE for built-in modules".
> Driver developers for built-in-always code should read that description
> and be able to decide if they should use it or not. But even your latest
> replies to Greg do not make that clear, *I personally gather* rather that
> this would in no way shape or form be useful. But is that true?

I took another stab at clarifying (and also dropped the ifdev since
the same macro works both for separate and built-in modules:

/*
 * Creates an alias so file2alias.c can find device table.
 *
 * Use this in cases where a device table is used to match devices because it
 * surfaces match-id based module aliases to userspace for:
 *   - Automatic module loading.
 *   - Tools like USBGuard which allow or block devices based on policy such as
 *     which modules match a device.
 *
 * The module name is included in the alias for two reasons:
 *   - It avoids creating two aliases with the same name for built-in modules.
 *     Historically MODULE_DEVICE_TABLE was a no-op for built-in modules, so
 *     there was nothing to stop different modules from having the same device
 *     table name and consequently the same alias when building as a module.
 *   - The module name is needed by files2alias.c to associate a particular
 *     device table with its associated module for built-in modules since
 *     files2alias would otherwise see the module name as `vmlinuz.o`.
 */
#define MODULE_DEVICE_TABLE(type, name) \
extern void *CONCATENATE( \
CONCATENATE(__mod_##type##__##name##__, \
__KBUILD_MODNAME), \
_device_table) \
__attribute__ ((unused, alias(__stringify(name))))

>
> So why not just remove MODULE_DEVICE_TABLE() from code we know is
> built-in-always code instead of fixing a typo just to fix a future
> compile issue?
>
> Then your commit log is not about "fix typo", but rather remove a no-op
> macro as the driver is always built-in and keeping that macro would not
> help built-in code.

The deciding factor in whether it makes sense to remove these vs fix
them seems to be, "How complete do we want modules.builtin.alias to
be?"

Arguably we should just drop these in cases where there isn't an
"authorized" sysfs attribute but following that logic there is not any
reason to generate built-in aliases for anything except USB and
thunderbolt.

On the flip side, if we are going to the effort to make this a generic
solution that covers everything, the built-in aliases are only as
useful as they are complete, so we would want everything that defines
a device table to call the macro correctly.

>
> > but it
> > is plausible people copied a common pattern and since no breakage was
> > noticed left it as is.
>
> This level of clarity is important to spell out in the commit log
> message, specially if you are suggesting it is just a typo fix. Because
> I will take it for granted that it is just that.
>
> If it fixes a future use case where the typo would be more of an issue,
> you can mention that in a secondary paragraph or sentence.
>
> > It also raises the question how many modules have device tables, but
> > do not call MODULE_DEVICE_TABLE since they are only ever built-in.
> > Maybe there should be some build time enforcement mechanism to make
> > sure that these are consistent.
>
> Definitely, Nick Alcock is doing some related work where the semantics
> of built-in modules needs to be clearer, he for instance is now removing
> a few MODULE_() macros for things which *are never* modules, and this is
> because after commit 8b41fc4454e ("kbuild: create modules.builtin
> without Makefile.modbuiltin or tristate.conf") we rely on the module
> license tag to generate the modules.builtin file. Without that commit
> we end up traversing the source tree twice. Nick's work builds on
> that work and futher clarifies these semantics by adding tooling which
> complains when something which is *never* capable of being a module
> uses module macros. The macro you are extending, MODULE_DEVICE_TABLE(),
> today is a no-op for built-in, but you are adding support to extend it
> for built-in stuff. Nick's work will help with clarifying symbol locality
> and so he may be interested in your association for the data in
> MODULE_DEVICE_TABLE and how you associate to a respective would-be
> module. His work is useful for making tracing more accurate with respect
> to symbol associations, so the data in MODULE_DEVICE_TABLE() may be
> useful as well to him.

Thanks, I will look through what I can find.

>
> You folks may want to Cc each other on your patches.
>
> If we know for certain things *will never* be used or *should not be
> used*, as in the case of the module license tag, we should certainly
> have tooling to pick up on that crap to help us tidy it up.
>
> If you determine MODULE_DEVICE_TABLE() *should* not be used for built-in
> always code (never tristate) then you and Nick likely have overlap of
> macros to tidy up and tooling to share to spot these sort of issues.

It definitely is needed for never-tristate modules that match devices
in subsystems that surface the authorized attribute.

>
>   Luis
Luis Chamberlain Dec. 20, 2022, 8:03 p.m. UTC | #9
On Tue, Dec 20, 2022 at 01:49:04PM -0600, Allen Webb wrote:
> I took another stab at clarifying (and also dropped the ifdev since
> the same macro works both for separate and built-in modules:
> 
> /*
>  * Creates an alias so file2alias.c can find device table.
>  *
>  * Use this in cases where a device table is used to match devices because it
>  * surfaces match-id based module aliases to userspace for:
>  *   - Automatic module loading.
>  *   - Tools like USBGuard which allow or block devices based on policy such as
>  *     which modules match a device.
>  *
>  * The module name is included in the alias for two reasons:
>  *   - It avoids creating two aliases with the same name for built-in modules.
>  *     Historically MODULE_DEVICE_TABLE was a no-op for built-in modules, so
>  *     there was nothing to stop different modules from having the same device
>  *     table name and consequently the same alias when building as a module.
>  *   - The module name is needed by files2alias.c to associate a particular
>  *     device table with its associated module for built-in modules since
>  *     files2alias would otherwise see the module name as `vmlinuz.o`.
>  */

This is still weak in light of the questions I had. It does not make it
easy for a driver developer who is going to support only built-in only
if they need to define this or not. And it seems we're still discussing
the merits of this, so I'd wait until this is fleshed out, but I think
we are on the right track finally.

> The deciding factor in whether it makes sense to remove these vs fix
> them seems to be, "How complete do we want modules.builtin.alias to
> be?"
> 
> Arguably we should just drop these in cases where there isn't an
> "authorized" sysfs attribute but following that logic there is not any
> reason to generate built-in aliases for anything except USB and
> thunderbolt.

There we go, now we have a *real* use case for this for built-in stuff
to consider. Is USBGuard effective even for built-in stuff?

Given everything discussed so far I'd like to get clarification if it
even help for built-in USB / thunderbolt. Does it? If so how? What could
userspace do with this information if the driver is already built-in?

> On the flip side, if we are going to the effort to make this a generic
> solution that covers everything, the built-in aliases are only as
> useful as they are complete, so we would want everything that defines
> a device table to call the macro correctly.

It is the ambiguity which is terrible to add. If the only use case is
for USB and Thunderbolt then we can spell it out, then only those driver
developers would care to consider it if the driver is bool. And, a
respective tooling would scrape only those drivers to verify if the
table is missing for built-in too.

> It definitely is needed for never-tristate modules that match devices
> in subsystems that surface the authorized attribute.

What is this "authorized attribute" BTW exactly? Do have some
documentation reference?

  Luis
Allen Webb Dec. 27, 2022, 5:42 p.m. UTC | #10
On Tue, Dec 20, 2022 at 5:09 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, Dec 20, 2022 at 03:57:57PM -0600, Allen Webb wrote:
> > On Tue, Dec 20, 2022 at 2:03 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> > >
> > > On Tue, Dec 20, 2022 at 01:49:04PM -0600, Allen Webb wrote:
> > > > I took another stab at clarifying (and also dropped the ifdev since
> > > > the same macro works both for separate and built-in modules:
> > > >
> > > > /*
> > > >  * Creates an alias so file2alias.c can find device table.
> > > >  *
> > > >  * Use this in cases where a device table is used to match devices because it
> > > >  * surfaces match-id based module aliases to userspace for:
> > > >  *   - Automatic module loading.
> > > >  *   - Tools like USBGuard which allow or block devices based on policy such as
> > > >  *     which modules match a device.
> > > >  *
> > > >  * The module name is included in the alias for two reasons:
> > > >  *   - It avoids creating two aliases with the same name for built-in modules.
> > > >  *     Historically MODULE_DEVICE_TABLE was a no-op for built-in modules, so
> > > >  *     there was nothing to stop different modules from having the same device
> > > >  *     table name and consequently the same alias when building as a module.
> > > >  *   - The module name is needed by files2alias.c to associate a particular
> > > >  *     device table with its associated module for built-in modules since
> > > >  *     files2alias would otherwise see the module name as `vmlinuz.o`.
> > > >  */
> > >
> > > This is still weak in light of the questions I had. It does not make it
> > > easy for a driver developer who is going to support only built-in only
> > > if they need to define this or not. And it seems we're still discussing
> > > the merits of this, so I'd wait until this is fleshed out, but I think
> > > we are on the right track finally.
> > >
> > > > The deciding factor in whether it makes sense to remove these vs fix
> > > > them seems to be, "How complete do we want modules.builtin.alias to
> > > > be?"
> > > >
> > > > Arguably we should just drop these in cases where there isn't an
> > > > "authorized" sysfs attribute but following that logic there is not any
> > > > reason to generate built-in aliases for anything except USB and
> > > > thunderbolt.
> > >
> > > There we go, now we have a *real* use case for this for built-in stuff
> > > to consider. Is USBGuard effective even for built-in stuff?
> >
> > Yes, just because a module is loaded doesn't mean a specific device
> > has probed the driver yet.
> >
> > >
> > > Given everything discussed so far I'd like to get clarification if it
> > > even help for built-in USB / thunderbolt. Does it? If so how? What could
> > > userspace do with this information if the driver is already built-in?
> >
> > We are not trying to stop the module from being loaded (which is
> > always the case for built-in modules) and in fact it is possible to
> > have devices already using the module and still not authorize (and by
> > extension probe the module for) newly connected devices.
> >
> > For example someone might have an unattended computer downloading
> > installation media to a USB drive. Presumably this computer would be
> > locked to make it more difficult for a bad actor to access the
> > computer. Since USB storage devices are not needed to interact with
> > the lock screen, we can use the authorized_default sysfs attribute to
> > not allow new USB devices to probe modules by default and have
> > USBGuard vet the devices. Mice, keyboards, etc can be allowed so that
> > the lock screen can still be used (this important in cases like
> > suspend+resume or docks).
>
> I see thanks!
>
> > > > On the flip side, if we are going to the effort to make this a generic
> > > > solution that covers everything, the built-in aliases are only as
> > > > useful as they are complete, so we would want everything that defines
> > > > a device table to call the macro correctly.
> > >
> > > It is the ambiguity which is terrible to add. If the only use case is
> > > for USB and Thunderbolt then we can spell it out, then only those driver
> > > developers would care to consider it if the driver is bool. And, a
> > > respective tooling would scrape only those drivers to verify if the
> > > table is missing for built-in too.
> >
> > I was aiming to write it so that it wouldn't easily become obsolete by
> > later changes, so tying it to the authorized and authorized_default
> > sysfs attributes is probably the ideal deciding factor and listing USB
> > and thunderbolt as examples makes sense.
>
> I think it would make sense then to be explicit about this for now, even
> if it seems we can obsolete this. Right now the justification for having
> this for built-in is *very* specific to this feature for USB, which
> makes use of special USB sysfs attributes which as you explained, allows
> to restrict probe of devices even though the respective driver is already
> loaded.

The thing we might obsolete is limiting it to just the USB subsystem.
I am fine with expanding the documentation and limiting the scope of
the feature to USB/thunderbolt for now.

>
> > There are sysfs attributes called  authorized and authorized_default
> > that together can prevent devices from being fully enumerated and
> > probed.
>
> Although these attributes are USB specfic today it gets me wondering if
> other subsystems may benefit from a similar feature.

The subsystems that would likely benefit the most are ones that are
externally reachable. The external ports that come to mind are USB /
thunderbolt, firewire, PCMCIA / expresscard, eSATA, serial and
parallel ports. Supporting PCMCIA / expresscard seems like it would
require adding the authorized sysfs attribute to pci. eSATA would be
covered by ata.

>
> > authorized_default gets set to 0 for the hub and any devices
> > connected after that will show in sysfs, but not fully enumerate or
> > probe until the device's authorized attribute is set to 1. There are
> > some edge cases like internal devices which have some extra
> > complexity.
> >
> > As for documentation, I wasn't able to find much other than:
> > https://github.com/torvalds/linux/blob/v6.1/drivers/usb/core/hcd.c#L370
> > /* authorized_default behaviour:
> > * -1 is authorized for all devices except wireless (old behaviour)
> > * 0 is unauthorized for all devices
> > * 1 is authorized for all devices
> > * 2 is authorized for internal devices
> > */
> > ...
> > and
> > https://github.com/torvalds/linux/blob/v6.1/Documentation/admin-guide/kernel-parameters.txt#L6424
> > usbcore.authorized_default=
> >    [USB] Default USB device authorization:
> >    (default -1 = authorized except for wireless USB,
> >    0 = not authorized, 1 = authorized, 2 = authorized
> >    if device connected to internal port)
> > ...
> > The feature looks like it was originally introduced for wireless USB in:
> > https://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg54289.html
> > and later adapted for use cases like USBGuard here:
> > https://github.com/torvalds/linux/commit/c4fc2342cb611f945fa468e742759e25984005ad
>
> Thanks for digging all this up. Can you extend the docs on
> Documentation/driver-api/usb/ somewhere about this attribute as part of
> your changes so its clear the motivation, *then* you make your changes.
> The documentation for MODULE_DEVICE_TABLE() can just say:
>
> The only use-case for built-in drivers today is enable userspace to
> prevent / allow probe for devices on certain subsystems even if the
> driver is already loaded. An example is the USB subsystem with its
> authorized_default sysfs attribute. For more details refer to the
> kernel's Documentation for USB about authorized_default.
>
> That should be clear enough for both USB driver writers and others.
>
> Please also extend the docs for MODULE_DEVICE_TABLE() on
> Documentation/driver-api/usb/writing_usb_driver.rst or where you see
> fit for your changes. That can go into depth about the USBGuard stuff.
>
>   Luis

How do you feel about only having one version of the macro for both
cases and merging the documentation so things are kept simple? Here is
what I have locally for the macro without the ifdef and the updated
documentation:

/*
 * Creates an alias so file2alias.c can find device table.
 *
 * Use this in cases where a device table is used to match devices because it
 * surfaces match-id based module aliases to userspace for:
 *   - Automatic module loading through modules.alias.
 *   - Tools like USBGuard which allow or block devices based on policy such as
 *     which modules match a device.
 *
 * The only use-case for built-in drivers today is enable userspace to prevent /
 * allow probe for devices on certain subsystems even if the driver is already
 * loaded. An example is the USB subsystem with its authorized_default sysfs
 * attribute. For more details refer to the kernel's Documentation for USB about
 * authorized_default.
 *
 * The module name is included in the alias for two reasons:
 *   - It avoids creating two aliases with the same name for built-in modules.
 *     Historically MODULE_DEVICE_TABLE was a no-op for built-in modules, so
 *     there was nothing to stop different modules from having the same device
 *     table name and consequently the same alias when building as a module.
 *   - The module name is needed by files2alias.c to associate a particular
 *     device table with its associated module for built-in modules since
 *     files2alias would otherwise see the module name as `vmlinuz.o`.
 */
#define MODULE_DEVICE_TABLE(type, name) \
extern void *CONCATENATE( \
CONCATENATE(__mod_##type##__##name##__, \
__KBUILD_MODNAME), \
_device_table) \
__attribute__ ((unused, alias(__stringify(name))))


Here is a draft version for an updated to
Documentation/driver-api/usb/ (I will add the 80 char line breaks
later) in case you have feedback:


# Authorization

Authorization provides userspace a way to allow or block configuring
devices early during enumeration before any modules are probed for the
device. While it is possible to block a device by not loading the
required modules, this also prevents other devices from using the
module as well. For example someone might have an unattended computer
downloading installation media to a USB drive. Presumably this
computer would be locked to make it more difficult for a bad actor to
access the computer. Since USB storage devices are not needed to
interact with the lock screen, the authorized_default sysfs attribute
can be set to not authorize new USB devices by default. A userspace
tool like USBGuard can then vet the devices. Mice, keyboards, etc can
be allowed by writing to their authorized sysfs attribute so that the
lock screen can still be used (this important in cases like
suspend+resume or docks) while other devices can be blocked as long as
the lock screen is shown.

## Sysfs Attributes

Userspace can control USB device authorization through the
authorized_default and authorized sysfs attributes.

### authorized_default

.. kernel-doc:: drivers/usb/core/hcd.c
   :export:

The authorized_default sysfs attribute is only present for host
controllers. It determines the initial state of the authorized sysfs
attribute of USB devices newly connected to the corresponding host
controller. It can take on the following values:

+---------------------------------------------------+
| Value | Behavior                                  |
+=======+===========================================+
|    -1 | Authorize all devices except wireless USB |
+-------+-------------------------------------------+
|     0 | Do not authorize new devices              |
+-------+-------------------------------------------+
|     1 | Authorize new devices                     |
+-------+-------------------------------------------+
|     2 | Authorize new internal devices only       |
+---------------------------------------------------+

Note that firmware platform code determines if a device is internal or
not and this is reported as the connect_type sysfs attribute of the
USB port. This is currently supported by ACPI, but device tree still
needs an implementation. Authorizing new internal devices only can be
useful to work around issues with devices that misbehave if there are
delays in probing their module.

### authorized

.. kernel-doc:: drivers/usb/core/sysfs.c
   :export:

Every USB device has an authorized sysfs attribute which can take the
values 0 and 1. When authorized is 0, the device still is present in
sysfs, but none of its interfaces can be associated with drivers and
modules will not be probed. When authorized is 1 (or set to one) a
configuration is chosen for the device and its interfaces are
registered allowing drivers to bind to them.