diff mbox series

[03/10] scsi: NCR5380: Replace snprintf() with the safer scnprintf() variant

Message ID 20240208084512.3803250-4-lee@kernel.org
State New
Headers show
Series scsi: Replace {v}snprintf() variants with safer alternatives | expand

Commit Message

Lee Jones Feb. 8, 2024, 8:44 a.m. UTC
There is a general misunderstanding amongst engineers that {v}snprintf()
returns the length of the data *actually* encoded into the destination
array.  However, as per the C99 standard {v}snprintf() really returns
the length of the data that *would have been* written if there were
enough space for it.  This misunderstanding has led to buffer-overruns
in the past.  It's generally considered safer to use the {v}scnprintf()
variants in their place (or even sprintf() in simple cases).  So let's
do that.

Link: https://lwn.net/Articles/69419/
Link: https://github.com/KSPP/linux/issues/105
Signed-off-by: Lee Jones <lee@kernel.org>
---
Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: drew@colorado.edu
Cc: Tnx to <Thomas_Roesch@m2.maus.de>
Cc: linux-scsi@vger.kernel.org
---
 drivers/scsi/NCR5380.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Comments

Geert Uytterhoeven Feb. 8, 2024, 10:22 a.m. UTC | #1
Hi Lee,

Thanks for your patch!

On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org> wrote:
> There is a general misunderstanding amongst engineers that {v}snprintf()
> returns the length of the data *actually* encoded into the destination
> array.  However, as per the C99 standard {v}snprintf() really returns
> the length of the data that *would have been* written if there were
> enough space for it.  This misunderstanding has led to buffer-overruns
> in the past.  It's generally considered safer to use the {v}scnprintf()
> variants in their place (or even sprintf() in simple cases).  So let's
> do that.

Confused... The return value is not used at all?

> --- a/drivers/scsi/NCR5380.c
> +++ b/drivers/scsi/NCR5380.c
> @@ -421,14 +421,14 @@ static int NCR5380_init(struct Scsi_Host *instance, int flags)
>         if (!hostdata->work_q)
>                 return -ENOMEM;
>
> -       snprintf(hostdata->info, sizeof(hostdata->info),
> -               "%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
> -               instance->hostt->name, instance->irq, hostdata->io_port,
> -               hostdata->base, instance->can_queue, instance->cmd_per_lun,
> -               instance->sg_tablesize, instance->this_id,
> -               hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
> -               hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
> -               hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
> +       scnprintf(hostdata->info, sizeof(hostdata->info),
> +                "%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
> +                instance->hostt->name, instance->irq, hostdata->io_port,
> +                hostdata->base, instance->can_queue, instance->cmd_per_lun,
> +                instance->sg_tablesize, instance->this_id,
> +                hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
> +                hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
> +                hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
>
>         NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
>         NCR5380_write(MODE_REG, MR_BASE);

Gr{oetje,eeting}s,

                        Geert
Lee Jones Feb. 8, 2024, 10:29 a.m. UTC | #2
On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:

> Hi Lee,
> 
> Thanks for your patch!
> 
> On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org> wrote:
> > There is a general misunderstanding amongst engineers that {v}snprintf()
> > returns the length of the data *actually* encoded into the destination
> > array.  However, as per the C99 standard {v}snprintf() really returns
> > the length of the data that *would have been* written if there were
> > enough space for it.  This misunderstanding has led to buffer-overruns
> > in the past.  It's generally considered safer to use the {v}scnprintf()
> > variants in their place (or even sprintf() in simple cases).  So let's
> > do that.
> 
> Confused... The return value is not used at all?

Future proofing.  The idea of the effort is to rid the use entirely.

 - Usage is inside a sysfs handler passing PAGE_SIZE as the size
   - s/snprintf/sysfs_emit/
 - Usage is inside a sysfs handler passing a bespoke value as the size
   - s/snprintf/scnprintf/
 - Return value used, but does *not* care about overflow
   - s/snprintf/scnprintf/
 - Return value used, caller *does* care about overflow
   - s/snprintf/seq_buf/
 - Return value not used
   - s/snprintf/scnprintf/

This is the final case.

> > --- a/drivers/scsi/NCR5380.c
> > +++ b/drivers/scsi/NCR5380.c
> > @@ -421,14 +421,14 @@ static int NCR5380_init(struct Scsi_Host *instance, int flags)
> >         if (!hostdata->work_q)
> >                 return -ENOMEM;
> >
> > -       snprintf(hostdata->info, sizeof(hostdata->info),
> > -               "%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
> > -               instance->hostt->name, instance->irq, hostdata->io_port,
> > -               hostdata->base, instance->can_queue, instance->cmd_per_lun,
> > -               instance->sg_tablesize, instance->this_id,
> > -               hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
> > -               hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
> > -               hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
> > +       scnprintf(hostdata->info, sizeof(hostdata->info),
> > +                "%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
> > +                instance->hostt->name, instance->irq, hostdata->io_port,
> > +                hostdata->base, instance->can_queue, instance->cmd_per_lun,
> > +                instance->sg_tablesize, instance->this_id,
> > +                hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
> > +                hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
> > +                hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
> >
> >         NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
> >         NCR5380_write(MODE_REG, MR_BASE);
Kees Cook Feb. 10, 2024, 7:14 a.m. UTC | #3
On Thu, Feb 08, 2024 at 08:44:15AM +0000, Lee Jones wrote:
> There is a general misunderstanding amongst engineers that {v}snprintf()
> returns the length of the data *actually* encoded into the destination
> array.  However, as per the C99 standard {v}snprintf() really returns
> the length of the data that *would have been* written if there were
> enough space for it.  This misunderstanding has led to buffer-overruns
> in the past.  It's generally considered safer to use the {v}scnprintf()
> variants in their place (or even sprintf() in simple cases).  So let's
> do that.
> 
> Link: https://lwn.net/Articles/69419/
> Link: https://github.com/KSPP/linux/issues/105
> Signed-off-by: Lee Jones <lee@kernel.org>

Reviewed-by: Kees Cook <keescook@chromium.org>
Finn Thain Feb. 10, 2024, 9:32 a.m. UTC | #4
On Thu, 8 Feb 2024, Lee Jones wrote:

> On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:
> 
> > 
> > Confused... The return value is not used at all?
> 
> Future proofing. 
> 

Surely a better way to prevent potential future API abuse is by adding 
checkpatch.pl rules. That way does not generate churn.

James or Martin, if you can find some value in this patch, go ahead and 
apply it. I'm afraid I can't see it.
James Bottomley Feb. 10, 2024, 12:56 p.m. UTC | #5
On Thu, 2024-02-08 at 10:29 +0000, Lee Jones wrote:
> On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:
> 
> > Hi Lee,
> > 
> > Thanks for your patch!
> > 
> > On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org> wrote:
> > > There is a general misunderstanding amongst engineers that
> > > {v}snprintf()
> > > returns the length of the data *actually* encoded into the
> > > destination
> > > array.  However, as per the C99 standard {v}snprintf() really
> > > returns
> > > the length of the data that *would have been* written if there
> > > were
> > > enough space for it.  This misunderstanding has led to buffer-
> > > overruns
> > > in the past.  It's generally considered safer to use the
> > > {v}scnprintf()
> > > variants in their place (or even sprintf() in simple cases).  So
> > > let's
> > > do that.
> > 
> > Confused... The return value is not used at all?
> 
> Future proofing.  The idea of the effort is to rid the use entirely.
> 
>  - Usage is inside a sysfs handler passing PAGE_SIZE as the size
>    - s/snprintf/sysfs_emit/
>  - Usage is inside a sysfs handler passing a bespoke value as the
> size
>    - s/snprintf/scnprintf/
>  - Return value used, but does *not* care about overflow
>    - s/snprintf/scnprintf/
>  - Return value used, caller *does* care about overflow
>    - s/snprintf/seq_buf/
>  - Return value not used
>    - s/snprintf/scnprintf/
> 
> This is the final case.

To re-ask Geert's question: the last case can't ever lead to a bug or
problem, what value does churning the kernel to change it provide?  As
Finn said, if we want to deprecate it as a future pattern, put it in
checkpatch.

James
Lee Jones Feb. 19, 2024, 3:23 p.m. UTC | #6
On Sat, 10 Feb 2024, James Bottomley wrote:

> On Thu, 2024-02-08 at 10:29 +0000, Lee Jones wrote:
> > On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:
> > 
> > > Hi Lee,
> > > 
> > > Thanks for your patch!
> > > 
> > > On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org> wrote:
> > > > There is a general misunderstanding amongst engineers that
> > > > {v}snprintf()
> > > > returns the length of the data *actually* encoded into the
> > > > destination
> > > > array.  However, as per the C99 standard {v}snprintf() really
> > > > returns
> > > > the length of the data that *would have been* written if there
> > > > were
> > > > enough space for it.  This misunderstanding has led to buffer-
> > > > overruns
> > > > in the past.  It's generally considered safer to use the
> > > > {v}scnprintf()
> > > > variants in their place (or even sprintf() in simple cases).  So
> > > > let's
> > > > do that.
> > > 
> > > Confused... The return value is not used at all?
> > 
> > Future proofing.  The idea of the effort is to rid the use entirely.
> > 
> >  - Usage is inside a sysfs handler passing PAGE_SIZE as the size
> >    - s/snprintf/sysfs_emit/
> >  - Usage is inside a sysfs handler passing a bespoke value as the
> > size
> >    - s/snprintf/scnprintf/
> >  - Return value used, but does *not* care about overflow
> >    - s/snprintf/scnprintf/
> >  - Return value used, caller *does* care about overflow
> >    - s/snprintf/seq_buf/
> >  - Return value not used
> >    - s/snprintf/scnprintf/
> > 
> > This is the final case.
> 
> To re-ask Geert's question: the last case can't ever lead to a bug or
> problem, what value does churning the kernel to change it provide?  As
> Finn said, if we want to deprecate it as a future pattern, put it in
> checkpatch.

Adding this to checkpatch is a good idea.

What if we also take Kees's suggestion and hit all of these found in
SCSI in one patch to keep the churn down to a minimum?
James Bottomley Feb. 19, 2024, 4:25 p.m. UTC | #7
On Mon, 2024-02-19 at 15:23 +0000, Lee Jones wrote:
> On Sat, 10 Feb 2024, James Bottomley wrote:
> 
> > On Thu, 2024-02-08 at 10:29 +0000, Lee Jones wrote:
> > > On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:
> > > 
> > > > Hi Lee,
> > > > 
> > > > Thanks for your patch!
> > > > 
> > > > On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org>
> > > > wrote:
> > > > > There is a general misunderstanding amongst engineers that
> > > > > {v}snprintf() returns the length of the data *actually*
> > > > > encoded into the destination array.  However, as per the C99
> > > > > standard {v}snprintf() really returns the length of the data
> > > > > that *would have been* written if there were enough space for
> > > > > it.  This misunderstanding has led to buffer-overruns in the
> > > > > past.  It's generally considered safer to use the
> > > > > {v}scnprintf() variants in their place (or even sprintf() in
> > > > > simple cases).  So let's do that.
> > > > 
> > > > Confused... The return value is not used at all?
> > > 
> > > Future proofing.  The idea of the effort is to rid the use
> > > entirely.
> > > 
> > >  - Usage is inside a sysfs handler passing PAGE_SIZE as the size
> > >    - s/snprintf/sysfs_emit/
> > >  - Usage is inside a sysfs handler passing a bespoke value as the
> > > size
> > >    - s/snprintf/scnprintf/
> > >  - Return value used, but does *not* care about overflow
> > >    - s/snprintf/scnprintf/
> > >  - Return value used, caller *does* care about overflow
> > >    - s/snprintf/seq_buf/
> > >  - Return value not used
> > >    - s/snprintf/scnprintf/
> > > 
> > > This is the final case.
> > 
> > To re-ask Geert's question: the last case can't ever lead to a bug
> > orproblem, what value does churning the kernel to change it
> > provide? As Finn said, if we want to deprecate it as a future
> > pattern, put it in checkpatch.
> 
> Adding this to checkpatch is a good idea.
> 
> What if we also take Kees's suggestion and hit all of these found in
> SCSI in one patch to keep the churn down to a minimum?

That doesn't fix the churn problem because you're still changing the
source.  For ancient drivers, we keep the changes to a minimum to avoid
introducing inadvertent bugs which aren't discovered until months
later.  If there's no actual bug in the driver, there's no reason to
change the code.

Regards,

James
Kees Cook Feb. 19, 2024, 9:30 p.m. UTC | #8
On Mon, Feb 19, 2024 at 03:23:12PM +0000, Lee Jones wrote:
> Adding this to checkpatch is a good idea.

Yeah, please do. You can look at the "strncpy -> strscpy" check that is
already in there for an example.

> 
> What if we also take Kees's suggestion and hit all of these found in
> SCSI in one patch to keep the churn down to a minimum?

We don't have to focus on SCSI even. At the end of the next -rc1, I can
send a tree-wide patch (from Coccinelle) that'll convert all snprintf()
uses that don't check a return value into scnprintf(). For example,
this seems to do the trick:

@scnprintf depends on !(file in "tools") && !(file in "samples")@
@@

-snprintf
+scnprintf
 (...);


Results in:

 2252 files changed, 4795 insertions(+), 4795 deletions(-)

-Kees
Lee Jones Feb. 20, 2024, 8:24 a.m. UTC | #9
On Mon, 19 Feb 2024, Kees Cook wrote:

> On Mon, Feb 19, 2024 at 03:23:12PM +0000, Lee Jones wrote:
> > Adding this to checkpatch is a good idea.
> 
> Yeah, please do. You can look at the "strncpy -> strscpy" check that is
> already in there for an example.
> 
> > 
> > What if we also take Kees's suggestion and hit all of these found in
> > SCSI in one patch to keep the churn down to a minimum?
> 
> We don't have to focus on SCSI even. At the end of the next -rc1, I can

When I've conducted similar work before, I've taken it subsystem by
subsystem.  However, if you're happy to co-ordinate with the big penguin
et al. and get them all with a treewide patch, please go for it.

> send a tree-wide patch (from Coccinelle) that'll convert all snprintf()
> uses that don't check a return value into scnprintf(). For example,
> this seems to do the trick:
> 
> @scnprintf depends on !(file in "tools") && !(file in "samples")@
> @@
> 
> -snprintf
> +scnprintf
>  (...);
> 
> 
> Results in:
> 
>  2252 files changed, 4795 insertions(+), 4795 deletions(-)

Super!
Lee Jones Feb. 20, 2024, 8:28 a.m. UTC | #10
On Mon, 19 Feb 2024, James Bottomley wrote:

> On Mon, 2024-02-19 at 15:23 +0000, Lee Jones wrote:
> > On Sat, 10 Feb 2024, James Bottomley wrote:
> > 
> > > On Thu, 2024-02-08 at 10:29 +0000, Lee Jones wrote:
> > > > On Thu, 08 Feb 2024, Geert Uytterhoeven wrote:
> > > > 
> > > > > Hi Lee,
> > > > > 
> > > > > Thanks for your patch!
> > > > > 
> > > > > On Thu, Feb 8, 2024 at 9:48 AM Lee Jones <lee@kernel.org>
> > > > > wrote:
> > > > > > There is a general misunderstanding amongst engineers that
> > > > > > {v}snprintf() returns the length of the data *actually*
> > > > > > encoded into the destination array.  However, as per the C99
> > > > > > standard {v}snprintf() really returns the length of the data
> > > > > > that *would have been* written if there were enough space for
> > > > > > it.  This misunderstanding has led to buffer-overruns in the
> > > > > > past.  It's generally considered safer to use the
> > > > > > {v}scnprintf() variants in their place (or even sprintf() in
> > > > > > simple cases).  So let's do that.
> > > > > 
> > > > > Confused... The return value is not used at all?
> > > > 
> > > > Future proofing.  The idea of the effort is to rid the use
> > > > entirely.
> > > > 
> > > >  - Usage is inside a sysfs handler passing PAGE_SIZE as the size
> > > >    - s/snprintf/sysfs_emit/
> > > >  - Usage is inside a sysfs handler passing a bespoke value as the
> > > > size
> > > >    - s/snprintf/scnprintf/
> > > >  - Return value used, but does *not* care about overflow
> > > >    - s/snprintf/scnprintf/
> > > >  - Return value used, caller *does* care about overflow
> > > >    - s/snprintf/seq_buf/
> > > >  - Return value not used
> > > >    - s/snprintf/scnprintf/
> > > > 
> > > > This is the final case.
> > > 
> > > To re-ask Geert's question: the last case can't ever lead to a bug
> > > orproblem, what value does churning the kernel to change it
> > > provide? As Finn said, if we want to deprecate it as a future
> > > pattern, put it in checkpatch.
> > 
> > Adding this to checkpatch is a good idea.
> > 
> > What if we also take Kees's suggestion and hit all of these found in
> > SCSI in one patch to keep the churn down to a minimum?
> 
> That doesn't fix the churn problem because you're still changing the
> source.  For ancient drivers, we keep the changes to a minimum to avoid
> introducing inadvertent bugs which aren't discovered until months
> later.  If there's no actual bug in the driver, there's no reason to
> change the code.

Okay, no problem.  Would you like me to drop these from the set and
resubmit or are you happy to cherry-pick the remainder?
diff mbox series

Patch

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index cea3a79d538e4..ea565e843c765 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -421,14 +421,14 @@  static int NCR5380_init(struct Scsi_Host *instance, int flags)
 	if (!hostdata->work_q)
 		return -ENOMEM;
 
-	snprintf(hostdata->info, sizeof(hostdata->info),
-		"%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
-		instance->hostt->name, instance->irq, hostdata->io_port,
-		hostdata->base, instance->can_queue, instance->cmd_per_lun,
-		instance->sg_tablesize, instance->this_id,
-		hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
-		hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
-		hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
+	scnprintf(hostdata->info, sizeof(hostdata->info),
+		 "%s, irq %d, io_port 0x%lx, base 0x%lx, can_queue %d, cmd_per_lun %d, sg_tablesize %d, this_id %d, flags { %s%s%s}",
+		 instance->hostt->name, instance->irq, hostdata->io_port,
+		 hostdata->base, instance->can_queue, instance->cmd_per_lun,
+		 instance->sg_tablesize, instance->this_id,
+		 hostdata->flags & FLAG_DMA_FIXUP     ? "DMA_FIXUP "     : "",
+		 hostdata->flags & FLAG_NO_PSEUDO_DMA ? "NO_PSEUDO_DMA " : "",
+		 hostdata->flags & FLAG_TOSHIBA_DELAY ? "TOSHIBA_DELAY " : "");
 
 	NCR5380_write(INITIATOR_COMMAND_REG, ICR_BASE);
 	NCR5380_write(MODE_REG, MR_BASE);