diff mbox series

scsi: NCR5380: Prevent potential out-of-bounds read in spi_print_msg()

Message ID 20250430115926.6335-1-rand.sec96@gmail.com
State New
Headers show
Series scsi: NCR5380: Prevent potential out-of-bounds read in spi_print_msg() | expand

Commit Message

Rand Deeb April 30, 2025, 11:59 a.m. UTC
spi_print_msg() assumes that the input buffer is large enough to
contain the full SCSI message, including extended messages which may
access msg[2], msg[3], msg[7], and beyond based on message type.

NCR5380_reselect() currently allocates a 3-byte buffer for 'msg'
and reads only a single byte from the SCSI bus before passing it to
spi_print_msg(), which can result in a potential out-of-bounds read
if the message is malformed or declares a longer length.

This patch increases the buffer size to 16 bytes and reads up to
16 bytes from the SCSI bus. A length check is also added to ensure
the message is well-formed before passing it to spi_print_msg().

This ensures safe handling of all valid SCSI messages and prevents
undefined behavior due to malformed or malicious input.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Rand Deeb <rand.sec96@gmail.com>
---
 drivers/scsi/NCR5380.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

James Bottomley April 30, 2025, 12:59 p.m. UTC | #1
On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote:
> spi_print_msg() assumes that the input buffer is large enough to
> contain the full SCSI message, including extended messages which may
> access msg[2], msg[3], msg[7], and beyond based on message type.

That's true because it's a generic function designed to work for all
parallel card.  However, this card only a narrow non-HVD low frequency
one, so it only really speaks a tiny subset of this (in particular it
would never speak messages over 3 bytes).

> NCR5380_reselect() currently allocates a 3-byte buffer for 'msg'
> and reads only a single byte from the SCSI bus before passing it to
> spi_print_msg(), which can result in a potential out-of-bounds read
> if the message is malformed or declares a longer length.

The reselect protocol *requires* the next message to be an identify. 
Since these cards and the devices they attack to are all decades old, I
think if they were going to behave like this we'd have seen it by now.

The bottom line is we don't add this type of thing to a device facing
interface unless there's evidence of an actual negotiation problem.

[...] 
> @@ -2084,7 +2084,7 @@ static void NCR5380_reselect(struct Scsi_Host
> *instance)
>  	msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG);
>  #else
>  	{
> -		int len = 1;
> +		int len = sizeof(msg);

You didn't test this, did you?  The above code instructs the card to
wait for 16 bytes on reselection and abort if they aren't found ...
i.e. every reselection now aborts because the device is only sending a
one byte message.

Regards,

James
Finn Thain May 1, 2025, 3:40 a.m. UTC | #2
On Wed, 30 Apr 2025, Rand Deeb wrote:

> spi_print_msg() assumes that the input buffer is large enough to
> contain the full SCSI message, including extended messages which may
> access msg[2], msg[3], msg[7], and beyond based on message type.
> 
> NCR5380_reselect() currently allocates a 3-byte buffer for 'msg'
> and reads only a single byte from the SCSI bus before passing it to
> spi_print_msg(), which can result in a potential out-of-bounds read
> if the message is malformed or declares a longer length.
> 
> This patch increases the buffer size to 16 bytes and reads up to
> 16 bytes from the SCSI bus. A length check is also added to ensure
> the message is well-formed before passing it to spi_print_msg().
> 
> This ensures safe handling of all valid SCSI messages and prevents
> undefined behavior due to malformed or malicious input.
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 

I happen to agree with James that there is no value in trying to defend 
against hostile SPI controllers, buses and targets. But I see a lot of 
value in static checking so I'm not against removing theoretical issues 
from the code if it makes static checking easier.

AFAIK the error path in question doesn't get executed in practice, like 
James said. So you could drop the spi_print_msg() call in favour of this:

shost_printk(KERN_ERR, instance,
             "expecting IDENTIFY message, got 0x%02x\n", msg[0]);

But it's not clear to me that you can sidestep the API issue that way. Do 
the other callers of spi_print_msg() not have the same issue?
Rand Deeb May 5, 2025, 5 a.m. UTC | #3
On Wed, Apr 30, 2025 at 3:59 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote:
> > spi_print_msg() assumes that the input buffer is large enough to
> > contain the full SCSI message, including extended messages which may
> > access msg[2], msg[3], msg[7], and beyond based on message type.
>
> That's true because it's a generic function designed to work for all
> parallel card.  However, this card only a narrow non-HVD low frequency
> one, so it only really speaks a tiny subset of this (in particular it
> would never speak messages over 3 bytes).

Thank you for clarifying this. I wasn’t aware that the NCR5380 is so
strictly limited in terms of message support.I assumed a more generic
scenario when applying defensive checks, without considering the practical
behavior of this specific hardware.

> > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg'
> > and reads only a single byte from the SCSI bus before passing it to
> > spi_print_msg(), which can result in a potential out-of-bounds read
> > if the message is malformed or declares a longer length.

That makes sense. My initial assumption was that, even if unlikely, a
malformed or non-compliant message could theoretically appear. But I now
see that this isn’t realistic for these devices, and no evidence suggests
this has ever occurred in the field.

> The reselect protocol *requires* the next message to be an identify.
> Since these cards and the devices they attack to are all decades old, I
> think if they were going to behave like this we'd have seen it by now.
>
> The bottom line is we don't add this type of thing to a device facing
> interface unless there's evidence of an actual negotiation problem.

Understood and I agree. Defensive programming without a known issue on
hardware-level interfaces introduces unnecessary complexity.

> You didn't test this, did you?  The above code instructs the card to
> wait for 16 bytes on reselection and abort if they aren't found ...

You’re absolutely right. I misjudged the effect of changing the read length.

Given this, I’ll drop the patch entirely, as there’s no actual problem to
fix. The intent was only to silence static analysis tools, but I now
realize this isn’t a valid justification for modifying a stable hardware-
facing path.

Thanks again for the review and your insights.

Best regards,
Rand Deeb

On Wed, Apr 30, 2025 at 3:59 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote:
> > spi_print_msg() assumes that the input buffer is large enough to
> > contain the full SCSI message, including extended messages which may
> > access msg[2], msg[3], msg[7], and beyond based on message type.
>
> That's true because it's a generic function designed to work for all
> parallel card.  However, this card only a narrow non-HVD low frequency
> one, so it only really speaks a tiny subset of this (in particular it
> would never speak messages over 3 bytes).
>
> > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg'
> > and reads only a single byte from the SCSI bus before passing it to
> > spi_print_msg(), which can result in a potential out-of-bounds read
> > if the message is malformed or declares a longer length.
>
> The reselect protocol *requires* the next message to be an identify.
> Since these cards and the devices they attack to are all decades old, I
> think if they were going to behave like this we'd have seen it by now.
>
> The bottom line is we don't add this type of thing to a device facing
> interface unless there's evidence of an actual negotiation problem.
>
> [...]
> > @@ -2084,7 +2084,7 @@ static void NCR5380_reselect(struct Scsi_Host
> > *instance)
> >       msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG);
> >  #else
> >       {
> > -             int len = 1;
> > +             int len = sizeof(msg);
>
> You didn't test this, did you?  The above code instructs the card to
> wait for 16 bytes on reselection and abort if they aren't found ...
> i.e. every reselection now aborts because the device is only sending a
> one byte message.
>
> Regards,
>
> James
>
kernel test robot May 7, 2025, 7:31 a.m. UTC | #4
Hi Rand,

kernel test robot noticed the following build errors:

[auto build test ERROR on jejb-scsi/for-next]
[also build test ERROR on mkp-scsi/for-next linus/master v6.15-rc5 next-20250506]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Rand-Deeb/scsi-NCR5380-Prevent-potential-out-of-bounds-read-in-spi_print_msg/20250430-200221
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20250430115926.6335-1-rand.sec96%40gmail.com
patch subject: [PATCH] scsi: NCR5380: Prevent potential out-of-bounds read in spi_print_msg()
config: alpha-randconfig-r072-20250501 (https://download.01.org/0day-ci/archive/20250507/202505071504.SVF8vs1h-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250507/202505071504.SVF8vs1h-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505071504.SVF8vs1h-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/scsi/g_NCR5380.c:691:
   drivers/scsi/NCR5380.c: In function 'NCR5380_reselect':
>> drivers/scsi/NCR5380.c:2107:51: error: 'len' undeclared (first use in this function); did you mean 'lun'?
    2107 |                 if (msg[0] == EXTENDED_MESSAGE && len >= 3) {
         |                                                   ^~~
         |                                                   lun
   drivers/scsi/NCR5380.c:2107:51: note: each undeclared identifier is reported only once for each function it appears in


vim +2107 drivers/scsi/NCR5380.c

  2099	
  2100		if (!(msg[0] & 0x80)) {
  2101			shost_printk(KERN_ERR, instance, "expecting IDENTIFY message, got ");
  2102	
  2103			/*
  2104			 * Defensive check before calling spi_print_msg():
  2105			 * Avoid buffer overrun if msg claims extended length.
  2106			 */
> 2107			if (msg[0] == EXTENDED_MESSAGE && len >= 3) {
  2108				int expected_len = 2 + msg[1];
  2109	
  2110				if (expected_len == 2)
  2111					expected_len += 256;
  2112	
  2113				if (len >= expected_len)
  2114					spi_print_msg(msg);
  2115				else
  2116					pr_warn("spi_print_msg: skipping malformed extended message (len=%d, expected=%d)\n",
  2117						len, expected_len);
  2118			} else {
  2119				spi_print_msg(msg);
  2120			}
  2121	
  2122			printk("\n");
  2123			do_abort(instance, 0);
  2124			return;
  2125		}
  2126		lun = msg[0] & 0x07;
  2127	
  2128		/*
  2129		 * We need to add code for SCSI-II to track which devices have
  2130		 * I_T_L_Q nexuses established, and which have simple I_T_L
  2131		 * nexuses so we can chose to do additional data transfer.
  2132		 */
  2133	
  2134		/*
  2135		 * Find the command corresponding to the I_T_L or I_T_L_Q  nexus we
  2136		 * just reestablished, and remove it from the disconnected queue.
  2137		 */
  2138	
  2139		tmp = NULL;
  2140		list_for_each_entry(ncmd, &hostdata->disconnected, list) {
  2141			struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd);
  2142	
  2143			if (target_mask == (1 << scmd_id(cmd)) &&
  2144			    lun == (u8)cmd->device->lun) {
  2145				list_del(&ncmd->list);
  2146				tmp = cmd;
  2147				break;
  2148			}
  2149		}
  2150	
  2151		if (tmp) {
  2152			dsprintk(NDEBUG_RESELECTION | NDEBUG_QUEUES, instance,
  2153			         "reselect: removed %p from disconnected queue\n", tmp);
  2154		} else {
  2155			int target = ffs(target_mask) - 1;
  2156	
  2157			shost_printk(KERN_ERR, instance, "target bitmask 0x%02x lun %d not in disconnected queue.\n",
  2158			             target_mask, lun);
  2159			/*
  2160			 * Since we have an established nexus that we can't do anything
  2161			 * with, we must abort it.
  2162			 */
  2163			if (do_abort(instance, 0) == 0)
  2164				hostdata->busy[target] &= ~(1 << lun);
  2165			return;
  2166		}
  2167
diff mbox series

Patch

diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
index 0e10502660de..2d2a1244af62 100644
--- a/drivers/scsi/NCR5380.c
+++ b/drivers/scsi/NCR5380.c
@@ -2026,7 +2026,7 @@  static void NCR5380_reselect(struct Scsi_Host *instance)
 	struct NCR5380_hostdata *hostdata = shost_priv(instance);
 	unsigned char target_mask;
 	unsigned char lun;
-	unsigned char msg[3];
+	unsigned char msg[16];
 	struct NCR5380_cmd *ncmd;
 	struct scsi_cmnd *tmp;
 
@@ -2084,7 +2084,7 @@  static void NCR5380_reselect(struct Scsi_Host *instance)
 	msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG);
 #else
 	{
-		int len = 1;
+		int len = sizeof(msg);
 		unsigned char *data = msg;
 		unsigned char phase = PHASE_MSGIN;
 
@@ -2099,7 +2099,26 @@  static void NCR5380_reselect(struct Scsi_Host *instance)
 
 	if (!(msg[0] & 0x80)) {
 		shost_printk(KERN_ERR, instance, "expecting IDENTIFY message, got ");
-		spi_print_msg(msg);
+
+		/*
+		 * Defensive check before calling spi_print_msg():
+		 * Avoid buffer overrun if msg claims extended length.
+		 */
+		if (msg[0] == EXTENDED_MESSAGE && len >= 3) {
+			int expected_len = 2 + msg[1];
+
+			if (expected_len == 2)
+				expected_len += 256;
+
+			if (len >= expected_len)
+				spi_print_msg(msg);
+			else
+				pr_warn("spi_print_msg: skipping malformed extended message (len=%d, expected=%d)\n",
+					len, expected_len);
+		} else {
+			spi_print_msg(msg);
+		}
+
 		printk("\n");
 		do_abort(instance, 0);
 		return;