Message ID | 20250430115926.6335-1-rand.sec96@gmail.com |
---|---|
State | New |
Headers | show |
Series | scsi: NCR5380: Prevent potential out-of-bounds read in spi_print_msg() | expand |
On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote: > spi_print_msg() assumes that the input buffer is large enough to > contain the full SCSI message, including extended messages which may > access msg[2], msg[3], msg[7], and beyond based on message type. That's true because it's a generic function designed to work for all parallel card. However, this card only a narrow non-HVD low frequency one, so it only really speaks a tiny subset of this (in particular it would never speak messages over 3 bytes). > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg' > and reads only a single byte from the SCSI bus before passing it to > spi_print_msg(), which can result in a potential out-of-bounds read > if the message is malformed or declares a longer length. The reselect protocol *requires* the next message to be an identify. Since these cards and the devices they attack to are all decades old, I think if they were going to behave like this we'd have seen it by now. The bottom line is we don't add this type of thing to a device facing interface unless there's evidence of an actual negotiation problem. [...] > @@ -2084,7 +2084,7 @@ static void NCR5380_reselect(struct Scsi_Host > *instance) > msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG); > #else > { > - int len = 1; > + int len = sizeof(msg); You didn't test this, did you? The above code instructs the card to wait for 16 bytes on reselection and abort if they aren't found ... i.e. every reselection now aborts because the device is only sending a one byte message. Regards, James
On Wed, 30 Apr 2025, Rand Deeb wrote: > spi_print_msg() assumes that the input buffer is large enough to > contain the full SCSI message, including extended messages which may > access msg[2], msg[3], msg[7], and beyond based on message type. > > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg' > and reads only a single byte from the SCSI bus before passing it to > spi_print_msg(), which can result in a potential out-of-bounds read > if the message is malformed or declares a longer length. > > This patch increases the buffer size to 16 bytes and reads up to > 16 bytes from the SCSI bus. A length check is also added to ensure > the message is well-formed before passing it to spi_print_msg(). > > This ensures safe handling of all valid SCSI messages and prevents > undefined behavior due to malformed or malicious input. > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > I happen to agree with James that there is no value in trying to defend against hostile SPI controllers, buses and targets. But I see a lot of value in static checking so I'm not against removing theoretical issues from the code if it makes static checking easier. AFAIK the error path in question doesn't get executed in practice, like James said. So you could drop the spi_print_msg() call in favour of this: shost_printk(KERN_ERR, instance, "expecting IDENTIFY message, got 0x%02x\n", msg[0]); But it's not clear to me that you can sidestep the API issue that way. Do the other callers of spi_print_msg() not have the same issue?
On Wed, Apr 30, 2025 at 3:59 PM James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > > On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote: > > spi_print_msg() assumes that the input buffer is large enough to > > contain the full SCSI message, including extended messages which may > > access msg[2], msg[3], msg[7], and beyond based on message type. > > That's true because it's a generic function designed to work for all > parallel card. However, this card only a narrow non-HVD low frequency > one, so it only really speaks a tiny subset of this (in particular it > would never speak messages over 3 bytes). Thank you for clarifying this. I wasn’t aware that the NCR5380 is so strictly limited in terms of message support.I assumed a more generic scenario when applying defensive checks, without considering the practical behavior of this specific hardware. > > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg' > > and reads only a single byte from the SCSI bus before passing it to > > spi_print_msg(), which can result in a potential out-of-bounds read > > if the message is malformed or declares a longer length. That makes sense. My initial assumption was that, even if unlikely, a malformed or non-compliant message could theoretically appear. But I now see that this isn’t realistic for these devices, and no evidence suggests this has ever occurred in the field. > The reselect protocol *requires* the next message to be an identify. > Since these cards and the devices they attack to are all decades old, I > think if they were going to behave like this we'd have seen it by now. > > The bottom line is we don't add this type of thing to a device facing > interface unless there's evidence of an actual negotiation problem. Understood and I agree. Defensive programming without a known issue on hardware-level interfaces introduces unnecessary complexity. > You didn't test this, did you? The above code instructs the card to > wait for 16 bytes on reselection and abort if they aren't found ... You’re absolutely right. I misjudged the effect of changing the read length. Given this, I’ll drop the patch entirely, as there’s no actual problem to fix. The intent was only to silence static analysis tools, but I now realize this isn’t a valid justification for modifying a stable hardware- facing path. Thanks again for the review and your insights. Best regards, Rand Deeb On Wed, Apr 30, 2025 at 3:59 PM James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > > On Wed, 2025-04-30 at 14:59 +0300, Rand Deeb wrote: > > spi_print_msg() assumes that the input buffer is large enough to > > contain the full SCSI message, including extended messages which may > > access msg[2], msg[3], msg[7], and beyond based on message type. > > That's true because it's a generic function designed to work for all > parallel card. However, this card only a narrow non-HVD low frequency > one, so it only really speaks a tiny subset of this (in particular it > would never speak messages over 3 bytes). > > > NCR5380_reselect() currently allocates a 3-byte buffer for 'msg' > > and reads only a single byte from the SCSI bus before passing it to > > spi_print_msg(), which can result in a potential out-of-bounds read > > if the message is malformed or declares a longer length. > > The reselect protocol *requires* the next message to be an identify. > Since these cards and the devices they attack to are all decades old, I > think if they were going to behave like this we'd have seen it by now. > > The bottom line is we don't add this type of thing to a device facing > interface unless there's evidence of an actual negotiation problem. > > [...] > > @@ -2084,7 +2084,7 @@ static void NCR5380_reselect(struct Scsi_Host > > *instance) > > msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG); > > #else > > { > > - int len = 1; > > + int len = sizeof(msg); > > You didn't test this, did you? The above code instructs the card to > wait for 16 bytes on reselection and abort if they aren't found ... > i.e. every reselection now aborts because the device is only sending a > one byte message. > > Regards, > > James >
Hi Rand, kernel test robot noticed the following build errors: [auto build test ERROR on jejb-scsi/for-next] [also build test ERROR on mkp-scsi/for-next linus/master v6.15-rc5 next-20250506] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Rand-Deeb/scsi-NCR5380-Prevent-potential-out-of-bounds-read-in-spi_print_msg/20250430-200221 base: https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next patch link: https://lore.kernel.org/r/20250430115926.6335-1-rand.sec96%40gmail.com patch subject: [PATCH] scsi: NCR5380: Prevent potential out-of-bounds read in spi_print_msg() config: alpha-randconfig-r072-20250501 (https://download.01.org/0day-ci/archive/20250507/202505071504.SVF8vs1h-lkp@intel.com/config) compiler: alpha-linux-gcc (GCC) 11.5.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250507/202505071504.SVF8vs1h-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505071504.SVF8vs1h-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from drivers/scsi/g_NCR5380.c:691: drivers/scsi/NCR5380.c: In function 'NCR5380_reselect': >> drivers/scsi/NCR5380.c:2107:51: error: 'len' undeclared (first use in this function); did you mean 'lun'? 2107 | if (msg[0] == EXTENDED_MESSAGE && len >= 3) { | ^~~ | lun drivers/scsi/NCR5380.c:2107:51: note: each undeclared identifier is reported only once for each function it appears in vim +2107 drivers/scsi/NCR5380.c 2099 2100 if (!(msg[0] & 0x80)) { 2101 shost_printk(KERN_ERR, instance, "expecting IDENTIFY message, got "); 2102 2103 /* 2104 * Defensive check before calling spi_print_msg(): 2105 * Avoid buffer overrun if msg claims extended length. 2106 */ > 2107 if (msg[0] == EXTENDED_MESSAGE && len >= 3) { 2108 int expected_len = 2 + msg[1]; 2109 2110 if (expected_len == 2) 2111 expected_len += 256; 2112 2113 if (len >= expected_len) 2114 spi_print_msg(msg); 2115 else 2116 pr_warn("spi_print_msg: skipping malformed extended message (len=%d, expected=%d)\n", 2117 len, expected_len); 2118 } else { 2119 spi_print_msg(msg); 2120 } 2121 2122 printk("\n"); 2123 do_abort(instance, 0); 2124 return; 2125 } 2126 lun = msg[0] & 0x07; 2127 2128 /* 2129 * We need to add code for SCSI-II to track which devices have 2130 * I_T_L_Q nexuses established, and which have simple I_T_L 2131 * nexuses so we can chose to do additional data transfer. 2132 */ 2133 2134 /* 2135 * Find the command corresponding to the I_T_L or I_T_L_Q nexus we 2136 * just reestablished, and remove it from the disconnected queue. 2137 */ 2138 2139 tmp = NULL; 2140 list_for_each_entry(ncmd, &hostdata->disconnected, list) { 2141 struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd); 2142 2143 if (target_mask == (1 << scmd_id(cmd)) && 2144 lun == (u8)cmd->device->lun) { 2145 list_del(&ncmd->list); 2146 tmp = cmd; 2147 break; 2148 } 2149 } 2150 2151 if (tmp) { 2152 dsprintk(NDEBUG_RESELECTION | NDEBUG_QUEUES, instance, 2153 "reselect: removed %p from disconnected queue\n", tmp); 2154 } else { 2155 int target = ffs(target_mask) - 1; 2156 2157 shost_printk(KERN_ERR, instance, "target bitmask 0x%02x lun %d not in disconnected queue.\n", 2158 target_mask, lun); 2159 /* 2160 * Since we have an established nexus that we can't do anything 2161 * with, we must abort it. 2162 */ 2163 if (do_abort(instance, 0) == 0) 2164 hostdata->busy[target] &= ~(1 << lun); 2165 return; 2166 } 2167
diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c index 0e10502660de..2d2a1244af62 100644 --- a/drivers/scsi/NCR5380.c +++ b/drivers/scsi/NCR5380.c @@ -2026,7 +2026,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance) struct NCR5380_hostdata *hostdata = shost_priv(instance); unsigned char target_mask; unsigned char lun; - unsigned char msg[3]; + unsigned char msg[16]; struct NCR5380_cmd *ncmd; struct scsi_cmnd *tmp; @@ -2084,7 +2084,7 @@ static void NCR5380_reselect(struct Scsi_Host *instance) msg[0] = NCR5380_read(CURRENT_SCSI_DATA_REG); #else { - int len = 1; + int len = sizeof(msg); unsigned char *data = msg; unsigned char phase = PHASE_MSGIN; @@ -2099,7 +2099,26 @@ static void NCR5380_reselect(struct Scsi_Host *instance) if (!(msg[0] & 0x80)) { shost_printk(KERN_ERR, instance, "expecting IDENTIFY message, got "); - spi_print_msg(msg); + + /* + * Defensive check before calling spi_print_msg(): + * Avoid buffer overrun if msg claims extended length. + */ + if (msg[0] == EXTENDED_MESSAGE && len >= 3) { + int expected_len = 2 + msg[1]; + + if (expected_len == 2) + expected_len += 256; + + if (len >= expected_len) + spi_print_msg(msg); + else + pr_warn("spi_print_msg: skipping malformed extended message (len=%d, expected=%d)\n", + len, expected_len); + } else { + spi_print_msg(msg); + } + printk("\n"); do_abort(instance, 0); return;
spi_print_msg() assumes that the input buffer is large enough to contain the full SCSI message, including extended messages which may access msg[2], msg[3], msg[7], and beyond based on message type. NCR5380_reselect() currently allocates a 3-byte buffer for 'msg' and reads only a single byte from the SCSI bus before passing it to spi_print_msg(), which can result in a potential out-of-bounds read if the message is malformed or declares a longer length. This patch increases the buffer size to 16 bytes and reads up to 16 bytes from the SCSI bus. A length check is also added to ensure the message is well-formed before passing it to spi_print_msg(). This ensures safe handling of all valid SCSI messages and prevents undefined behavior due to malformed or malicious input. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Rand Deeb <rand.sec96@gmail.com> --- drivers/scsi/NCR5380.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)