Message ID | 20240205074650.200304-1-quic_kriskura@quicinc.com |
---|---|
State | New |
Headers | show |
Series | [v3] usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs | expand |
Hi everyone, I recently switch to f_ncm with Windows 10 since rndis 's safety issue. (the Windows 10 driver version is 10.0.19041.1 2009/4/21) It seems Windows 10 ncm driver won't send ZLP to let udc properly separate the skbs. On Mon, 5 Feb 2024 13:16:50 +0530 Krishna Kurapati wrote: > According to Windows driver, no ZLP is needed if wBlockLength is non-zero, > because the non-zero wBlockLength has already told the function side the > size of transfer to be expected. However, there are in-market NCM devices > that rely on ZLP as long as the wBlockLength is multiple of wMaxPacketSize. > To deal with such devices, it pads an extra 0 at end so the transfer is no > longer multiple of wMaxPacketSize. I do the iperf3 testing cause gadget constantly report similar error after a litle modification to get more concrete info: [ 174] configfs-gadget.0: to process=512, so go to find second NTH from: 15872 [ 174] FIND NEXT NTH HEAD:000000006c26a12c: 6e 63 6d 68 10 00 86 16 b0 3b 00 00 48 3b 00 00 00 00 52 34 fc 5f 90 fd ca 40 c1 f4 f4 6e 08 00 [ 174] configfs-gadget.0: Wrong NDP SIGN of this ndp index: 15176, skb len: 16384, ureq_len: 16384, this wSeq: 5766 [ 174] NDP HEAD:00000000298f3cab: 2b 12 48 8f 12 ce 3c c8 d7 39 c0 0d 15 cf 86 14 17 4a 91 85 db df ad 87 f0 35 0d 76 ad 4d 4d 74 [ 174] NTH of this NDP HEAD:00000000af9fbfc9: 6e 63 6d 68 10 00 85 16 00 3e 00 00 90 3d 00 00 00 00 52 34 fc 5f 90 fd ca 40 c1 f4 f4 6e 08 00 [ 174] configfs-gadget.0: Wrong NTH SIGN, skblen 14768, last wSequence: 5766, last dgram_num: 11, ureqlen: 16384 [ 174] HEAD:00000000b1a72bfc: 3f 98 a6 8e 17 f8 bb 29 07 b8 da 13 7f 20 80 8e 77 ca 32 07 ac 71 b8 8d 84 03 d7 1b 96 9b c4 fa Lecroy shows the wSequence=5765 have 10 Datagram consisting a 31*512 bytes=15872 bytes OUT Transfer but have no ZLP: OUT Transfer wSequence=5765 NTH32 Datagrams: 1514B * 8 + 580B NDP32 Transfer length: 512B * 31 NO ZLP OUT Transfer wSequence=5766 NTH32 Datagrams: 1514B * 8 NDP32 Transfer length: 512B * 29 + 432 This lead to a result that the first givebacked 16K skb correponding to a usb_request contains two NTH but not complete: USB Request 1 SKB 16384B (NTH32) (Datagrams) (NDP32) | (NTH32) (Datagrams piece of wSequence=5766) USB Request 2 SKB 14768B (Datagrams piece of wSequence=5766) (NDP32) From the context, it seems the first report of Wrong NDP SIGN is caused by out-of-bound accessing, the second report of Wrong NTH SIGN is caused by a wrong beginning of NTB parsing. Do you have any idea how can this be fixed so that the ncm compatibility is better for windows users. Best Regards, Pan
On Thu, Jan 9, 2025 at 11:37 PM Junzhong Pan <panjunzhong@outlook.com> wrote: > > Hi everyone, > > I recently switch to f_ncm with Windows 10 since rndis 's safety issue. > (the Windows 10 driver version is 10.0.19041.1 2009/4/21) > > It seems Windows 10 ncm driver won't send ZLP to let udc properly > separate the skbs. > > On Mon, 5 Feb 2024 13:16:50 +0530 Krishna Kurapati wrote: > > According to Windows driver, no ZLP is needed if wBlockLength is > non-zero, > > because the non-zero wBlockLength has already told the function side the > > size of transfer to be expected. However, there are in-market NCM devices > > that rely on ZLP as long as the wBlockLength is multiple of > wMaxPacketSize. > > To deal with such devices, it pads an extra 0 at end so the transfer > is no > > longer multiple of wMaxPacketSize. > > I do the iperf3 testing cause gadget constantly report similar error after > a litle modification to get more concrete info: > > [ 174] configfs-gadget.0: to process=512, so go to find second NTH > from: 15872 > [ 174] FIND NEXT NTH HEAD:000000006c26a12c: 6e 63 6d 68 10 00 86 16 b0 > 3b 00 00 48 3b 00 00 00 00 52 34 fc 5f 90 fd ca 40 c1 f4 f4 6e 08 00 > [ 174] configfs-gadget.0: Wrong NDP SIGN of this ndp index: 15176, skb > len: 16384, ureq_len: 16384, this wSeq: 5766 > [ 174] NDP HEAD:00000000298f3cab: 2b 12 48 8f 12 ce 3c c8 d7 39 c0 0d > 15 cf 86 14 17 4a 91 85 db df ad 87 f0 35 0d 76 ad 4d 4d 74 > [ 174] NTH of this NDP HEAD:00000000af9fbfc9: 6e 63 6d 68 10 00 85 16 > 00 3e 00 00 90 3d 00 00 00 00 52 34 fc 5f 90 fd ca 40 c1 f4 f4 6e 08 00 > [ 174] configfs-gadget.0: Wrong NTH SIGN, skblen 14768, last wSequence: > 5766, last dgram_num: 11, ureqlen: 16384 > [ 174] HEAD:00000000b1a72bfc: 3f 98 a6 8e 17 f8 bb 29 07 b8 da 13 7f 20 > 80 8e 77 ca 32 07 ac 71 b8 8d 84 03 d7 1b 96 9b c4 fa > > > Lecroy shows the wSequence=5765 have 10 Datagram consisting a 31*512 > bytes=15872 bytes OUT Transfer but have no ZLP: > > OUT Transfer wSequence=5765 > NTH32 Datagrams: 1514B * 8 + 580B NDP32 > Transfer length: 512B * 31 > NO ZLP > OUT Transfer wSequence=5766 > NTH32 Datagrams: 1514B * 8 NDP32 > Transfer length: 512B * 29 + 432 > > This lead to a result that the first givebacked 16K skb correponding to > a usb_request contains two NTH but not complete: > > USB Request 1 SKB 16384B > (NTH32) (Datagrams) (NDP32) | (NTH32) (Datagrams piece of wSequence=5766) > USB Request 2 SKB 14768B > (Datagrams piece of wSequence=5766) (NDP32) > > From the context, it seems the first report of Wrong NDP SIGN is caused > by out-of-bound accessing, the second report of Wrong NTH SIGN is caused > by a wrong beginning of NTB parsing. > > Do you have any idea how can this be fixed so that the ncm compatibility > is better for windows users. > > Best Regards, > Pan Could you clarify which Linux Kernel you're testing against? Either X.Y.Z version or some git kernel sha1 (not including your debug code of course). Could you provide some pcap of the actual usb frames? Or perhaps describe better the problem, because I'm not quite following from your email. (I'm not sure if the problem is what windows is sending, or how Linux is parsing it) I *think* what you're saying is that wSequence=5765 & 5766 are being treated as a single ncm message due to their being a multiple of 512 in the former, not followed by a ZLP? I thought that was precisely when microsoft ncm added an extra zero byte... What's at the end of 5755? Padding? No padding? Is there an NTH32 header in 5766? Should there be? -- Maciej Żenczykowski, Kernel Networking Developer @ Google
Hi Maciej, On 2025/1/13 1:49, Maciej Żenczykowski Wrote:> (a) I think this looks like a bug on the sending Win10 side, rather > than a parsing bug in Linux, > with there being no ZLP, and no short (<512) frame, there's simply no > way for the receiver to split at the right spot. > > Indeed, fixing this on the Linux/parsing side seems non-trivial... > I guess we could try to treat the connection as simply a serial > connection (ie. ignore frame boundaries), but then we might have > issues with other senders... > > I guess the most likely 'correct' hack/fix would be to hold on to the > extra 'N*512' bytes (it doesn't even have to be 1, though likely the N > is odd), if it starts with a NTH header...Make sence, it seems we only need to save the rest data beside dwBlockLength for next unwrap if a hack is acceptable, otherwise I may need to check if a custom host driver for Windows10 user feasible. I didn't look carefully into the 1byte and padding stuff with Windows11 host yet, I will take a look then. > (b) I notice the '512' not '1024', I think this implies a USB2 > connection instead of USB3 > -- could you try replicating this with a USB3 capable data cable (and > USB3 ports), this should result in 1024 block size instead of 512. > > I'm wondering if the win10 stack is avoiding generating N*1024, but > then hitting N*512 with odd N...Yes, I am using USB2.0 connection to better capture the crime scene. Normally the OUT transfer on USB3.0 SuperSpeed connection comes with a bunch of 1024B Data Pakcet along with a Short Packet less than 1024B in the end from the Lecroy trace. It's also reproducible on USB3.0 SuperSpeed connection using dwc3 controller, but it will cost more time and make it difficult to capture the online data (limited tracer HW buffer), I can try using software tracing or custom logs later: [ 5] 26.00-27.00 sec 183 MBytes 1.54 Gbits/sec [ 5] 27.00-28.00 sec 182 MBytes 1.53 Gbits/sec [ 206.123935] configfs.gadget.2: Wrong NDP SIGN [ 206.129785] configfs.gadget.2: Wrong NTH SIGN, skblen 12208 [ 206.136802] HEAD:0000000004f66a88: 80 06 bc f9 c0 a8 24 66 c0 a8 24 65 f7 24 14 51 aa 1a 30 d5 01 f8 01 26 50 10 20 14 27 3d 00 00 [ 5] 28.00-29.00 sec 128 MBytes 1.07 Gbits/sec [ 5] 29.00-30.00 sec 191 MBytes 1.61 Gbits/sec> > Presumably '512' would be '64' with USB1.0/1.1, but I guess finding a > USB1.x port/host to test against is likely to be near impossible... > > I'll try to see if I can find the source of the bug in the Win > driver's sources (though based on it being Win10 only, may need to > search history) > It's great if you can analyze from the host driver. I didn't know if the NCM driver open-sourced on github by M$ is the correspond version. They said that only Win 11 officially support NCM in the issue on github yet they do have a built-in driver in Win10 and 2004 tag there in the repo.
On Mon, Jan 13, 2025 at 5:31 AM Junzhong Pan <panjunzhong@outlook.com> wrote: > > Hi Maciej, > > On 2025/1/13 1:49, Maciej Żenczykowski Wrote:> (a) I think this looks like a bug on the sending Win10 side, rather > > than a parsing bug in Linux, > > with there being no ZLP, and no short (<512) frame, there's simply no > > way for the receiver to split at the right spot. > > > > Indeed, fixing this on the Linux/parsing side seems non-trivial... > > I guess we could try to treat the connection as simply a serial > > connection (ie. ignore frame boundaries), but then we might have > > issues with other senders... > > > > I guess the most likely 'correct' hack/fix would be to hold on to the > > extra 'N*512' bytes (it doesn't even have to be 1, though likely the N > > is odd), if it starts with a NTH header...Make sence, it seems we only need to save the rest data beside > dwBlockLength for next unwrap if a hack is acceptable, otherwise I may > need to check if a custom host driver for Windows10 user feasible. > > I didn't look carefully into the 1byte and padding stuff with Windows11 > host yet, I will take a look then. > > > (b) I notice the '512' not '1024', I think this implies a USB2 > > connection instead of USB3 > > -- could you try replicating this with a USB3 capable data cable (and > > USB3 ports), this should result in 1024 block size instead of 512. > > > > I'm wondering if the win10 stack is avoiding generating N*1024, but > > then hitting N*512 with odd N...Yes, I am using USB2.0 connection to better capture the crime scene. > > Normally the OUT transfer on USB3.0 SuperSpeed connection comes with a bunch > of 1024B Data Pakcet along with a Short Packet less than 1024B in the end from > the Lecroy trace. > > It's also reproducible on USB3.0 SuperSpeed connection using dwc3 controller, > but it will cost more time and make it difficult to capture the online data > (limited tracer HW buffer), I can try using software tracing or custom logs > later: > > [ 5] 26.00-27.00 sec 183 MBytes 1.54 Gbits/sec > [ 5] 27.00-28.00 sec 182 MBytes 1.53 Gbits/sec > [ 206.123935] configfs.gadget.2: Wrong NDP SIGN > [ 206.129785] configfs.gadget.2: Wrong NTH SIGN, skblen 12208 > [ 206.136802] HEAD:0000000004f66a88: 80 06 bc f9 c0 a8 24 66 c0 a8 24 65 f7 24 14 51 aa 1a 30 d5 01 f8 01 26 50 10 20 14 27 3d 00 00 > [ 5] 28.00-29.00 sec 128 MBytes 1.07 Gbits/sec > [ 5] 29.00-30.00 sec 191 MBytes 1.61 Gbits/sec> > > Presumably '512' would be '64' with USB1.0/1.1, but I guess finding a > > USB1.x port/host to test against is likely to be near impossible... > > > > I'll try to see if I can find the source of the bug in the Win > > driver's sources (though based on it being Win10 only, may need to > > search history) > > It's great if you can analyze from the host driver. > I didn't know if the NCM driver open-sourced on github by M$ is the correspond > version. They said that only Win 11 officially support NCM in the issue on github > yet they do have a built-in driver in Win10 and 2004 tag there in the repo. Looking at https://github.com/microsoft/NCM-Driver-for-Windows commit ded4839c5103ab91822bfde1932393bbb14afda3 (tag: windows_10.0.22000, origin/master) Author: Brandon Jiang <jiangyue@microsoft.com> Date: Mon Oct 4 14:30:44 2021 -0700 update NCM to Windows 11 (21H2) release, built with Windows 11 (22000) WDK and DMF v1.1.82 -- vs previous change to host/device.cpp commit 40118f55a0843221f04c8036df8b97fa3512a007 (tag: windows_10.0.19041, origin/release_2004) Author: Brandon Jiang <jiangyue@microsoft.com> Date: Sun Feb 23 19:53:06 2020 -0800 update NCM to 20H1 Windows release, built with 20H1 WDK and DMF v1.1.20 it introduced if (bufferRequest->TransferLength < bufferRequest->BufferLength && bufferRequest->TransferLength % hostDevice->m_DataBulkOutPipeMaximumPacketSize == 0) { //NCM spec is not explicit if a ZLP shall be sent when wBlockLength != 0 and it happens to be //multiple of wMaxPacketSize. Our interpretation is that no ZLP needed if wBlockLength is non-zero, //because the non-zero wBlockLength has already told the function side the size of transfer to be expected. // //However, there are in-market NCM devices rely on ZLP as long as the wBlockLength is multiple of wMaxPacketSize. //To deal with such devices, we pad an extra 0 at end so the transfer is no longer multiple of wMaxPacketSize bufferRequest->Buffer[bufferRequest->TransferLength] = 0; bufferRequest->TransferLength++; } Which I think is literally the fix for this bug you're reporting. That 'fix' is what then caused us to add the patch at the top of this thread. So that fix was present in the very first official Win11 release (build 22000), but is likely not present in any Win10 release... https://en.wikipedia.org/wiki/Windows_10_version_history (2004 - 20H1 - May 2020 Update - 19041 - May 27, 2020) https://en.wikipedia.org/wiki/Windows_11 (first version is 21H2 - Sun Valley - 22000 - October 5, 2021)
Hi Maciej, Thanks for your quick reply. On 2025/1/14 3:22, Maciej Żenczykowski Wrote: > Looking at https://github.com/microsoft/NCM-Driver-for-Windows > > commit ded4839c5103ab91822bfde1932393bbb14afda3 (tag: > windows_10.0.22000, origin/master) > Author: Brandon Jiang <jiangyue@microsoft.com> > Date: Mon Oct 4 14:30:44 2021 -0700 > > update NCM to Windows 11 (21H2) release, built with Windows 11 > (22000) WDK and DMF v1.1.82 > > -- vs previous change to host/device.cpp > > commit 40118f55a0843221f04c8036df8b97fa3512a007 (tag: > windows_10.0.19041, origin/release_2004) > Author: Brandon Jiang <jiangyue@microsoft.com> > Date: Sun Feb 23 19:53:06 2020 -0800 > > update NCM to 20H1 Windows release, built with 20H1 WDK and DMF v1.1.20 > > it introduced > > if (bufferRequest->TransferLength < bufferRequest->BufferLength && > bufferRequest->TransferLength % > hostDevice->m_DataBulkOutPipeMaximumPacketSize == 0) > { > //NCM spec is not explicit if a ZLP shall be sent when > wBlockLength != 0 and it happens to be > //multiple of wMaxPacketSize. Our interpretation is that no > ZLP needed if wBlockLength is non-zero, In NCM10, 3.2.2 dwBlockLength description, it states: > If exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, and the > size is a multiple of wMaxPacketSize for the given pipe, then no ZLP > shall be sent. I don't know if its a Microsoft's problem or really **not explicit**. Maybe most of the device implementations treat the incoming data as a stream and do contiguous parsing on it. > //because the non-zero wBlockLength has already told the > function side the size of transfer to be expected. > // > //However, there are in-market NCM devices rely on ZLP as long > as the wBlockLength is multiple of wMaxPacketSize. > //To deal with such devices, we pad an extra 0 at end so the > transfer is no longer multiple of wMaxPacketSize > > bufferRequest->Buffer[bufferRequest->TransferLength] = 0; > bufferRequest->TransferLength++; > } > > Which I think is literally the fix for this bug you're reporting. > That 'fix' is what then caused us to add the patch at the top of this thread. > > So that fix was present in the very first official Win11 release > (build 22000), but is likely not present in any Win10 release... As you mentioned before to fix it in the gadget side, it seems very complicated, maybe we need a extra skb with size=ntb_size as an intermediate buffer to move around those ntb data before parsing it, but may (or may not) lead to performance drop. Any other idea? Do you think hacking in the gadget side to fix this compatible issue is a good idea consider that there are still a large number of users using Win10? (Though Win10 will reach end of support on October 14, 2025, but people may still use it for a long time since many PCs in good condition cannot install win11.) Best Regards
On Mon, Jan 13, 2025 at 7:46 PM Junzhong Pan <panjunzhong@outlook.com> wrote: > > Hi Maciej, > > Thanks for your quick reply. > > On 2025/1/14 3:22, Maciej Żenczykowski Wrote: > > Looking at https://github.com/microsoft/NCM-Driver-for-Windows > > > > commit ded4839c5103ab91822bfde1932393bbb14afda3 (tag: > > windows_10.0.22000, origin/master) > > Author: Brandon Jiang <jiangyue@microsoft.com> > > Date: Mon Oct 4 14:30:44 2021 -0700 > > > > update NCM to Windows 11 (21H2) release, built with Windows 11 > > (22000) WDK and DMF v1.1.82 > > > > -- vs previous change to host/device.cpp > > > > commit 40118f55a0843221f04c8036df8b97fa3512a007 (tag: > > windows_10.0.19041, origin/release_2004) > > Author: Brandon Jiang <jiangyue@microsoft.com> > > Date: Sun Feb 23 19:53:06 2020 -0800 > > > > update NCM to 20H1 Windows release, built with 20H1 WDK and DMF v1.1.20 > > > > it introduced > > > > if (bufferRequest->TransferLength < bufferRequest->BufferLength && > > bufferRequest->TransferLength % > > hostDevice->m_DataBulkOutPipeMaximumPacketSize == 0) > > { > > //NCM spec is not explicit if a ZLP shall be sent when > > wBlockLength != 0 and it happens to be > > //multiple of wMaxPacketSize. Our interpretation is that no > > ZLP needed if wBlockLength is non-zero, > > In NCM10, 3.2.2 dwBlockLength description, it states: > > If exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent, and the > > size is a multiple of wMaxPacketSize for the given pipe, then no ZLP > > shall be sent. But the Linux ncm gadget driver sets both of those (dwNtbIn/OutMaxSize) to 16 kiB (I can never remember which one is relevant to which direction, I think in this case it's 'In' cause it's relevant to the gadget/device and thus affects what is sent by Windows and parsed by Linux). So with 15.5 kiB this is not relevant, right? (Please correct me if I'm wrong) Furthermore that 16 kiB is also the size of the preallocated receive buffer it passes to the usb stack, so there won't be a problem without ZLP (post 16 kiB xfer), because the buffer will naturally terminate. > I don't know if its a Microsoft's problem or really **not explicit**. I *think* (in this case) this is very much an MS problem (and the fix in the newer driver confirms it). Short packet / ZLP termination is simply how USB works to transfer packets... Unfortunately MS is not the only one with problems with ZLP. See Linux's drivers/net/usb/cdc_ncm.c 'NO ZLP' vs 'SEND ZLP' and the FLAG_SEND_ZLP. and note it's set on Apple tethering... [I think your quote up above is exactly why the standard requires 'NO ZLP' operation] So there's very clearly ample confusion here... > Maybe most of the device implementations treat the incoming data as a > stream and do contiguous parsing on it. > > > //because the non-zero wBlockLength has already told the > > function side the size of transfer to be expected. > > // > > //However, there are in-market NCM devices rely on ZLP as long > > as the wBlockLength is multiple of wMaxPacketSize. > > //To deal with such devices, we pad an extra 0 at end so the > > transfer is no longer multiple of wMaxPacketSize > > > > bufferRequest->Buffer[bufferRequest->TransferLength] = 0; > > bufferRequest->TransferLength++; > > } > > > > Which I think is literally the fix for this bug you're reporting. > > That 'fix' is what then caused us to add the patch at the top of this thread. > > > > So that fix was present in the very first official Win11 release > > (build 22000), but is likely not present in any Win10 release... > > As you mentioned before to fix it in the gadget side, it seems very > complicated, maybe we need a extra skb with size=ntb_size as an intermediate > buffer to move around those ntb data before parsing it, but may (or may not) > lead to performance drop. Any other idea? It would definitely lead to a fair bit of code complication, and in the particular case of this happening it would involve (at least) an extra copy (to linearize), so definitely be a performance hit. I think we'd have to have a potential extra buffer/offset/length. Initially it would be null/0/0. Whenever we receive a frame and parsing it leaves us with leftover bytes, we'd have to allocate this buffer, and copy the leftover stuff into this temporary area. Try to parse it (note: potentially repeatedly, because we could have 8 2kiB merged pkts...) and swallow the part that parsed, but if the buffer is too short, then hold on to it until we receive more data. If we ever manage to fully parse it - we could potentially deallocate it (or hold on to the memory to avoid multiple alloc/deallocs). When we receive a usb xfer, if the buffer already exists (or is non zero size), the new xfer needs to be appended to it, and parsing repeats. This btw. implies this needs to be a 32 kiB (2*16) buffer... vmalloc would be fine. I think we'd likely need to get rid of the way this stuff abuses skbs for usb anyway. I've wanted to do this anyway (note: not sure I've seen this in the gadget or host side ncm driver). Ugh... > Do you think hacking in the gadget side to fix this compatible issue is > a good idea consider that there are still a large number of users using > Win10? I'm still thinking about it, but I'd far prefer for MS to fix their Win10 driver. This just seems really hard to fix in the gadget. > (Though Win10 will reach end of support on October 14, 2025, Far longer than that. Since there's (purchasable) extended support (+2 years), and non consumer Win10 EOL is further out as well (Enterprise is Jan 2027, business/school Oct 2028, IoT Jan 2032, etc). > but people may still use it for a long time Yeah, Win10 will be around for many many years to come... > since many PCs in good condition cannot install win11.) Or people don't want to, even when they could :-) I personally have a win11 capable PC that I'm refusing to upgrade... (and a pair that are too old) This is only the *second* time Win11 actually has something (beyond what's in Win10) I could potentially maybe actually want (the first being related to WSL, though I've since stopped using it). I miss XP with classic UI (maybe Win7 was better? not sure).
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index ca5d5f564998..e2a059cfda2c 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1338,7 +1338,15 @@ static int ncm_unwrap_ntb(struct gether *port, "Parsed NTB with %d frames\n", dgram_counter); to_process -= block_len; - if (to_process != 0) { + + /* + * Windows NCM driver avoids USB ZLPs by adding a 1-byte + * zero pad as needed. + */ + if (to_process == 1 && + (*(unsigned char *)(ntb_ptr + block_len) == 0x00)) { + to_process--; + } else if (to_process > 0) { ntb_ptr = (unsigned char *)(ntb_ptr + block_len); goto parse_ntb; }
It is observed sometimes when tethering is used over NCM with Windows 11 as host, at some instances, the gadget_giveback has one byte appended at the end of a proper NTB. When the NTB is parsed, unwrap call looks for any leftover bytes in SKB provided by u_ether and if there are any pending bytes, it treats them as a separate NTB and parses it. But in case the second NTB (as per unwrap call) is faulty/corrupt, all the datagrams that were parsed properly in the first NTB and saved in rx_list are dropped. Adding a few custom traces showed the following: [002] d..1 7828.532866: dwc3_gadget_giveback: ep1out: req 000000003868811a length 1025/16384 zsI ==> 0 [002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb toprocess: 1025 [002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342 [002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb seq: 0xce67 [002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x400 [002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb ndp_len: 0x10 [002] d..1 7828.532869: ncm_unwrap_ntb: K: Parsed NTB with 1 frames In this case, the giveback is of 1025 bytes and block length is 1024. The rest 1 byte (which is 0x00) won't be parsed resulting in drop of all datagrams in rx_list. Same is case with packets of size 2048: [002] d..1 7828.557948: dwc3_gadget_giveback: ep1out: req 0000000011dfd96e length 2049/16384 zsI ==> 0 [002] d..1 7828.557949: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342 [002] d..1 7828.557950: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x800 Lecroy shows one byte coming in extra confirming that the byte is coming in from PC: Transfer 2959 - Bytes Transferred(1025) Timestamp((18.524 843 590) - Transaction 8391 - Data(1025 bytes) Timestamp(18.524 843 590) --- Packet 4063861 Data(1024 bytes) Duration(2.117us) Idle(14.700ns) Timestamp(18.524 843 590) --- Packet 4063863 Data(1 byte) Duration(66.160ns) Time(282.000ns) Timestamp(18.524 845 722) According to Windows driver, no ZLP is needed if wBlockLength is non-zero, because the non-zero wBlockLength has already told the function side the size of transfer to be expected. However, there are in-market NCM devices that rely on ZLP as long as the wBlockLength is multiple of wMaxPacketSize. To deal with such devices, it pads an extra 0 at end so the transfer is no longer multiple of wMaxPacketSize. Cc: <stable@vger.kernel.org> Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added") Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com> --- Link to v2: https://lore.kernel.org/all/20240131150332.1326523-1-quic_kriskura@quicinc.com/ Changes in v2: Added check to see if the padded byte is 0x00. Changes in v3: Removed wMaxPacketSize check from v2. drivers/usb/gadget/function/f_ncm.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)