mbox series

[net,0/5] hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2"

Message ID 20250513000604.1396-1-mhklinux@outlook.com
Headers show
Series hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2" | expand

Message

mhkelley58@gmail.com May 13, 2025, 12:05 a.m. UTC
From: Michael Kelley <mhklinux@outlook.com>

Starting with commit dca5161f9bd0 in the 6.3 kernel, the Linux driver
for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.

This patch set fixes the root cause of the problem, which is that the
netvsc driver in Linux may send an incorrectly formatted VMBus message
to Hyper-V when transmitting the network packet. The incorrect
formatting occurs when the rndis header of the VMBus message crosses a
page boundary due to how the Linux skb head memory is aligned. In such
a case, two PFNs are required to describe the location of the rndis
header, even though they are contiguous in guest physical address
(GPA) space. Hyper-V requires that two PFNs be in a single "GPA range"
data struture, but current netvsc code puts each PFN in its own GPA
range, which Hyper-V rejects as an error in the case of the rndis
header.

The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.

After commit 14ad6ed30a10 in the 6.14 kernel, the error occurs much
more frequently in VMs with 16 or more vCPUs. It may occur every few
seconds, or even more frequently, in a ssh session that outputs a lot
of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more.  The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.

One fix is to check for adjacent PFNs in vmbus_sendpacket_pagebuffer()
and just combine them into a single GPA range. Such a fix is very
contained. But conceptually it is fixing the problem at the wrong
level. So this patch set takes the broader approach of maintaining
the already known grouping of contiguous PFNs at a higher level in
the netvsc driver code, and propagating that grouping down to the
creation of the VMBus message to send to Hyper-V. Maintaining the
grouping fixes this problem, and has the added benefit of allowing
netvsc_dma_map() to make fewer calls to dma_map_single() to do bounce
buffering in CoCo VMs.

Patch 1 is a preparatory change to allow vmbus_sendpacket_mpb_desc()
to specify multiple GPA ranges. In current code
vmbus_sendpacket_mpb_desc() is used only by the storvsc synthetic SCSI
driver, and it always creates a single GPA range.

Patch 2 updates the netvsc driver to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(). Because the higher levels of
netvsc still don't group contiguous PFNs, this patch is functionally
neutral. The VMBus message to Hyper-V still has many GPA ranges, each
with a single PFN. But it lays the groundwork for the next patch.

Patch 3 changes the higher levels of netvsc to preserve the already
known grouping of contiguous PFNs. When the contiguous groupings are
passed to vmbus_sendpacket_mpb_desc(), GPA ranges containing multiple
PFNs are produced, as expected by Hyper-V. This is point at which the
core problem is fixed.

Patches 4 and 5 remove code that is no longer necessary after the
previous patches.

These changes provide a net reduction of about 65 lines of code, which
is an added benefit.

These changes have been tested in normal VMs, in SEV-SNP and TDX CoCo
VMs, and in Dv6-series VMs where the netvsp implementation is in the
OpenHCL paravisor instead of the Hyper-V host.

These changes are built against kernel version 6.15-rc6.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503

Michael Kelley (5):
  Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple
    ranges
  hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
  hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
  hv_netvsc: Remove rmsg_pgcnt
  Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()

 drivers/hv/channel.c              | 65 ++-----------------------------
 drivers/net/hyperv/hyperv_net.h   | 13 ++++++-
 drivers/net/hyperv/netvsc.c       | 57 ++++++++++++++++++++++-----
 drivers/net/hyperv/netvsc_drv.c   | 62 +++++++----------------------
 drivers/net/hyperv/rndis_filter.c | 24 +++---------
 drivers/scsi/storvsc_drv.c        |  1 +
 include/linux/hyperv.h            |  7 ----
 7 files changed, 83 insertions(+), 146 deletions(-)