Message ID | 1620085579-5646-1-git-send-email-rsanger@wand.net.nz |
---|---|
State | Superseded |
Headers | show |
Series | net: packetmmap: fix only tx timestamp on request | expand |
On Mon, May 3, 2021 at 8:04 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > The packetmmap tx ring should only return timestamps if requested, > as documented. This allows compatibility with non-timestamp aware > user-space code which checks tp_status == TP_STATUS_AVAILABLE; > not expecting additional timestamp flags to be set. This is an established interface. Passing the status goes back to 2013, since commit b9c32fb27170 ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got"). Passing a timestamp itself in tp_sec/tp_usec goes back to before git, probably to the introduction of the ring. I don't think we can change this now. That will likely break applications that have come to expect current behavior. Is it documented somewhere that the ring works differently? Or are you referring to the general SO_TIMESTAMPING behavior, which is a separate timestamp interface.
Hi Willem, This is to match up with the documented behaviour; see the timestamping section at the bottom of https://www.kernel.org/doc/html/latest/networking/packet_mmap.html If no call to setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, ...) is made then the tx path ring should not return timestamps, or timestamp flags set in tp_status. As noted in b9c32fb27170 ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got") this is to retain backwards compatibility with old code. However, currently, a timestamp can be returned without setting PACKET_TIMESTAMP, in the case that skb->tstamp includes a timestamp. I only noticed this recently due to: aa4e689ed1 (veth: add software timestamping) which means skb->tstamp now includes a timestamp. The issue this bug causes for old/non-timestamp aware code is that tp_status may incorrectly have the TP_STATUS_TS_SOFTWARE flag set, so the documented check (tp_status == TP_STATUS_AVAILABLE) that a frame in the ring is free fails. Causing such code to hang infinitely. This patch corrects the behaviour for the tx path. But, doesn't change the behaviour on the rx path. The rx path still includes a timestamp (hence the patch always sets the SOF_TIMESTAMPING_SOFTWARE flag on rx). Thanks, Richard On Tue, May 4, 2021 at 12:36 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Mon, May 3, 2021 at 8:04 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > > > The packetmmap tx ring should only return timestamps if requested, > > as documented. This allows compatibility with non-timestamp aware > > user-space code which checks tp_status == TP_STATUS_AVAILABLE; > > not expecting additional timestamp flags to be set. > > This is an established interface. > > Passing the status goes back to 2013, since commit b9c32fb27170 > ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got"). > > Passing a timestamp itself in tp_sec/tp_usec goes back to before git, > probably to the introduction of the ring. > > I don't think we can change this now. That will likely break > applications that have come to expect current behavior. > > Is it documented somewhere that the ring works differently? Or are you > referring to the general SO_TIMESTAMPING behavior, which is a separate > timestamp interface.
On Mon, May 3, 2021 at 9:22 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > Hi Willem, > > This is to match up with the documented behaviour; see the timestamping section > at the bottom of > https://www.kernel.org/doc/html/latest/networking/packet_mmap.html > > If no call to setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, ...) is made then > the tx path ring should not return timestamps, or timestamp flags set in > tp_status. > > As noted in b9c32fb27170 > ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got") > this is to retain backwards compatibility with old code. > > However, currently, a timestamp can be returned without setting > PACKET_TIMESTAMP, in the case that skb->tstamp includes a timestamp. > I only noticed this recently due to: > aa4e689ed1 (veth: add software timestamping) > which means skb->tstamp now includes a timestamp. > > The issue this bug causes for old/non-timestamp aware code is that tp_status > may incorrectly have the TP_STATUS_TS_SOFTWARE flag set, so the documented > check (tp_status == TP_STATUS_AVAILABLE) that a frame in the ring is free fails. > Causing such code to hang infinitely. Then this would need a Fixes: b9c32fb27170 ("packet: if hw/sw ts enabled in rx/tx ring, report which ts we got") I don't fully follow the commit message in that patch for why enabling this unconditionally on Tx is safe: " This should not break anything for the following reasons: [..] ii) in TX ring path, time stamps with PACKET_TIMESTAMP socketoption are not available resp. had no effect except that the application setting this is buggy. Next to TP_STATUS_AVAILABLE, the user also should check for other flags such as TP_STATUS_WRONG_FORMAT to reclaim frames to the application. Thus, in case TX ts are turned off (default case), nothing happens to the application logic " But I think the point is that tx packets are not timestamped unless skb_shinfo(skb)->tx_flags holds a timestamp request. Such as for the software timestamps that veth can now generate: " static inline void skb_tx_timestamp(struct sk_buff *skb) { skb_clone_tx_timestamp(skb); if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP) skb_tstamp_tx(skb, NULL); } " So unless this packet socket has SOF_TIMESTAMPING_TX_SOFTWARE configured, no timestamps should be recorded for its packets, as tx flag SKBTX_SW_TSTAMP is not set. > This patch corrects the behaviour for the tx path. But, doesn't change the > behaviour on the rx path. The rx path still includes a timestamp (hence > the patch always sets the SOF_TIMESTAMPING_SOFTWARE flag on rx). Right, this patch suppresses reporting of any recorded timestamps. But the system should already be suppressing recording of these timestamps. Assuming you discovered this with a real application: does it call setsockopt SOL_SOCKET/SO_TIMESTAMPING at all? It's safe to suppress on the reporting side as extra precaution against spuriously timestamped packets. I just want to understand how these timestamps are even recorded in the first place. Small nit wrt the patch: the comment "/* always timestamp; prefer an existing software timestamp */" states what the code does, but more interesting would be why.
On Wed, May 5, 2021 at 2:45 AM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Mon, May 3, 2021 at 9:22 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > > > Hi Willem, > > > > This is to match up with the documented behaviour; see the timestamping section > > at the bottom of > > https://www.kernel.org/doc/html/latest/networking/packet_mmap.html [ ... ] > > Then this would need a > > Fixes: b9c32fb27170 ("packet: if hw/sw ts enabled in rx/tx ring, > report which ts we got") ack, I will resubmit the patch with that as the summary line of the commit message. > I don't fully follow the commit message in that patch for why enabling > this unconditionally on Tx is safe: > [...] > > But I think the point is that tx packets are not timestamped unless > skb_shinfo(skb)->tx_flags holds a timestamp request. Such as for > the software timestamps that veth can now generate: > I came to the same understanding, tx timestamping should be disabled unless the code calls setsockopt SOL_SOCKET/SO_TIMESTAMPING. > " > static inline void skb_tx_timestamp(struct sk_buff *skb) > { > skb_clone_tx_timestamp(skb); > if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP) > skb_tstamp_tx(skb, NULL); > } > " > > So unless this packet socket has SOF_TIMESTAMPING_TX_SOFTWARE > configured, no timestamps should be recorded for its packets, as tx flag > SKBTX_SW_TSTAMP is not set. You are right, that check is working correctly, I'm mistaken on the trigger of this behaviour. It doesn't appear related to aa4e689ed1 (veth: add software timestamping). In fact, this bug is present in Linux 4.19 the version before that patch was added, and likely earlier versions too. I've just verified using printk() that after the call to skb_tx_timestamp(skb) in veth_xmit() skb->tstamp == 0 as expected. However, when skb_tx_timestamp() is called within the packetmmap code path skb->tstamp holds a valid time. > > This patch corrects the behaviour for the tx path. But, doesn't change the > > behaviour on the rx path. The rx path still includes a timestamp (hence > > the patch always sets the SOF_TIMESTAMPING_SOFTWARE flag on rx). > > Right, this patch suppresses reporting of any recorded timestamps. But > the system should already be suppressing recording of these > timestamps. > > Assuming you discovered this with a real application: does it call > setsockopt SOL_SOCKET/SO_TIMESTAMPING at all? > Yes, I can confirm my code does not setsockopt SO_TIMESTAMPING Here is the filtered output of strace # strace ./test-live -c 1 ring:veth0 2>&1 | grep sock socket(AF_PACKET, SOCK_RAW, htons(0 /* ETH_P_??? */)) = 3 setsockopt(3, SOL_PACKET, PACKET_VERSION, [1], 4) = 0 setsockopt(3, SOL_PACKET, PACKET_TX_RING, {tp_block_size=1048576, tp_block_nr=1, tp_frame_size=4096, tp_frame_nr=256}, 16) = 0 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 > It's safe to suppress on the reporting side as extra precaution against > spuriously timestamped packets. I just want to understand how these > timestamps are even recorded in the first place. > Agreed, if this isn't expected behaviour, how skb->tstamp is getting filled with a timestamp remains a mystery to me. I'll report back if I find the source. > Small nit wrt the patch: the comment "/* always timestamp; prefer an > existing software timestamp */" states what the code does, but more > interesting would be why. Absolutely, I'll replace it with something along the lines of /* always timestamp; prefer an existing software timestamp taken closer to the time of capture */
> However, when skb_tx_timestamp() is called within the packetmmap code path > skb->tstamp holds a valid time. Sorry, I've confused a function name here, meant to say: However, when ***tpacket_get_timestamp()*** is called within the packetmmap code path skb->tstamp holds a valid time.
On Wed, May 5, 2021 at 7:42 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > On Wed, May 5, 2021 at 2:45 AM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > On Mon, May 3, 2021 at 9:22 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > > > > > Hi Willem, > > > > > > This is to match up with the documented behaviour; see the timestamping section > > > at the bottom of > > > https://www.kernel.org/doc/html/latest/networking/packet_mmap.html > [ ... ] > > > > Then this would need a > > > > Fixes: b9c32fb27170 ("packet: if hw/sw ts enabled in rx/tx ring, > > report which ts we got") > > ack, I will resubmit the patch with that as the summary line of the commit > message. The fixes tag is not the summary line. > > I don't fully follow the commit message in that patch for why enabling > > this unconditionally on Tx is safe: > > > [...] > > > > But I think the point is that tx packets are not timestamped unless > > skb_shinfo(skb)->tx_flags holds a timestamp request. Such as for > > the software timestamps that veth can now generate: > > > > I came to the same understanding, tx timestamping should be disabled unless > the code calls setsockopt SOL_SOCKET/SO_TIMESTAMPING. > > > " > > static inline void skb_tx_timestamp(struct sk_buff *skb) > > { > > skb_clone_tx_timestamp(skb); > > if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP) > > skb_tstamp_tx(skb, NULL); > > } > > " > > > > So unless this packet socket has SOF_TIMESTAMPING_TX_SOFTWARE > > configured, no timestamps should be recorded for its packets, as tx flag > > SKBTX_SW_TSTAMP is not set. > > You are right, that check is working correctly, I'm mistaken on the trigger of > this behaviour. It doesn't appear related to aa4e689ed1 > (veth: add software timestamping). In fact, this bug is present in Linux 4.19 > the version before that patch was added, and likely earlier versions too. > > I've just verified using printk() that after the call to skb_tx_timestamp(skb) > in veth_xmit() skb->tstamp == 0 as expected. > > However, when skb_tx_timestamp() is called within the packetmmap code path > skb->tstamp holds a valid time. Interesting. I had expected veth_xmit to trigger skb_orphan, which calls the destructor. But this is no longer true as of commit 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). As a result, I suppose the skb can enter the next namespace and be timestamped there if receive timestamps are enabled (this is not per-socket). One way to verify, if you can easily recompile a kernel, is to add a WARN_ON_ONCE(1) to tpacket_destruct_skb to see which path led up to queuing the completion notification. > > > This patch corrects the behaviour for the tx path. But, doesn't change the > > > behaviour on the rx path. The rx path still includes a timestamp (hence > > > the patch always sets the SOF_TIMESTAMPING_SOFTWARE flag on rx). > > > > Right, this patch suppresses reporting of any recorded timestamps. But > > the system should already be suppressing recording of these > > timestamps. > > > > Assuming you discovered this with a real application: does it call > > setsockopt SOL_SOCKET/SO_TIMESTAMPING at all? > > > > Yes, I can confirm my code does not setsockopt SO_TIMESTAMPING > Here is the filtered output of strace > > # strace ./test-live -c 1 ring:veth0 2>&1 | grep sock > socket(AF_PACKET, SOCK_RAW, htons(0 /* ETH_P_??? */)) = 3 > setsockopt(3, SOL_PACKET, PACKET_VERSION, [1], 4) = 0 > setsockopt(3, SOL_PACKET, PACKET_TX_RING, {tp_block_size=1048576, > tp_block_nr=1, tp_frame_size=4096, tp_frame_nr=256}, 16) = 0 > socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 > > > It's safe to suppress on the reporting side as extra precaution against > > spuriously timestamped packets. I just want to understand how these > > timestamps are even recorded in the first place. > > > > Agreed, if this isn't expected behaviour, how skb->tstamp is getting filled > with a timestamp remains a mystery to me. I'll report back if I find the > source. I think we need to understand exactly what goes on before we apply a patch. It might just be papering over the problem otherwise.
I've had a chance to look into this further and have found where the timestamp is added. Details are at the end of this message. On Thu, May 6, 2021 at 1:23 PM Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Wed, May 5, 2021 at 7:42 PM Richard Sanger <rsanger@wand.net.nz> wrote: [...] > > > > I've just verified using printk() that after the call to skb_tx_timestamp(skb) > > in veth_xmit() skb->tstamp == 0 as expected. > > > > However, when skb_tx_timestamp() is called within the packetmmap code path > > skb->tstamp holds a valid time. > > Interesting. I had expected veth_xmit to trigger skb_orphan, which > calls the destructor. > > But this is no longer true as of commit 9c4c325252c5 ("skbuff: > preserve sock reference when scrubbing the skb."). > > As a result, I suppose the skb can enter the next namespace and be > timestamped there if receive timestamps are enabled (this is not > per-socket). > > One way to verify, if you can easily recompile a kernel, is to add a > WARN_ON_ONCE(1) to tpacket_destruct_skb to see which path led up to > queuing the completion notification. > Here's the output of putting a WARN_ON_ONCE(1) statement in tpacket_destruct_skb, I don't believe it is related to the problem. [ 37.249629] RIP: 0010:tpacket_destruct_skb+0x24/0x60 [...] [ 37.249659] Call Trace: [ 37.249661] <IRQ> [ 37.249666] skb_release_head_state+0x44/0x90 [ 37.249680] skb_release_all+0x13/0x30 [ 37.249684] kfree_skb+0x2f/0xa0 [ 37.249689] llc_rcv+0x2e/0x360 [llc] [ 37.249698] __netif_receive_skb_one_core+0x8f/0xa0 [ 37.249707] __netif_receive_skb+0x18/0x60 [ 37.249710] process_backlog+0xa9/0x160 [ 37.249714] __napi_poll+0x31/0x140 [ 37.249717] net_rx_action+0xde/0x210 [ 37.249722] __do_softirq+0xe0/0x29b [ 37.249737] do_softirq+0x66/0x80 [ 37.249747] </IRQ> [ 37.249748] __local_bh_enable_ip+0x50/0x60 [ 37.249751] __dev_queue_xmit+0x23a/0x6e0 [ 37.249756] dev_queue_xmit+0x10/0x20 [ 37.249759] packet_sendmsg+0x6b8/0x1c90 [ 37.249763] ? __drain_all_pages+0x150/0x1c0 [ 37.249772] sock_sendmsg+0x65/0x70 [ 37.249778] __sys_sendto+0x113/0x190 [ 37.249783] ? handle_mm_fault+0xda/0x2b0 [ 37.249790] ? exit_to_user_mode_prepare+0x3c/0x1e0 [ 37.249800] ? do_user_addr_fault+0x1d3/0x640 [ 37.249805] __x64_sys_sendto+0x29/0x30 [ 37.249809] do_syscall_64+0x40/0xb0 [ 37.249816] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 37.249820] RIP: 0033:0x7f43950d27ea [...] > I think we need to understand exactly what goes on before we apply a > patch. It might just be papering over the problem otherwise. Okay, so the call path that adds the timestamp looks like this: send() syscall triggers tpacket_snd() which calls the veth_xmit() hander. In drivers/net/veth.c veth_xmit() calls veth_forward_skb() which then calls netif_rx()/netif_rx_internal() in net/core/dev.c. And finally, net_timestamp_check(netdev_tstamp_prequeue, skb) adds the timestamp, netdev_tstamp_prequeue defaults to 1. net_timestamp_check in its current form was added by 588f033075 ("net: use jump_label for netstamp_needed ") In the kernel since 3.3-rc1, so it looks like this issue has been present the entire time. Pre-conditions are netstamp_needed_key and netdev_tstamp_prequeue, so if either is false, timestamping won't happen at this stage in the code. Here's the call trace of where the timestamp is added [ 251.619538] Call Trace: [ 251.619550] netif_rx+0x1b/0x60 [ 251.619556] veth_xmit+0x19d/0x230 [veth] [ 251.619563] netdev_start_xmit+0x4a/0x8b [ 251.619566] dev_hard_start_xmit.cold+0xc8/0x1d5 [ 251.619569] __dev_queue_xmit.cold+0xa3/0x12c [ 251.619572] dev_queue_xmit+0x10/0x20 [ 251.619575] packet_sendmsg+0x6b8/0x1c90 [ 251.619580] ? __drain_all_pages+0x150/0x1c0 [ 251.619588] sock_sendmsg+0x65/0x70 [ 251.619594] __sys_sendto+0x113/0x190 [ 251.619598] ? handle_mm_fault+0xda/0x2b0 [ 251.619604] ? exit_to_user_mode_prepare+0x3c/0x1e0 [ 251.619611] ? do_user_addr_fault+0x1d3/0x640 [ 251.619615] __x64_sys_sendto+0x29/0x30 [ 251.619618] do_syscall_64+0x40/0xb0 [ 251.619623] entry_SYSCALL_64_after_hwframe+0x44/0xae This appears to be reasonable, but I don't know what the expected behaviour is. Should this timestamp still be cleared before returning the sent skb?
On Tue, May 11, 2021 at 6:11 PM Richard Sanger <rsanger@wand.net.nz> wrote: > > I've had a chance to look into this further and have found where the > timestamp is added. Details are at the end of this message. > > On Thu, May 6, 2021 at 1:23 PM Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: > > > > On Wed, May 5, 2021 at 7:42 PM Richard Sanger <rsanger@wand.net.nz> wrote: > [...] > > > > > > I've just verified using printk() that after the call to skb_tx_timestamp(skb) > > > in veth_xmit() skb->tstamp == 0 as expected. > > > > > > However, when skb_tx_timestamp() is called within the packetmmap code path > > > skb->tstamp holds a valid time. > > > > Interesting. I had expected veth_xmit to trigger skb_orphan, which > > calls the destructor. > > > > But this is no longer true as of commit 9c4c325252c5 ("skbuff: > > preserve sock reference when scrubbing the skb."). > > > > As a result, I suppose the skb can enter the next namespace and be > > timestamped there if receive timestamps are enabled (this is not > > per-socket). > > > > One way to verify, if you can easily recompile a kernel, is to add a > > WARN_ON_ONCE(1) to tpacket_destruct_skb to see which path led up to > > queuing the completion notification. > > > > Here's the output of putting a WARN_ON_ONCE(1) statement in > tpacket_destruct_skb, I don't believe it is related to the problem. It might be, if this is indeed an llc (ETH_P_802_2) packet that was inserted with tpacket. veth calls netif_rx, which enqueues the packet to the percpu queue for later processing in softirq context using process_backlog. > > [ 37.249629] RIP: 0010:tpacket_destruct_skb+0x24/0x60 > [...] > [ 37.249659] Call Trace: > [ 37.249661] <IRQ> > [ 37.249666] skb_release_head_state+0x44/0x90 > [ 37.249680] skb_release_all+0x13/0x30 > [ 37.249684] kfree_skb+0x2f/0xa0 > [ 37.249689] llc_rcv+0x2e/0x360 [llc] > [ 37.249698] __netif_receive_skb_one_core+0x8f/0xa0 > [ 37.249707] __netif_receive_skb+0x18/0x60 > [ 37.249710] process_backlog+0xa9/0x160 > [ 37.249714] __napi_poll+0x31/0x140 > [ 37.249717] net_rx_action+0xde/0x210 > [ 37.249722] __do_softirq+0xe0/0x29b > [ 37.249737] do_softirq+0x66/0x80 > [ 37.249747] </IRQ> > [ 37.249748] __local_bh_enable_ip+0x50/0x60 > [ 37.249751] __dev_queue_xmit+0x23a/0x6e0 > [ 37.249756] dev_queue_xmit+0x10/0x20 > [ 37.249759] packet_sendmsg+0x6b8/0x1c90 > [ 37.249763] ? __drain_all_pages+0x150/0x1c0 > [ 37.249772] sock_sendmsg+0x65/0x70 > [ 37.249778] __sys_sendto+0x113/0x190 > [ 37.249783] ? handle_mm_fault+0xda/0x2b0 > [ 37.249790] ? exit_to_user_mode_prepare+0x3c/0x1e0 > [ 37.249800] ? do_user_addr_fault+0x1d3/0x640 > [ 37.249805] __x64_sys_sendto+0x29/0x30 > [ 37.249809] do_syscall_64+0x40/0xb0 > [ 37.249816] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 37.249820] RIP: 0033:0x7f43950d27ea > > [...] > > > I think we need to understand exactly what goes on before we apply a > > patch. It might just be papering over the problem otherwise. > > Okay, so the call path that adds the timestamp looks like this: > > send() syscall triggers tpacket_snd() which calls the veth_xmit() hander. > In drivers/net/veth.c veth_xmit() calls veth_forward_skb() which then > calls netif_rx()/netif_rx_internal() in net/core/dev.c. > And finally, net_timestamp_check(netdev_tstamp_prequeue, skb) adds > the timestamp, netdev_tstamp_prequeue defaults to 1. > > net_timestamp_check in its current form was added by 588f033075 > ("net: use jump_label for netstamp_needed ") > In the kernel since 3.3-rc1, so it looks like this issue has been present the > entire time. Pre-conditions are netstamp_needed_key and > netdev_tstamp_prequeue, so if either is false, timestamping won't happen > at this stage in the code. Thanks. That's as I expected. We cannot suppress the timestamp assignment by net_timestamp_check, as it may be expected by the receive process. I don't immediately see a way to clear/ignore the field in tpacket_destruct_skb if the field was set from a different network namespace than the one in which the packet socket inserted the packet. So your suggested patch to suppress spurious tx timestamps if software timestamping reporting is not enabled SGTM. > Here's the call trace of where the timestamp is added > > [ 251.619538] Call Trace: > [ 251.619550] netif_rx+0x1b/0x60 > [ 251.619556] veth_xmit+0x19d/0x230 [veth] > [ 251.619563] netdev_start_xmit+0x4a/0x8b > [ 251.619566] dev_hard_start_xmit.cold+0xc8/0x1d5 > [ 251.619569] __dev_queue_xmit.cold+0xa3/0x12c > [ 251.619572] dev_queue_xmit+0x10/0x20 > [ 251.619575] packet_sendmsg+0x6b8/0x1c90 > [ 251.619580] ? __drain_all_pages+0x150/0x1c0 > [ 251.619588] sock_sendmsg+0x65/0x70 > [ 251.619594] __sys_sendto+0x113/0x190 > [ 251.619598] ? handle_mm_fault+0xda/0x2b0 > [ 251.619604] ? exit_to_user_mode_prepare+0x3c/0x1e0 > [ 251.619611] ? do_user_addr_fault+0x1d3/0x640 > [ 251.619615] __x64_sys_sendto+0x29/0x30 > [ 251.619618] do_syscall_64+0x40/0xb0 > [ 251.619623] entry_SYSCALL_64_after_hwframe+0x44/0xae > > This appears to be reasonable, but I don't know what the expected behaviour > is. Should this timestamp still be cleared before returning the sent skb?
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index ba96db1..b69805e 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -422,7 +422,8 @@ static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec64 *ts, ktime_to_timespec64_cond(shhwtstamps->hwtstamp, ts)) return TP_STATUS_TS_RAW_HARDWARE; - if (ktime_to_timespec64_cond(skb->tstamp, ts)) + if ((flags & SOF_TIMESTAMPING_SOFTWARE) && + ktime_to_timespec64_cond(skb->tstamp, ts)) return TP_STATUS_TS_SOFTWARE; return 0; @@ -2340,7 +2341,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, skb_copy_bits(skb, 0, h.raw + macoff, snaplen); - if (!(ts_status = tpacket_get_timestamp(skb, &ts, po->tp_tstamp))) + /* always timestamp; prefer an existing software timestamp */ + ts_status = tpacket_get_timestamp(skb, &ts, + po->tp_tstamp | SOF_TIMESTAMPING_SOFTWARE); + if (!ts_status) ktime_get_real_ts64(&ts); status |= ts_status;
The packetmmap tx ring should only return timestamps if requested, as documented. This allows compatibility with non-timestamp aware user-space code which checks tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp flags to be set. Signed-off-by: Richard Sanger <rsanger@wand.net.nz> --- net/packet/af_packet.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)