diff mbox series

[net-next,v2,2/5] lan743x: sync only the received area of an rx ring buffer

Message ID 20210211161830.17366-3-TheSven73@gmail.com
State New
Headers show
Series lan743x speed boost | expand

Commit Message

Sven Van Asbroeck Feb. 11, 2021, 4:18 p.m. UTC
From: Sven Van Asbroeck <thesven73@gmail.com>

On cpu architectures w/o dma cache snooping, dma_unmap() is a
is a very expensive operation, because its resulting sync
needs to invalidate cpu caches.

Increase efficiency/performance by syncing only those sections
of the lan743x's rx ring buffers that are actually in use.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
---

To: Bryan Whitehead <bryan.whitehead@microchip.com>
To: UNGLinuxDriver@microchip.com
To: "David S. Miller" <davem@davemloft.net>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Alexey Denisov <rtgbnm@gmail.com>
Cc: Sergej Bauer <sbauer@blackbox.su>
Cc: Tim Harvey <tharvey@gateworks.com>
Cc: Anders Rønningen <anders@ronningen.priv.no>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

 drivers/net/ethernet/microchip/lan743x_main.c | 32 +++++++++++++------
 1 file changed, 23 insertions(+), 9 deletions(-)

Comments

Bryan.Whitehead@microchip.com Feb. 12, 2021, 8:45 p.m. UTC | #1
Hi Sven, see below.

> +       if (buffer_info->dma_ptr) {

> +               /* unmap from dma */

> +               packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_

> +                               (le32_to_cpu(descriptor->data0));

> +               if (packet_length == 0 ||

> +                   packet_length > buffer_info->buffer_length)

> +                       /* buffer is part of multi-buffer packet: fully used */

> +                       packet_length = buffer_info->buffer_length;


According to the document I have, FRAME_LENGTH is only valid when LS bit is set, and reserved otherwise.
Therefore, I'm not sure you can rely on it being zero when LS is not set, even if your experiments say it is.
Future chip revisions might use those bits differently.

Can you change this so the LS bit is checked.
	If set you can use the smaller of FRAME_LENGTH or buffer length.
	If clear you can just use buffer length. 

> +               /* sync used part of buffer only */

> +               dma_sync_single_for_cpu(dev, buffer_info->dma_ptr,

> +                                       packet_length,

> +                                       DMA_FROM_DEVICE);

> +               dma_unmap_single_attrs(dev, buffer_info->dma_ptr,

> +                                      buffer_info->buffer_length,

> +                                      DMA_FROM_DEVICE,

> +                                      DMA_ATTR_SKIP_CPU_SYNC);

> +       }
Sven Van Asbroeck Feb. 12, 2021, 10:38 p.m. UTC | #2
Hi Bryan,

On Fri, Feb 12, 2021 at 3:45 PM <Bryan.Whitehead@microchip.com> wrote:
>

> According to the document I have, FRAME_LENGTH is only valid when LS bit is set, and reserved otherwise.

> Therefore, I'm not sure you can rely on it being zero when LS is not set, even if your experiments say it is.

> Future chip revisions might use those bits differently.


That's good to know. I didn't find any documentation related to
multi-buffer frames, so I had to go with what I saw the chip do
experimentally. It's great that you were able to double-check against
the official docs.

>

> Can you change this so the LS bit is checked.

>         If set you can use the smaller of FRAME_LENGTH or buffer length.

>         If clear you can just use buffer length.


Will do. Are you planning to hold off your tests until v3? It
shouldn't take too long.
Bryan.Whitehead@microchip.com Feb. 13, 2021, 7:15 p.m. UTC | #3
> Will do. Are you planning to hold off your tests until v3? It shouldn't take too

> long.


Sure, we will wait for v3
diff mbox series

Patch

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 0c48bb559719..36cc67c72851 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -1968,35 +1968,49 @@  static int lan743x_rx_init_ring_element(struct lan743x_rx *rx, int index)
 	struct net_device *netdev = rx->adapter->netdev;
 	struct device *dev = &rx->adapter->pdev->dev;
 	struct lan743x_rx_buffer_info *buffer_info;
+	unsigned int buffer_length, packet_length;
 	struct lan743x_rx_descriptor *descriptor;
 	struct sk_buff *skb;
 	dma_addr_t dma_ptr;
-	int length;
 
-	length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
+	buffer_length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;
 
 	descriptor = &rx->ring_cpu_ptr[index];
 	buffer_info = &rx->buffer_info[index];
-	skb = __netdev_alloc_skb(netdev, length, GFP_ATOMIC | GFP_DMA);
+	skb = __netdev_alloc_skb(netdev, buffer_length, GFP_ATOMIC | GFP_DMA);
 	if (!skb)
 		return -ENOMEM;
-	dma_ptr = dma_map_single(dev, skb->data, length, DMA_FROM_DEVICE);
+	dma_ptr = dma_map_single(dev, skb->data, buffer_length, DMA_FROM_DEVICE);
 	if (dma_mapping_error(dev, dma_ptr)) {
 		dev_kfree_skb_any(skb);
 		return -ENOMEM;
 	}
-	if (buffer_info->dma_ptr)
-		dma_unmap_single(dev, buffer_info->dma_ptr,
-				 buffer_info->buffer_length, DMA_FROM_DEVICE);
+	if (buffer_info->dma_ptr) {
+		/* unmap from dma */
+		packet_length = RX_DESC_DATA0_FRAME_LENGTH_GET_
+				(le32_to_cpu(descriptor->data0));
+		if (packet_length == 0 ||
+		    packet_length > buffer_info->buffer_length)
+			/* buffer is part of multi-buffer packet: fully used */
+			packet_length = buffer_info->buffer_length;
+		/* sync used part of buffer only */
+		dma_sync_single_for_cpu(dev, buffer_info->dma_ptr,
+					packet_length,
+					DMA_FROM_DEVICE);
+		dma_unmap_single_attrs(dev, buffer_info->dma_ptr,
+				       buffer_info->buffer_length,
+				       DMA_FROM_DEVICE,
+				       DMA_ATTR_SKIP_CPU_SYNC);
+	}
 
 	buffer_info->skb = skb;
 	buffer_info->dma_ptr = dma_ptr;
-	buffer_info->buffer_length = length;
+	buffer_info->buffer_length = buffer_length;
 	descriptor->data1 = cpu_to_le32(DMA_ADDR_LOW32(buffer_info->dma_ptr));
 	descriptor->data2 = cpu_to_le32(DMA_ADDR_HIGH32(buffer_info->dma_ptr));
 	descriptor->data3 = 0;
 	descriptor->data0 = cpu_to_le32((RX_DESC_DATA0_OWN_ |
-			    (length & RX_DESC_DATA0_BUF_LENGTH_MASK_)));
+			    (buffer_length & RX_DESC_DATA0_BUF_LENGTH_MASK_)));
 	lan743x_rx_update_tail(rx, index);
 
 	return 0;