mbox series

[v2,00/50] wilc1000: rework tx path to use sk_buffs throughout

Message ID 20211223011358.4031459-1-davidm@egauge.net
Headers show
Series wilc1000: rework tx path to use sk_buffs throughout | expand

Message

David Mosberger-Tang Dec. 23, 2021, 1:14 a.m. UTC
OK, so I'm nervous about such a large patch series, but it took a lot
of work to break things down into atomic changes.  This should be it
for the transmit path as far as I'm concerned.

- v2:
	- Fix 64-bit architecture compile breakage found by
          kernel build daemon.
	- Fix kernel-doc issues.
	- All patches compie with "make W=1" now and pass scripts/checkpatch.pl
        - Expand series to clean up some locking issues.
        - Expand series to support zero-copy tx transfers for SPI.
        - Rebase to latest wireless-drivers-next

Rework tx path to use sk_buffs throughout and optionally provide
zero-copy transmit.

Based on the earlier discussion (RFC: wilc1000: refactor TX path to
use sk_buff queue), here is the full patch series to clean up and
simplify the TX path.

The biggest patch is 0016, which is the one actually switching the
queue data type, but I worked hard to minimize it to only direct
changes due to the type changes.

There is no dramatic performance difference due to this patch.  I'd
expect the new code to be slightly faster, but my WLAN
test-environment is not sufficiently controlled to be sure of that.

original iperf3 performance (duration 120 seconds):

                TX [Mbps]	RX [Mbps]
  PSM off:	14.8		18.9
  PSM  on:	10.5		17.1

iperf3 performance with this patch-series applied:

		TX [Mbps]	RX [Mbps]
  PSM off:	16.0		19.9
  PSM  on:	11.7		18.0

(PSM == power-save-mode; controlled by iw dev wlan0 set power_save on/off)

David Mosberger-Tang (50):
  wilc1000: don't hold txq_spinlock while initializing AC queue limits
  wilc1000: switch txq_event from completion to waitqueue
  wilc1000: move receive-queue stats from txq to wilc structure
  wilc1000: factor common code in wilc_wlan_cfg_set() and wilc_wlan_cfg_get()
  wilc1000: add wilc_wlan_tx_packet_done() function
  wilc1000: move tx packet drop code into its own function
  wilc1000: increment tx_dropped stat counter on tx packet drop
  wilc1000: fix management packet type inconsistency
  wilc1000: prepare wilc_wlan_tx_packet_done() for sk_buff changes
  wilc1000: factor initialization of tx queue-specific packet fields
  wilc1000: convert tqx_entries from "int" to "atomic_t"
  wilc1000: refactor wilc_wlan_cfg_commit() a bit
  wilc1000: sanitize config packet sequence number management a bit
  wilc1000: if there is no tx packet, don't increment packets-sent counter
  wilc1000: add struct wilc_skb_tx_cb as an alias of struct txq_entry_t
  wilc1000: switch tx queue to normal sk_buff entries
  wilc1000: remove no longer used "vif" argument from init_txq_entry()
  wilc1000: split huge tx handler into subfunctions
  wilc1000: don't tell the chip to go to sleep while copying tx packets
  wilc1000: eliminate "max_size_over" variable in fill_vmm_table
  wilc1000: declare read-only ac_preserve_ratio as static and const
  wilc1000: minor syntax cleanup
  wilc1000: introduce symbolic names for two tx-related control bits
  wilc1000: protect tx_q_limit with a mutex instead of a spinlock
  wilc1000: replace txq_spinlock with ack_filter_lock mutex
  wilc1000: reduce amount of time ack_filter_lock is held
  wilc1000: simplify ac_balance() a bit
  wilc1000: improve send_packets() a bit
  wilc1000: factor header length calculation into a new function
  wilc1000: use more descriptive variable names
  wilc1000: eliminate another magic constant
  wilc1000: introduce vmm_table_entry() helper function
  wilc1000: move ac_desired_ratio calculation to where its needed
  wilc1000: restructure wilc-wlan_handle_txq() for clarity
  wilc1000: introduce copy_and_send_packets() helper function
  wilc1000: introduce transmit path chip queue
  wilc1000: introduce set_header() function
  wilc1000: take advantage of chip queue
  wilc1000: eliminate txq_add_to_head_cs mutex
  wilc1000: introduce schedule_packets() function
  wilc1000: use more descriptive variable name
  wilc1000: simplify code by adding header/padding to skb
  wilc1000: add support for zero-copy transmit of tx packets
  wilc1000: don't allocate tx_buffer when zero-copy is available
  wilc1000: move struct wilc_spi declaration
  wilc1000: remove duplicate CRC calculation code
  wilc1000: factor SPI DMA command initialization code into a function
  wilc1000: introduce function to find and check DMA response
  wilc1000: implement zero-copy transmit support for SPI
  wilc1000: add module parameter "disable_zero_copy_tx" to SPI driver

 .../wireless/microchip/wilc1000/cfg80211.c    |  45 +-
 drivers/net/wireless/microchip/wilc1000/mon.c |  36 +-
 .../net/wireless/microchip/wilc1000/netdev.c  |  46 +-
 .../net/wireless/microchip/wilc1000/netdev.h  |  41 +-
 drivers/net/wireless/microchip/wilc1000/spi.c | 293 ++++-
 .../net/wireless/microchip/wilc1000/wlan.c    | 998 ++++++++++--------
 .../net/wireless/microchip/wilc1000/wlan.h    |  55 +-
 7 files changed, 888 insertions(+), 626 deletions(-)

Comments

Ajay Singh Dec. 23, 2021, 6:16 a.m. UTC | #1
On 23/12/21 06:44, David Mosberger-Tang wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> OK, so I'm nervous about such a large patch series, but it took a lot
> of work to break things down into atomic changes.  This should be it
> for the transmit path as far as I'm concerned.


Thanks David for the efforts to break down the changes. I am still 
reviewing and testing the previous series and found some inconsistent 
results. I am not sure about the cause of the difference. For some 
tests, the throughput is improved(~1Mbps) but for some CI tests, the 
throughput is less compared(~1Mbps in same range) to the previous. 
Though not observed much difference.

Now the new patches are added to the same series so it is difficult to 
review them in one go.

I have a request, incase there are new patches please include them in 
separate series. Breaking down the patch helps to identify the non 
related changes which can go in separate series. The patches(change) may 
be related to TX path flow but can go in separate series.


Regards,
Ajay
Kalle Valo Dec. 23, 2021, 6:41 a.m. UTC | #2
<Ajay.Kathat@microchip.com> writes:

> On 23/12/21 06:44, David Mosberger-Tang wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you
>> know the content is safe
>>
>> OK, so I'm nervous about such a large patch series, but it took a lot
>> of work to break things down into atomic changes.  This should be it
>> for the transmit path as far as I'm concerned.
>
>
> Thanks David for the efforts to break down the changes. I am still 
> reviewing and testing the previous series and found some inconsistent 
> results. I am not sure about the cause of the difference. For some 
> tests, the throughput is improved(~1Mbps) but for some CI tests, the 
> throughput is less compared(~1Mbps in same range) to the previous. 
> Though not observed much difference.
>
> Now the new patches are added to the same series so it is difficult to 
> review them in one go.
>
> I have a request, incase there are new patches please include them in 
> separate series. Breaking down the patch helps to identify the non 
> related changes which can go in separate series. The patches(change) may 
> be related to TX path flow but can go in separate series.

Yeah, a thumb of rule is to have around 10-12 patches per patchset. Then
it's still pretty easy to review them and get them accepted. Of course
it's not a hard rule, for smaller patches (like here) having more than
12 is still doable. An also the opposite, with big patches even 10
patches is too much. But 50 patches is just pure pain for the reviewers :)
David Mosberger-Tang Dec. 23, 2021, 3:23 p.m. UTC | #3
On Thu, 2021-12-23 at 06:16 +0000, Ajay.Kathat@microchip.com wrote:
> On 23/12/21 06:44, David Mosberger-Tang wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > 
> > OK, so I'm nervous about such a large patch series, but it took a lot
> > of work to break things down into atomic changes.  This should be it
> > for the transmit path as far as I'm concerned.
> 
> Thanks David for the efforts to break down the changes. I am still 
> reviewing and testing the previous series and found some inconsistent 
> results. I am not sure about the cause of the difference. For some 
> tests, the throughput is improved(~1Mbps) but for some CI tests, the 
> throughput is less compared(~1Mbps in same range) to the previous. 
> Though not observed much difference.

There shouldn't be any significant performance regressions.  From my
observations, +/-1Mbps in throughput is quite possible due to cache-
effects.

> Now the new patches are added to the same series so it is difficult to 
> review them in one go.

Ah, OK, sorry about that.  After the automated error reports, I waited
for a day or two and after not seeing any further feedback, I figured
it'd be fine to add to the series.

I take it that as long as a patch shows up in patchworks with a
state==Action required, I should assume the patch is being worked on
(or will be worked on).

> I have a request, incase there are new patches please include them in 
> separate series.

I'm not planning on adding more patches.

  --david