Message ID | 20230709195712.603200-1-martin.blumenstingl@googlemail.com |
---|---|
State | New |
Headers | show |
Series | wifi: rtw88: sdio: Honor the host max_req_size in the RX path | expand |
> -----Original Message----- > From: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > Sent: Monday, July 10, 2023 3:57 AM > To: linux-wireless@vger.kernel.org > Cc: linux-kernel@vger.kernel.org; jernej.skrabec@gmail.com; Ping-Ke Shih <pkshih@realtek.com>; > ulf.hansson@linaro.org; kvalo@kernel.org; tony0620emma@gmail.com; Martin Blumenstingl > <martin.blumenstingl@googlemail.com>; Lukas F . Hartmann <lukas@mntre.com> > Subject: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path > > Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes > with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth > combo card. The error he observed is identical to what has been fixed > in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST > bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem. > > Lukas found that disabling or limiting RX aggregation fix the problem > for him. In the following discussion a few key topics have been > discussed which have an impact on this problem: > - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller > which prevents DMA transfers. Instead all transfers need to go through > the controller SRAM which limits transfers to 1536 bytes > - rtw88 chips don't split incoming (RX) packets, so if a big packet is > received this is forwarded to the host in it's original form > - rtw88 chips can do RX aggregation, meaning more multiple incoming > packets can be pulled by the host from the card with one MMC/SDIO > transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH > register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will > be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation > and BIT_EN_PRE_CALC makes the chip honor the limits more effectively) > > Use multiple consecutive reads in rtw_sdio_read_port() to limit the > number of bytes which are copied by the host from the card in one > MMC/SDIO transfer. This allows receiving a buffer that's larger than > the hosts max_req_size (number of bytes which can be transferred in > one MMC/SDIO transfer). As a result of this the skb_over_panic error > is gone as the rtw88 driver is now able to receive more than 1536 bytes > from the card (either because the incoming packet is larger than that > or because multiple packets have been aggregated). I assume your conclusion is correct for all platforms, so I add my reviewed-by. But, I think it would be better that Lukas can help to test this patch on his platform, and give a tested-by tag before getting this patch merged. > > Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets") > Reported-by: Lukas F. Hartmann <lukas@mntre.com> > Closes: > https://lore.kernel.org/linux-wireless/CAFBinCBaXtebixKbjkWKW_WXc5k=NdGNaGUjVE8NCPNxOhsb2g@mail.gmail. > com/ > Suggested-by: Ping-Ke Shih <pkshih@realtek.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> [...]
> -----Original Message----- > From: Lukas F. Hartmann <lukas@mntre.com> > Sent: Thursday, July 13, 2023 8:49 PM > To: Ping-Ke Shih <pkshih@realtek.com>; Martin Blumenstingl <martin.blumenstingl@googlemail.com>; > linux-wireless@vger.kernel.org > Cc: linux-kernel@vger.kernel.org; jernej.skrabec@gmail.com; ulf.hansson@linaro.org; kvalo@kernel.org; > tony0620emma@gmail.com > Subject: RE: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path > > Hi, > > Ping-Ke Shih <pkshih@realtek.com> writes: > > > I assume your conclusion is correct for all platforms, so I add my reviewed-by. > > But, I think it would be better that Lukas can help to test this patch on his > > platform, and give a tested-by tag before getting this patch merged. > > I have been testing this now more rigorously in my own laptop with > Kernel 6.4.1 (from Debian experimental) and this patch applied. I first > had issues with rtw_power_mode_change (and "firmware failed to leave lps > state"), so I turned off power_save using iw. This made everything > quiet, but unfortunately after about 1 hour of usage I get > skb_over_panic again and I believe some memory corruption happens in the > kernel, as I can do dmesg only once and then another dmesg will hang forever. > (After WARNING: CPU: 4 PID: 0 at kernel/context_tracking.c:128 > ct_kernel_exit.constprop.0+0xa0/0xa8) > > Here are the errors that lead up to this: > http://dump.mntmn.com/rtw88-failure-1h-dmesg.txt Hi Martin, The dmesg shows that "rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1" Shouldn't we return an error code (with proper error handling) instead of just break the loop? Because 'buf' content isn't usable. I wonder the approach of this patch is still not enough for Lukas' platform. Ping-Ke
Hello Ping-Ke, On Fri, Jul 14, 2023 at 2:34 AM Ping-Ke Shih <pkshih@realtek.com> wrote: [...] > > Here are the errors that lead up to this: > > http://dump.mntmn.com/rtw88-failure-1h-dmesg.txt > > Hi Martin, > > The dmesg shows that > "rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1" > > Shouldn't we return an error code (with proper error handling) instead of > just break the loop? Because 'buf' content isn't usable. In my opinion we are properly breaking the loop: "ret" will be non-zero so the error code is returned from rtw_sdio_read_port() to the caller. The (only) caller is rtw_sdio_rxfifo_recv() which sees the non-zero return code and aborts processing. What do you think? > I wonder the approach of this patch is still not enough for Lukas' platform. On IRC Lukas wrote: funny, i can reproduce skb_panic when opening this page in chromium https://embedded.avnet.com/product/msc-sm2s-ryz/ and: still getting spurious skb_panics, even after disabling rx aggregation. I haven't had the time to look into this any further yet. Unfortunately I also don't have any hardware to reproduce this problem either, which unfortunately results in this long ping-pong. Lukas, could you please add two more prints: - in the rtw_warn with "Failed to read %zu byte(s) from SDIO port": please also print the ret variable (with %d) - I'm curious what the reported error is (it could be some CRC error which would mean ret is -EILSEQ) - add something like the following at the end of rtw_sdio_read_port() (right before "return ret"): if (!ret && count > 1000) { printk(KERN_INFO "rtw_sdio_read_port() with %zu bytes:", count); print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, buf, count, false); } (note: I only compile-tested this) The very last output of this (potentially spammy) output will contain the full buffer that's causing the problem. Best regards, Martin
> -----Original Message----- > From: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > Sent: Thursday, July 27, 2023 1:38 AM > To: Ping-Ke Shih <pkshih@realtek.com> > Cc: Lukas F. Hartmann <lukas@mntre.com>; linux-wireless@vger.kernel.org; linux-kernel@vger.kernel.org; > jernej.skrabec@gmail.com; ulf.hansson@linaro.org; kvalo@kernel.org; tony0620emma@gmail.com > Subject: Re: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path > > Hello Ping-Ke, > > On Fri, Jul 14, 2023 at 2:34 AM Ping-Ke Shih <pkshih@realtek.com> wrote: > [...] > > > Here are the errors that lead up to this: > > > http://dump.mntmn.com/rtw88-failure-1h-dmesg.txt > > > > Hi Martin, > > > > The dmesg shows that > > "rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1" > > > > Shouldn't we return an error code (with proper error handling) instead of > > just break the loop? Because 'buf' content isn't usable. > In my opinion we are properly breaking the loop: > "ret" will be non-zero so the error code is returned from > rtw_sdio_read_port() to the caller. > The (only) caller is rtw_sdio_rxfifo_recv() which sees the non-zero > return code and aborts processing. > What do you think? You are correct. I check the kernel log again. It might try to read two times for a large packet. First read is 1536 bytes, but it failed: [ 4002.096664] rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1 Second read is less byte, and it succeed, but skb->data content is incorrect. Then, [ 4002.100140] rtw_8822cs mmc2:0001:1: unused phy status page (3) [ 4002.105065] rtw_8822cs mmc2:0001:1: unused phy status page (2) [ 4002.110862] ------------[ cut here ]------------ [ 4002.110868] Rate marked as a VHT rate but data is invalid: MCS: 0, NSS: 0 So, showing total size ('count' argument) might help to find the cause or a workaround. Ping-Ke
Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote: > Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes > with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth > combo card. The error he observed is identical to what has been fixed > in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST > bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem. > > Lukas found that disabling or limiting RX aggregation fix the problem > for him. In the following discussion a few key topics have been > discussed which have an impact on this problem: > - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller > which prevents DMA transfers. Instead all transfers need to go through > the controller SRAM which limits transfers to 1536 bytes > - rtw88 chips don't split incoming (RX) packets, so if a big packet is > received this is forwarded to the host in it's original form > - rtw88 chips can do RX aggregation, meaning more multiple incoming > packets can be pulled by the host from the card with one MMC/SDIO > transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH > register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will > be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation > and BIT_EN_PRE_CALC makes the chip honor the limits more effectively) > > Use multiple consecutive reads in rtw_sdio_read_port() to limit the > number of bytes which are copied by the host from the card in one > MMC/SDIO transfer. This allows receiving a buffer that's larger than > the hosts max_req_size (number of bytes which can be transferred in > one MMC/SDIO transfer). As a result of this the skb_over_panic error > is gone as the rtw88 driver is now able to receive more than 1536 bytes > from the card (either because the incoming packet is larger than that > or because multiple packets have been aggregated). > > Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets") > Reported-by: Lukas F. Hartmann <lukas@mntre.com> > Closes: https://lore.kernel.org/linux-wireless/CAFBinCBaXtebixKbjkWKW_WXc5k=NdGNaGUjVE8NCPNxOhsb2g@mail.gmail.com/ > Suggested-by: Ping-Ke Shih <pkshih@realtek.com> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Ping, should I take or drop the patch? It wasn't quite clear for me.
> -----Original Message----- > From: Kalle Valo <kvalo@kernel.org> > Sent: Tuesday, August 1, 2023 10:11 PM > To: Martin Blumenstingl <martin.blumenstingl@googlemail.com> > Cc: linux-wireless@vger.kernel.org; linux-kernel@vger.kernel.org; jernej.skrabec@gmail.com; Ping-Ke Shih > <pkshih@realtek.com>; ulf.hansson@linaro.org; tony0620emma@gmail.com; Martin Blumenstingl > <martin.blumenstingl@googlemail.com>; Lukas F . Hartmann <lukas@mntre.com> > Subject: Re: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path > > Ping, should I take or drop the patch? It wasn't quite clear for me. Please drop this patch, because this patch still not fixes problem on Lukas' platform. I gave my reviewed-by too early. Sorry for that. Ping-Ke
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c index 2c1fb2dabd40..b19262ec5d8c 100644 --- a/drivers/net/wireless/realtek/rtw88/sdio.c +++ b/drivers/net/wireless/realtek/rtw88/sdio.c @@ -500,19 +500,31 @@ static u32 rtw_sdio_get_tx_addr(struct rtw_dev *rtwdev, size_t size, static int rtw_sdio_read_port(struct rtw_dev *rtwdev, u8 *buf, size_t count) { struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv; + struct mmc_host *host = rtwsdio->sdio_func->card->host; bool bus_claim = rtw_sdio_bus_claim_needed(rtwsdio); u32 rxaddr = rtwsdio->rx_addr++; + size_t bytes; int ret; if (bus_claim) sdio_claim_host(rtwsdio->sdio_func); - ret = sdio_memcpy_fromio(rtwsdio->sdio_func, buf, - RTW_SDIO_ADDR_RX_RX0FF_GEN(rxaddr), count); - if (ret) - rtw_warn(rtwdev, - "Failed to read %zu byte(s) from SDIO port 0x%08x", - count, rxaddr); + while (count > 0) { + bytes = min_t(size_t, host->max_req_size, count); + + ret = sdio_memcpy_fromio(rtwsdio->sdio_func, buf, + RTW_SDIO_ADDR_RX_RX0FF_GEN(rxaddr), + bytes); + if (ret) { + rtw_warn(rtwdev, + "Failed to read %zu byte(s) from SDIO port 0x%08x", + bytes, rxaddr); + break; + } + + count -= bytes; + buf += bytes; + } if (bus_claim) sdio_release_host(rtwsdio->sdio_func);
Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth combo card. The error he observed is identical to what has been fixed in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem. Lukas found that disabling or limiting RX aggregation fix the problem for him. In the following discussion a few key topics have been discussed which have an impact on this problem: - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller which prevents DMA transfers. Instead all transfers need to go through the controller SRAM which limits transfers to 1536 bytes - rtw88 chips don't split incoming (RX) packets, so if a big packet is received this is forwarded to the host in it's original form - rtw88 chips can do RX aggregation, meaning more multiple incoming packets can be pulled by the host from the card with one MMC/SDIO transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation and BIT_EN_PRE_CALC makes the chip honor the limits more effectively) Use multiple consecutive reads in rtw_sdio_read_port() to limit the number of bytes which are copied by the host from the card in one MMC/SDIO transfer. This allows receiving a buffer that's larger than the hosts max_req_size (number of bytes which can be transferred in one MMC/SDIO transfer). As a result of this the skb_over_panic error is gone as the rtw88 driver is now able to receive more than 1536 bytes from the card (either because the incoming packet is larger than that or because multiple packets have been aggregated). Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets") Reported-by: Lukas F. Hartmann <lukas@mntre.com> Closes: https://lore.kernel.org/linux-wireless/CAFBinCBaXtebixKbjkWKW_WXc5k=NdGNaGUjVE8NCPNxOhsb2g@mail.gmail.com/ Suggested-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> --- drivers/net/wireless/realtek/rtw88/sdio.c | 24 +++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-)