mbox series

[v4,00/25] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+

Message ID 20241216-concurrent-wb-v4-0-fe220297a7f0@quicinc.com
Headers show
Series drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ | expand

Message

Jessica Zhang Dec. 17, 2024, 12:43 a.m. UTC
DPU supports a single writeback session running concurrently with primary
display when the CWB mux is configured properly. This series enables
clone mode for DPU driver and adds support for programming the CWB mux
in cases where the hardware has dedicated CWB pingpong blocks. Currently,
the CWB hardware blocks have only been added to the SM8650
hardware catalog and only DSI has been exposed as a possible_clone of WB.

This changes are split into two parts:

The first part of the series will pull in Dmitry's patches to refactor
the DPU resource manager to be based off of CRTC instead of encoder.
This includes some changes (noted in the relevant commits) by me and
Abhinav to fix some issues with getting the global state and refactoring
the CDM allocation to work with Dmitry's changes.

The second part of the series will add support for CWB by doing the
following:

1) Add a DRM helper to detect if the current CRTC state is in clone mode
   and add an "in_clone_mode" entry to the atomic state print
2) Add the CWB mux to the hardware catalog and clarify the pingpong
   block index enum to specifiy which pingpong blocks are dedicated to
   CWB only and which ones are general use pingpong blocks
3) Add CWB as part of the devcoredump
4) Add support for configuring the CWB mux via dpu_hw_cwb ops
5) Add pending flush support for CWB
6) Add support for validating clone mode in the DPU CRTC and setting up
   CWB within the encoder
7) Adjust the encoder trigger flush, trigger start, and kickoff order to
   accomodate clone mode
8) Adjust when the frame done timer is started for clone mode
9) Define the possible clones for DPU encoders so that 

The feature was tested on SM8650 using IGT's kms_writeback test with the
following change [1] and dumping the writeback framebuffer when in clone
mode. I haven't gotten the chance to test it on DP yet, but I've
validated both single and dual LM on DSI.

To test CWB with IGT, you'll need to apply this series [1] and this
driver patch [2]. Run the following command to dump the writeback buffer:

IGT_FRAME_DUMP_PATH=<dump path> FRAME_PNG_FILE_NAME=<file name> \
./build/tests/kms_writeback -d [--run-subtest dump-valid-clones] \

You can also do CRC validation by running this command:

./build/tests/kms_writeback [--run-subtest dump-valid-clones]

[1] https://patchwork.freedesktop.org/series/137933/
[2] https://patchwork.freedesktop.org/series/138284/

---
Changes in v4:
- Rebased onto latest msm-next
- Added kunit tests for framework changes
- Skip valid clone check for encoders that don't have any possible clones set
  (this is to avoid failing kunit tests, specifically the HDMI state helper tests)
- Link to v3: https://lore.kernel.org/r/20241016-concurrent-wb-v3-0-a33cf9b93835@quicinc.com

Changes in v3:
- Dropped support for CWB on DP connectors for now
- Dropped unnecessary PINGPONG array in *_setup_cwb()
- Add a check to make sure CWB and CDM aren't supported simultaneously
  (Dmitry)
- Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry)
- Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani)
- Dropped duplicate error message for reserving CWB resources (Dmitry)
- Added notes in framework changes about posting a separate series to
  add proper KUnit tests (Maxime)
- Added commit message note addressing Sima's comment on handling
  mode_changed (Dmitry)
- Formatting fixes (Dmitry)
- Added proper kerneldocs (Dmitry)
- Renamed dpu_encoder_helper_get_cwb() -> *_get_cwb_mask() (Dmitry)
- Capitalize all instances of "pingpong" in comments (Dmitry)
- Link to v2: https://lore.kernel.org/r/20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com

Changes in v2:
- Moved CWB hardware programming to its own dpu_hw_cwb abstraction
  (Dmitry)
- Reserve and get assigned CWB muxes using RM API and KMS global state
  (Dmitry)
- Dropped requirement to have only one CWB session at a time
- Moved valid clone mode check to DRM framework (Dmitry and Ville)
- Switch to default CWB tap point to LM as the DSPP
- Dropped printing clone mode status in atomic state (Dmitry)
- Call dpu_vbif_clear_errors() before dpu_encoder_kickoff() (Dmitry)
- Squashed setup_input_ctrl() and setup_input_mode() into a single
  dpu_hw_cwb op (Dmitry)
- Moved function comment docs to correct place and fixed wording of
  comments/commit messages (Dmitry)
- Grabbed old CRTC state using proper drm_atomic_state API in
  dpu_crtc_atomic_check() (Dmitry)
- Split HW catalog changes of adding the CWB mux block and changing the
  dedicated CWB pingpong indices into 2 separate commits (Dmitry)
- Moved clearing the dpu_crtc_state.num_mixers to "drm/msm/dpu: fill
  CRTC resources in dpu_crtc.c" (Dmitry)
- Fixed alignment and other formatting issues (Dmitry)
- Link to v1: https://lore.kernel.org/r/20240829-concurrent-wb-v1-0-502b16ae2ebb@quicinc.com

---
Dmitry Baryshkov (4):
      drm/msm/dpu: get rid of struct dpu_rm_requirements
      drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation
      drm/msm/dpu: move resource allocation to CRTC
      drm/msm/dpu: fill CRTC resources in dpu_crtc.c

Esha Bharadwaj (3):
      drm/msm/dpu: Add CWB entry to catalog for SM8650
      drm/msm/dpu: add devcoredumps for cwb registers
      drm/msm/dpu: add CWB support to dpu_hw_wb

Jessica Zhang (18):
      drm: add clone mode check for CRTC
      drm/tests: Add test for drm_crtc_in_clone_mode()
      drm: Add valid clones check
      drm/tests: Add test for drm_atomic_helper_check_modeset()
      drm/msm/dpu: Specify dedicated CWB pingpong blocks
      drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block
      drm/msm/dpu: Add RM support for allocating CWB
      drm/msm/dpu: Add CWB to msm_display_topology
      drm/msm/dpu: Require modeset if clone mode status changes
      drm/msm/dpu: Fail atomic_check if CWB and CDM are enabled
      drm/msm/dpu: Reserve resources for CWB
      drm/msm/dpu: Configure CWB in writeback encoder
      drm/msm/dpu: Support CWB in dpu_hw_ctl
      drm/msm/dpu: Adjust writeback phys encoder setup for CWB
      drm/msm/dpu: Start frame done timer after encoder kickoff
      drm/msm/dpu: Skip trigger flush and start for CWB
      drm/msm/dpu: Reorder encoder kickoff for CWB
      drm/msm/dpu: Set possible clones for all encoders

 drivers/gpu/drm/drm_atomic_helper.c                |  28 ++
 drivers/gpu/drm/drm_crtc.c                         |  20 +
 drivers/gpu/drm/msm/Makefile                       |   1 +
 .../drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h    |  29 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |   4 +-
 .../drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h    |   4 +-
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |   4 +-
 .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h   |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c           | 208 ++++++++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        | 463 ++++++++++++---------
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |  14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   7 +-
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c    |  16 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  13 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  30 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  15 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.c         |  73 ++++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h         |  70 ++++
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |  15 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c          |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  12 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |  13 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 361 +++++++++-------
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  13 +-
 drivers/gpu/drm/tests/drm_atomic_state_test.c      | 133 +++++-
 include/drm/drm_crtc.h                             |   2 +-
 26 files changed, 1172 insertions(+), 384 deletions(-)
---
base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
change-id: 20240618-concurrent-wb-97d62387f952
prerequisite-change-id: 20241209-abhinavk-modeset-fix-74864f1de08d:v3
prerequisite-patch-id: a197a0cd4647cb189ea20a96583ea78d0c98b638
prerequisite-patch-id: 112c8f1795cbed989beb02b72561854c0ccd59dd

Best regards,

Comments

Dmitry Baryshkov Dec. 20, 2024, 5:11 a.m. UTC | #1
On Mon, Dec 16, 2024 at 04:43:11PM -0800, Jessica Zhang wrote:
> DPU supports a single writeback session running concurrently with primary
> display when the CWB mux is configured properly. This series enables
> clone mode for DPU driver and adds support for programming the CWB mux
> in cases where the hardware has dedicated CWB pingpong blocks. Currently,
> the CWB hardware blocks have only been added to the SM8650
> hardware catalog and only DSI has been exposed as a possible_clone of WB.
> 
> This changes are split into two parts:
> 
> The first part of the series will pull in Dmitry's patches to refactor
> the DPU resource manager to be based off of CRTC instead of encoder.
> This includes some changes (noted in the relevant commits) by me and
> Abhinav to fix some issues with getting the global state and refactoring
> the CDM allocation to work with Dmitry's changes.

To provide a sensible baseline for both CWB and Quad-Pipe changes I'm
going to pull patches 5-14 (those which refactor the resource allocation
and also those adding support for the CWB hardware block). The core DRM
patches should probably go in through drm-misc-next.

> 
> The second part of the series will add support for CWB by doing the
> following:
> 
> 1) Add a DRM helper to detect if the current CRTC state is in clone mode
>    and add an "in_clone_mode" entry to the atomic state print
> 2) Add the CWB mux to the hardware catalog and clarify the pingpong
>    block index enum to specifiy which pingpong blocks are dedicated to
>    CWB only and which ones are general use pingpong blocks
> 3) Add CWB as part of the devcoredump
> 4) Add support for configuring the CWB mux via dpu_hw_cwb ops
> 5) Add pending flush support for CWB
> 6) Add support for validating clone mode in the DPU CRTC and setting up
>    CWB within the encoder
> 7) Adjust the encoder trigger flush, trigger start, and kickoff order to
>    accomodate clone mode
> 8) Adjust when the frame done timer is started for clone mode
> 9) Define the possible clones for DPU encoders so that 
> 
> The feature was tested on SM8650 using IGT's kms_writeback test with the
> following change [1] and dumping the writeback framebuffer when in clone
> mode. I haven't gotten the chance to test it on DP yet, but I've
> validated both single and dual LM on DSI.
> 
> To test CWB with IGT, you'll need to apply this series [1] and this
> driver patch [2]. Run the following command to dump the writeback buffer:
> 
> IGT_FRAME_DUMP_PATH=<dump path> FRAME_PNG_FILE_NAME=<file name> \
> ./build/tests/kms_writeback -d [--run-subtest dump-valid-clones] \
> 
> You can also do CRC validation by running this command:
> 
> ./build/tests/kms_writeback [--run-subtest dump-valid-clones]
> 
> [1] https://patchwork.freedesktop.org/series/137933/
> [2] https://patchwork.freedesktop.org/series/138284/
> 
> ---
> Changes in v4:
> - Rebased onto latest msm-next
> - Added kunit tests for framework changes
> - Skip valid clone check for encoders that don't have any possible clones set
>   (this is to avoid failing kunit tests, specifically the HDMI state helper tests)
> - Link to v3: https://lore.kernel.org/r/20241016-concurrent-wb-v3-0-a33cf9b93835@quicinc.com
> 
> Changes in v3:
> - Dropped support for CWB on DP connectors for now
> - Dropped unnecessary PINGPONG array in *_setup_cwb()
> - Add a check to make sure CWB and CDM aren't supported simultaneously
>   (Dmitry)
> - Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry)
> - Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani)
> - Dropped duplicate error message for reserving CWB resources (Dmitry)
> - Added notes in framework changes about posting a separate series to
>   add proper KUnit tests (Maxime)
> - Added commit message note addressing Sima's comment on handling
>   mode_changed (Dmitry)
> - Formatting fixes (Dmitry)
> - Added proper kerneldocs (Dmitry)
> - Renamed dpu_encoder_helper_get_cwb() -> *_get_cwb_mask() (Dmitry)
> - Capitalize all instances of "pingpong" in comments (Dmitry)
> - Link to v2: https://lore.kernel.org/r/20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com
> 
> Changes in v2:
> - Moved CWB hardware programming to its own dpu_hw_cwb abstraction
>   (Dmitry)
> - Reserve and get assigned CWB muxes using RM API and KMS global state
>   (Dmitry)
> - Dropped requirement to have only one CWB session at a time
> - Moved valid clone mode check to DRM framework (Dmitry and Ville)
> - Switch to default CWB tap point to LM as the DSPP
> - Dropped printing clone mode status in atomic state (Dmitry)
> - Call dpu_vbif_clear_errors() before dpu_encoder_kickoff() (Dmitry)
> - Squashed setup_input_ctrl() and setup_input_mode() into a single
>   dpu_hw_cwb op (Dmitry)
> - Moved function comment docs to correct place and fixed wording of
>   comments/commit messages (Dmitry)
> - Grabbed old CRTC state using proper drm_atomic_state API in
>   dpu_crtc_atomic_check() (Dmitry)
> - Split HW catalog changes of adding the CWB mux block and changing the
>   dedicated CWB pingpong indices into 2 separate commits (Dmitry)
> - Moved clearing the dpu_crtc_state.num_mixers to "drm/msm/dpu: fill
>   CRTC resources in dpu_crtc.c" (Dmitry)
> - Fixed alignment and other formatting issues (Dmitry)
> - Link to v1: https://lore.kernel.org/r/20240829-concurrent-wb-v1-0-502b16ae2ebb@quicinc.com
> 
> ---
> Dmitry Baryshkov (4):
>       drm/msm/dpu: get rid of struct dpu_rm_requirements
>       drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation
>       drm/msm/dpu: move resource allocation to CRTC
>       drm/msm/dpu: fill CRTC resources in dpu_crtc.c
> 
> Esha Bharadwaj (3):
>       drm/msm/dpu: Add CWB entry to catalog for SM8650
>       drm/msm/dpu: add devcoredumps for cwb registers
>       drm/msm/dpu: add CWB support to dpu_hw_wb
> 
> Jessica Zhang (18):
>       drm: add clone mode check for CRTC
>       drm/tests: Add test for drm_crtc_in_clone_mode()
>       drm: Add valid clones check
>       drm/tests: Add test for drm_atomic_helper_check_modeset()
>       drm/msm/dpu: Specify dedicated CWB pingpong blocks
>       drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block
>       drm/msm/dpu: Add RM support for allocating CWB
>       drm/msm/dpu: Add CWB to msm_display_topology
>       drm/msm/dpu: Require modeset if clone mode status changes
>       drm/msm/dpu: Fail atomic_check if CWB and CDM are enabled
>       drm/msm/dpu: Reserve resources for CWB
>       drm/msm/dpu: Configure CWB in writeback encoder
>       drm/msm/dpu: Support CWB in dpu_hw_ctl
>       drm/msm/dpu: Adjust writeback phys encoder setup for CWB
>       drm/msm/dpu: Start frame done timer after encoder kickoff
>       drm/msm/dpu: Skip trigger flush and start for CWB
>       drm/msm/dpu: Reorder encoder kickoff for CWB
>       drm/msm/dpu: Set possible clones for all encoders
> 
>  drivers/gpu/drm/drm_atomic_helper.c                |  28 ++
>  drivers/gpu/drm/drm_crtc.c                         |  20 +
>  drivers/gpu/drm/msm/Makefile                       |   1 +
>  .../drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h    |  29 +-
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |   4 +-
>  .../drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h    |   4 +-
>  .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |   4 +-
>  .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h   |   4 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c           | 208 ++++++++-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        | 463 ++++++++++++---------
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |  14 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   7 +-
>  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c    |  16 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  13 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  30 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  15 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.c         |  73 ++++
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h         |  70 ++++
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |  15 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c          |   4 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  12 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |  13 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 361 +++++++++-------
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  13 +-
>  drivers/gpu/drm/tests/drm_atomic_state_test.c      | 133 +++++-
>  include/drm/drm_crtc.h                             |   2 +-
>  26 files changed, 1172 insertions(+), 384 deletions(-)
> ---
> base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
> change-id: 20240618-concurrent-wb-97d62387f952
> prerequisite-change-id: 20241209-abhinavk-modeset-fix-74864f1de08d:v3
> prerequisite-patch-id: a197a0cd4647cb189ea20a96583ea78d0c98b638
> prerequisite-patch-id: 112c8f1795cbed989beb02b72561854c0ccd59dd
> 
> Best regards,
> -- 
> Jessica Zhang <quic_jesszhan@quicinc.com>
>
Dmitry Baryshkov Dec. 20, 2024, 5:52 a.m. UTC | #2
On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote:
> Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes
> 
> For concurrent writeback, even-indexed CWB muxes must be assigned to
> even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same
> even/odd rule applies for dedicated CWB PINGPONGs.
> 
> Track the CWB muxes in the global state and add a CWB-specific helper to
> reserve the correct CWB muxes and dedicated PINGPONGs following the
> even/odd rule.
> 
> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++--
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  2 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  1 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 83 +++++++++++++++++++++++++++++
>  4 files changed, 116 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -2,7 +2,7 @@
>  /*
>   * Copyright (C) 2013 Red Hat
>   * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved.
> - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
>   *
>   * Author: Rob Clark <robdclark@gmail.com>
>   */
> @@ -28,6 +28,7 @@
>  #include "dpu_hw_dsc.h"
>  #include "dpu_hw_merge3d.h"
>  #include "dpu_hw_cdm.h"
> +#include "dpu_hw_cwb.h"
>  #include "dpu_formats.h"
>  #include "dpu_encoder_phys.h"
>  #include "dpu_crtc.h"
> @@ -133,6 +134,9 @@ enum dpu_enc_rc_states {
>   * @cur_slave:		As above but for the slave encoder.
>   * @hw_pp:		Handle to the pingpong blocks used for the display. No.
>   *			pingpong blocks can be different than num_phys_encs.
> + * @hw_cwb:		Handle to the CWB muxes used for concurrent writeback
> + *			display. Number of CWB muxes can be different than
> + *			num_phys_encs.
>   * @hw_dsc:		Handle to the DSC blocks used for the display.
>   * @dsc_mask:		Bitmask of used DSC blocks.
>   * @intfs_swapped:	Whether or not the phys_enc interfaces have been swapped
> @@ -177,6 +181,7 @@ struct dpu_encoder_virt {
>  	struct dpu_encoder_phys *cur_master;
>  	struct dpu_encoder_phys *cur_slave;
>  	struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
> +	struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC];
>  	struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
>  
>  	unsigned int dsc_mask;
> @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>  	struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC];
>  	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
>  	struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
> +	struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC];
>  	int num_pp, num_dsc, num_ctl;
> +	int num_cwb = 0;
> +	bool is_cwb_encoder;
>  	unsigned int dsc_mask = 0;
>  	int i;
>  
> @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>  
>  	priv = drm_enc->dev->dev_private;
>  	dpu_kms = to_dpu_kms(priv->kms);
> +	is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) &&
> +			dpu_enc->disp_info.intf_type == INTF_WB;
>  
>  	global_state = dpu_kms_get_existing_global_state(dpu_kms);
>  	if (IS_ERR_OR_NULL(global_state)) {
> @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>  	trace_dpu_enc_mode_set(DRMID(drm_enc));
>  
>  	/* Query resource that have been reserved in atomic check step. */
> -	num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> -		drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp,
> -		ARRAY_SIZE(hw_pp));
> +	if (is_cwb_encoder) {
> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> +						       drm_enc->crtc,
> +						       DPU_HW_BLK_DCWB_PINGPONG,
> +						       hw_pp, ARRAY_SIZE(hw_pp));
> +		num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> +						       drm_enc->crtc,
> +						       DPU_HW_BLK_CWB,
> +						       hw_cwb, ARRAY_SIZE(hw_cwb));
> +	} else {
> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> +						       drm_enc->crtc,
> +						       DPU_HW_BLK_PINGPONG, hw_pp,
> +						       ARRAY_SIZE(hw_pp));
> +	}
> +
> +	for (i = 0; i < num_cwb; i++)
> +		dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]);
> +
>  	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>  			drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
>  
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> @@ -77,12 +77,14 @@ enum dpu_hw_blk_type {
>  	DPU_HW_BLK_LM,
>  	DPU_HW_BLK_CTL,
>  	DPU_HW_BLK_PINGPONG,
> +	DPU_HW_BLK_DCWB_PINGPONG,
>  	DPU_HW_BLK_INTF,
>  	DPU_HW_BLK_WB,
>  	DPU_HW_BLK_DSPP,
>  	DPU_HW_BLK_MERGE_3D,
>  	DPU_HW_BLK_DSC,
>  	DPU_HW_BLK_CDM,
> +	DPU_HW_BLK_CWB,
>  	DPU_HW_BLK_MAX,
>  };
>  
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> @@ -128,6 +128,7 @@ struct dpu_global_state {
>  	uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0];
>  	uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0];
>  	uint32_t cdm_to_crtc_id;
> +	uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0];
>  };
>  
>  struct dpu_global_state
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
>  	return -EINVAL;
>  }
>  
> +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm,
> +						 struct dpu_global_state *global_state,
> +						 uint32_t crtc_id,
> +						 struct msm_display_topology *topology)
> +{
> +	int num_cwb_pp = topology->num_lm, cwb_pp_count = 0;
> +	int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0;
> +	int cwb_pp_idx[MAX_BLOCKS];
> +	int cwb_mux_idx[MAX_BLOCKS];
> +
> +	/*
> +	 * Reserve additional dedicated CWB PINGPONG blocks and muxes for each
> +	 * mixer
> +	 *
> +	 * TODO: add support reserving resources for platforms with no
> +	 *       PINGPONG_CWB

What about doing it other way around: allocate CWBs first as required
(even/odd, proper count, etc). Then for each of CWBs allocate a PP block
(I think it's enough to simply make CWB blocks have a corresponding PP
index as a property). This way the driver can handle both legacy and
current platforms.

> +	 */
> +	for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) &&
> +	     cwb_pp_count < num_cwb_pp; i++) {
> +		for (int j = cwb_pp_start_idx;
> +		     j < ARRAY_SIZE(rm->pingpong_blks); j++) {
> +			/*
> +			 * Odd LMs must be assigned to odd PINGPONGs and even
> +			 * LMs with even PINGPONGs
> +			 */
> +			if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) ||
> +			    i % 2 != j % 2)
> +				continue;
> +
> +			cwb_pp_idx[cwb_pp_count] = j;
> +			cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx;
> +			cwb_pp_count++;
> +			break;
> +		}
> +	}
> +
> +	if (cwb_pp_count != num_cwb_pp) {
> +		DPU_ERROR("Unable to reserve all CWB PINGPONGs\n");
> +		return -ENAVAIL;
> +	}
> +
> +	for (int i = 0; i < cwb_pp_count; i++) {
> +		global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id;
> +		global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets
>   *	proposed use case requirements, incl. hardwired dependent blocks like
> @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation(
>  		return ret;
>  	}
>  
> +	if (topology->cwb_enabled) {
> +		ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state,
> +							    crtc_id, topology);
> +		if (ret)
> +			return ret;
> +	}
>  
>  	ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id,
>  			topology);
> @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
>  	_dpu_rm_clear_mapping(global_state->dspp_to_crtc_id,
>  			ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id);
>  	_dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id);
> +	_dpu_rm_clear_mapping(global_state->cwb_to_crtc_id,
> +			ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id);
>  }
>  
>  /**
> @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>  
>  	switch (type) {
>  	case DPU_HW_BLK_PINGPONG:
> +	case DPU_HW_BLK_DCWB_PINGPONG:
>  		hw_blks = rm->pingpong_blks;
>  		hw_to_crtc_id = global_state->pingpong_to_crtc_id;
>  		max_blks = ARRAY_SIZE(rm->pingpong_blks);
> @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>  		hw_to_crtc_id = &global_state->cdm_to_crtc_id;
>  		max_blks = 1;
>  		break;
> +	case DPU_HW_BLK_CWB:
> +		hw_blks = rm->cwb_blks;
> +		hw_to_crtc_id = global_state->cwb_to_crtc_id;
> +		max_blks = ARRAY_SIZE(rm->cwb_blks);
> +		break;
>  	default:
>  		DPU_ERROR("blk type %d not managed by rm\n", type);
>  		return 0;
> @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>  		if (hw_to_crtc_id[i] != crtc_id)
>  			continue;
>  
> +		if (type == DPU_HW_BLK_PINGPONG) {
> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
> +
> +			if (pp->idx >= PINGPONG_CWB_0)
> +				continue;
> +		}
> +
> +		if (type == DPU_HW_BLK_DCWB_PINGPONG) {
> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
> +
> +			if (pp->idx < PINGPONG_CWB_0)
> +				continue;
> +		}
> +
>  		if (num_blks == blks_size) {
>  			DPU_ERROR("More than %d resources assigned to crtc %d\n",
>  				  blks_size, crtc_id);
> @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p,
>  	dpu_rm_print_state_helper(p, rm->cdm_blk,
>  				  global_state->cdm_to_crtc_id);
>  	drm_puts(p, "\n");
> +
> +	drm_puts(p, "\tcwb=");
> +	for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++)
> +		dpu_rm_print_state_helper(p, rm->cwb_blks[i],
> +					  global_state->cwb_to_crtc_id[i]);
> +	drm_puts(p, "\n");
>  }
> 
> -- 
> 2.34.1
>
Dmitry Baryshkov Dec. 20, 2024, 5:55 a.m. UTC | #3
On Mon, Dec 16, 2024 at 04:43:30PM -0800, Jessica Zhang wrote:
> Cache the CWB block mask in the DPU virtual encoder and configure CWB
> according to the CWB block mask within the writeback phys encoder
> 
> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        | 75 +++++++++++++++++++++-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |  7 +-
>  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c    |  4 +-
>  3 files changed, 83 insertions(+), 3 deletions(-)
> 

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Jessica Zhang Dec. 20, 2024, 6:46 p.m. UTC | #4
On 12/19/2024 9:11 PM, Dmitry Baryshkov wrote:
> On Mon, Dec 16, 2024 at 04:43:11PM -0800, Jessica Zhang wrote:
>> DPU supports a single writeback session running concurrently with primary
>> display when the CWB mux is configured properly. This series enables
>> clone mode for DPU driver and adds support for programming the CWB mux
>> in cases where the hardware has dedicated CWB pingpong blocks. Currently,
>> the CWB hardware blocks have only been added to the SM8650
>> hardware catalog and only DSI has been exposed as a possible_clone of WB.
>>
>> This changes are split into two parts:
>>
>> The first part of the series will pull in Dmitry's patches to refactor
>> the DPU resource manager to be based off of CRTC instead of encoder.
>> This includes some changes (noted in the relevant commits) by me and
>> Abhinav to fix some issues with getting the global state and refactoring
>> the CDM allocation to work with Dmitry's changes.
> 
> To provide a sensible baseline for both CWB and Quad-Pipe changes I'm
> going to pull patches 5-14 (those which refactor the resource allocation
> and also those adding support for the CWB hardware block). The core DRM
> patches should probably go in through drm-misc-next.

Ack, thanks for all the help with reviews!

- Jessica Zhang

> 
>>
>> The second part of the series will add support for CWB by doing the
>> following:
>>
>> 1) Add a DRM helper to detect if the current CRTC state is in clone mode
>>     and add an "in_clone_mode" entry to the atomic state print
>> 2) Add the CWB mux to the hardware catalog and clarify the pingpong
>>     block index enum to specifiy which pingpong blocks are dedicated to
>>     CWB only and which ones are general use pingpong blocks
>> 3) Add CWB as part of the devcoredump
>> 4) Add support for configuring the CWB mux via dpu_hw_cwb ops
>> 5) Add pending flush support for CWB
>> 6) Add support for validating clone mode in the DPU CRTC and setting up
>>     CWB within the encoder
>> 7) Adjust the encoder trigger flush, trigger start, and kickoff order to
>>     accomodate clone mode
>> 8) Adjust when the frame done timer is started for clone mode
>> 9) Define the possible clones for DPU encoders so that
>>
>> The feature was tested on SM8650 using IGT's kms_writeback test with the
>> following change [1] and dumping the writeback framebuffer when in clone
>> mode. I haven't gotten the chance to test it on DP yet, but I've
>> validated both single and dual LM on DSI.
>>
>> To test CWB with IGT, you'll need to apply this series [1] and this
>> driver patch [2]. Run the following command to dump the writeback buffer:
>>
>> IGT_FRAME_DUMP_PATH=<dump path> FRAME_PNG_FILE_NAME=<file name> \
>> ./build/tests/kms_writeback -d [--run-subtest dump-valid-clones] \
>>
>> You can also do CRC validation by running this command:
>>
>> ./build/tests/kms_writeback [--run-subtest dump-valid-clones]
>>
>> [1] https://patchwork.freedesktop.org/series/137933/
>> [2] https://patchwork.freedesktop.org/series/138284/
>>
>> ---
>> Changes in v4:
>> - Rebased onto latest msm-next
>> - Added kunit tests for framework changes
>> - Skip valid clone check for encoders that don't have any possible clones set
>>    (this is to avoid failing kunit tests, specifically the HDMI state helper tests)
>> - Link to v3: https://lore.kernel.org/r/20241016-concurrent-wb-v3-0-a33cf9b93835@quicinc.com
>>
>> Changes in v3:
>> - Dropped support for CWB on DP connectors for now
>> - Dropped unnecessary PINGPONG array in *_setup_cwb()
>> - Add a check to make sure CWB and CDM aren't supported simultaneously
>>    (Dmitry)
>> - Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry)
>> - Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani)
>> - Dropped duplicate error message for reserving CWB resources (Dmitry)
>> - Added notes in framework changes about posting a separate series to
>>    add proper KUnit tests (Maxime)
>> - Added commit message note addressing Sima's comment on handling
>>    mode_changed (Dmitry)
>> - Formatting fixes (Dmitry)
>> - Added proper kerneldocs (Dmitry)
>> - Renamed dpu_encoder_helper_get_cwb() -> *_get_cwb_mask() (Dmitry)
>> - Capitalize all instances of "pingpong" in comments (Dmitry)
>> - Link to v2: https://lore.kernel.org/r/20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com
>>
>> Changes in v2:
>> - Moved CWB hardware programming to its own dpu_hw_cwb abstraction
>>    (Dmitry)
>> - Reserve and get assigned CWB muxes using RM API and KMS global state
>>    (Dmitry)
>> - Dropped requirement to have only one CWB session at a time
>> - Moved valid clone mode check to DRM framework (Dmitry and Ville)
>> - Switch to default CWB tap point to LM as the DSPP
>> - Dropped printing clone mode status in atomic state (Dmitry)
>> - Call dpu_vbif_clear_errors() before dpu_encoder_kickoff() (Dmitry)
>> - Squashed setup_input_ctrl() and setup_input_mode() into a single
>>    dpu_hw_cwb op (Dmitry)
>> - Moved function comment docs to correct place and fixed wording of
>>    comments/commit messages (Dmitry)
>> - Grabbed old CRTC state using proper drm_atomic_state API in
>>    dpu_crtc_atomic_check() (Dmitry)
>> - Split HW catalog changes of adding the CWB mux block and changing the
>>    dedicated CWB pingpong indices into 2 separate commits (Dmitry)
>> - Moved clearing the dpu_crtc_state.num_mixers to "drm/msm/dpu: fill
>>    CRTC resources in dpu_crtc.c" (Dmitry)
>> - Fixed alignment and other formatting issues (Dmitry)
>> - Link to v1: https://lore.kernel.org/r/20240829-concurrent-wb-v1-0-502b16ae2ebb@quicinc.com
>>
>> ---
>> Dmitry Baryshkov (4):
>>        drm/msm/dpu: get rid of struct dpu_rm_requirements
>>        drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation
>>        drm/msm/dpu: move resource allocation to CRTC
>>        drm/msm/dpu: fill CRTC resources in dpu_crtc.c
>>
>> Esha Bharadwaj (3):
>>        drm/msm/dpu: Add CWB entry to catalog for SM8650
>>        drm/msm/dpu: add devcoredumps for cwb registers
>>        drm/msm/dpu: add CWB support to dpu_hw_wb
>>
>> Jessica Zhang (18):
>>        drm: add clone mode check for CRTC
>>        drm/tests: Add test for drm_crtc_in_clone_mode()
>>        drm: Add valid clones check
>>        drm/tests: Add test for drm_atomic_helper_check_modeset()
>>        drm/msm/dpu: Specify dedicated CWB pingpong blocks
>>        drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block
>>        drm/msm/dpu: Add RM support for allocating CWB
>>        drm/msm/dpu: Add CWB to msm_display_topology
>>        drm/msm/dpu: Require modeset if clone mode status changes
>>        drm/msm/dpu: Fail atomic_check if CWB and CDM are enabled
>>        drm/msm/dpu: Reserve resources for CWB
>>        drm/msm/dpu: Configure CWB in writeback encoder
>>        drm/msm/dpu: Support CWB in dpu_hw_ctl
>>        drm/msm/dpu: Adjust writeback phys encoder setup for CWB
>>        drm/msm/dpu: Start frame done timer after encoder kickoff
>>        drm/msm/dpu: Skip trigger flush and start for CWB
>>        drm/msm/dpu: Reorder encoder kickoff for CWB
>>        drm/msm/dpu: Set possible clones for all encoders
>>
>>   drivers/gpu/drm/drm_atomic_helper.c                |  28 ++
>>   drivers/gpu/drm/drm_crtc.c                         |  20 +
>>   drivers/gpu/drm/msm/Makefile                       |   1 +
>>   .../drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h    |  29 +-
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h |   4 +-
>>   .../drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h    |   4 +-
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h |   4 +-
>>   .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h   |   4 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c           | 208 ++++++++-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        | 463 ++++++++++++---------
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |  14 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   7 +-
>>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c    |  16 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |  13 +
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c         |  30 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h         |  15 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.c         |  73 ++++
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h         |  70 ++++
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h        |  15 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c          |   4 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  12 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |  13 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 361 +++++++++-------
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  13 +-
>>   drivers/gpu/drm/tests/drm_atomic_state_test.c      | 133 +++++-
>>   include/drm/drm_crtc.h                             |   2 +-
>>   26 files changed, 1172 insertions(+), 384 deletions(-)
>> ---
>> base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
>> change-id: 20240618-concurrent-wb-97d62387f952
>> prerequisite-change-id: 20241209-abhinavk-modeset-fix-74864f1de08d:v3
>> prerequisite-patch-id: a197a0cd4647cb189ea20a96583ea78d0c98b638
>> prerequisite-patch-id: 112c8f1795cbed989beb02b72561854c0ccd59dd
>>
>> Best regards,
>> -- 
>> Jessica Zhang <quic_jesszhan@quicinc.com>
>>
> 
> -- 
> With best wishes
> Dmitry
Jessica Zhang Dec. 21, 2024, 12:12 a.m. UTC | #5
On 12/19/2024 9:52 PM, Dmitry Baryshkov wrote:
> On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote:
>> Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes
>>
>> For concurrent writeback, even-indexed CWB muxes must be assigned to
>> even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same
>> even/odd rule applies for dedicated CWB PINGPONGs.
>>
>> Track the CWB muxes in the global state and add a CWB-specific helper to
>> reserve the correct CWB muxes and dedicated PINGPONGs following the
>> even/odd rule.
>>
>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
>> ---
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++--
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  2 +
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  1 +
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 83 +++++++++++++++++++++++++++++
>>   4 files changed, 116 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> @@ -2,7 +2,7 @@
>>   /*
>>    * Copyright (C) 2013 Red Hat
>>    * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved.
>> - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
>> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
>>    *
>>    * Author: Rob Clark <robdclark@gmail.com>
>>    */
>> @@ -28,6 +28,7 @@
>>   #include "dpu_hw_dsc.h"
>>   #include "dpu_hw_merge3d.h"
>>   #include "dpu_hw_cdm.h"
>> +#include "dpu_hw_cwb.h"
>>   #include "dpu_formats.h"
>>   #include "dpu_encoder_phys.h"
>>   #include "dpu_crtc.h"
>> @@ -133,6 +134,9 @@ enum dpu_enc_rc_states {
>>    * @cur_slave:		As above but for the slave encoder.
>>    * @hw_pp:		Handle to the pingpong blocks used for the display. No.
>>    *			pingpong blocks can be different than num_phys_encs.
>> + * @hw_cwb:		Handle to the CWB muxes used for concurrent writeback
>> + *			display. Number of CWB muxes can be different than
>> + *			num_phys_encs.
>>    * @hw_dsc:		Handle to the DSC blocks used for the display.
>>    * @dsc_mask:		Bitmask of used DSC blocks.
>>    * @intfs_swapped:	Whether or not the phys_enc interfaces have been swapped
>> @@ -177,6 +181,7 @@ struct dpu_encoder_virt {
>>   	struct dpu_encoder_phys *cur_master;
>>   	struct dpu_encoder_phys *cur_slave;
>>   	struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
>> +	struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC];
>>   	struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
>>   
>>   	unsigned int dsc_mask;
>> @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>   	struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC];
>>   	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
>>   	struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
>> +	struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC];
>>   	int num_pp, num_dsc, num_ctl;
>> +	int num_cwb = 0;
>> +	bool is_cwb_encoder;
>>   	unsigned int dsc_mask = 0;
>>   	int i;
>>   
>> @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>   
>>   	priv = drm_enc->dev->dev_private;
>>   	dpu_kms = to_dpu_kms(priv->kms);
>> +	is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) &&
>> +			dpu_enc->disp_info.intf_type == INTF_WB;
>>   
>>   	global_state = dpu_kms_get_existing_global_state(dpu_kms);
>>   	if (IS_ERR_OR_NULL(global_state)) {
>> @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>   	trace_dpu_enc_mode_set(DRMID(drm_enc));
>>   
>>   	/* Query resource that have been reserved in atomic check step. */
>> -	num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> -		drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp,
>> -		ARRAY_SIZE(hw_pp));
>> +	if (is_cwb_encoder) {
>> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> +						       drm_enc->crtc,
>> +						       DPU_HW_BLK_DCWB_PINGPONG,
>> +						       hw_pp, ARRAY_SIZE(hw_pp));
>> +		num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> +						       drm_enc->crtc,
>> +						       DPU_HW_BLK_CWB,
>> +						       hw_cwb, ARRAY_SIZE(hw_cwb));
>> +	} else {
>> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> +						       drm_enc->crtc,
>> +						       DPU_HW_BLK_PINGPONG, hw_pp,
>> +						       ARRAY_SIZE(hw_pp));
>> +	}
>> +
>> +	for (i = 0; i < num_cwb; i++)
>> +		dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]);
>> +
>>   	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>   			drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
>>   
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>> index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>> @@ -77,12 +77,14 @@ enum dpu_hw_blk_type {
>>   	DPU_HW_BLK_LM,
>>   	DPU_HW_BLK_CTL,
>>   	DPU_HW_BLK_PINGPONG,
>> +	DPU_HW_BLK_DCWB_PINGPONG,
>>   	DPU_HW_BLK_INTF,
>>   	DPU_HW_BLK_WB,
>>   	DPU_HW_BLK_DSPP,
>>   	DPU_HW_BLK_MERGE_3D,
>>   	DPU_HW_BLK_DSC,
>>   	DPU_HW_BLK_CDM,
>> +	DPU_HW_BLK_CWB,
>>   	DPU_HW_BLK_MAX,
>>   };
>>   
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> @@ -128,6 +128,7 @@ struct dpu_global_state {
>>   	uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0];
>>   	uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0];
>>   	uint32_t cdm_to_crtc_id;
>> +	uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0];
>>   };
>>   
>>   struct dpu_global_state
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>> @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
>>   	return -EINVAL;
>>   }
>>   
>> +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm,
>> +						 struct dpu_global_state *global_state,
>> +						 uint32_t crtc_id,
>> +						 struct msm_display_topology *topology)
>> +{
>> +	int num_cwb_pp = topology->num_lm, cwb_pp_count = 0;
>> +	int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0;
>> +	int cwb_pp_idx[MAX_BLOCKS];
>> +	int cwb_mux_idx[MAX_BLOCKS];
>> +
>> +	/*
>> +	 * Reserve additional dedicated CWB PINGPONG blocks and muxes for each
>> +	 * mixer
>> +	 *
>> +	 * TODO: add support reserving resources for platforms with no
>> +	 *       PINGPONG_CWB
> 
> What about doing it other way around: allocate CWBs first as required
> (even/odd, proper count, etc). Then for each of CWBs allocate a PP block
> (I think it's enough to simply make CWB blocks have a corresponding PP
> index as a property). This way the driver can handle both legacy and
> current platforms.

Hi Dmitry,

Sorry if I'm misunderstanding your suggestion, but the main change 
needed to support platforms with no dedicated PINGPONG_CWB is where in 
the rm->pingpong_blks list to start assigning pingpong blocks for the 
CWB mux. I'm not sure how changing the order in which CWBs and the 
pingpong blocks are assigned will address that.

(FWIW, the only change necessary to add support for non-dedicated 
PINGPONG_CWBs platforms for this function should just be changing the 
initialization value of cwb_pp_start_idx)

Thanks,

Jessica Zhang

> 
>> +	 */
>> +	for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) &&
>> +	     cwb_pp_count < num_cwb_pp; i++) {
>> +		for (int j = cwb_pp_start_idx;
>> +		     j < ARRAY_SIZE(rm->pingpong_blks); j++) {
>> +			/*
>> +			 * Odd LMs must be assigned to odd PINGPONGs and even
>> +			 * LMs with even PINGPONGs
>> +			 */
>> +			if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) ||
>> +			    i % 2 != j % 2)
>> +				continue;
>> +
>> +			cwb_pp_idx[cwb_pp_count] = j;
>> +			cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx;
>> +			cwb_pp_count++;
>> +			break;
>> +		}
>> +	}
>> +
>> +	if (cwb_pp_count != num_cwb_pp) {
>> +		DPU_ERROR("Unable to reserve all CWB PINGPONGs\n");
>> +		return -ENAVAIL;
>> +	}
>> +
>> +	for (int i = 0; i < cwb_pp_count; i++) {
>> +		global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id;
>> +		global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets
>>    *	proposed use case requirements, incl. hardwired dependent blocks like
>> @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation(
>>   		return ret;
>>   	}
>>   
>> +	if (topology->cwb_enabled) {
>> +		ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state,
>> +							    crtc_id, topology);
>> +		if (ret)
>> +			return ret;
>> +	}
>>   
>>   	ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id,
>>   			topology);
>> @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
>>   	_dpu_rm_clear_mapping(global_state->dspp_to_crtc_id,
>>   			ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id);
>>   	_dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id);
>> +	_dpu_rm_clear_mapping(global_state->cwb_to_crtc_id,
>> +			ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id);
>>   }
>>   
>>   /**
>> @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>   
>>   	switch (type) {
>>   	case DPU_HW_BLK_PINGPONG:
>> +	case DPU_HW_BLK_DCWB_PINGPONG:
>>   		hw_blks = rm->pingpong_blks;
>>   		hw_to_crtc_id = global_state->pingpong_to_crtc_id;
>>   		max_blks = ARRAY_SIZE(rm->pingpong_blks);
>> @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>   		hw_to_crtc_id = &global_state->cdm_to_crtc_id;
>>   		max_blks = 1;
>>   		break;
>> +	case DPU_HW_BLK_CWB:
>> +		hw_blks = rm->cwb_blks;
>> +		hw_to_crtc_id = global_state->cwb_to_crtc_id;
>> +		max_blks = ARRAY_SIZE(rm->cwb_blks);
>> +		break;
>>   	default:
>>   		DPU_ERROR("blk type %d not managed by rm\n", type);
>>   		return 0;
>> @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>   		if (hw_to_crtc_id[i] != crtc_id)
>>   			continue;
>>   
>> +		if (type == DPU_HW_BLK_PINGPONG) {
>> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
>> +
>> +			if (pp->idx >= PINGPONG_CWB_0)
>> +				continue;
>> +		}
>> +
>> +		if (type == DPU_HW_BLK_DCWB_PINGPONG) {
>> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
>> +
>> +			if (pp->idx < PINGPONG_CWB_0)
>> +				continue;
>> +		}
>> +
>>   		if (num_blks == blks_size) {
>>   			DPU_ERROR("More than %d resources assigned to crtc %d\n",
>>   				  blks_size, crtc_id);
>> @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p,
>>   	dpu_rm_print_state_helper(p, rm->cdm_blk,
>>   				  global_state->cdm_to_crtc_id);
>>   	drm_puts(p, "\n");
>> +
>> +	drm_puts(p, "\tcwb=");
>> +	for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++)
>> +		dpu_rm_print_state_helper(p, rm->cwb_blks[i],
>> +					  global_state->cwb_to_crtc_id[i]);
>> +	drm_puts(p, "\n");
>>   }
>>
>> -- 
>> 2.34.1
>>
> 
> -- 
> With best wishes
> Dmitry
Dmitry Baryshkov Dec. 21, 2024, 1:07 a.m. UTC | #6
On Fri, Dec 20, 2024 at 04:12:29PM -0800, Jessica Zhang wrote:
> 
> 
> On 12/19/2024 9:52 PM, Dmitry Baryshkov wrote:
> > On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote:
> > > Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes
> > > 
> > > For concurrent writeback, even-indexed CWB muxes must be assigned to
> > > even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same
> > > even/odd rule applies for dedicated CWB PINGPONGs.
> > > 
> > > Track the CWB muxes in the global state and add a CWB-specific helper to
> > > reserve the correct CWB muxes and dedicated PINGPONGs following the
> > > even/odd rule.
> > > 
> > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> > > ---
> > >   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++--
> > >   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  2 +
> > >   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  1 +
> > >   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 83 +++++++++++++++++++++++++++++
> > >   4 files changed, 116 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > @@ -2,7 +2,7 @@
> > >   /*
> > >    * Copyright (C) 2013 Red Hat
> > >    * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved.
> > > - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> > > + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> > >    *
> > >    * Author: Rob Clark <robdclark@gmail.com>
> > >    */
> > > @@ -28,6 +28,7 @@
> > >   #include "dpu_hw_dsc.h"
> > >   #include "dpu_hw_merge3d.h"
> > >   #include "dpu_hw_cdm.h"
> > > +#include "dpu_hw_cwb.h"
> > >   #include "dpu_formats.h"
> > >   #include "dpu_encoder_phys.h"
> > >   #include "dpu_crtc.h"
> > > @@ -133,6 +134,9 @@ enum dpu_enc_rc_states {
> > >    * @cur_slave:		As above but for the slave encoder.
> > >    * @hw_pp:		Handle to the pingpong blocks used for the display. No.
> > >    *			pingpong blocks can be different than num_phys_encs.
> > > + * @hw_cwb:		Handle to the CWB muxes used for concurrent writeback
> > > + *			display. Number of CWB muxes can be different than
> > > + *			num_phys_encs.
> > >    * @hw_dsc:		Handle to the DSC blocks used for the display.
> > >    * @dsc_mask:		Bitmask of used DSC blocks.
> > >    * @intfs_swapped:	Whether or not the phys_enc interfaces have been swapped
> > > @@ -177,6 +181,7 @@ struct dpu_encoder_virt {
> > >   	struct dpu_encoder_phys *cur_master;
> > >   	struct dpu_encoder_phys *cur_slave;
> > >   	struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
> > > +	struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC];
> > >   	struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
> > >   	unsigned int dsc_mask;
> > > @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > >   	struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC];
> > >   	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
> > >   	struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
> > > +	struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC];
> > >   	int num_pp, num_dsc, num_ctl;
> > > +	int num_cwb = 0;
> > > +	bool is_cwb_encoder;
> > >   	unsigned int dsc_mask = 0;
> > >   	int i;
> > > @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > >   	priv = drm_enc->dev->dev_private;
> > >   	dpu_kms = to_dpu_kms(priv->kms);
> > > +	is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) &&
> > > +			dpu_enc->disp_info.intf_type == INTF_WB;
> > >   	global_state = dpu_kms_get_existing_global_state(dpu_kms);
> > >   	if (IS_ERR_OR_NULL(global_state)) {
> > > @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > >   	trace_dpu_enc_mode_set(DRMID(drm_enc));
> > >   	/* Query resource that have been reserved in atomic check step. */
> > > -	num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > -		drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp,
> > > -		ARRAY_SIZE(hw_pp));
> > > +	if (is_cwb_encoder) {
> > > +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > +						       drm_enc->crtc,
> > > +						       DPU_HW_BLK_DCWB_PINGPONG,
> > > +						       hw_pp, ARRAY_SIZE(hw_pp));
> > > +		num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > +						       drm_enc->crtc,
> > > +						       DPU_HW_BLK_CWB,
> > > +						       hw_cwb, ARRAY_SIZE(hw_cwb));
> > > +	} else {
> > > +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > +						       drm_enc->crtc,
> > > +						       DPU_HW_BLK_PINGPONG, hw_pp,
> > > +						       ARRAY_SIZE(hw_pp));
> > > +	}
> > > +
> > > +	for (i = 0; i < num_cwb; i++)
> > > +		dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]);
> > > +
> > >   	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > >   			drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > @@ -77,12 +77,14 @@ enum dpu_hw_blk_type {
> > >   	DPU_HW_BLK_LM,
> > >   	DPU_HW_BLK_CTL,
> > >   	DPU_HW_BLK_PINGPONG,
> > > +	DPU_HW_BLK_DCWB_PINGPONG,
> > >   	DPU_HW_BLK_INTF,
> > >   	DPU_HW_BLK_WB,
> > >   	DPU_HW_BLK_DSPP,
> > >   	DPU_HW_BLK_MERGE_3D,
> > >   	DPU_HW_BLK_DSC,
> > >   	DPU_HW_BLK_CDM,
> > > +	DPU_HW_BLK_CWB,
> > >   	DPU_HW_BLK_MAX,
> > >   };
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > @@ -128,6 +128,7 @@ struct dpu_global_state {
> > >   	uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0];
> > >   	uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0];
> > >   	uint32_t cdm_to_crtc_id;
> > > +	uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0];
> > >   };
> > >   struct dpu_global_state
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
> > >   	return -EINVAL;
> > >   }
> > > +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm,
> > > +						 struct dpu_global_state *global_state,
> > > +						 uint32_t crtc_id,
> > > +						 struct msm_display_topology *topology)
> > > +{
> > > +	int num_cwb_pp = topology->num_lm, cwb_pp_count = 0;
> > > +	int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0;
> > > +	int cwb_pp_idx[MAX_BLOCKS];
> > > +	int cwb_mux_idx[MAX_BLOCKS];
> > > +
> > > +	/*
> > > +	 * Reserve additional dedicated CWB PINGPONG blocks and muxes for each
> > > +	 * mixer
> > > +	 *
> > > +	 * TODO: add support reserving resources for platforms with no
> > > +	 *       PINGPONG_CWB
> > 
> > What about doing it other way around: allocate CWBs first as required
> > (even/odd, proper count, etc). Then for each of CWBs allocate a PP block
> > (I think it's enough to simply make CWB blocks have a corresponding PP
> > index as a property). This way the driver can handle both legacy and
> > current platforms.
> 
> Hi Dmitry,
> 
> Sorry if I'm misunderstanding your suggestion, but the main change needed to
> support platforms with no dedicated PINGPONG_CWB is where in the
> rm->pingpong_blks list to start assigning pingpong blocks for the CWB mux.
> I'm not sure how changing the order in which CWBs and the pingpong blocks
> are assigned will address that.
> 
> (FWIW, the only change necessary to add support for non-dedicated
> PINGPONG_CWBs platforms for this function should just be changing the
> initialization value of cwb_pp_start_idx)

If I remember correctly, we have identified several generations of DPU
wrt. CWB handling:
- 8.1+ (or 8.0+?), DCWB, dedicated PP blocks
- 7.2, dedicated PP_1?
- 5.0+, shared PP blocks
- older DPUs, special handling of PP

If the driver allocates PP first and then first it has to allocated PP
(in a platform-specific way) and then go from PINGPONG to CWB (in a
platform-specific way). If CWB is allocated first, then you have only
one platform-specific piece of code that gets PINGPONG for the CWB (and
as this function is called after the CWB allocation, the major part of
the CWB / PP allocation is generic).

> 
> Thanks,
> 
> Jessica Zhang
> 
> > 
> > > +	 */
> > > +	for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) &&
> > > +	     cwb_pp_count < num_cwb_pp; i++) {
> > > +		for (int j = cwb_pp_start_idx;
> > > +		     j < ARRAY_SIZE(rm->pingpong_blks); j++) {
> > > +			/*
> > > +			 * Odd LMs must be assigned to odd PINGPONGs and even
> > > +			 * LMs with even PINGPONGs
> > > +			 */
> > > +			if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) ||
> > > +			    i % 2 != j % 2)
> > > +				continue;
> > > +
> > > +			cwb_pp_idx[cwb_pp_count] = j;
> > > +			cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx;
> > > +			cwb_pp_count++;
> > > +			break;
> > > +		}
> > > +	}
> > > +
> > > +	if (cwb_pp_count != num_cwb_pp) {
> > > +		DPU_ERROR("Unable to reserve all CWB PINGPONGs\n");
> > > +		return -ENAVAIL;
> > > +	}
> > > +
> > > +	for (int i = 0; i < cwb_pp_count; i++) {
> > > +		global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id;
> > > +		global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >   /**
> > >    * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets
> > >    *	proposed use case requirements, incl. hardwired dependent blocks like
> > > @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation(
> > >   		return ret;
> > >   	}
> > > +	if (topology->cwb_enabled) {
> > > +		ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state,
> > > +							    crtc_id, topology);
> > > +		if (ret)
> > > +			return ret;
> > > +	}
> > >   	ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id,
> > >   			topology);
> > > @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
> > >   	_dpu_rm_clear_mapping(global_state->dspp_to_crtc_id,
> > >   			ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id);
> > >   	_dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id);
> > > +	_dpu_rm_clear_mapping(global_state->cwb_to_crtc_id,
> > > +			ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id);
> > >   }
> > >   /**
> > > @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
> > >   	switch (type) {
> > >   	case DPU_HW_BLK_PINGPONG:
> > > +	case DPU_HW_BLK_DCWB_PINGPONG:
> > >   		hw_blks = rm->pingpong_blks;
> > >   		hw_to_crtc_id = global_state->pingpong_to_crtc_id;
> > >   		max_blks = ARRAY_SIZE(rm->pingpong_blks);
> > > @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
> > >   		hw_to_crtc_id = &global_state->cdm_to_crtc_id;
> > >   		max_blks = 1;
> > >   		break;
> > > +	case DPU_HW_BLK_CWB:
> > > +		hw_blks = rm->cwb_blks;
> > > +		hw_to_crtc_id = global_state->cwb_to_crtc_id;
> > > +		max_blks = ARRAY_SIZE(rm->cwb_blks);
> > > +		break;
> > >   	default:
> > >   		DPU_ERROR("blk type %d not managed by rm\n", type);
> > >   		return 0;
> > > @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
> > >   		if (hw_to_crtc_id[i] != crtc_id)
> > >   			continue;
> > > +		if (type == DPU_HW_BLK_PINGPONG) {
> > > +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
> > > +
> > > +			if (pp->idx >= PINGPONG_CWB_0)
> > > +				continue;
> > > +		}
> > > +
> > > +		if (type == DPU_HW_BLK_DCWB_PINGPONG) {
> > > +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
> > > +
> > > +			if (pp->idx < PINGPONG_CWB_0)
> > > +				continue;
> > > +		}
> > > +
> > >   		if (num_blks == blks_size) {
> > >   			DPU_ERROR("More than %d resources assigned to crtc %d\n",
> > >   				  blks_size, crtc_id);
> > > @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p,
> > >   	dpu_rm_print_state_helper(p, rm->cdm_blk,
> > >   				  global_state->cdm_to_crtc_id);
> > >   	drm_puts(p, "\n");
> > > +
> > > +	drm_puts(p, "\tcwb=");
> > > +	for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++)
> > > +		dpu_rm_print_state_helper(p, rm->cwb_blks[i],
> > > +					  global_state->cwb_to_crtc_id[i]);
> > > +	drm_puts(p, "\n");
> > >   }
> > > 
> > > -- 
> > > 2.34.1
> > > 
> > 
> > -- 
> > With best wishes
> > Dmitry
>
Dmitry Baryshkov Dec. 24, 2024, 5:02 a.m. UTC | #7
On Mon, Dec 16, 2024 at 04:43:18PM -0800, Jessica Zhang wrote:
> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> 
> All resource allocation is centered around the LMs. Then other blocks
> (except DSCs) are allocated basing on the LMs that was selected, and LM
> powers up the CRTC rather than the encoder.
> 
> Moreover if at some point the driver supports encoder cloning,
> allocating resources from the encoder will be incorrect, as all clones
> will have different encoder IDs, while LMs are to be shared by these
> encoders.
> 
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM]
> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> [quic_jesszhan@quicinc.com: Changed to grabbing exising global state]
> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c    |  86 ++++++++++
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 256 ++++++++++------------------
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |   8 +
>  3 files changed, 181 insertions(+), 169 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> index 9f6ffd344693ecfb633095772a31ada5613345dc..186ed84f59f16997716fe216e635b8dce07a63a1 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> @@ -1182,6 +1182,78 @@ static bool dpu_crtc_needs_dirtyfb(struct drm_crtc_state *cstate)
>  	return false;
>  }
>  
> +#define MAX_HDISPLAY_SPLIT 1080
> +
> +static struct msm_display_topology dpu_crtc_get_topology(
> +		struct drm_crtc *crtc,
> +		struct dpu_kms *dpu_kms,
> +		struct drm_crtc_state *crtc_state)
> +{
> +	struct drm_display_mode *mode = &crtc_state->adjusted_mode;
> +	struct msm_display_topology topology = {0};
> +	struct drm_encoder *drm_enc;
> +
> +	drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask)
> +		dpu_encoder_update_topology(drm_enc, &topology, crtc_state->state,
> +					    &crtc_state->adjusted_mode);
> +
> +	/*
> +	 * Datapath topology selection
> +	 *
> +	 * Dual display
> +	 * 2 LM, 2 INTF ( Split display using 2 interfaces)
> +	 *
> +	 * Single display
> +	 * 1 LM, 1 INTF
> +	 * 2 LM, 1 INTF (stream merge to support high resolution interfaces)
> +	 *
> +	 * Add dspps to the reservation requirements if ctm is requested
> +	 */
> +
> +	if (topology.num_intf == 2)
> +		topology.num_lm = 2;
> +	else if (topology.num_dsc == 2)
> +		topology.num_lm = 2;
> +	else if (dpu_kms->catalog->caps->has_3d_merge)
> +		topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
> +	else
> +		topology.num_lm = 1;
> +
> +	if (crtc_state->ctm)
> +		topology.num_dspp = topology.num_lm;
> +
> +	return topology;
> +}
> +
> +static int dpu_crtc_assign_resources(struct drm_crtc *crtc, struct drm_crtc_state *crtc_state)
> +{
> +	struct dpu_kms *dpu_kms = _dpu_crtc_get_kms(crtc);
> +	struct dpu_global_state *global_state;
> +	struct msm_display_topology topology;
> +	int ret;
> +
> +	/*
> +	 * Release and Allocate resources on every modeset
> +	 * Dont allocate when enable is false.
> +	 */
> +	global_state = dpu_kms_get_existing_global_state(dpu_kms);
> +	if (IS_ERR(global_state))
> +		return PTR_ERR(global_state);
> +
> +	dpu_rm_release(global_state, crtc);
> +
> +	if (!crtc_state->enable)
> +		return 0;
> +
> +	topology = dpu_crtc_get_topology(crtc, dpu_kms, crtc_state);
> +	ret = dpu_rm_reserve(&dpu_kms->rm, global_state,
> +			     crtc, &topology);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
>  static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
>  		struct drm_atomic_state *state)
>  {
> @@ -1193,10 +1265,24 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
>  	const struct drm_plane_state *pstate;
>  	struct drm_plane *plane;
>  
> +	struct drm_encoder *drm_enc;
> +
>  	int rc = 0;
>  
>  	bool needs_dirtyfb = dpu_crtc_needs_dirtyfb(crtc_state);
>  
> +	/* there might be cases where encoder needs a modeset too */
> +	drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask) {
> +		if (dpu_encoder_needs_modeset(drm_enc, crtc_state->state))
> +			crtc_state->mode_changed = true;

I will postpone this patch for a while, pending the review of the
drm_atomic_helper_check_modeset() series

https://lore.kernel.org/dri-devel/20241222-drm-dirty-modeset-v1-0-0e76a53eceb9@linaro.org/

Not to mention that this commit looks broken, see below.

> +	}
> +
> +	if (drm_atomic_crtc_needs_modeset(crtc_state)) {
> +		rc = dpu_crtc_assign_resources(crtc, crtc_state);
> +		if (rc < 0)
> +			return rc;
> +	}
> +
>  	if (!crtc_state->enable || !drm_atomic_crtc_effectively_active(crtc_state)) {
>  		DRM_DEBUG_ATOMIC("crtc%d -> enable %d, active %d, skip atomic_check\n",
>  				crtc->base.id, crtc_state->enable,
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index e6f930dd34566d01223823de82c922668e6be300..2b999a0558b2a016644ed5d25bf54ab45c38d1d9 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -58,8 +58,6 @@
>  
>  #define IDLE_SHORT_TIMEOUT	1
>  
> -#define MAX_HDISPLAY_SPLIT 1080
> -
>  /* timeout in frames waiting for frame done */
>  #define DPU_ENCODER_FRAME_DONE_TIMEOUT_FRAMES 5
>  
> @@ -609,206 +607,127 @@ void dpu_encoder_helper_split_config(
>  	}
>  }
>  
> -/**
> - * dpu_encoder_use_dsc_merge - returns true if the encoder uses DSC merge topology.
> - * @drm_enc:    Pointer to previously created drm encoder structure
> - */
> -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
> +				 struct msm_display_topology *topology,
> +				 struct drm_atomic_state *state,
> +				 const struct drm_display_mode *adj_mode)
>  {
>  	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> -	int i, intf_count = 0, num_dsc = 0;
> +	struct drm_connector *connector;
> +	struct drm_connector_state *conn_state;
> +	struct msm_display_info *disp_info;
> +	struct drm_framebuffer *fb;
> +	struct msm_drm_private *priv;
> +	int i;
>  
>  	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
>  		if (dpu_enc->phys_encs[i])
> -			intf_count++;
> +			topology->num_intf++;
>  
> -	/* See dpu_encoder_get_topology, we only support 2:2:1 topology */
> +	/* We only support 2 DSC mode (with 2 LM and 1 INTF) */
>  	if (dpu_enc->dsc)
> -		num_dsc = 2;
> -
> -	return (num_dsc > 0) && (num_dsc > intf_count);
> -}
> -
> -/**
> - * dpu_encoder_get_dsc_config - get DSC config for the DPU encoder
> - *   This helper function is used by physical encoder to get DSC config
> - *   used for this encoder.
> - * @drm_enc: Pointer to encoder structure
> - */
> -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
> -{
> -	struct msm_drm_private *priv = drm_enc->dev->dev_private;
> -	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> -	int index = dpu_enc->disp_info.h_tile_instance[0];
> +		topology->num_dsc += 2;
>  
> -	if (dpu_enc->disp_info.intf_type == INTF_DSI)
> -		return msm_dsi_get_dsc_config(priv->dsi[index]);
> +	connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
> +	if (!connector)
> +		return;
> +	conn_state = drm_atomic_get_new_connector_state(state, connector);
> +	if (!conn_state)
> +		return;
>  
> -	return NULL;
> -}
> +	disp_info = &dpu_enc->disp_info;
>  
> -static struct msm_display_topology dpu_encoder_get_topology(
> -			struct dpu_encoder_virt *dpu_enc,
> -			struct dpu_kms *dpu_kms,
> -			struct drm_display_mode *mode,
> -			struct drm_crtc_state *crtc_state,
> -			struct drm_dsc_config *dsc)
> -{
> -	struct msm_display_topology topology = {0};
> -	int i, intf_count = 0;
> +	priv = drm_enc->dev->dev_private;
>  
> -	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
> -		if (dpu_enc->phys_encs[i])
> -			intf_count++;
> -
> -	/* Datapath topology selection
> -	 *
> -	 * Dual display
> -	 * 2 LM, 2 INTF ( Split display using 2 interfaces)
> -	 *
> -	 * Single display
> -	 * 1 LM, 1 INTF
> -	 * 2 LM, 1 INTF (stream merge to support high resolution interfaces)
> -	 *
> -	 * Add dspps to the reservation requirements if ctm is requested
> +	/*
> +	 * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
> +	 * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
> +	 * earlier.
>  	 */
> -	if (intf_count == 2)
> -		topology.num_lm = 2;
> -	else if (!dpu_kms->catalog->caps->has_3d_merge)
> -		topology.num_lm = 1;
> -	else
> -		topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
> -
> -	if (crtc_state->ctm)
> -		topology.num_dspp = topology.num_lm;
> -
> -	topology.num_intf = intf_count;
> +	if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
> +		fb = conn_state->writeback_job->fb;
>  
> -	if (dsc) {
> -		/*
> -		 * In case of Display Stream Compression (DSC), we would use
> -		 * 2 DSC encoders, 2 layer mixers and 1 interface
> -		 * this is power optimal and can drive up to (including) 4k
> -		 * screens
> -		 */
> -		topology.num_dsc = 2;
> -		topology.num_lm = 2;
> -		topology.num_intf = 1;
> +		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
> +			topology->needs_cdm = true;
> +	} else if (disp_info->intf_type == INTF_DP) {
> +		if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
> +			topology->needs_cdm = true;
>  	}
> -
> -	return topology;
>  }
>  
> -static void dpu_encoder_assign_crtc_resources(struct dpu_kms *dpu_kms,
> -					      struct drm_encoder *drm_enc,
> -					      struct dpu_global_state *global_state,
> -					      struct drm_crtc_state *crtc_state)
> +static bool dpu_encoder_needs_dsc_merge(struct drm_encoder *drm_enc)
>  {
> -	struct dpu_crtc_state *cstate;
> -	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
> -	struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC];
> -	struct dpu_hw_blk *hw_dspp[MAX_CHANNELS_PER_ENC];
> -	int num_lm, num_ctl, num_dspp, i;
> -
> -	cstate = to_dpu_crtc_state(crtc_state);
> -
> -	memset(cstate->mixers, 0, sizeof(cstate->mixers));
> -
> -	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> -		drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
> -	num_lm = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> -		drm_enc->crtc, DPU_HW_BLK_LM, hw_lm, ARRAY_SIZE(hw_lm));
> -	num_dspp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> -		drm_enc->crtc, DPU_HW_BLK_DSPP, hw_dspp,
> -		ARRAY_SIZE(hw_dspp));
> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> +	u32 num_intf = 0;
> +	u32 num_dsc = 0;
> +	int i;
>  
> -	for (i = 0; i < num_lm; i++) {
> -		int ctl_idx = (i < num_ctl) ? i : (num_ctl-1);
> +	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
> +		if (dpu_enc->phys_encs[i])
> +			num_intf++;
>  
> -		cstate->mixers[i].hw_lm = to_dpu_hw_mixer(hw_lm[i]);
> -		cstate->mixers[i].lm_ctl = to_dpu_hw_ctl(hw_ctl[ctl_idx]);
> -		cstate->mixers[i].hw_dspp = i < num_dspp ? to_dpu_hw_dspp(hw_dspp[i]) : NULL;
> -	}
> +	/* We only support 2 DSC mode (with 2 LM and 1 INTF) */
> +	if (dpu_enc->dsc)
> +		num_dsc += 2;
>  
> -	cstate->num_mixers = num_lm;
> +	return (num_dsc > 0) && (num_dsc > num_intf);

Ok, so after all the rebases this commit removes CRTC resource
assignment from dpu_encoder.c, but they are added to dpu_crtc.c only in
the next commit! So after this one the tree is broken. This isn't really
acceptable. After each commit the tree should work, otherwise git-bisect
might return incorrect results.

Historically this patch just moved the allocation to the
dpu_crtc_atomic_check(), while cstate has been maniputated in
dpu_encoder_virt_atomic_mode_set(). Commit 3ae133b0192b ("drm/msm/dpu:
move CRTC resource assignment to dpu_encoder_virt_atomic_check") moved
resource handling to the atomic_check() stage. I think at this point you
need to take one step back, return to the previous commits, but revert
their order: first move cstate manipulation to happen during the
dpu_crtc_atomic_check() function call, leaving dpu_rm_release() /
dpu_rm_reserve() out of dpu_crtc_assign_resources() (in
dpu_encoder_virt_atomic_check() as they are now). Then can come this
patch, which moves topology handling, resource reservation, etc. to
dpu_crtc.c.

>  }
>  
> -static int dpu_encoder_virt_atomic_check(
> -		struct drm_encoder *drm_enc,
> -		struct drm_crtc_state *crtc_state,
> -		struct drm_connector_state *conn_state)
> +bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_state *state)
>  {
> -	struct dpu_encoder_virt *dpu_enc;
> -	struct msm_drm_private *priv;
> -	struct dpu_kms *dpu_kms;
> -	struct drm_display_mode *adj_mode;
> -	struct msm_display_topology topology;
> -	struct msm_display_info *disp_info;
> -	struct dpu_global_state *global_state;
> +	struct drm_connector *connector;
> +	struct drm_connector_state *conn_state;
>  	struct drm_framebuffer *fb;
> -	struct drm_dsc_config *dsc;
> -	int ret = 0;
> -
> -	if (!drm_enc || !crtc_state || !conn_state) {
> -		DPU_ERROR("invalid arg(s), drm_enc %d, crtc/conn state %d/%d\n",
> -				drm_enc != NULL, crtc_state != NULL, conn_state != NULL);
> -		return -EINVAL;
> -	}
> -
> -	dpu_enc = to_dpu_encoder_virt(drm_enc);
> -	DPU_DEBUG_ENC(dpu_enc, "\n");
> -
> -	priv = drm_enc->dev->dev_private;
> -	disp_info = &dpu_enc->disp_info;
> -	dpu_kms = to_dpu_kms(priv->kms);
> -	adj_mode = &crtc_state->adjusted_mode;
> -	global_state = dpu_kms_get_global_state(crtc_state->state);
> -	if (IS_ERR(global_state))
> -		return PTR_ERR(global_state);
> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>  
> -	trace_dpu_enc_atomic_check(DRMID(drm_enc));
> +	if (!drm_enc || !state)
> +		return false;
>  
> -	dsc = dpu_encoder_get_dsc_config(drm_enc);
> +	connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
> +	if (!connector)
> +		return false;
>  
> -	topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode, crtc_state, dsc);
> +	conn_state = drm_atomic_get_new_connector_state(state, connector);
>  
> -	/*
> -	 * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
> -	 * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
> -	 * earlier.
> -	 */
> -	if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
> +	if (dpu_enc->disp_info.intf_type == INTF_WB && conn_state->writeback_job) {
>  		fb = conn_state->writeback_job->fb;
> -
> -		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
> -			topology.needs_cdm = true;
> -	} else if (disp_info->intf_type == INTF_DP) {
> -		if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
> -			topology.needs_cdm = true;
> +		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb))) {
> +			if (!dpu_enc->cur_master->hw_cdm)
> +				return true;
> +		} else {
> +			if (dpu_enc->cur_master->hw_cdm)
> +				return true;
> +		}
>  	}
>  
> -	if (topology.needs_cdm && !dpu_enc->cur_master->hw_cdm)
> -		crtc_state->mode_changed = true;
> -	else if (!topology.needs_cdm && dpu_enc->cur_master->hw_cdm)
> -		crtc_state->mode_changed = true;
> -	/*
> -	 * Release and Allocate resources on every modeset
> -	 * Dont allocate when active is false.
> -	 */
> -	if (drm_atomic_crtc_needs_modeset(crtc_state)) {
> -		dpu_rm_release(global_state, crtc_state->crtc);
> +	return false;
> +}
>  
> -		if (!crtc_state->active_changed || crtc_state->enable)
> -			ret = dpu_rm_reserve(&dpu_kms->rm, global_state,
> -					crtc_state->crtc, &topology);
> -		if (!ret)
> -			dpu_encoder_assign_crtc_resources(dpu_kms, drm_enc,
> -							  global_state, crtc_state);
> -	}
> +/**
> + * dpu_encoder_get_dsc_config - get DSC config for the DPU encoder
> + *   This helper function is used by physical encoder to get DSC config
> + *   used for this encoder.
> + * @drm_enc: Pointer to encoder structure
> + */
> +struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
> +{
> +	struct msm_drm_private *priv = drm_enc->dev->dev_private;
> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
> +	int index = dpu_enc->disp_info.h_tile_instance[0];
>  
> -	trace_dpu_enc_atomic_check_flags(DRMID(drm_enc), adj_mode->flags);
> +	if (dpu_enc->disp_info.intf_type == INTF_DSI)
> +		return msm_dsi_get_dsc_config(priv->dsi[index]);
>  
> -	return ret;
> +	return NULL;
> +}
> +
> +/**
> + * dpu_encoder_use_dsc_merge - returns true if the encoder uses DSC merge topology.
> + * @drm_enc:    Pointer to previously created drm encoder structure
> + */
> +bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
> +{
> +	return dpu_encoder_needs_dsc_merge(drm_enc);
>  }
>  
>  static void _dpu_encoder_update_vsync_source(struct dpu_encoder_virt *dpu_enc,
> @@ -2627,7 +2546,6 @@ static const struct drm_encoder_helper_funcs dpu_encoder_helper_funcs = {
>  	.atomic_mode_set = dpu_encoder_virt_atomic_mode_set,
>  	.atomic_disable = dpu_encoder_virt_atomic_disable,
>  	.atomic_enable = dpu_encoder_virt_atomic_enable,
> -	.atomic_check = dpu_encoder_virt_atomic_check,
>  };
>  
>  static const struct drm_encoder_funcs dpu_encoder_funcs = {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> index 92b5ee390788d16e85e195a664417896a2bf1cae..3db3ea076c377ad5411ec85006bcf4cd9757eb1d 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> @@ -11,6 +11,7 @@
>  
>  #include <drm/drm_crtc.h>
>  #include "dpu_hw_mdss.h"
> +#include "dpu_kms.h"
>  
>  #define DPU_ENCODER_FRAME_EVENT_DONE			BIT(0)
>  #define DPU_ENCODER_FRAME_EVENT_ERROR			BIT(1)
> @@ -80,6 +81,13 @@ int dpu_encoder_get_crc(const struct drm_encoder *drm_enc, u32 *crcs, int pos);
>  
>  bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc);
>  
> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
> +				 struct msm_display_topology *topology,
> +				 struct drm_atomic_state *state,
> +				 const struct drm_display_mode *adj_mode);
> +
> +bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_state *state);
> +
>  void dpu_encoder_prepare_wb_job(struct drm_encoder *drm_enc,
>  		struct drm_writeback_job *job);
>  
> 
> -- 
> 2.34.1
>
Dmitry Baryshkov Dec. 24, 2024, 8:41 p.m. UTC | #8
On Mon, 16 Dec 2024 16:43:11 -0800, Jessica Zhang wrote:
> DPU supports a single writeback session running concurrently with primary
> display when the CWB mux is configured properly. This series enables
> clone mode for DPU driver and adds support for programming the CWB mux
> in cases where the hardware has dedicated CWB pingpong blocks. Currently,
> the CWB hardware blocks have only been added to the SM8650
> hardware catalog and only DSI has been exposed as a possible_clone of WB.
> 
> [...]

Applied, thanks!

[05/25] drm/msm/dpu: get rid of struct dpu_rm_requirements
        https://gitlab.freedesktop.org/lumag/msm/-/commit/835d10620445
[09/25] drm/msm/dpu: Add CWB entry to catalog for SM8650
        https://gitlab.freedesktop.org/lumag/msm/-/commit/989412edae5b
[10/25] drm/msm/dpu: Specify dedicated CWB pingpong blocks
        https://gitlab.freedesktop.org/lumag/msm/-/commit/d1fe88dd53ae
[11/25] drm/msm/dpu: add devcoredumps for cwb registers
        https://gitlab.freedesktop.org/lumag/msm/-/commit/675c1edfa92d
[12/25] drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block
        https://gitlab.freedesktop.org/lumag/msm/-/commit/aae8736426c6
[13/25] drm/msm/dpu: add CWB support to dpu_hw_wb
        https://gitlab.freedesktop.org/lumag/msm/-/commit/a31a610fd44b
[14/25] drm/msm/dpu: Add RM support for allocating CWB
        https://gitlab.freedesktop.org/lumag/msm/-/commit/a5463629299b

Best regards,
Jessica Zhang Dec. 26, 2024, 6:14 p.m. UTC | #9
On 12/23/2024 9:02 PM, Dmitry Baryshkov wrote:
> On Mon, Dec 16, 2024 at 04:43:18PM -0800, Jessica Zhang wrote:
>> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>
>> All resource allocation is centered around the LMs. Then other blocks
>> (except DSCs) are allocated basing on the LMs that was selected, and LM
>> powers up the CRTC rather than the encoder.
>>
>> Moreover if at some point the driver supports encoder cloning,
>> allocating resources from the encoder will be incorrect, as all clones
>> will have different encoder IDs, while LMs are to be shared by these
>> encoders.
>>
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> [quic_abhinavk@quicinc.com: Refactored resource allocation for CDM]
>> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
>> [quic_jesszhan@quicinc.com: Changed to grabbing exising global state]
>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
>> ---
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c    |  86 ++++++++++
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 256 ++++++++++------------------
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |   8 +
>>   3 files changed, 181 insertions(+), 169 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
>> index 9f6ffd344693ecfb633095772a31ada5613345dc..186ed84f59f16997716fe216e635b8dce07a63a1 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
>> @@ -1182,6 +1182,78 @@ static bool dpu_crtc_needs_dirtyfb(struct drm_crtc_state *cstate)
>>   	return false;
>>   }
>>   
>> +#define MAX_HDISPLAY_SPLIT 1080
>> +
>> +static struct msm_display_topology dpu_crtc_get_topology(
>> +		struct drm_crtc *crtc,
>> +		struct dpu_kms *dpu_kms,
>> +		struct drm_crtc_state *crtc_state)
>> +{
>> +	struct drm_display_mode *mode = &crtc_state->adjusted_mode;
>> +	struct msm_display_topology topology = {0};
>> +	struct drm_encoder *drm_enc;
>> +
>> +	drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask)
>> +		dpu_encoder_update_topology(drm_enc, &topology, crtc_state->state,
>> +					    &crtc_state->adjusted_mode);
>> +
>> +	/*
>> +	 * Datapath topology selection
>> +	 *
>> +	 * Dual display
>> +	 * 2 LM, 2 INTF ( Split display using 2 interfaces)
>> +	 *
>> +	 * Single display
>> +	 * 1 LM, 1 INTF
>> +	 * 2 LM, 1 INTF (stream merge to support high resolution interfaces)
>> +	 *
>> +	 * Add dspps to the reservation requirements if ctm is requested
>> +	 */
>> +
>> +	if (topology.num_intf == 2)
>> +		topology.num_lm = 2;
>> +	else if (topology.num_dsc == 2)
>> +		topology.num_lm = 2;
>> +	else if (dpu_kms->catalog->caps->has_3d_merge)
>> +		topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
>> +	else
>> +		topology.num_lm = 1;
>> +
>> +	if (crtc_state->ctm)
>> +		topology.num_dspp = topology.num_lm;
>> +
>> +	return topology;
>> +}
>> +
>> +static int dpu_crtc_assign_resources(struct drm_crtc *crtc, struct drm_crtc_state *crtc_state)
>> +{
>> +	struct dpu_kms *dpu_kms = _dpu_crtc_get_kms(crtc);
>> +	struct dpu_global_state *global_state;
>> +	struct msm_display_topology topology;
>> +	int ret;
>> +
>> +	/*
>> +	 * Release and Allocate resources on every modeset
>> +	 * Dont allocate when enable is false.
>> +	 */
>> +	global_state = dpu_kms_get_existing_global_state(dpu_kms);
>> +	if (IS_ERR(global_state))
>> +		return PTR_ERR(global_state);
>> +
>> +	dpu_rm_release(global_state, crtc);
>> +
>> +	if (!crtc_state->enable)
>> +		return 0;
>> +
>> +	topology = dpu_crtc_get_topology(crtc, dpu_kms, crtc_state);
>> +	ret = dpu_rm_reserve(&dpu_kms->rm, global_state,
>> +			     crtc, &topology);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return 0;
>> +}
>> +
>>   static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
>>   		struct drm_atomic_state *state)
>>   {
>> @@ -1193,10 +1265,24 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
>>   	const struct drm_plane_state *pstate;
>>   	struct drm_plane *plane;
>>   
>> +	struct drm_encoder *drm_enc;
>> +
>>   	int rc = 0;
>>   
>>   	bool needs_dirtyfb = dpu_crtc_needs_dirtyfb(crtc_state);
>>   
>> +	/* there might be cases where encoder needs a modeset too */
>> +	drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask) {
>> +		if (dpu_encoder_needs_modeset(drm_enc, crtc_state->state))
>> +			crtc_state->mode_changed = true;
> 
> I will postpone this patch for a while, pending the review of the
> drm_atomic_helper_check_modeset() series
> 
> https://lore.kernel.org/dri-devel/20241222-drm-dirty-modeset-v1-0-0e76a53eceb9@linaro.org/
> 
> Not to mention that this commit looks broken, see below.
> 
>> +	}
>> +
>> +	if (drm_atomic_crtc_needs_modeset(crtc_state)) {
>> +		rc = dpu_crtc_assign_resources(crtc, crtc_state);
>> +		if (rc < 0)
>> +			return rc;
>> +	}
>> +
>>   	if (!crtc_state->enable || !drm_atomic_crtc_effectively_active(crtc_state)) {
>>   		DRM_DEBUG_ATOMIC("crtc%d -> enable %d, active %d, skip atomic_check\n",
>>   				crtc->base.id, crtc_state->enable,
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> index e6f930dd34566d01223823de82c922668e6be300..2b999a0558b2a016644ed5d25bf54ab45c38d1d9 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>> @@ -58,8 +58,6 @@
>>   
>>   #define IDLE_SHORT_TIMEOUT	1
>>   
>> -#define MAX_HDISPLAY_SPLIT 1080
>> -
>>   /* timeout in frames waiting for frame done */
>>   #define DPU_ENCODER_FRAME_DONE_TIMEOUT_FRAMES 5
>>   
>> @@ -609,206 +607,127 @@ void dpu_encoder_helper_split_config(
>>   	}
>>   }
>>   
>> -/**
>> - * dpu_encoder_use_dsc_merge - returns true if the encoder uses DSC merge topology.
>> - * @drm_enc:    Pointer to previously created drm encoder structure
>> - */
>> -bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
>> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
>> +				 struct msm_display_topology *topology,
>> +				 struct drm_atomic_state *state,
>> +				 const struct drm_display_mode *adj_mode)
>>   {
>>   	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>> -	int i, intf_count = 0, num_dsc = 0;
>> +	struct drm_connector *connector;
>> +	struct drm_connector_state *conn_state;
>> +	struct msm_display_info *disp_info;
>> +	struct drm_framebuffer *fb;
>> +	struct msm_drm_private *priv;
>> +	int i;
>>   
>>   	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
>>   		if (dpu_enc->phys_encs[i])
>> -			intf_count++;
>> +			topology->num_intf++;
>>   
>> -	/* See dpu_encoder_get_topology, we only support 2:2:1 topology */
>> +	/* We only support 2 DSC mode (with 2 LM and 1 INTF) */
>>   	if (dpu_enc->dsc)
>> -		num_dsc = 2;
>> -
>> -	return (num_dsc > 0) && (num_dsc > intf_count);
>> -}
>> -
>> -/**
>> - * dpu_encoder_get_dsc_config - get DSC config for the DPU encoder
>> - *   This helper function is used by physical encoder to get DSC config
>> - *   used for this encoder.
>> - * @drm_enc: Pointer to encoder structure
>> - */
>> -struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
>> -{
>> -	struct msm_drm_private *priv = drm_enc->dev->dev_private;
>> -	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>> -	int index = dpu_enc->disp_info.h_tile_instance[0];
>> +		topology->num_dsc += 2;
>>   
>> -	if (dpu_enc->disp_info.intf_type == INTF_DSI)
>> -		return msm_dsi_get_dsc_config(priv->dsi[index]);
>> +	connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
>> +	if (!connector)
>> +		return;
>> +	conn_state = drm_atomic_get_new_connector_state(state, connector);
>> +	if (!conn_state)
>> +		return;
>>   
>> -	return NULL;
>> -}
>> +	disp_info = &dpu_enc->disp_info;
>>   
>> -static struct msm_display_topology dpu_encoder_get_topology(
>> -			struct dpu_encoder_virt *dpu_enc,
>> -			struct dpu_kms *dpu_kms,
>> -			struct drm_display_mode *mode,
>> -			struct drm_crtc_state *crtc_state,
>> -			struct drm_dsc_config *dsc)
>> -{
>> -	struct msm_display_topology topology = {0};
>> -	int i, intf_count = 0;
>> +	priv = drm_enc->dev->dev_private;
>>   
>> -	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
>> -		if (dpu_enc->phys_encs[i])
>> -			intf_count++;
>> -
>> -	/* Datapath topology selection
>> -	 *
>> -	 * Dual display
>> -	 * 2 LM, 2 INTF ( Split display using 2 interfaces)
>> -	 *
>> -	 * Single display
>> -	 * 1 LM, 1 INTF
>> -	 * 2 LM, 1 INTF (stream merge to support high resolution interfaces)
>> -	 *
>> -	 * Add dspps to the reservation requirements if ctm is requested
>> +	/*
>> +	 * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
>> +	 * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
>> +	 * earlier.
>>   	 */
>> -	if (intf_count == 2)
>> -		topology.num_lm = 2;
>> -	else if (!dpu_kms->catalog->caps->has_3d_merge)
>> -		topology.num_lm = 1;
>> -	else
>> -		topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
>> -
>> -	if (crtc_state->ctm)
>> -		topology.num_dspp = topology.num_lm;
>> -
>> -	topology.num_intf = intf_count;
>> +	if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
>> +		fb = conn_state->writeback_job->fb;
>>   
>> -	if (dsc) {
>> -		/*
>> -		 * In case of Display Stream Compression (DSC), we would use
>> -		 * 2 DSC encoders, 2 layer mixers and 1 interface
>> -		 * this is power optimal and can drive up to (including) 4k
>> -		 * screens
>> -		 */
>> -		topology.num_dsc = 2;
>> -		topology.num_lm = 2;
>> -		topology.num_intf = 1;
>> +		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
>> +			topology->needs_cdm = true;
>> +	} else if (disp_info->intf_type == INTF_DP) {
>> +		if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
>> +			topology->needs_cdm = true;
>>   	}
>> -
>> -	return topology;
>>   }
>>   
>> -static void dpu_encoder_assign_crtc_resources(struct dpu_kms *dpu_kms,
>> -					      struct drm_encoder *drm_enc,
>> -					      struct dpu_global_state *global_state,
>> -					      struct drm_crtc_state *crtc_state)
>> +static bool dpu_encoder_needs_dsc_merge(struct drm_encoder *drm_enc)
>>   {
>> -	struct dpu_crtc_state *cstate;
>> -	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
>> -	struct dpu_hw_blk *hw_lm[MAX_CHANNELS_PER_ENC];
>> -	struct dpu_hw_blk *hw_dspp[MAX_CHANNELS_PER_ENC];
>> -	int num_lm, num_ctl, num_dspp, i;
>> -
>> -	cstate = to_dpu_crtc_state(crtc_state);
>> -
>> -	memset(cstate->mixers, 0, sizeof(cstate->mixers));
>> -
>> -	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> -		drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
>> -	num_lm = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> -		drm_enc->crtc, DPU_HW_BLK_LM, hw_lm, ARRAY_SIZE(hw_lm));
>> -	num_dspp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>> -		drm_enc->crtc, DPU_HW_BLK_DSPP, hw_dspp,
>> -		ARRAY_SIZE(hw_dspp));
>> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>> +	u32 num_intf = 0;
>> +	u32 num_dsc = 0;
>> +	int i;
>>   
>> -	for (i = 0; i < num_lm; i++) {
>> -		int ctl_idx = (i < num_ctl) ? i : (num_ctl-1);
>> +	for (i = 0; i < MAX_PHYS_ENCODERS_PER_VIRTUAL; i++)
>> +		if (dpu_enc->phys_encs[i])
>> +			num_intf++;
>>   
>> -		cstate->mixers[i].hw_lm = to_dpu_hw_mixer(hw_lm[i]);
>> -		cstate->mixers[i].lm_ctl = to_dpu_hw_ctl(hw_ctl[ctl_idx]);
>> -		cstate->mixers[i].hw_dspp = i < num_dspp ? to_dpu_hw_dspp(hw_dspp[i]) : NULL;
>> -	}
>> +	/* We only support 2 DSC mode (with 2 LM and 1 INTF) */
>> +	if (dpu_enc->dsc)
>> +		num_dsc += 2;
>>   
>> -	cstate->num_mixers = num_lm;
>> +	return (num_dsc > 0) && (num_dsc > num_intf);
> 
> Ok, so after all the rebases this commit removes CRTC resource
> assignment from dpu_encoder.c, but they are added to dpu_crtc.c only in
> the next commit! So after this one the tree is broken. This isn't really
> acceptable. After each commit the tree should work, otherwise git-bisect
> might return incorrect results.
> 
> Historically this patch just moved the allocation to the
> dpu_crtc_atomic_check(), while cstate has been maniputated in
> dpu_encoder_virt_atomic_mode_set(). Commit 3ae133b0192b ("drm/msm/dpu:
> move CRTC resource assignment to dpu_encoder_virt_atomic_check") moved
> resource handling to the atomic_check() stage. I think at this point you
> need to take one step back, return to the previous commits, but revert
> their order: first move cstate manipulation to happen during the
> dpu_crtc_atomic_check() function call, leaving dpu_rm_release() /
> dpu_rm_reserve() out of dpu_crtc_assign_resources() (in
> dpu_encoder_virt_atomic_check() as they are now). Then can come this
> patch, which moves topology handling, resource reservation, etc. to
> dpu_crtc.c.

Hey Dmitry,

Thanks for the heads up! I'll get this fixed.

- Jessica Zhang

> 
>>   }
>>   
>> -static int dpu_encoder_virt_atomic_check(
>> -		struct drm_encoder *drm_enc,
>> -		struct drm_crtc_state *crtc_state,
>> -		struct drm_connector_state *conn_state)
>> +bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_state *state)
>>   {
>> -	struct dpu_encoder_virt *dpu_enc;
>> -	struct msm_drm_private *priv;
>> -	struct dpu_kms *dpu_kms;
>> -	struct drm_display_mode *adj_mode;
>> -	struct msm_display_topology topology;
>> -	struct msm_display_info *disp_info;
>> -	struct dpu_global_state *global_state;
>> +	struct drm_connector *connector;
>> +	struct drm_connector_state *conn_state;
>>   	struct drm_framebuffer *fb;
>> -	struct drm_dsc_config *dsc;
>> -	int ret = 0;
>> -
>> -	if (!drm_enc || !crtc_state || !conn_state) {
>> -		DPU_ERROR("invalid arg(s), drm_enc %d, crtc/conn state %d/%d\n",
>> -				drm_enc != NULL, crtc_state != NULL, conn_state != NULL);
>> -		return -EINVAL;
>> -	}
>> -
>> -	dpu_enc = to_dpu_encoder_virt(drm_enc);
>> -	DPU_DEBUG_ENC(dpu_enc, "\n");
>> -
>> -	priv = drm_enc->dev->dev_private;
>> -	disp_info = &dpu_enc->disp_info;
>> -	dpu_kms = to_dpu_kms(priv->kms);
>> -	adj_mode = &crtc_state->adjusted_mode;
>> -	global_state = dpu_kms_get_global_state(crtc_state->state);
>> -	if (IS_ERR(global_state))
>> -		return PTR_ERR(global_state);
>> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>>   
>> -	trace_dpu_enc_atomic_check(DRMID(drm_enc));
>> +	if (!drm_enc || !state)
>> +		return false;
>>   
>> -	dsc = dpu_encoder_get_dsc_config(drm_enc);
>> +	connector = drm_atomic_get_new_connector_for_encoder(state, drm_enc);
>> +	if (!connector)
>> +		return false;
>>   
>> -	topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode, crtc_state, dsc);
>> +	conn_state = drm_atomic_get_new_connector_state(state, connector);
>>   
>> -	/*
>> -	 * Use CDM only for writeback or DP at the moment as other interfaces cannot handle it.
>> -	 * If writeback itself cannot handle cdm for some reason it will fail in its atomic_check()
>> -	 * earlier.
>> -	 */
>> -	if (disp_info->intf_type == INTF_WB && conn_state->writeback_job) {
>> +	if (dpu_enc->disp_info.intf_type == INTF_WB && conn_state->writeback_job) {
>>   		fb = conn_state->writeback_job->fb;
>> -
>> -		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb)))
>> -			topology.needs_cdm = true;
>> -	} else if (disp_info->intf_type == INTF_DP) {
>> -		if (msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]], adj_mode))
>> -			topology.needs_cdm = true;
>> +		if (fb && MSM_FORMAT_IS_YUV(msm_framebuffer_format(fb))) {
>> +			if (!dpu_enc->cur_master->hw_cdm)
>> +				return true;
>> +		} else {
>> +			if (dpu_enc->cur_master->hw_cdm)
>> +				return true;
>> +		}
>>   	}
>>   
>> -	if (topology.needs_cdm && !dpu_enc->cur_master->hw_cdm)
>> -		crtc_state->mode_changed = true;
>> -	else if (!topology.needs_cdm && dpu_enc->cur_master->hw_cdm)
>> -		crtc_state->mode_changed = true;
>> -	/*
>> -	 * Release and Allocate resources on every modeset
>> -	 * Dont allocate when active is false.
>> -	 */
>> -	if (drm_atomic_crtc_needs_modeset(crtc_state)) {
>> -		dpu_rm_release(global_state, crtc_state->crtc);
>> +	return false;
>> +}
>>   
>> -		if (!crtc_state->active_changed || crtc_state->enable)
>> -			ret = dpu_rm_reserve(&dpu_kms->rm, global_state,
>> -					crtc_state->crtc, &topology);
>> -		if (!ret)
>> -			dpu_encoder_assign_crtc_resources(dpu_kms, drm_enc,
>> -							  global_state, crtc_state);
>> -	}
>> +/**
>> + * dpu_encoder_get_dsc_config - get DSC config for the DPU encoder
>> + *   This helper function is used by physical encoder to get DSC config
>> + *   used for this encoder.
>> + * @drm_enc: Pointer to encoder structure
>> + */
>> +struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
>> +{
>> +	struct msm_drm_private *priv = drm_enc->dev->dev_private;
>> +	struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(drm_enc);
>> +	int index = dpu_enc->disp_info.h_tile_instance[0];
>>   
>> -	trace_dpu_enc_atomic_check_flags(DRMID(drm_enc), adj_mode->flags);
>> +	if (dpu_enc->disp_info.intf_type == INTF_DSI)
>> +		return msm_dsi_get_dsc_config(priv->dsi[index]);
>>   
>> -	return ret;
>> +	return NULL;
>> +}
>> +
>> +/**
>> + * dpu_encoder_use_dsc_merge - returns true if the encoder uses DSC merge topology.
>> + * @drm_enc:    Pointer to previously created drm encoder structure
>> + */
>> +bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
>> +{
>> +	return dpu_encoder_needs_dsc_merge(drm_enc);
>>   }
>>   
>>   static void _dpu_encoder_update_vsync_source(struct dpu_encoder_virt *dpu_enc,
>> @@ -2627,7 +2546,6 @@ static const struct drm_encoder_helper_funcs dpu_encoder_helper_funcs = {
>>   	.atomic_mode_set = dpu_encoder_virt_atomic_mode_set,
>>   	.atomic_disable = dpu_encoder_virt_atomic_disable,
>>   	.atomic_enable = dpu_encoder_virt_atomic_enable,
>> -	.atomic_check = dpu_encoder_virt_atomic_check,
>>   };
>>   
>>   static const struct drm_encoder_funcs dpu_encoder_funcs = {
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>> index 92b5ee390788d16e85e195a664417896a2bf1cae..3db3ea076c377ad5411ec85006bcf4cd9757eb1d 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>> @@ -11,6 +11,7 @@
>>   
>>   #include <drm/drm_crtc.h>
>>   #include "dpu_hw_mdss.h"
>> +#include "dpu_kms.h"
>>   
>>   #define DPU_ENCODER_FRAME_EVENT_DONE			BIT(0)
>>   #define DPU_ENCODER_FRAME_EVENT_ERROR			BIT(1)
>> @@ -80,6 +81,13 @@ int dpu_encoder_get_crc(const struct drm_encoder *drm_enc, u32 *crcs, int pos);
>>   
>>   bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc);
>>   
>> +void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
>> +				 struct msm_display_topology *topology,
>> +				 struct drm_atomic_state *state,
>> +				 const struct drm_display_mode *adj_mode);
>> +
>> +bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_state *state);
>> +
>>   void dpu_encoder_prepare_wb_job(struct drm_encoder *drm_enc,
>>   		struct drm_writeback_job *job);
>>   
>>
>> -- 
>> 2.34.1
>>
> 
> -- 
> With best wishes
> Dmitry
Jessica Zhang Dec. 26, 2024, 10:49 p.m. UTC | #10
On 12/20/2024 5:07 PM, Dmitry Baryshkov wrote:
> On Fri, Dec 20, 2024 at 04:12:29PM -0800, Jessica Zhang wrote:
>>
>>
>> On 12/19/2024 9:52 PM, Dmitry Baryshkov wrote:
>>> On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote:
>>>> Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes
>>>>
>>>> For concurrent writeback, even-indexed CWB muxes must be assigned to
>>>> even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same
>>>> even/odd rule applies for dedicated CWB PINGPONGs.
>>>>
>>>> Track the CWB muxes in the global state and add a CWB-specific helper to
>>>> reserve the correct CWB muxes and dedicated PINGPONGs following the
>>>> even/odd rule.
>>>>
>>>> Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
>>>> ---
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++--
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  2 +
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  1 +
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 83 +++++++++++++++++++++++++++++
>>>>    4 files changed, 116 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>>>> @@ -2,7 +2,7 @@
>>>>    /*
>>>>     * Copyright (C) 2013 Red Hat
>>>>     * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved.
>>>> - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
>>>> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
>>>>     *
>>>>     * Author: Rob Clark <robdclark@gmail.com>
>>>>     */
>>>> @@ -28,6 +28,7 @@
>>>>    #include "dpu_hw_dsc.h"
>>>>    #include "dpu_hw_merge3d.h"
>>>>    #include "dpu_hw_cdm.h"
>>>> +#include "dpu_hw_cwb.h"
>>>>    #include "dpu_formats.h"
>>>>    #include "dpu_encoder_phys.h"
>>>>    #include "dpu_crtc.h"
>>>> @@ -133,6 +134,9 @@ enum dpu_enc_rc_states {
>>>>     * @cur_slave:		As above but for the slave encoder.
>>>>     * @hw_pp:		Handle to the pingpong blocks used for the display. No.
>>>>     *			pingpong blocks can be different than num_phys_encs.
>>>> + * @hw_cwb:		Handle to the CWB muxes used for concurrent writeback
>>>> + *			display. Number of CWB muxes can be different than
>>>> + *			num_phys_encs.
>>>>     * @hw_dsc:		Handle to the DSC blocks used for the display.
>>>>     * @dsc_mask:		Bitmask of used DSC blocks.
>>>>     * @intfs_swapped:	Whether or not the phys_enc interfaces have been swapped
>>>> @@ -177,6 +181,7 @@ struct dpu_encoder_virt {
>>>>    	struct dpu_encoder_phys *cur_master;
>>>>    	struct dpu_encoder_phys *cur_slave;
>>>>    	struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
>>>> +	struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC];
>>>>    	struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
>>>>    	unsigned int dsc_mask;
>>>> @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>>>    	struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC];
>>>>    	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
>>>>    	struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
>>>> +	struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC];
>>>>    	int num_pp, num_dsc, num_ctl;
>>>> +	int num_cwb = 0;
>>>> +	bool is_cwb_encoder;
>>>>    	unsigned int dsc_mask = 0;
>>>>    	int i;
>>>> @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>>>    	priv = drm_enc->dev->dev_private;
>>>>    	dpu_kms = to_dpu_kms(priv->kms);
>>>> +	is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) &&
>>>> +			dpu_enc->disp_info.intf_type == INTF_WB;
>>>>    	global_state = dpu_kms_get_existing_global_state(dpu_kms);
>>>>    	if (IS_ERR_OR_NULL(global_state)) {
>>>> @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
>>>>    	trace_dpu_enc_mode_set(DRMID(drm_enc));
>>>>    	/* Query resource that have been reserved in atomic check step. */
>>>> -	num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>>> -		drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp,
>>>> -		ARRAY_SIZE(hw_pp));
>>>> +	if (is_cwb_encoder) {
>>>> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>>> +						       drm_enc->crtc,
>>>> +						       DPU_HW_BLK_DCWB_PINGPONG,
>>>> +						       hw_pp, ARRAY_SIZE(hw_pp));
>>>> +		num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>>> +						       drm_enc->crtc,
>>>> +						       DPU_HW_BLK_CWB,
>>>> +						       hw_cwb, ARRAY_SIZE(hw_cwb));
>>>> +	} else {
>>>> +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>>> +						       drm_enc->crtc,
>>>> +						       DPU_HW_BLK_PINGPONG, hw_pp,
>>>> +						       ARRAY_SIZE(hw_pp));
>>>> +	}
>>>> +
>>>> +	for (i = 0; i < num_cwb; i++)
>>>> +		dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]);
>>>> +
>>>>    	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
>>>>    			drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>>>> index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
>>>> @@ -77,12 +77,14 @@ enum dpu_hw_blk_type {
>>>>    	DPU_HW_BLK_LM,
>>>>    	DPU_HW_BLK_CTL,
>>>>    	DPU_HW_BLK_PINGPONG,
>>>> +	DPU_HW_BLK_DCWB_PINGPONG,
>>>>    	DPU_HW_BLK_INTF,
>>>>    	DPU_HW_BLK_WB,
>>>>    	DPU_HW_BLK_DSPP,
>>>>    	DPU_HW_BLK_MERGE_3D,
>>>>    	DPU_HW_BLK_DSC,
>>>>    	DPU_HW_BLK_CDM,
>>>> +	DPU_HW_BLK_CWB,
>>>>    	DPU_HW_BLK_MAX,
>>>>    };
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>>>> index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>>>> @@ -128,6 +128,7 @@ struct dpu_global_state {
>>>>    	uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0];
>>>>    	uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0];
>>>>    	uint32_t cdm_to_crtc_id;
>>>> +	uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0];
>>>>    };
>>>>    struct dpu_global_state
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>> index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
>>>> @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
>>>>    	return -EINVAL;
>>>>    }
>>>> +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm,
>>>> +						 struct dpu_global_state *global_state,
>>>> +						 uint32_t crtc_id,
>>>> +						 struct msm_display_topology *topology)
>>>> +{
>>>> +	int num_cwb_pp = topology->num_lm, cwb_pp_count = 0;
>>>> +	int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0;
>>>> +	int cwb_pp_idx[MAX_BLOCKS];
>>>> +	int cwb_mux_idx[MAX_BLOCKS];
>>>> +
>>>> +	/*
>>>> +	 * Reserve additional dedicated CWB PINGPONG blocks and muxes for each
>>>> +	 * mixer
>>>> +	 *
>>>> +	 * TODO: add support reserving resources for platforms with no
>>>> +	 *       PINGPONG_CWB
>>>
>>> What about doing it other way around: allocate CWBs first as required
>>> (even/odd, proper count, etc). Then for each of CWBs allocate a PP block
>>> (I think it's enough to simply make CWB blocks have a corresponding PP
>>> index as a property). This way the driver can handle both legacy and
>>> current platforms.
>>
>> Hi Dmitry,
>>
>> Sorry if I'm misunderstanding your suggestion, but the main change needed to
>> support platforms with no dedicated PINGPONG_CWB is where in the
>> rm->pingpong_blks list to start assigning pingpong blocks for the CWB mux.
>> I'm not sure how changing the order in which CWBs and the pingpong blocks
>> are assigned will address that.
>>
>> (FWIW, the only change necessary to add support for non-dedicated
>> PINGPONG_CWBs platforms for this function should just be changing the
>> initialization value of cwb_pp_start_idx)
> 
> If I remember correctly, we have identified several generations of DPU
> wrt. CWB handling:
> - 8.1+ (or 8.0+?), DCWB, dedicated PP blocks
> - 7.2, dedicated PP_1?
> - 5.0+, shared PP blocks
> - older DPUs, special handling of PP
> 
> If the driver allocates PP first and then first it has to allocated PP
> (in a platform-specific way) and then go from PINGPONG to CWB (in a
> platform-specific way). If CWB is allocated first, then you have only
> one platform-specific piece of code that gets PINGPONG for the CWB (and
> as this function is called after the CWB allocation, the major part of
> the CWB / PP allocation is generic).

The issue with breaking this into separate helpers/functions is that the 
CWB mux and PPB indices are dependent on each other. But I agree that we 
can reserve CWB mux and the PPBs in 2 separate loops within this helper 
to minimize the special platform-specific handling.

Also wanted to note that the comment doc on the PPB odd/even rule is 
inaccurate -- technically the odd/even rule applies specifically to the 
CWB mux as odd/even LMs are hardwired to their respective CWB muxes. 
Will correct the comment doc to be more accurate.

Thanks,

Jessica Zhang

> 
>>
>> Thanks,
>>
>> Jessica Zhang
>>
>>>
>>>> +	 */
>>>> +	for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) &&
>>>> +	     cwb_pp_count < num_cwb_pp; i++) {
>>>> +		for (int j = cwb_pp_start_idx;
>>>> +		     j < ARRAY_SIZE(rm->pingpong_blks); j++) {
>>>> +			/*
>>>> +			 * Odd LMs must be assigned to odd PINGPONGs and even
>>>> +			 * LMs with even PINGPONGs
>>>> +			 */
>>>> +			if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) ||
>>>> +			    i % 2 != j % 2)
>>>> +				continue;
>>>> +
>>>> +			cwb_pp_idx[cwb_pp_count] = j;
>>>> +			cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx;
>>>> +			cwb_pp_count++;
>>>> +			break;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	if (cwb_pp_count != num_cwb_pp) {
>>>> +		DPU_ERROR("Unable to reserve all CWB PINGPONGs\n");
>>>> +		return -ENAVAIL;
>>>> +	}
>>>> +
>>>> +	for (int i = 0; i < cwb_pp_count; i++) {
>>>> +		global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id;
>>>> +		global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>    /**
>>>>     * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets
>>>>     *	proposed use case requirements, incl. hardwired dependent blocks like
>>>> @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation(
>>>>    		return ret;
>>>>    	}
>>>> +	if (topology->cwb_enabled) {
>>>> +		ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state,
>>>> +							    crtc_id, topology);
>>>> +		if (ret)
>>>> +			return ret;
>>>> +	}
>>>>    	ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id,
>>>>    			topology);
>>>> @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
>>>>    	_dpu_rm_clear_mapping(global_state->dspp_to_crtc_id,
>>>>    			ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id);
>>>>    	_dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id);
>>>> +	_dpu_rm_clear_mapping(global_state->cwb_to_crtc_id,
>>>> +			ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id);
>>>>    }
>>>>    /**
>>>> @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>>>    	switch (type) {
>>>>    	case DPU_HW_BLK_PINGPONG:
>>>> +	case DPU_HW_BLK_DCWB_PINGPONG:
>>>>    		hw_blks = rm->pingpong_blks;
>>>>    		hw_to_crtc_id = global_state->pingpong_to_crtc_id;
>>>>    		max_blks = ARRAY_SIZE(rm->pingpong_blks);
>>>> @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>>>    		hw_to_crtc_id = &global_state->cdm_to_crtc_id;
>>>>    		max_blks = 1;
>>>>    		break;
>>>> +	case DPU_HW_BLK_CWB:
>>>> +		hw_blks = rm->cwb_blks;
>>>> +		hw_to_crtc_id = global_state->cwb_to_crtc_id;
>>>> +		max_blks = ARRAY_SIZE(rm->cwb_blks);
>>>> +		break;
>>>>    	default:
>>>>    		DPU_ERROR("blk type %d not managed by rm\n", type);
>>>>    		return 0;
>>>> @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>>>>    		if (hw_to_crtc_id[i] != crtc_id)
>>>>    			continue;
>>>> +		if (type == DPU_HW_BLK_PINGPONG) {
>>>> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
>>>> +
>>>> +			if (pp->idx >= PINGPONG_CWB_0)
>>>> +				continue;
>>>> +		}
>>>> +
>>>> +		if (type == DPU_HW_BLK_DCWB_PINGPONG) {
>>>> +			struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]);
>>>> +
>>>> +			if (pp->idx < PINGPONG_CWB_0)
>>>> +				continue;
>>>> +		}
>>>> +
>>>>    		if (num_blks == blks_size) {
>>>>    			DPU_ERROR("More than %d resources assigned to crtc %d\n",
>>>>    				  blks_size, crtc_id);
>>>> @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p,
>>>>    	dpu_rm_print_state_helper(p, rm->cdm_blk,
>>>>    				  global_state->cdm_to_crtc_id);
>>>>    	drm_puts(p, "\n");
>>>> +
>>>> +	drm_puts(p, "\tcwb=");
>>>> +	for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++)
>>>> +		dpu_rm_print_state_helper(p, rm->cwb_blks[i],
>>>> +					  global_state->cwb_to_crtc_id[i]);
>>>> +	drm_puts(p, "\n");
>>>>    }
>>>>
>>>> -- 
>>>> 2.34.1
>>>>
>>>
>>> -- 
>>> With best wishes
>>> Dmitry
>>
> 
> -- 
> With best wishes
> Dmitry
Dmitry Baryshkov Dec. 29, 2024, 4:47 a.m. UTC | #11
On Thu, Dec 26, 2024 at 02:49:28PM -0800, Jessica Zhang wrote:
> 
> 
> On 12/20/2024 5:07 PM, Dmitry Baryshkov wrote:
> > On Fri, Dec 20, 2024 at 04:12:29PM -0800, Jessica Zhang wrote:
> > > 
> > > 
> > > On 12/19/2024 9:52 PM, Dmitry Baryshkov wrote:
> > > > On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote:
> > > > > Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes
> > > > > 
> > > > > For concurrent writeback, even-indexed CWB muxes must be assigned to
> > > > > even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same
> > > > > even/odd rule applies for dedicated CWB PINGPONGs.
> > > > > 
> > > > > Track the CWB muxes in the global state and add a CWB-specific helper to
> > > > > reserve the correct CWB muxes and dedicated PINGPONGs following the
> > > > > even/odd rule.
> > > > > 
> > > > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
> > > > > ---
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++--
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  2 +
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h     |  1 +
> > > > >    drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c      | 83 +++++++++++++++++++++++++++++
> > > > >    4 files changed, 116 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > > > index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644
> > > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> > > > > @@ -2,7 +2,7 @@
> > > > >    /*
> > > > >     * Copyright (C) 2013 Red Hat
> > > > >     * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved.
> > > > > - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> > > > > + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> > > > >     *
> > > > >     * Author: Rob Clark <robdclark@gmail.com>
> > > > >     */
> > > > > @@ -28,6 +28,7 @@
> > > > >    #include "dpu_hw_dsc.h"
> > > > >    #include "dpu_hw_merge3d.h"
> > > > >    #include "dpu_hw_cdm.h"
> > > > > +#include "dpu_hw_cwb.h"
> > > > >    #include "dpu_formats.h"
> > > > >    #include "dpu_encoder_phys.h"
> > > > >    #include "dpu_crtc.h"
> > > > > @@ -133,6 +134,9 @@ enum dpu_enc_rc_states {
> > > > >     * @cur_slave:		As above but for the slave encoder.
> > > > >     * @hw_pp:		Handle to the pingpong blocks used for the display. No.
> > > > >     *			pingpong blocks can be different than num_phys_encs.
> > > > > + * @hw_cwb:		Handle to the CWB muxes used for concurrent writeback
> > > > > + *			display. Number of CWB muxes can be different than
> > > > > + *			num_phys_encs.
> > > > >     * @hw_dsc:		Handle to the DSC blocks used for the display.
> > > > >     * @dsc_mask:		Bitmask of used DSC blocks.
> > > > >     * @intfs_swapped:	Whether or not the phys_enc interfaces have been swapped
> > > > > @@ -177,6 +181,7 @@ struct dpu_encoder_virt {
> > > > >    	struct dpu_encoder_phys *cur_master;
> > > > >    	struct dpu_encoder_phys *cur_slave;
> > > > >    	struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC];
> > > > > +	struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC];
> > > > >    	struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC];
> > > > >    	unsigned int dsc_mask;
> > > > > @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > > > >    	struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC];
> > > > >    	struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC];
> > > > >    	struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC];
> > > > > +	struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC];
> > > > >    	int num_pp, num_dsc, num_ctl;
> > > > > +	int num_cwb = 0;
> > > > > +	bool is_cwb_encoder;
> > > > >    	unsigned int dsc_mask = 0;
> > > > >    	int i;
> > > > > @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > > > >    	priv = drm_enc->dev->dev_private;
> > > > >    	dpu_kms = to_dpu_kms(priv->kms);
> > > > > +	is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) &&
> > > > > +			dpu_enc->disp_info.intf_type == INTF_WB;
> > > > >    	global_state = dpu_kms_get_existing_global_state(dpu_kms);
> > > > >    	if (IS_ERR_OR_NULL(global_state)) {
> > > > > @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc,
> > > > >    	trace_dpu_enc_mode_set(DRMID(drm_enc));
> > > > >    	/* Query resource that have been reserved in atomic check step. */
> > > > > -	num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > > > -		drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp,
> > > > > -		ARRAY_SIZE(hw_pp));
> > > > > +	if (is_cwb_encoder) {
> > > > > +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > > > +						       drm_enc->crtc,
> > > > > +						       DPU_HW_BLK_DCWB_PINGPONG,
> > > > > +						       hw_pp, ARRAY_SIZE(hw_pp));
> > > > > +		num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > > > +						       drm_enc->crtc,
> > > > > +						       DPU_HW_BLK_CWB,
> > > > > +						       hw_cwb, ARRAY_SIZE(hw_cwb));
> > > > > +	} else {
> > > > > +		num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > > > +						       drm_enc->crtc,
> > > > > +						       DPU_HW_BLK_PINGPONG, hw_pp,
> > > > > +						       ARRAY_SIZE(hw_pp));
> > > > > +	}
> > > > > +
> > > > > +	for (i = 0; i < num_cwb; i++)
> > > > > +		dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]);
> > > > > +
> > > > >    	num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state,
> > > > >    			drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl));
> > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > > > index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644
> > > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> > > > > @@ -77,12 +77,14 @@ enum dpu_hw_blk_type {
> > > > >    	DPU_HW_BLK_LM,
> > > > >    	DPU_HW_BLK_CTL,
> > > > >    	DPU_HW_BLK_PINGPONG,
> > > > > +	DPU_HW_BLK_DCWB_PINGPONG,
> > > > >    	DPU_HW_BLK_INTF,
> > > > >    	DPU_HW_BLK_WB,
> > > > >    	DPU_HW_BLK_DSPP,
> > > > >    	DPU_HW_BLK_MERGE_3D,
> > > > >    	DPU_HW_BLK_DSC,
> > > > >    	DPU_HW_BLK_CDM,
> > > > > +	DPU_HW_BLK_CWB,
> > > > >    	DPU_HW_BLK_MAX,
> > > > >    };
> > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > > index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644
> > > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > > @@ -128,6 +128,7 @@ struct dpu_global_state {
> > > > >    	uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0];
> > > > >    	uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0];
> > > > >    	uint32_t cdm_to_crtc_id;
> > > > > +	uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0];
> > > > >    };
> > > > >    struct dpu_global_state
> > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > > > index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644
> > > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > > > > @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
> > > > >    	return -EINVAL;
> > > > >    }
> > > > > +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm,
> > > > > +						 struct dpu_global_state *global_state,
> > > > > +						 uint32_t crtc_id,
> > > > > +						 struct msm_display_topology *topology)
> > > > > +{
> > > > > +	int num_cwb_pp = topology->num_lm, cwb_pp_count = 0;
> > > > > +	int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0;
> > > > > +	int cwb_pp_idx[MAX_BLOCKS];
> > > > > +	int cwb_mux_idx[MAX_BLOCKS];
> > > > > +
> > > > > +	/*
> > > > > +	 * Reserve additional dedicated CWB PINGPONG blocks and muxes for each
> > > > > +	 * mixer
> > > > > +	 *
> > > > > +	 * TODO: add support reserving resources for platforms with no
> > > > > +	 *       PINGPONG_CWB
> > > > 
> > > > What about doing it other way around: allocate CWBs first as required
> > > > (even/odd, proper count, etc). Then for each of CWBs allocate a PP block
> > > > (I think it's enough to simply make CWB blocks have a corresponding PP
> > > > index as a property). This way the driver can handle both legacy and
> > > > current platforms.
> > > 
> > > Hi Dmitry,
> > > 
> > > Sorry if I'm misunderstanding your suggestion, but the main change needed to
> > > support platforms with no dedicated PINGPONG_CWB is where in the
> > > rm->pingpong_blks list to start assigning pingpong blocks for the CWB mux.
> > > I'm not sure how changing the order in which CWBs and the pingpong blocks
> > > are assigned will address that.
> > > 
> > > (FWIW, the only change necessary to add support for non-dedicated
> > > PINGPONG_CWBs platforms for this function should just be changing the
> > > initialization value of cwb_pp_start_idx)
> > 
> > If I remember correctly, we have identified several generations of DPU
> > wrt. CWB handling:
> > - 8.1+ (or 8.0+?), DCWB, dedicated PP blocks
> > - 7.2, dedicated PP_1?
> > - 5.0+, shared PP blocks
> > - older DPUs, special handling of PP
> > 
> > If the driver allocates PP first and then first it has to allocated PP
> > (in a platform-specific way) and then go from PINGPONG to CWB (in a
> > platform-specific way). If CWB is allocated first, then you have only
> > one platform-specific piece of code that gets PINGPONG for the CWB (and
> > as this function is called after the CWB allocation, the major part of
> > the CWB / PP allocation is generic).
> 
> The issue with breaking this into separate helpers/functions is that the CWB
> mux and PPB indices are dependent on each other. But I agree that we can
> reserve CWB mux and the PPBs in 2 separate loops within this helper to
> minimize the special platform-specific handling.

Doesn't it just PPB depend on CWB?


> 
> Also wanted to note that the comment doc on the PPB odd/even rule is
> inaccurate -- technically the odd/even rule applies specifically to the CWB
> mux as odd/even LMs are hardwired to their respective CWB muxes. Will
> correct the comment doc to be more accurate.

Yes, please fix that.

> 
> Thanks,
> 
> Jessica Zhang
>