Message ID | 20241216-concurrent-wb-v4-0-fe220297a7f0@quicinc.com |
---|---|
Headers | show |
Series | drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+ | expand |
On Mon, Dec 16, 2024 at 04:43:11PM -0800, Jessica Zhang wrote: > DPU supports a single writeback session running concurrently with primary > display when the CWB mux is configured properly. This series enables > clone mode for DPU driver and adds support for programming the CWB mux > in cases where the hardware has dedicated CWB pingpong blocks. Currently, > the CWB hardware blocks have only been added to the SM8650 > hardware catalog and only DSI has been exposed as a possible_clone of WB. > > This changes are split into two parts: > > The first part of the series will pull in Dmitry's patches to refactor > the DPU resource manager to be based off of CRTC instead of encoder. > This includes some changes (noted in the relevant commits) by me and > Abhinav to fix some issues with getting the global state and refactoring > the CDM allocation to work with Dmitry's changes. To provide a sensible baseline for both CWB and Quad-Pipe changes I'm going to pull patches 5-14 (those which refactor the resource allocation and also those adding support for the CWB hardware block). The core DRM patches should probably go in through drm-misc-next. > > The second part of the series will add support for CWB by doing the > following: > > 1) Add a DRM helper to detect if the current CRTC state is in clone mode > and add an "in_clone_mode" entry to the atomic state print > 2) Add the CWB mux to the hardware catalog and clarify the pingpong > block index enum to specifiy which pingpong blocks are dedicated to > CWB only and which ones are general use pingpong blocks > 3) Add CWB as part of the devcoredump > 4) Add support for configuring the CWB mux via dpu_hw_cwb ops > 5) Add pending flush support for CWB > 6) Add support for validating clone mode in the DPU CRTC and setting up > CWB within the encoder > 7) Adjust the encoder trigger flush, trigger start, and kickoff order to > accomodate clone mode > 8) Adjust when the frame done timer is started for clone mode > 9) Define the possible clones for DPU encoders so that > > The feature was tested on SM8650 using IGT's kms_writeback test with the > following change [1] and dumping the writeback framebuffer when in clone > mode. I haven't gotten the chance to test it on DP yet, but I've > validated both single and dual LM on DSI. > > To test CWB with IGT, you'll need to apply this series [1] and this > driver patch [2]. Run the following command to dump the writeback buffer: > > IGT_FRAME_DUMP_PATH=<dump path> FRAME_PNG_FILE_NAME=<file name> \ > ./build/tests/kms_writeback -d [--run-subtest dump-valid-clones] \ > > You can also do CRC validation by running this command: > > ./build/tests/kms_writeback [--run-subtest dump-valid-clones] > > [1] https://patchwork.freedesktop.org/series/137933/ > [2] https://patchwork.freedesktop.org/series/138284/ > > --- > Changes in v4: > - Rebased onto latest msm-next > - Added kunit tests for framework changes > - Skip valid clone check for encoders that don't have any possible clones set > (this is to avoid failing kunit tests, specifically the HDMI state helper tests) > - Link to v3: https://lore.kernel.org/r/20241016-concurrent-wb-v3-0-a33cf9b93835@quicinc.com > > Changes in v3: > - Dropped support for CWB on DP connectors for now > - Dropped unnecessary PINGPONG array in *_setup_cwb() > - Add a check to make sure CWB and CDM aren't supported simultaneously > (Dmitry) > - Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry) > - Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani) > - Dropped duplicate error message for reserving CWB resources (Dmitry) > - Added notes in framework changes about posting a separate series to > add proper KUnit tests (Maxime) > - Added commit message note addressing Sima's comment on handling > mode_changed (Dmitry) > - Formatting fixes (Dmitry) > - Added proper kerneldocs (Dmitry) > - Renamed dpu_encoder_helper_get_cwb() -> *_get_cwb_mask() (Dmitry) > - Capitalize all instances of "pingpong" in comments (Dmitry) > - Link to v2: https://lore.kernel.org/r/20240924-concurrent-wb-v2-0-7849f900e863@quicinc.com > > Changes in v2: > - Moved CWB hardware programming to its own dpu_hw_cwb abstraction > (Dmitry) > - Reserve and get assigned CWB muxes using RM API and KMS global state > (Dmitry) > - Dropped requirement to have only one CWB session at a time > - Moved valid clone mode check to DRM framework (Dmitry and Ville) > - Switch to default CWB tap point to LM as the DSPP > - Dropped printing clone mode status in atomic state (Dmitry) > - Call dpu_vbif_clear_errors() before dpu_encoder_kickoff() (Dmitry) > - Squashed setup_input_ctrl() and setup_input_mode() into a single > dpu_hw_cwb op (Dmitry) > - Moved function comment docs to correct place and fixed wording of > comments/commit messages (Dmitry) > - Grabbed old CRTC state using proper drm_atomic_state API in > dpu_crtc_atomic_check() (Dmitry) > - Split HW catalog changes of adding the CWB mux block and changing the > dedicated CWB pingpong indices into 2 separate commits (Dmitry) > - Moved clearing the dpu_crtc_state.num_mixers to "drm/msm/dpu: fill > CRTC resources in dpu_crtc.c" (Dmitry) > - Fixed alignment and other formatting issues (Dmitry) > - Link to v1: https://lore.kernel.org/r/20240829-concurrent-wb-v1-0-502b16ae2ebb@quicinc.com > > --- > Dmitry Baryshkov (4): > drm/msm/dpu: get rid of struct dpu_rm_requirements > drm/msm/dpu: switch RM to use crtc_id rather than enc_id for allocation > drm/msm/dpu: move resource allocation to CRTC > drm/msm/dpu: fill CRTC resources in dpu_crtc.c > > Esha Bharadwaj (3): > drm/msm/dpu: Add CWB entry to catalog for SM8650 > drm/msm/dpu: add devcoredumps for cwb registers > drm/msm/dpu: add CWB support to dpu_hw_wb > > Jessica Zhang (18): > drm: add clone mode check for CRTC > drm/tests: Add test for drm_crtc_in_clone_mode() > drm: Add valid clones check > drm/tests: Add test for drm_atomic_helper_check_modeset() > drm/msm/dpu: Specify dedicated CWB pingpong blocks > drm/msm/dpu: Add dpu_hw_cwb abstraction for CWB block > drm/msm/dpu: Add RM support for allocating CWB > drm/msm/dpu: Add CWB to msm_display_topology > drm/msm/dpu: Require modeset if clone mode status changes > drm/msm/dpu: Fail atomic_check if CWB and CDM are enabled > drm/msm/dpu: Reserve resources for CWB > drm/msm/dpu: Configure CWB in writeback encoder > drm/msm/dpu: Support CWB in dpu_hw_ctl > drm/msm/dpu: Adjust writeback phys encoder setup for CWB > drm/msm/dpu: Start frame done timer after encoder kickoff > drm/msm/dpu: Skip trigger flush and start for CWB > drm/msm/dpu: Reorder encoder kickoff for CWB > drm/msm/dpu: Set possible clones for all encoders > > drivers/gpu/drm/drm_atomic_helper.c | 28 ++ > drivers/gpu/drm/drm_crtc.c | 20 + > drivers/gpu/drm/msm/Makefile | 1 + > .../drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h | 29 +- > .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 4 +- > .../drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h | 4 +- > .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 4 +- > .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h | 4 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 208 ++++++++- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 463 ++++++++++++--------- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h | 14 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h | 7 +- > .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 16 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 13 + > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 30 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 15 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.c | 73 ++++ > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_cwb.h | 70 ++++ > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 15 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_wb.c | 4 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 12 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 13 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 361 +++++++++------- > drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h | 13 +- > drivers/gpu/drm/tests/drm_atomic_state_test.c | 133 +++++- > include/drm/drm_crtc.h | 2 +- > 26 files changed, 1172 insertions(+), 384 deletions(-) > --- > base-commit: 86313a9cd152330c634b25d826a281c6a002eb77 > change-id: 20240618-concurrent-wb-97d62387f952 > prerequisite-change-id: 20241209-abhinavk-modeset-fix-74864f1de08d:v3 > prerequisite-patch-id: a197a0cd4647cb189ea20a96583ea78d0c98b638 > prerequisite-patch-id: 112c8f1795cbed989beb02b72561854c0ccd59dd > > Best regards, > -- > Jessica Zhang <quic_jesszhan@quicinc.com> >
On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote: > Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes > > For concurrent writeback, even-indexed CWB muxes must be assigned to > even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same > even/odd rule applies for dedicated CWB PINGPONGs. > > Track the CWB muxes in the global state and add a CWB-specific helper to > reserve the correct CWB muxes and dedicated PINGPONGs following the > even/odd rule. > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> > --- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++-- > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 2 + > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 1 + > drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 83 +++++++++++++++++++++++++++++ > 4 files changed, 116 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > @@ -2,7 +2,7 @@ > /* > * Copyright (C) 2013 Red Hat > * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved. > - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. > + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved. > * > * Author: Rob Clark <robdclark@gmail.com> > */ > @@ -28,6 +28,7 @@ > #include "dpu_hw_dsc.h" > #include "dpu_hw_merge3d.h" > #include "dpu_hw_cdm.h" > +#include "dpu_hw_cwb.h" > #include "dpu_formats.h" > #include "dpu_encoder_phys.h" > #include "dpu_crtc.h" > @@ -133,6 +134,9 @@ enum dpu_enc_rc_states { > * @cur_slave: As above but for the slave encoder. > * @hw_pp: Handle to the pingpong blocks used for the display. No. > * pingpong blocks can be different than num_phys_encs. > + * @hw_cwb: Handle to the CWB muxes used for concurrent writeback > + * display. Number of CWB muxes can be different than > + * num_phys_encs. > * @hw_dsc: Handle to the DSC blocks used for the display. > * @dsc_mask: Bitmask of used DSC blocks. > * @intfs_swapped: Whether or not the phys_enc interfaces have been swapped > @@ -177,6 +181,7 @@ struct dpu_encoder_virt { > struct dpu_encoder_phys *cur_master; > struct dpu_encoder_phys *cur_slave; > struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC]; > + struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC]; > struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC]; > > unsigned int dsc_mask; > @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC]; > struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC]; > struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC]; > + struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC]; > int num_pp, num_dsc, num_ctl; > + int num_cwb = 0; > + bool is_cwb_encoder; > unsigned int dsc_mask = 0; > int i; > > @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > > priv = drm_enc->dev->dev_private; > dpu_kms = to_dpu_kms(priv->kms); > + is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) && > + dpu_enc->disp_info.intf_type == INTF_WB; > > global_state = dpu_kms_get_existing_global_state(dpu_kms); > if (IS_ERR_OR_NULL(global_state)) { > @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > trace_dpu_enc_mode_set(DRMID(drm_enc)); > > /* Query resource that have been reserved in atomic check step. */ > - num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > - drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp, > - ARRAY_SIZE(hw_pp)); > + if (is_cwb_encoder) { > + num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > + drm_enc->crtc, > + DPU_HW_BLK_DCWB_PINGPONG, > + hw_pp, ARRAY_SIZE(hw_pp)); > + num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > + drm_enc->crtc, > + DPU_HW_BLK_CWB, > + hw_cwb, ARRAY_SIZE(hw_cwb)); > + } else { > + num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > + drm_enc->crtc, > + DPU_HW_BLK_PINGPONG, hw_pp, > + ARRAY_SIZE(hw_pp)); > + } > + > + for (i = 0; i < num_cwb; i++) > + dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]); > + > num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl)); > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > @@ -77,12 +77,14 @@ enum dpu_hw_blk_type { > DPU_HW_BLK_LM, > DPU_HW_BLK_CTL, > DPU_HW_BLK_PINGPONG, > + DPU_HW_BLK_DCWB_PINGPONG, > DPU_HW_BLK_INTF, > DPU_HW_BLK_WB, > DPU_HW_BLK_DSPP, > DPU_HW_BLK_MERGE_3D, > DPU_HW_BLK_DSC, > DPU_HW_BLK_CDM, > + DPU_HW_BLK_CWB, > DPU_HW_BLK_MAX, > }; > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > @@ -128,6 +128,7 @@ struct dpu_global_state { > uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0]; > uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0]; > uint32_t cdm_to_crtc_id; > + uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0]; > }; > > struct dpu_global_state > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644 > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx) > return -EINVAL; > } > > +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm, > + struct dpu_global_state *global_state, > + uint32_t crtc_id, > + struct msm_display_topology *topology) > +{ > + int num_cwb_pp = topology->num_lm, cwb_pp_count = 0; > + int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0; > + int cwb_pp_idx[MAX_BLOCKS]; > + int cwb_mux_idx[MAX_BLOCKS]; > + > + /* > + * Reserve additional dedicated CWB PINGPONG blocks and muxes for each > + * mixer > + * > + * TODO: add support reserving resources for platforms with no > + * PINGPONG_CWB What about doing it other way around: allocate CWBs first as required (even/odd, proper count, etc). Then for each of CWBs allocate a PP block (I think it's enough to simply make CWB blocks have a corresponding PP index as a property). This way the driver can handle both legacy and current platforms. > + */ > + for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) && > + cwb_pp_count < num_cwb_pp; i++) { > + for (int j = cwb_pp_start_idx; > + j < ARRAY_SIZE(rm->pingpong_blks); j++) { > + /* > + * Odd LMs must be assigned to odd PINGPONGs and even > + * LMs with even PINGPONGs > + */ > + if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) || > + i % 2 != j % 2) > + continue; > + > + cwb_pp_idx[cwb_pp_count] = j; > + cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx; > + cwb_pp_count++; > + break; > + } > + } > + > + if (cwb_pp_count != num_cwb_pp) { > + DPU_ERROR("Unable to reserve all CWB PINGPONGs\n"); > + return -ENAVAIL; > + } > + > + for (int i = 0; i < cwb_pp_count; i++) { > + global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id; > + global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id; > + } > + > + return 0; > +} > + > /** > * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets > * proposed use case requirements, incl. hardwired dependent blocks like > @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation( > return ret; > } > > + if (topology->cwb_enabled) { > + ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state, > + crtc_id, topology); > + if (ret) > + return ret; > + } > > ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id, > topology); > @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state, > _dpu_rm_clear_mapping(global_state->dspp_to_crtc_id, > ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id); > _dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id); > + _dpu_rm_clear_mapping(global_state->cwb_to_crtc_id, > + ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id); > } > > /** > @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > > switch (type) { > case DPU_HW_BLK_PINGPONG: > + case DPU_HW_BLK_DCWB_PINGPONG: > hw_blks = rm->pingpong_blks; > hw_to_crtc_id = global_state->pingpong_to_crtc_id; > max_blks = ARRAY_SIZE(rm->pingpong_blks); > @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > hw_to_crtc_id = &global_state->cdm_to_crtc_id; > max_blks = 1; > break; > + case DPU_HW_BLK_CWB: > + hw_blks = rm->cwb_blks; > + hw_to_crtc_id = global_state->cwb_to_crtc_id; > + max_blks = ARRAY_SIZE(rm->cwb_blks); > + break; > default: > DPU_ERROR("blk type %d not managed by rm\n", type); > return 0; > @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > if (hw_to_crtc_id[i] != crtc_id) > continue; > > + if (type == DPU_HW_BLK_PINGPONG) { > + struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]); > + > + if (pp->idx >= PINGPONG_CWB_0) > + continue; > + } > + > + if (type == DPU_HW_BLK_DCWB_PINGPONG) { > + struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]); > + > + if (pp->idx < PINGPONG_CWB_0) > + continue; > + } > + > if (num_blks == blks_size) { > DPU_ERROR("More than %d resources assigned to crtc %d\n", > blks_size, crtc_id); > @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p, > dpu_rm_print_state_helper(p, rm->cdm_blk, > global_state->cdm_to_crtc_id); > drm_puts(p, "\n"); > + > + drm_puts(p, "\tcwb="); > + for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++) > + dpu_rm_print_state_helper(p, rm->cwb_blks[i], > + global_state->cwb_to_crtc_id[i]); > + drm_puts(p, "\n"); > } > > -- > 2.34.1 >
On Mon, Dec 16, 2024 at 04:43:30PM -0800, Jessica Zhang wrote: > Cache the CWB block mask in the DPU virtual encoder and configure CWB > according to the CWB block mask within the writeback phys encoder > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> > --- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 75 +++++++++++++++++++++- > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h | 7 +- > .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 4 +- > 3 files changed, 83 insertions(+), 3 deletions(-) > Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
On Fri, Dec 20, 2024 at 04:12:29PM -0800, Jessica Zhang wrote: > > > On 12/19/2024 9:52 PM, Dmitry Baryshkov wrote: > > On Mon, Dec 16, 2024 at 04:43:29PM -0800, Jessica Zhang wrote: > > > Add support for RM to reserve dedicated CWB PINGPONGs and CWB muxes > > > > > > For concurrent writeback, even-indexed CWB muxes must be assigned to > > > even-indexed LMs and odd-indexed CWB muxes for odd-indexed LMs. The same > > > even/odd rule applies for dedicated CWB PINGPONGs. > > > > > > Track the CWB muxes in the global state and add a CWB-specific helper to > > > reserve the correct CWB muxes and dedicated PINGPONGs following the > > > even/odd rule. > > > > > > Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com> > > > --- > > > drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 34 ++++++++++-- > > > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 2 + > > > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 1 + > > > drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 83 +++++++++++++++++++++++++++++ > > > 4 files changed, 116 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > > > index a895d48fe81ccc71d265e089992786e8b6268b1b..a95dc1f0c6a422485c7ba98743e944e1a4f43539 100644 > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c > > > @@ -2,7 +2,7 @@ > > > /* > > > * Copyright (C) 2013 Red Hat > > > * Copyright (c) 2014-2018, 2020-2021 The Linux Foundation. All rights reserved. > > > - * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved. > > > + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved. > > > * > > > * Author: Rob Clark <robdclark@gmail.com> > > > */ > > > @@ -28,6 +28,7 @@ > > > #include "dpu_hw_dsc.h" > > > #include "dpu_hw_merge3d.h" > > > #include "dpu_hw_cdm.h" > > > +#include "dpu_hw_cwb.h" > > > #include "dpu_formats.h" > > > #include "dpu_encoder_phys.h" > > > #include "dpu_crtc.h" > > > @@ -133,6 +134,9 @@ enum dpu_enc_rc_states { > > > * @cur_slave: As above but for the slave encoder. > > > * @hw_pp: Handle to the pingpong blocks used for the display. No. > > > * pingpong blocks can be different than num_phys_encs. > > > + * @hw_cwb: Handle to the CWB muxes used for concurrent writeback > > > + * display. Number of CWB muxes can be different than > > > + * num_phys_encs. > > > * @hw_dsc: Handle to the DSC blocks used for the display. > > > * @dsc_mask: Bitmask of used DSC blocks. > > > * @intfs_swapped: Whether or not the phys_enc interfaces have been swapped > > > @@ -177,6 +181,7 @@ struct dpu_encoder_virt { > > > struct dpu_encoder_phys *cur_master; > > > struct dpu_encoder_phys *cur_slave; > > > struct dpu_hw_pingpong *hw_pp[MAX_CHANNELS_PER_ENC]; > > > + struct dpu_hw_cwb *hw_cwb[MAX_CHANNELS_PER_ENC]; > > > struct dpu_hw_dsc *hw_dsc[MAX_CHANNELS_PER_ENC]; > > > unsigned int dsc_mask; > > > @@ -1138,7 +1143,10 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > > > struct dpu_hw_blk *hw_pp[MAX_CHANNELS_PER_ENC]; > > > struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_ENC]; > > > struct dpu_hw_blk *hw_dsc[MAX_CHANNELS_PER_ENC]; > > > + struct dpu_hw_blk *hw_cwb[MAX_CHANNELS_PER_ENC]; > > > int num_pp, num_dsc, num_ctl; > > > + int num_cwb = 0; > > > + bool is_cwb_encoder; > > > unsigned int dsc_mask = 0; > > > int i; > > > @@ -1152,6 +1160,8 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > > > priv = drm_enc->dev->dev_private; > > > dpu_kms = to_dpu_kms(priv->kms); > > > + is_cwb_encoder = drm_crtc_in_clone_mode(crtc_state) && > > > + dpu_enc->disp_info.intf_type == INTF_WB; > > > global_state = dpu_kms_get_existing_global_state(dpu_kms); > > > if (IS_ERR_OR_NULL(global_state)) { > > > @@ -1162,9 +1172,25 @@ static void dpu_encoder_virt_atomic_mode_set(struct drm_encoder *drm_enc, > > > trace_dpu_enc_mode_set(DRMID(drm_enc)); > > > /* Query resource that have been reserved in atomic check step. */ > > > - num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > > > - drm_enc->crtc, DPU_HW_BLK_PINGPONG, hw_pp, > > > - ARRAY_SIZE(hw_pp)); > > > + if (is_cwb_encoder) { > > > + num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > > > + drm_enc->crtc, > > > + DPU_HW_BLK_DCWB_PINGPONG, > > > + hw_pp, ARRAY_SIZE(hw_pp)); > > > + num_cwb = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > > > + drm_enc->crtc, > > > + DPU_HW_BLK_CWB, > > > + hw_cwb, ARRAY_SIZE(hw_cwb)); > > > + } else { > > > + num_pp = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > > > + drm_enc->crtc, > > > + DPU_HW_BLK_PINGPONG, hw_pp, > > > + ARRAY_SIZE(hw_pp)); > > > + } > > > + > > > + for (i = 0; i < num_cwb; i++) > > > + dpu_enc->hw_cwb[i] = to_dpu_hw_cwb(hw_cwb[i]); > > > + > > > num_ctl = dpu_rm_get_assigned_resources(&dpu_kms->rm, global_state, > > > drm_enc->crtc, DPU_HW_BLK_CTL, hw_ctl, ARRAY_SIZE(hw_ctl)); > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > > > index ba7bb05efe9b8cac01a908e53121117e130f91ec..8d820cd1b5545d247515763039b341184e814e32 100644 > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h > > > @@ -77,12 +77,14 @@ enum dpu_hw_blk_type { > > > DPU_HW_BLK_LM, > > > DPU_HW_BLK_CTL, > > > DPU_HW_BLK_PINGPONG, > > > + DPU_HW_BLK_DCWB_PINGPONG, > > > DPU_HW_BLK_INTF, > > > DPU_HW_BLK_WB, > > > DPU_HW_BLK_DSPP, > > > DPU_HW_BLK_MERGE_3D, > > > DPU_HW_BLK_DSC, > > > DPU_HW_BLK_CDM, > > > + DPU_HW_BLK_CWB, > > > DPU_HW_BLK_MAX, > > > }; > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > > > index 48d756d8f8c6e4ab94b72bac0418320f7dc8cda8..1fc8abda927fc094b369e0d1efc795b71d6a7fcb 100644 > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h > > > @@ -128,6 +128,7 @@ struct dpu_global_state { > > > uint32_t dspp_to_crtc_id[DSPP_MAX - DSPP_0]; > > > uint32_t dsc_to_crtc_id[DSC_MAX - DSC_0]; > > > uint32_t cdm_to_crtc_id; > > > + uint32_t cwb_to_crtc_id[CWB_MAX - CWB_0]; > > > }; > > > struct dpu_global_state > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > > > index 85adaf256b2c705d2d7df378b6ffc0e578f52bc3..ead24bb0ceb5d8ec4705f0d32330294d0b45b216 100644 > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c > > > @@ -234,6 +234,55 @@ static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx) > > > return -EINVAL; > > > } > > > +static int _dpu_rm_reserve_cwb_mux_and_pingpongs(struct dpu_rm *rm, > > > + struct dpu_global_state *global_state, > > > + uint32_t crtc_id, > > > + struct msm_display_topology *topology) > > > +{ > > > + int num_cwb_pp = topology->num_lm, cwb_pp_count = 0; > > > + int cwb_pp_start_idx = PINGPONG_CWB_0 - PINGPONG_0; > > > + int cwb_pp_idx[MAX_BLOCKS]; > > > + int cwb_mux_idx[MAX_BLOCKS]; > > > + > > > + /* > > > + * Reserve additional dedicated CWB PINGPONG blocks and muxes for each > > > + * mixer > > > + * > > > + * TODO: add support reserving resources for platforms with no > > > + * PINGPONG_CWB > > > > What about doing it other way around: allocate CWBs first as required > > (even/odd, proper count, etc). Then for each of CWBs allocate a PP block > > (I think it's enough to simply make CWB blocks have a corresponding PP > > index as a property). This way the driver can handle both legacy and > > current platforms. > > Hi Dmitry, > > Sorry if I'm misunderstanding your suggestion, but the main change needed to > support platforms with no dedicated PINGPONG_CWB is where in the > rm->pingpong_blks list to start assigning pingpong blocks for the CWB mux. > I'm not sure how changing the order in which CWBs and the pingpong blocks > are assigned will address that. > > (FWIW, the only change necessary to add support for non-dedicated > PINGPONG_CWBs platforms for this function should just be changing the > initialization value of cwb_pp_start_idx) If I remember correctly, we have identified several generations of DPU wrt. CWB handling: - 8.1+ (or 8.0+?), DCWB, dedicated PP blocks - 7.2, dedicated PP_1? - 5.0+, shared PP blocks - older DPUs, special handling of PP If the driver allocates PP first and then first it has to allocated PP (in a platform-specific way) and then go from PINGPONG to CWB (in a platform-specific way). If CWB is allocated first, then you have only one platform-specific piece of code that gets PINGPONG for the CWB (and as this function is called after the CWB allocation, the major part of the CWB / PP allocation is generic). > > Thanks, > > Jessica Zhang > > > > > > + */ > > > + for (int i = 0; i < ARRAY_SIZE(rm->mixer_blks) && > > > + cwb_pp_count < num_cwb_pp; i++) { > > > + for (int j = cwb_pp_start_idx; > > > + j < ARRAY_SIZE(rm->pingpong_blks); j++) { > > > + /* > > > + * Odd LMs must be assigned to odd PINGPONGs and even > > > + * LMs with even PINGPONGs > > > + */ > > > + if (reserved_by_other(global_state->pingpong_to_crtc_id, j, crtc_id) || > > > + i % 2 != j % 2) > > > + continue; > > > + > > > + cwb_pp_idx[cwb_pp_count] = j; > > > + cwb_mux_idx[cwb_pp_count] = j - cwb_pp_start_idx; > > > + cwb_pp_count++; > > > + break; > > > + } > > > + } > > > + > > > + if (cwb_pp_count != num_cwb_pp) { > > > + DPU_ERROR("Unable to reserve all CWB PINGPONGs\n"); > > > + return -ENAVAIL; > > > + } > > > + > > > + for (int i = 0; i < cwb_pp_count; i++) { > > > + global_state->pingpong_to_crtc_id[cwb_pp_idx[i]] = crtc_id; > > > + global_state->cwb_to_crtc_id[cwb_mux_idx[i]] = crtc_id; > > > + } > > > + > > > + return 0; > > > +} > > > + > > > /** > > > * _dpu_rm_check_lm_and_get_connected_blks - check if proposed layer mixer meets > > > * proposed use case requirements, incl. hardwired dependent blocks like > > > @@ -614,6 +663,12 @@ static int _dpu_rm_make_reservation( > > > return ret; > > > } > > > + if (topology->cwb_enabled) { > > > + ret = _dpu_rm_reserve_cwb_mux_and_pingpongs(rm, global_state, > > > + crtc_id, topology); > > > + if (ret) > > > + return ret; > > > + } > > > ret = _dpu_rm_reserve_ctls(rm, global_state, crtc_id, > > > topology); > > > @@ -671,6 +726,8 @@ void dpu_rm_release(struct dpu_global_state *global_state, > > > _dpu_rm_clear_mapping(global_state->dspp_to_crtc_id, > > > ARRAY_SIZE(global_state->dspp_to_crtc_id), crtc_id); > > > _dpu_rm_clear_mapping(&global_state->cdm_to_crtc_id, 1, crtc_id); > > > + _dpu_rm_clear_mapping(global_state->cwb_to_crtc_id, > > > + ARRAY_SIZE(global_state->cwb_to_crtc_id), crtc_id); > > > } > > > /** > > > @@ -733,6 +790,7 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > > > switch (type) { > > > case DPU_HW_BLK_PINGPONG: > > > + case DPU_HW_BLK_DCWB_PINGPONG: > > > hw_blks = rm->pingpong_blks; > > > hw_to_crtc_id = global_state->pingpong_to_crtc_id; > > > max_blks = ARRAY_SIZE(rm->pingpong_blks); > > > @@ -762,6 +820,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > > > hw_to_crtc_id = &global_state->cdm_to_crtc_id; > > > max_blks = 1; > > > break; > > > + case DPU_HW_BLK_CWB: > > > + hw_blks = rm->cwb_blks; > > > + hw_to_crtc_id = global_state->cwb_to_crtc_id; > > > + max_blks = ARRAY_SIZE(rm->cwb_blks); > > > + break; > > > default: > > > DPU_ERROR("blk type %d not managed by rm\n", type); > > > return 0; > > > @@ -772,6 +835,20 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm, > > > if (hw_to_crtc_id[i] != crtc_id) > > > continue; > > > + if (type == DPU_HW_BLK_PINGPONG) { > > > + struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]); > > > + > > > + if (pp->idx >= PINGPONG_CWB_0) > > > + continue; > > > + } > > > + > > > + if (type == DPU_HW_BLK_DCWB_PINGPONG) { > > > + struct dpu_hw_pingpong *pp = to_dpu_hw_pingpong(hw_blks[i]); > > > + > > > + if (pp->idx < PINGPONG_CWB_0) > > > + continue; > > > + } > > > + > > > if (num_blks == blks_size) { > > > DPU_ERROR("More than %d resources assigned to crtc %d\n", > > > blks_size, crtc_id); > > > @@ -847,4 +924,10 @@ void dpu_rm_print_state(struct drm_printer *p, > > > dpu_rm_print_state_helper(p, rm->cdm_blk, > > > global_state->cdm_to_crtc_id); > > > drm_puts(p, "\n"); > > > + > > > + drm_puts(p, "\tcwb="); > > > + for (i = 0; i < ARRAY_SIZE(global_state->cwb_to_crtc_id); i++) > > > + dpu_rm_print_state_helper(p, rm->cwb_blks[i], > > > + global_state->cwb_to_crtc_id[i]); > > > + drm_puts(p, "\n"); > > > } > > > > > > -- > > > 2.34.1 > > > > > > > -- > > With best wishes > > Dmitry >