mbox series

[0/3] drm/msm: Further expose UBWC tiling parameters

Message ID 20240702-msm-tiling-config-v1-0-adaa6a6e4523@gmail.com
Headers show
Series drm/msm: Further expose UBWC tiling parameters | expand

Message

Connor Abbott July 2, 2024, 12:56 p.m. UTC
After testing, there are more parameters that we're programming which
affect how UBWC tiles are laid out in memory and therefore affect
the Mesa implementation of VK_EXT_host_image_copy [1], which includes a
CPU implementation of tiling and detiling images. In particular we have:

1. ubwc_mode, which is used to enable level 1 bank swizzling to go back
   to UBWC 1.0 when the implementation supports UBWC 2.0. a610 sets
   this.
2. macrotile_mode, which we previously left as default but according to
   downstream we shouldn't for a680.
3. level2_swizzling_dis, which according to downstream has to be set
   differently for a663.

I want as much as possible to avoid problems from people trying to
upstream Mesa/kernel support not knowing what they're doing and blindly
copying things, so let's make this very explicit that you must set the
correct parameters in the kernel and then make sure that Mesa always
gets the right parameters from the "source of truth" in the kernel by
adding two new UAPI parameters. The Mesa MR has already been updated to
use this if available.

A secondary goal is to make the adreno settings look more like the MDSS
settings, by combining ubwc_mode and level2_swizzling_dis into a single
ubwc_swizzle parameter that matches the MDSS one. This will help with
creating a single source of truth for all drivers later. The UAPI also
matches this, and it makes the Mesa tiling and detiling implementation
simpler/more straightforward.

For more information on what all these parameters mean, see the comments
I've added in the first commit and the giant comment in
src/freedreno/fdl/fd6_tiled_memcpy.c I've added in [1].

Testing of the Mesa MR both with and without this series is appreciated,
there are many different SoCs out there with different UBWC
configurations and I cannot test them all.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26578

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
Connor Abbott (3):
      drm/msm: Update a6xx register XML
      drm/msm: Expand UBWC config setting
      drm/msm: Expose expanded UBWC config uapi

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |    4 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |   36 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |    6 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |    3 +-
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1617 ++++++++++++++++++++++++-
 include/uapi/drm/msm_drm.h                    |    2 +
 6 files changed, 1644 insertions(+), 24 deletions(-)
---
base-commit: c39e6f4f08c40710c15a372f5a29de7b84f47fd9
change-id: 20240701-msm-tiling-config-c5f222f5db1c

Best regards,

Comments

Rob Clark July 2, 2024, 2:31 p.m. UTC | #1
On Tue, Jul 2, 2024 at 5:56 AM Connor Abbott <cwabbott0@gmail.com> wrote:
>
> According to downstream we should be setting RBBM_NC_MODE_CNTL to a
> non-default value on a663 and a680, we don't support a663 and on a680
> we're leaving it at the wrong (suboptimal) value. Just set it on all
> GPUs. Similarly, plumb through level2_swizzling_dis which will be
> necessary on a663.
>
> ubwc_mode is expanded and renamed to ubwc_swizzle to match the name on
> the display side. Similarly macrotile_mode should match the display
> side.
>
> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  4 ++++
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 36 ++++++++++++++++++++++++---------
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 ++-
>  3 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index c003f970189b..33b0f607f913 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -1788,5 +1788,9 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
>         else
>                 adreno_gpu->ubwc_config.highest_bank_bit = 14;
>
> +       /* a5xx only supports UBWC 1.0, these are not configurable */
> +       adreno_gpu->ubwc_config.macrotile_mode = 0;
> +       adreno_gpu->ubwc_config.ubwc_swizzle = 0x7;
> +
>         return gpu;
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c98cdb1e9326..7a3564dd7941 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -499,8 +499,17 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
>         gpu->ubwc_config.uavflagprd_inv = 0;
>         /* Whether the minimum access length is 64 bits */
>         gpu->ubwc_config.min_acc_len = 0;
> -       /* Entirely magic, per-GPU-gen value */
> -       gpu->ubwc_config.ubwc_mode = 0;
> +       /* Whether to enable level 1, 2 & 3 bank swizzling.
> +        * UBWC 1.0 always enables all three levels.
> +        * UBWC 2.0 removes level 1 bank swizzling, leaving levels 2 & 3.
> +        * UBWC 4.0 adds the optional ability to disable levels 2 & 3.

I guess this is a bitmask for BIT(level_n)?

> +        */
> +       gpu->ubwc_config.ubwc_swizzle = 0x6;
> +       /* Whether to use 4-channel macrotiling mode or the newer 8-channel
> +        * macrotiling mode introduced in UBWC 3.1. 0 is 4-channel and 1 is
> +        * 8-channel.
> +        */

Can we add these comments as kerneldoc comments in the ubwc_config
struct?  That would be a more natural place for eventually moving ubwc
config to a central systemwide table (and perhaps finally properly
dealing with the setting differences for DDR vs LPDDR)

BR,
-R

> +       gpu->ubwc_config.macrotile_mode = 0;
>         /*
>          * The Highest Bank Bit value represents the bit of the highest DDR bank.
>          * This should ideally use DRAM type detection.
> @@ -510,7 +519,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
>         if (adreno_is_a610(gpu)) {
>                 gpu->ubwc_config.highest_bank_bit = 13;
>                 gpu->ubwc_config.min_acc_len = 1;
> -               gpu->ubwc_config.ubwc_mode = 1;
> +               gpu->ubwc_config.ubwc_swizzle = 0x7;
>         }
>
>         if (adreno_is_a618(gpu))
> @@ -536,6 +545,7 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
>                 gpu->ubwc_config.amsbc = 1;
>                 gpu->ubwc_config.rgb565_predicator = 1;
>                 gpu->ubwc_config.uavflagprd_inv = 2;
> +               gpu->ubwc_config.macrotile_mode = 1;
>         }
>
>         if (adreno_is_7c3(gpu)) {
> @@ -543,12 +553,12 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
>                 gpu->ubwc_config.amsbc = 1;
>                 gpu->ubwc_config.rgb565_predicator = 1;
>                 gpu->ubwc_config.uavflagprd_inv = 2;
> +               gpu->ubwc_config.macrotile_mode = 1;
>         }
>
>         if (adreno_is_a702(gpu)) {
>                 gpu->ubwc_config.highest_bank_bit = 14;
>                 gpu->ubwc_config.min_acc_len = 1;
> -               gpu->ubwc_config.ubwc_mode = 0;
>         }
>  }
>
> @@ -564,21 +574,26 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>         u32 hbb = adreno_gpu->ubwc_config.highest_bank_bit - 13;
>         u32 hbb_hi = hbb >> 2;
>         u32 hbb_lo = hbb & 3;
> +       u32 ubwc_mode = adreno_gpu->ubwc_config.ubwc_swizzle & 1;
> +       u32 level2_swizzling_dis = !(adreno_gpu->ubwc_config.ubwc_swizzle & 2);
>
>         gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> +                 level2_swizzling_dis << 12 |
>                   adreno_gpu->ubwc_config.rgb565_predicator << 11 |
>                   hbb_hi << 10 | adreno_gpu->ubwc_config.amsbc << 4 |
>                   adreno_gpu->ubwc_config.min_acc_len << 3 |
> -                 hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
> +                 hbb_lo << 1 | ubwc_mode);
>
> -       gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +       gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL,
> +                 level2_swizzling_dis << 6 | hbb_hi << 4 |
>                   adreno_gpu->ubwc_config.min_acc_len << 3 |
> -                 hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
> +                 hbb_lo << 1 | ubwc_mode);
>
> -       gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +       gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> +                 level2_swizzling_dis << 12 | hbb_hi << 10 |
>                   adreno_gpu->ubwc_config.uavflagprd_inv << 4 |
>                   adreno_gpu->ubwc_config.min_acc_len << 3 |
> -                 hbb_lo << 1 | adreno_gpu->ubwc_config.ubwc_mode);
> +                 hbb_lo << 1 | ubwc_mode);
>
>         if (adreno_is_a7xx(adreno_gpu))
>                 gpu_write(gpu, REG_A7XX_GRAS_NC_MODE_CNTL,
> @@ -586,6 +601,9 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>
>         gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL,
>                   adreno_gpu->ubwc_config.min_acc_len << 23 | hbb_lo << 21);
> +
> +       gpu_write(gpu, REG_A6XX_RBBM_NC_MODE_CNTL,
> +                 adreno_gpu->ubwc_config.macrotile_mode);
>  }
>
>  static int a6xx_cp_init(struct msm_gpu *gpu)
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index cff8ce541d2c..b2da660c10c7 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -194,9 +194,10 @@ struct adreno_gpu {
>                 u32 rgb565_predicator;
>                 u32 uavflagprd_inv;
>                 u32 min_acc_len;
> -               u32 ubwc_mode;
> +               u32 ubwc_swizzle;
>                 u32 highest_bank_bit;
>                 u32 amsbc;
> +               u32 macrotile_mode;
>         } ubwc_config;
>
>         /*
>
> --
> 2.31.1
>