diff mbox series

[6/8] drm/msm/dpu: use dpu_perf_cfg in DPU core_perf code

Message ID 20230620000846.946925-7-dmitry.baryshkov@linaro.org
State Superseded
Headers show
Series [1/8] drm/msm/dpu: drop enum dpu_core_perf_data_bus_id | expand

Commit Message

Dmitry Baryshkov June 20, 2023, 12:08 a.m. UTC
Simplify dpu_core_perf code by using only dpu_perf_cfg instead of using
full-featured catalog data.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 52 ++++++++-----------
 drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h |  8 +--
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |  2 +-
 3 files changed, 27 insertions(+), 35 deletions(-)

Comments

Abhinav Kumar July 4, 2023, 12:46 a.m. UTC | #1
On 6/20/2023 4:31 AM, Konrad Dybcio wrote:
> On 20.06.2023 13:18, Dmitry Baryshkov wrote:
>> On 20/06/2023 13:55, Konrad Dybcio wrote:
>>> On 20.06.2023 02:08, Dmitry Baryshkov wrote:
>>>> Simplify dpu_core_perf code by using only dpu_perf_cfg instead of using
>>>> full-featured catalog data.
>>>>
>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>>> ---
>>> Acked-by: Konrad Dybcio <konrad.dybcio@linaro.org>
>>>
>>> Check below.
>>>
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 52 ++++++++-----------
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h |  8 +--
>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c       |  2 +-
>>>>    3 files changed, 27 insertions(+), 35 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>> index 773e641eab28..78a7e3ea27a4 100644
>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
>>>> @@ -19,11 +19,11 @@
>>>>
>>>>    /**
>>>>     * _dpu_core_perf_calc_bw() - to calculate BW per crtc
>>>> - * @kms:  pointer to the dpu_kms
>>>> + * @perf_cfg: performance configuration
>>>>     * @crtc: pointer to a crtc
>>>>     * Return: returns aggregated BW for all planes in crtc.
>>>>     */
>>>> -static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
>>>> +static u64 _dpu_core_perf_calc_bw(const struct dpu_perf_cfg *perf_cfg,
>>>>               struct drm_crtc *crtc)
>>>>    {
>>>>       struct drm_plane *plane;
>>>> @@ -39,7 +39,7 @@ static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
>>>>               crtc_plane_bw += pstate->plane_fetch_bw;
>>>>       }
>>>>
>>>> -    bw_factor = kms->catalog->perf->bw_inefficiency_factor;
>>>> +    bw_factor = perf_cfg->bw_inefficiency_factor;
>>> It's set to 120 for all SoCs.. and it sounds very much like some kind of a
>>> hack.
>>>
>>> The 105 on the other inefficiency factor is easy to spot:
>>>
>>> (1024/1000)^2 = 1.048576 =~= 1.05 = 105%
>>>
>>> It comes from a MiB-MB-MHz conversion that Qcom splattered all over
>>> downstream as due to ancient tragical design decisions in msmbus
>>> (which leak to the downstream interconnect a bit):
>>
>> This doesn't describe, why msm8226 and msm8974 had qcom,mdss-clk-factor
>> of 5/4. And 8084 got 1.05 as usual. I can only suppose that MDSS 1.0
>> (8974 v1) and 1.1 (8226) had some internal inefficiency / issues.
>>
>> Also, this 1.05 is a clock inefficiency, so it should not be related
>> to msm bus client code.
> Right. Maybe Abhinav could shed some light on this.
> 
> Konrad
>>

I will need to check with someone else about this as msm8974 and msm8226 
are quite old for me to remember.

That being said, I really dont think the explanation behind the number 
is going to be something which is going to be explained in detail here 
even if I did ask.

The name of the variable "clk_inefficiency_factor" says pretty much what 
has to be said for the purposes of this patch. I dont know if we will be 
able to go further into how that number came.

Coming to this patch itself, its not a major gain or major loss in my 
perspective.

Sure, we dont need to pass the full catalog today so we can just pass 
the perf_cfg. I cannot guarantee we wont need the full catalog later.


>>>
>>> The logic needs to get some input that corresponds to a clock rate
>>> of a bus clock (19.2, 200, 300 Mhz etc.) but the APIs expect a Kbps
>>> value. So at one point they invented a MHZ_TO_MBPS macro which did this
>>> conversion the other way around and probably had to account for it.
>>>
>>> I think they tried to make it make more sense, but it ended up being
>>> even more spaghetti :/
>>>
>>> Not yet sure how it's done on RPMh icc, but with SMD RPM, passing e.g.
>>>
>>> opp-peak-kBps = <(200 * 8 * 1000)>; # 200 MHz * 8-wide * KHz-to-MHz
>>>
>>> results in a "correct" end rate.
>>>
>>> Konrad
>>>>       if (bw_factor) {
>>>>               crtc_plane_bw *= bw_factor;
>>>>               do_div(crtc_plane_bw, 100);
>>
>>
>> --
>> With best wishes
>> Dmitry
diff mbox series

Patch

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
index 773e641eab28..78a7e3ea27a4 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c
@@ -19,11 +19,11 @@ 
 
 /**
  * _dpu_core_perf_calc_bw() - to calculate BW per crtc
- * @kms:  pointer to the dpu_kms
+ * @perf_cfg: performance configuration
  * @crtc: pointer to a crtc
  * Return: returns aggregated BW for all planes in crtc.
  */
-static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
+static u64 _dpu_core_perf_calc_bw(const struct dpu_perf_cfg *perf_cfg,
 		struct drm_crtc *crtc)
 {
 	struct drm_plane *plane;
@@ -39,7 +39,7 @@  static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
 		crtc_plane_bw += pstate->plane_fetch_bw;
 	}
 
-	bw_factor = kms->catalog->perf->bw_inefficiency_factor;
+	bw_factor = perf_cfg->bw_inefficiency_factor;
 	if (bw_factor) {
 		crtc_plane_bw *= bw_factor;
 		do_div(crtc_plane_bw, 100);
@@ -50,12 +50,12 @@  static u64 _dpu_core_perf_calc_bw(struct dpu_kms *kms,
 
 /**
  * _dpu_core_perf_calc_clk() - to calculate clock per crtc
- * @kms:  pointer to the dpu_kms
+ * @perf_cfg: performance configuration
  * @crtc: pointer to a crtc
  * @state: pointer to a crtc state
  * Return: returns max clk for all planes in crtc.
  */
-static u64 _dpu_core_perf_calc_clk(struct dpu_kms *kms,
+static u64 _dpu_core_perf_calc_clk(const struct dpu_perf_cfg *perf_cfg,
 		struct drm_crtc *crtc, struct drm_crtc_state *state)
 {
 	struct drm_plane *plane;
@@ -76,7 +76,7 @@  static u64 _dpu_core_perf_calc_clk(struct dpu_kms *kms,
 		crtc_clk = max(pstate->plane_clk, crtc_clk);
 	}
 
-	clk_factor = kms->catalog->perf->clk_inefficiency_factor;
+	clk_factor = perf_cfg->clk_inefficiency_factor;
 	if (clk_factor) {
 		crtc_clk *= clk_factor;
 		do_div(crtc_clk, 100);
@@ -92,20 +92,20 @@  static struct dpu_kms *_dpu_crtc_get_kms(struct drm_crtc *crtc)
 	return to_dpu_kms(priv->kms);
 }
 
-static void _dpu_core_perf_calc_crtc(struct dpu_kms *kms,
+static void _dpu_core_perf_calc_crtc(const struct dpu_perf_cfg *perf_cfg,
 		struct drm_crtc *crtc,
 		struct drm_crtc_state *state,
 		struct dpu_core_perf_params *perf)
 {
-	if (!kms || !kms->catalog || !crtc || !state || !perf) {
+	if (!perf_cfg || !crtc || !state || !perf) {
 		DPU_ERROR("invalid parameters\n");
 		return;
 	}
 
 	memset(perf, 0, sizeof(struct dpu_core_perf_params));
 
-	perf->bw_ctl = _dpu_core_perf_calc_bw(kms, crtc);
-	perf->core_clk_rate = _dpu_core_perf_calc_clk(kms, crtc, state);
+	perf->bw_ctl = _dpu_core_perf_calc_bw(perf_cfg, crtc);
+	perf->core_clk_rate = _dpu_core_perf_calc_clk(perf_cfg, crtc, state);
 
 	DRM_DEBUG_ATOMIC(
 		"crtc=%d clk_rate=%llu core_ab=%llu\n",
@@ -122,6 +122,7 @@  int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
 	struct dpu_crtc_state *dpu_cstate;
 	struct drm_crtc *tmp_crtc;
 	struct dpu_kms *kms;
+	const struct dpu_perf_cfg *perf_cfg;
 
 	if (!crtc || !state) {
 		DPU_ERROR("invalid crtc\n");
@@ -129,10 +130,7 @@  int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
 	}
 
 	kms = _dpu_crtc_get_kms(crtc);
-	if (!kms->catalog) {
-		DPU_ERROR("invalid parameters\n");
-		return 0;
-	}
+	perf_cfg = kms->perf.perf_cfg;
 
 	/* we only need bandwidth check on real-time clients (interfaces) */
 	if (dpu_crtc_get_client_type(crtc) == NRT_CLIENT)
@@ -141,7 +139,7 @@  int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
 	dpu_cstate = to_dpu_crtc_state(state);
 
 	/* obtain new values */
-	_dpu_core_perf_calc_crtc(kms, crtc, state, &dpu_cstate->new_perf);
+	_dpu_core_perf_calc_crtc(perf_cfg, crtc, state, &dpu_cstate->new_perf);
 
 	bw_sum_of_intfs = dpu_cstate->new_perf.bw_ctl;
 	curr_client_type = dpu_crtc_get_client_type(crtc);
@@ -164,7 +162,7 @@  int dpu_core_perf_crtc_check(struct drm_crtc *crtc,
 		bw = DIV_ROUND_UP_ULL(bw_sum_of_intfs, 1000);
 		DRM_DEBUG_ATOMIC("calculated bandwidth=%uk\n", bw);
 
-		threshold = kms->catalog->perf->max_bw_high;
+		threshold = perf_cfg->max_bw_high;
 
 		DRM_DEBUG_ATOMIC("final threshold bw limit = %d\n", threshold);
 
@@ -212,7 +210,7 @@  static int _dpu_core_perf_crtc_update_bus(struct dpu_kms *kms,
 	do_div(avg_bw, (kms->num_paths * 1000)); /*Bps_to_icc*/
 
 	for (i = 0; i < kms->num_paths; i++)
-		icc_set_bw(kms->path[i], avg_bw, kms->catalog->perf->min_dram_ib);
+		icc_set_bw(kms->path[i], avg_bw, kms->perf.perf_cfg->min_dram_ib);
 
 	return ret;
 }
@@ -293,10 +291,6 @@  int dpu_core_perf_crtc_update(struct drm_crtc *crtc,
 	}
 
 	kms = _dpu_crtc_get_kms(crtc);
-	if (!kms->catalog) {
-		DPU_ERROR("invalid kms\n");
-		return -EINVAL;
-	}
 
 	dpu_crtc = to_dpu_crtc(crtc);
 	dpu_cstate = to_dpu_crtc_state(crtc->state);
@@ -375,7 +369,6 @@  int dpu_core_perf_crtc_update(struct drm_crtc *crtc,
 int dpu_core_perf_debugfs_init(struct dpu_kms *dpu_kms, struct dentry *parent)
 {
 	struct dpu_core_perf *perf = &dpu_kms->perf;
-	const struct dpu_mdss_cfg *catalog = perf->catalog;
 	struct dentry *entry;
 
 	entry = debugfs_create_dir("core_perf", parent);
@@ -387,15 +380,15 @@  int dpu_core_perf_debugfs_init(struct dpu_kms *dpu_kms, struct dentry *parent)
 	debugfs_create_u32("enable_bw_release", 0600, entry,
 			(u32 *)&perf->enable_bw_release);
 	debugfs_create_u32("threshold_low", 0600, entry,
-			(u32 *)&catalog->perf->max_bw_low);
+			(u32 *)&perf->perf_cfg->max_bw_low);
 	debugfs_create_u32("threshold_high", 0600, entry,
-			(u32 *)&catalog->perf->max_bw_high);
+			(u32 *)&perf->perf_cfg->max_bw_high);
 	debugfs_create_u32("min_core_ib", 0600, entry,
-			(u32 *)&catalog->perf->min_core_ib);
+			(u32 *)&perf->perf_cfg->min_core_ib);
 	debugfs_create_u32("min_llcc_ib", 0600, entry,
-			(u32 *)&catalog->perf->min_llcc_ib);
+			(u32 *)&perf->perf_cfg->min_llcc_ib);
 	debugfs_create_u32("min_dram_ib", 0600, entry,
-			(u32 *)&catalog->perf->min_dram_ib);
+			(u32 *)&perf->perf_cfg->min_dram_ib);
 
 	return 0;
 }
@@ -410,17 +403,16 @@  void dpu_core_perf_destroy(struct dpu_core_perf *perf)
 
 	perf->max_core_clk_rate = 0;
 	perf->core_clk = NULL;
-	perf->catalog = NULL;
 	perf->dev = NULL;
 }
 
 int dpu_core_perf_init(struct dpu_core_perf *perf,
 		struct drm_device *dev,
-		const struct dpu_mdss_cfg *catalog,
+		const struct dpu_perf_cfg *perf_cfg,
 		struct clk *core_clk)
 {
 	perf->dev = dev;
-	perf->catalog = catalog;
+	perf->perf_cfg = perf_cfg;
 	perf->core_clk = core_clk;
 
 	perf->max_core_clk_rate = clk_get_rate(core_clk);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
index c29ec72984b8..e8a7916b6f71 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.h
@@ -28,7 +28,7 @@  struct dpu_core_perf_params {
  * struct dpu_core_perf - definition of core performance context
  * @dev: Pointer to drm device
  * @debugfs_root: top level debug folder
- * @catalog: Pointer to catalog configuration
+ * @perf_cfg: Platform-specific performance configuration
  * @core_clk: Pointer to the core clock
  * @core_clk_rate: current core clock rate
  * @max_core_clk_rate: maximum allowable core clock rate
@@ -38,7 +38,7 @@  struct dpu_core_perf_params {
 struct dpu_core_perf {
 	struct drm_device *dev;
 	struct dentry *debugfs_root;
-	const struct dpu_mdss_cfg *catalog;
+	const struct dpu_perf_cfg *perf_cfg;
 	struct clk *core_clk;
 	u64 core_clk_rate;
 	u64 max_core_clk_rate;
@@ -79,12 +79,12 @@  void dpu_core_perf_destroy(struct dpu_core_perf *perf);
  * dpu_core_perf_init - initialize the given core performance context
  * @perf: Pointer to core performance context
  * @dev: Pointer to drm device
- * @catalog: Pointer to catalog
+ * @perf_cfg: Pointer to platform performance configuration
  * @core_clk: pointer to core clock
  */
 int dpu_core_perf_init(struct dpu_core_perf *perf,
 		struct drm_device *dev,
-		const struct dpu_mdss_cfg *catalog,
+		const struct dpu_perf_cfg *perf_cfg,
 		struct clk *core_clk);
 
 struct dpu_kms;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index aa8499de1b9f..6e62606e32de 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1115,7 +1115,7 @@  static int dpu_kms_hw_init(struct msm_kms *kms)
 		dpu_kms->hw_vbif[vbif->id] = hw;
 	}
 
-	rc = dpu_core_perf_init(&dpu_kms->perf, dev, dpu_kms->catalog,
+	rc = dpu_core_perf_init(&dpu_kms->perf, dev, dpu_kms->catalog->perf,
 			msm_clk_bulk_get_clock(dpu_kms->clocks, dpu_kms->num_clocks, "core"));
 	if (rc) {
 		DPU_ERROR("failed to init perf %d\n", rc);