From patchwork Tue Apr 28 03:22:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Francisco Jerez X-Patchwork-Id: 212211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E7CBC83008 for ; Tue, 28 Apr 2020 03:27:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 516AB21D79 for ; Tue, 28 Apr 2020 03:27:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="KaTt9WBW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726047AbgD1D1v (ORCPT ); Mon, 27 Apr 2020 23:27:51 -0400 Received: from mx1.riseup.net ([198.252.153.129]:48900 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726284AbgD1D1v (ORCPT ); Mon, 27 Apr 2020 23:27:51 -0400 Received: from bell.riseup.net (unknown [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 49B6Xk3rQpzFfDl; Mon, 27 Apr 2020 20:27:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1588044470; bh=jYQGDs775oeWpnjVU5KQQtQdTslfIGrc2wZmpUY24xE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KaTt9WBWuX3+YC46+E95lTIX0T/t2a6+fxKbnsEzRqtMpEJBazUav9JDvnAsnubqh cUjZClvRPFdSgFSO0wuPxA8kvfuarmQ2RdV3facq/0ScLZCvcAGaLflMt/hP8mwwgx UYqQYyrPM4G9ggTFsjFMQgufNTe4S37yH2o8HkOw= X-Riseup-User-ID: FFF0C73852467EC62002E944EFC9FFF994A3B9DDD677902FBB2E2AED2180165B Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 49B6Xk1jXJzJqbk; Mon, 27 Apr 2020 20:27:50 -0700 (PDT) From: Francisco Jerez To: "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Cc: linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org, chris.p.wilson@intel.com, "Vivi\, Rodrigo" , Peter Zijlstra Subject: [PATCHv2.99 02/11] drm/i915: Adjust PM QoS scaling response frequency based on GPU load. Date: Mon, 27 Apr 2020 20:22:49 -0700 Message-Id: <20200428032258.2518-3-currojerez@riseup.net> In-Reply-To: <20200428032258.2518-1-currojerez@riseup.net> References: <20200428032258.2518-1-currojerez@riseup.net> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org This allows CPUFREQ governors to realize when the system becomes non-CPU-bound due to GPU rendering activity, and cause them to respond more conservatively to the workload by limiting their response frequency: CPU energy usage will be reduced when there isn't a good chance for system performance to scale with CPU frequency due to the GPU bottleneck. This leaves additional TDP budget available for the GPU to reach higher frequencies, which may be translated into an improvement in graphics performance to the extent that the workload remains TDP-limited. If the workload isn't (anymore) TDP-limited performance should stay roughly constant, but energy usage will be divided by a similar factor. The metric used by this patch in order to determine whether the GPU is unlikely to be a bottleneck may not be particularly obvious: We only specify a reduced PM QoS response frequency target whenever both execlists ELSP ports are simultaneously active, since in that case we know that the completion of one context will lead to the immediate execution of another, which means that the GPU can be kept busy without the execlists submission code rushing to submit a new request, so CPU latency shouldn't be a concern. This might miss some workloads that could theoretically benefit from this optimization, since some applications are unable to keep both ELSP ports active for a significant fraction of time even though they are GPU-bound, however using the single-ELSP utilization as metric would neglect the CPU latency-sensitivity of the execlists submission code, which would lead to large regressions in x11perf and jxrendermark. For that reason this patch takes the rather conservative approach of restricting the optimization to workloads that effectively utilize both ELSP ports, which indicates that command submission latency is unlikely to be an issue. Note that this is currently only enabled for execlists submission. It might be beneficial to do the same thing in combination with GuC submission, but the metric would be slightly different since we wouldn't need to care about multiple ELSP ports being in use. In order to further prevent regressions the optimization is enabled with a delay in order to avoid performance degradation of applications that quickly switch back and forth between being GPU-bound and CPU-bound. A reduced PM QoS scaling response frequency target will only be specified if the GPU has been continuously utilized for a long enough period of time. v3: Assorted clean-ups (Chris). Improved documentation (Tvrtko). Fix interaction with single-ELSP preemption (Chris). Move overload signalling to process_csb() to reduce bias due to interrupt latency (Francisco). Rename CPU_RESPONSE_FREQUENCY to CPU_SCALING_RESPONSE (Rafael). Adjust heuristic parameters to avoid regressions from other v3 governor changes (Francisco). Signed-off-by: Francisco Jerez --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 1 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 11 ++ drivers/gpu/drm/i915/gt/intel_gt_pm.c | 107 +++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt_pm.h | 3 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 49 +++++++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 17 +++ 6 files changed, 188 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index b1f8527f02c8..6b08a9ad2de1 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -517,6 +517,7 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine) execlists->queue_priority_hint = INT_MIN; execlists->queue = RB_ROOT_CACHED; + atomic_set(&execlists->overload, 0); } static void cleanup_status_page(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index bf395227c99f..9bdb3958dbb7 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -275,6 +275,17 @@ struct intel_engine_execlists { */ u8 csb_head; + /** + * @overload: whether at least two execlist ports are + * currently submitted to the hardware, indicating that CPU + * latency isn't critical in order to maintain the GPU busy. + * We use that to trigger a more energy-efficient response + * mode of CPU power management, since performance degradation + * is unlikely under those conditions, and GPU throughput may + * benefit from the increased TDP budget. + */ + atomic_t overload; + I915_SELFTEST_DECLARE(struct st_preempt_hang preempt_hang;) }; diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index 6bdb74892a1e..0d44ef3a07ad 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -107,6 +107,102 @@ void intel_gt_pm_init_early(struct intel_gt *gt) intel_wakeref_init(>->wakeref, gt->uncore->rpm, &wf_ops); } +/** + * Time increment until the most immediate PM QoS scaling response + * frequency update. + * + * May be in the future (return value > 0) if the GPU is currently + * active but we haven't updated the PM QoS request to reflect a + * bottleneck yet. May be in the past (return value < 0) if the GPU + * isn't fully utilized and we've already reset the PM QoS request to + * the default value. May be zero if a PM QoS request update is due. + * + * The time increment returned by this function decreases linearly + * with time until it reaches either zero or a configurable limit. + */ +static int32_t time_to_sf_qos_update_ns(struct intel_gt *gt) +{ + const uint64_t t1 = ktime_get_ns(); + const uint64_t dt1 = gt->sf_qos.delay_max_ns; + + if (atomic_read_acquire(>->sf_qos.active_count)) { + const uint64_t t0 = atomic64_read(>->sf_qos.time_set_ns); + + return min(dt1, t0 <= t1 ? 0 : t0 - t1); + } else { + const uint64_t t0 = atomic64_read(>->sf_qos.time_clear_ns); + const unsigned int shift = gt->sf_qos.delay_slope_shift; + + return -(int32_t)(t1 <= t0 ? 1 : + min(dt1, (t1 - t0) << shift)); + } +} + +/** + * Perform a delayed PM QoS scaling response frequency update. + */ +static void intel_gt_sf_qos_update(struct intel_gt *gt) +{ + const uint32_t dt = max(0, time_to_sf_qos_update_ns(gt)); + + timer_reduce(>->sf_qos.timer, jiffies + nsecs_to_jiffies(dt)); +} + +/** + * Timer that fires once the delay used to switch the PM QoS scaling + * response frequency request has elapsed. + */ +static void intel_gt_sf_qos_timeout(struct timer_list *timer) +{ + struct intel_gt *gt = container_of(timer, struct intel_gt, + sf_qos.timer); + const int32_t dt = time_to_sf_qos_update_ns(gt); + + if (dt == 0) + cpu_scaling_response_qos_update_request( + >->sf_qos.req, gt->sf_qos.target_hz); + else + cpu_scaling_response_qos_update_request( + >->sf_qos.req, PM_QOS_DEFAULT_VALUE); + + if (dt > 0) + intel_gt_sf_qos_update(gt); +} + +/** + * Report the beginning of a period of GPU utilization to PM. + * + * May trigger a more energy-efficient response mode in CPU PM, but + * only after a certain delay has elapsed so we don't have a negative + * impact on the CPU ramp-up latency except after the GPU has been + * continuously utilized for a long enough period of time. + */ +void intel_gt_pm_active_begin(struct intel_gt *gt) +{ + const uint32_t dt = abs(time_to_sf_qos_update_ns(gt)); + + atomic64_set(>->sf_qos.time_set_ns, ktime_get_ns() + dt); + + if (!atomic_fetch_inc_release(>->sf_qos.active_count)) + intel_gt_sf_qos_update(gt); +} + +/** + * Report the end of a period of GPU utilization to PM. + * + * Must be called once after each call to intel_gt_pm_active_begin(). + */ +void intel_gt_pm_active_end(struct intel_gt *gt) +{ + const uint32_t dt = abs(time_to_sf_qos_update_ns(gt)); + const unsigned int shift = gt->sf_qos.delay_slope_shift; + + atomic64_set(>->sf_qos.time_clear_ns, ktime_get_ns() - (dt >> shift)); + + if (!atomic_dec_return_release(>->sf_qos.active_count)) + intel_gt_sf_qos_update(gt); +} + void intel_gt_pm_init(struct intel_gt *gt) { /* @@ -116,6 +212,14 @@ void intel_gt_pm_init(struct intel_gt *gt) */ intel_rc6_init(>->rc6); intel_rps_init(>->rps); + + cpu_scaling_response_qos_add_request(>->sf_qos.req, + PM_QOS_DEFAULT_VALUE); + + gt->sf_qos.delay_max_ns = 10000000; + gt->sf_qos.delay_slope_shift = 1; + gt->sf_qos.target_hz = 2; + timer_setup(>->sf_qos.timer, intel_gt_sf_qos_timeout, 0); } static bool reset_engines(struct intel_gt *gt) @@ -174,6 +278,9 @@ static void gt_sanitize(struct intel_gt *gt, bool force) void intel_gt_pm_fini(struct intel_gt *gt) { + del_timer_sync(>->sf_qos.timer); + cpu_scaling_response_qos_remove_request(>->sf_qos.req); + intel_rc6_fini(>->rc6); } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h b/drivers/gpu/drm/i915/gt/intel_gt_pm.h index 60f0e2fbe55c..43f1d45fb0db 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h @@ -58,6 +58,9 @@ int intel_gt_resume(struct intel_gt *gt); void intel_gt_runtime_suspend(struct intel_gt *gt); int intel_gt_runtime_resume(struct intel_gt *gt); +void intel_gt_pm_active_begin(struct intel_gt *gt); +void intel_gt_pm_active_end(struct intel_gt *gt); + static inline bool is_mock_gt(const struct intel_gt *gt) { return I915_SELFTEST_ONLY(gt->awake == -ENODEV); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 96890dd12b5f..8aaeb2450d05 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -97,6 +98,54 @@ struct intel_gt { * Reserved for exclusive use by the kernel. */ struct i915_address_space *vm; + + /** + * CPU response frequency QoS tracking. + */ + struct { + /** PM QoS request of this device. */ + struct pm_qos_request req; + + /** Timer used for delayed update of the PM QoS request. */ + struct timer_list timer; + + /** Response frequency target to use in GPU-bound conditions. */ + uint32_t target_hz; + + /** + * Maximum delay before the PM QoS request is updated + * after we become GPU-bound. + */ + uint32_t delay_max_ns; + + /** + * Exponent of delay slope used when the workload + * becomes non-GPU-bound, used to provide greater + * sensitivity to periods of GPU inactivity which may + * indicate that the workload is latency-bound. + */ + uint32_t delay_slope_shift; + + /** + * Last time intel_gt_pm_active_begin() was called to + * indicate that the GPU is a bottleneck. + */ + atomic64_t time_set_ns; + + /** + * Last time intel_gt_pm_active_end() was called to + * indicate that the GPU is no longer a bottleneck. + */ + atomic64_t time_clear_ns; + + /** + * Number of times intel_gt_pm_active_begin() was + * called without a matching intel_gt_pm_active_end(). + * Will be greater than zero if the GPU is currently + * considered to be a bottleneck. + */ + atomic_t active_count; + } sf_qos; }; enum intel_gt_scratch_field { diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index be5d6b71b6b0..767fa88f4d20 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2365,6 +2365,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists) smp_wmb(); /* complete the seqlock for execlists_active() */ WRITE_ONCE(execlists->active, execlists->inflight); + + if (atomic_xchg(&execlists->overload, 0)) { + struct intel_engine_cs *engine = + container_of(execlists, typeof(*engine), execlists); + intel_gt_pm_active_end(engine->gt); + } } static inline void @@ -2533,12 +2539,23 @@ static void process_csb(struct intel_engine_cs *engine) WRITE_ONCE(execlists->active, execlists->inflight); WRITE_ONCE(execlists->pending[0], NULL); + + if (execlists->inflight[1]) { + if (!atomic_xchg(&execlists->overload, 1)) + intel_gt_pm_active_begin(engine->gt); + } else { + if (atomic_xchg(&execlists->overload, 0)) + intel_gt_pm_active_end(engine->gt); + } } else { GEM_BUG_ON(!*execlists->active); /* port0 completed, advanced to port1 */ trace_ports(execlists, "completed", execlists->active); + if (atomic_xchg(&execlists->overload, 0)) + intel_gt_pm_active_end(engine->gt); + /* * We rely on the hardware being strongly * ordered, that the breadcrumb write is From patchwork Tue Apr 28 03:22:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Francisco Jerez X-Patchwork-Id: 212212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52868C83006 for ; Tue, 28 Apr 2020 03:27:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 32030206D6 for ; Tue, 28 Apr 2020 03:27:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="R0UJBEHE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726318AbgD1D1v (ORCPT ); Mon, 27 Apr 2020 23:27:51 -0400 Received: from mx1.riseup.net ([198.252.153.129]:48912 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726047AbgD1D1v (ORCPT ); Mon, 27 Apr 2020 23:27:51 -0400 Received: from bell.riseup.net (unknown [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 49B6Xk5sdXzFcxH; Mon, 27 Apr 2020 20:27:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1588044470; bh=fyUj+NdMXFbPK+znZDmYjsevx14GaZlzNI5x1mEyTMk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=R0UJBEHEzcOV37W9NEj/eu/VeNbgoDZGPx8HabVbjiiNUpq1XPuWu2BGLPm8E61L8 IzEfJa+2bPuZqwR/BBvM2YeifezguQEKmbj6HUHw0BK0dwacaxs2woQQj5Ta2KaPdh NPZstrEpAdDnUxUdLJFAFB3nRwQGfhG8KMJYYbj0= X-Riseup-User-ID: 4352315D553855E8DA00A4C141746641FBDF3A8945C526146A0F7AC13450415C Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 49B6Xk3fkqzJqbv; Mon, 27 Apr 2020 20:27:50 -0700 (PDT) From: Francisco Jerez To: "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Cc: linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org, chris.p.wilson@intel.com, "Vivi\, Rodrigo" , Peter Zijlstra Subject: [PATCHv2.99 03/11] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs. Date: Mon, 27 Apr 2020 20:22:50 -0700 Message-Id: <20200428032258.2518-4-currojerez@riseup.net> In-Reply-To: <20200428032258.2518-1-currojerez@riseup.net> References: <20200428032258.2518-1-currojerez@riseup.net> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org v3: Rename CPU_RESPONSE_FREQUENCY to CPU_SCALING_RESPONSE (Rafael). Signed-off-by: Francisco Jerez --- drivers/gpu/drm/i915/i915_debugfs.c | 69 +++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index aa35a59f1c7d..16a45fd2c376 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1247,6 +1247,72 @@ static int i915_llc(struct seq_file *m, void *data) return 0; } +static int +i915_sf_qos_delay_max_ns_set(void *data, u64 val) +{ + struct drm_i915_private *dev_priv = data; + + WRITE_ONCE(dev_priv->gt.sf_qos.delay_max_ns, val); + return 0; +} + +static int +i915_sf_qos_delay_max_ns_get(void *data, u64 *val) +{ + struct drm_i915_private *dev_priv = data; + + *val = READ_ONCE(dev_priv->gt.sf_qos.delay_max_ns); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_sf_qos_delay_max_ns_fops, + i915_sf_qos_delay_max_ns_get, + i915_sf_qos_delay_max_ns_set, "%llu\n"); + +static int +i915_sf_qos_delay_slope_shift_set(void *data, u64 val) +{ + struct drm_i915_private *dev_priv = data; + + WRITE_ONCE(dev_priv->gt.sf_qos.delay_slope_shift, val); + return 0; +} + +static int +i915_sf_qos_delay_slope_shift_get(void *data, u64 *val) +{ + struct drm_i915_private *dev_priv = data; + + *val = READ_ONCE(dev_priv->gt.sf_qos.delay_slope_shift); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_sf_qos_delay_slope_shift_fops, + i915_sf_qos_delay_slope_shift_get, + i915_sf_qos_delay_slope_shift_set, "%llu\n"); + +static int +i915_sf_qos_target_hz_set(void *data, u64 val) +{ + struct drm_i915_private *dev_priv = data; + + WRITE_ONCE(dev_priv->gt.sf_qos.target_hz, val); + return 0; +} + +static int +i915_sf_qos_target_hz_get(void *data, u64 *val) +{ + struct drm_i915_private *dev_priv = data; + + *val = READ_ONCE(dev_priv->gt.sf_qos.target_hz); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(i915_sf_qos_target_hz_fops, + i915_sf_qos_target_hz_get, + i915_sf_qos_target_hz_set, "%llu\n"); + static int i915_runtime_pm_status(struct seq_file *m, void *unused) { struct drm_i915_private *dev_priv = node_to_i915(m->private); @@ -1882,6 +1948,9 @@ static const struct i915_debugfs_files { {"i915_error_state", &i915_error_state_fops}, {"i915_gpu_info", &i915_gpu_info_fops}, #endif + {"i915_sf_qos_delay_max_ns", &i915_sf_qos_delay_max_ns_fops}, + {"i915_sf_qos_delay_slope_shift", &i915_sf_qos_delay_slope_shift_fops}, + {"i915_sf_qos_target_hz", &i915_sf_qos_target_hz_fops} }; void i915_debugfs_register(struct drm_i915_private *dev_priv) From patchwork Tue Apr 28 03:22:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Francisco Jerez X-Patchwork-Id: 212210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 499DFC83009 for ; Tue, 28 Apr 2020 03:27:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2012021D7A for ; Tue, 28 Apr 2020 03:27:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="QqZvh+Cz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726335AbgD1D1w (ORCPT ); Mon, 27 Apr 2020 23:27:52 -0400 Received: from mx1.riseup.net ([198.252.153.129]:48912 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726312AbgD1D1v (ORCPT ); Mon, 27 Apr 2020 23:27:51 -0400 Received: from bell.riseup.net (unknown [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 49B6Xl3gxvzFfHC; Mon, 27 Apr 2020 20:27:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1588044471; bh=sGuXA/+5WIjTQ087osfcebpmWG4DqRI3IB3ZQpHAqWY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QqZvh+CzSLblaHHShm99+PvMpQ8+JZCqOMTUHTAgLsF7Xs9oDMgRehNSEnHOirThS +5osZwRoFKBtKTk0GC3RlxBwGSMshPRBkO7yr/gti+jNBsQuqHgsv9rJ51Gxd63SWz iEi71SnS29/2WSKgkL4IYCNf+VcgJGt6eWNbPY0c= X-Riseup-User-ID: 114744E557F2DD3C88A19DFBDF3999A048F0F63B1DAC11BE228326A912507F03 Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 49B6Xl1ms9zJqbk; Mon, 27 Apr 2020 20:27:51 -0700 (PDT) From: Francisco Jerez To: "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Cc: linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org, chris.p.wilson@intel.com, "Vivi\, Rodrigo" , Peter Zijlstra Subject: [PATCHv2.99 06/11] cpufreq: intel_pstate: Call intel_pstate_set_update_util_hook() once from the setpolicy hook. Date: Mon, 27 Apr 2020 20:22:53 -0700 Message-Id: <20200428032258.2518-7-currojerez@riseup.net> In-Reply-To: <20200428032258.2518-1-currojerez@riseup.net> References: <20200428032258.2518-1-currojerez@riseup.net> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org And let it figure out whether an update_util hook is needed, and what the appropriate function pointer is based on the CPUFREQ policy of the current CPU. Signed-off-by: Francisco Jerez --- drivers/cpufreq/intel_pstate.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 49401cfe9858..fd7eee57c05c 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -2016,10 +2016,11 @@ static void intel_pstate_set_update_util_hook(unsigned int cpu_num) { struct cpudata *cpu = all_cpu_data[cpu_num]; - if (hwp_active && !hwp_boost) - return; - if (cpu->update_util_set) + intel_pstate_clear_update_util_hook(cpu_num); + + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE || + (hwp_active && !hwp_boost)) return; /* Prevent intel_pstate_update_util() from using stale data. */ @@ -2117,27 +2118,18 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) intel_pstate_update_perf_limits(cpu, policy->min, policy->max); + intel_pstate_set_update_util_hook(policy->cpu); + if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) { /* * NOHZ_FULL CPUs need this as the governor callback may not * be invoked on them. */ - intel_pstate_clear_update_util_hook(policy->cpu); intel_pstate_max_within_limits(cpu); - } else { - intel_pstate_set_update_util_hook(policy->cpu); } - if (hwp_active) { - /* - * When hwp_boost was active before and dynamically it - * was turned off, in that case we need to clear the - * update util hook. - */ - if (!hwp_boost) - intel_pstate_clear_update_util_hook(policy->cpu); + if (hwp_active) intel_pstate_hwp_set(policy->cpu); - } mutex_unlock(&intel_pstate_limits_lock); From patchwork Tue Apr 28 03:22:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Francisco Jerez X-Patchwork-Id: 212209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 810A6C8300F for ; Tue, 28 Apr 2020 03:27:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 601B2206E2 for ; Tue, 28 Apr 2020 03:27:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="hHdNdXfF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726336AbgD1D1x (ORCPT ); Mon, 27 Apr 2020 23:27:53 -0400 Received: from mx1.riseup.net ([198.252.153.129]:48974 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726312AbgD1D1w (ORCPT ); Mon, 27 Apr 2020 23:27:52 -0400 Received: from bell.riseup.net (unknown [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 49B6Xm20WwzFfHL; Mon, 27 Apr 2020 20:27:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1588044472; bh=GhQO0oy9LAPGQy0LObLUFpBtrz+I8TRlVVbokzyOGiQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hHdNdXfFQPqHMPO6pZ8Zy0D91XhkD/xqb7DjqPCcinGlXFjgwGkbrartYENIRFbY+ qiodW6KctBCHq72UHPLwFNNI6QXkIhTxxP1+TSk8QabrzAIS00nk82Hk+/xgVaCeU5 dm73IV/F9ktyUnmarMgViWAI5QhfSlE0dKZ0N5bg= X-Riseup-User-ID: 5D73CF9300B288F55B9E4B03BA5BB836FBAF64C9D4888C1CA669A2E6FC068A36 Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 49B6Xm0922zJqbw; Mon, 27 Apr 2020 20:27:52 -0700 (PDT) From: Francisco Jerez To: "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Cc: linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org, chris.p.wilson@intel.com, "Vivi\, Rodrigo" , Peter Zijlstra Subject: [PATCHv2.99 09/11] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID. Date: Mon, 27 Apr 2020 20:22:56 -0700 Message-Id: <20200428032258.2518-10-currojerez@riseup.net> In-Reply-To: <20200428032258.2518-1-currojerez@riseup.net> References: <20200428032258.2518-1-currojerez@riseup.net> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org For the moment the VLP controller is only enabled on ICL platforms other than server FADT profiles in order to reduce the validation effort of the initial submission. It should work on any other processors that support HWP though (and soon enough on non-HWP too): In order to override the default behavior (e.g. to test on other platforms) the VLP controller can be forcefully enabled or disabled by selecting the "adaptive" or "powersave" CPUFREQ governors respectively via sysfs. v2: Handle HWP VLP controller. v3: Define generic CPUFREQ policy to control VLP governor (Rafael). Signed-off-by: Francisco Jerez --- drivers/cpufreq/intel_pstate.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 0a315f681c43..2458a821195f 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -396,6 +396,7 @@ static int hwp_active __read_mostly; static int hwp_mode_bdw __read_mostly; static bool per_cpu_limits __read_mostly; static bool hwp_boost __read_mostly; +static bool vlp __read_mostly; static struct cpufreq_driver *intel_pstate_driver __read_mostly; @@ -2724,7 +2725,8 @@ static int intel_pstate_cpu_init(struct cpufreq_policy *policy) * Set the policy to powersave to provide a valid fallback value in case * the default cpufreq governor is neither powersave nor performance. */ - policy->policy = CPUFREQ_POLICY_POWERSAVE; + policy->policy = (vlp ? CPUFREQ_POLICY_ADAPTIVE : + CPUFREQ_POLICY_POWERSAVE); return 0; } @@ -3209,6 +3211,16 @@ static const struct x86_cpu_id hwp_support_ids[] __initconst = { {} }; +#define X86_MATCH_VLP(model) \ + X86_MATCH_VENDOR_FAM_MODEL_FEATURE(INTEL, 6, INTEL_FAM6_##model, \ + X86_FEATURE_APERFMPERF, 0) + +static const struct x86_cpu_id vlp_default_ids[] __initconst = { + X86_MATCH_VLP(ICELAKE), + X86_MATCH_VLP(ICELAKE_L), + {} +}; + static int __init intel_pstate_init(void) { const struct x86_cpu_id *id; @@ -3247,6 +3259,10 @@ static int __init intel_pstate_init(void) default_driver = &intel_cpufreq; hwp_cpu_matched: + /* Enable VLP controller by default. */ + vlp = !intel_pstate_acpi_pm_profile_server() && + x86_match_cpu(vlp_default_ids) && hwp_active; + /* * The Intel pstate driver will be ignored if the platform * firmware has its own power management modes. From patchwork Tue Apr 28 03:22:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Francisco Jerez X-Patchwork-Id: 212208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D142C83005 for ; Tue, 28 Apr 2020 03:27:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A192206E2 for ; Tue, 28 Apr 2020 03:27:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=riseup.net header.i=@riseup.net header.b="j7eccXTn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726312AbgD1D1y (ORCPT ); Mon, 27 Apr 2020 23:27:54 -0400 Received: from mx1.riseup.net ([198.252.153.129]:49008 "EHLO mx1.riseup.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726272AbgD1D1x (ORCPT ); Mon, 27 Apr 2020 23:27:53 -0400 Received: from bell.riseup.net (unknown [10.0.1.178]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "*.riseup.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (not verified)) by mx1.riseup.net (Postfix) with ESMTPS id 49B6Xn1LbjzFfJW; Mon, 27 Apr 2020 20:27:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=riseup.net; s=squak; t=1588044473; bh=/nTQVSK/ZgLsgrkm5SVbyNrA2AriDqBumGqvIs8g0sg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j7eccXTnlRhTx97GXKHfjDNOfdeCC5ewjQ6EoFtZRRLSkbZLxc2qphaesYKgEba21 o6w0pPz7jZs9tiNOnB1b44fHZ9sxdbBze0sN0tGuYJynJ1cInW7Dc7C2gdxtFGrkGx FSskcaQnezPwFhy4mzHywiWu793wxu9LA0fLeWiA= X-Riseup-User-ID: 2C3AF3B7773374F043249BCF1F87A9E72CD6CB7B0114473ECF47A6BF4E92D132 Received: from [127.0.0.1] (localhost [127.0.0.1]) by bell.riseup.net (Postfix) with ESMTPSA id 49B6Xm3WVmzJqbv; Mon, 27 Apr 2020 20:27:52 -0700 (PDT) From: Francisco Jerez To: "Rafael J. Wysocki" , "Pandruvada\, Srinivas" Cc: linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org, chris.p.wilson@intel.com, "Vivi\, Rodrigo" , Peter Zijlstra , Fengguang Wu , Julia Lawall Subject: [PATCHv2.99 11/11] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs. Date: Mon, 27 Apr 2020 20:22:58 -0700 Message-Id: <20200428032258.2518-12-currojerez@riseup.net> In-Reply-To: <20200428032258.2518-1-currojerez@riseup.net> References: <20200428032258.2518-1-currojerez@riseup.net> MIME-Version: 1.0 Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org This is not required for the controller to work but has proven very useful for debugging and testing of alternative heuristic parameters, which may offer a better trade-off between energy efficiency and latency. A warning is printed out which should taint the kernel for the non-standard calibration of the heuristic to be obvious in bug reports. v2: Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files (Julia). Add realtime statistic threshold and averaging frequency parameters. v3: Define generic CPUFREQ policy to control VLP governor (Rafael). Signed-off-by: Francisco Jerez Signed-off-by: Fengguang Wu Signed-off-by: Julia Lawall --- drivers/cpufreq/intel_pstate.c | 86 ++++++++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index dd86505d7855..ab0334a99039 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1031,6 +1031,88 @@ static void intel_pstate_update_limits(unsigned int cpu) mutex_unlock(&intel_pstate_driver_lock); } +/************************** debugfs begin ************************/ +static void intel_pstate_reset_vlp(struct cpudata *cpu); + +static int vlp_param_set(void *data, u64 val) +{ + unsigned int cpu; + + *(u32 *)data = val; + for_each_possible_cpu(cpu) { + if (all_cpu_data[cpu]) + intel_pstate_reset_vlp(all_cpu_data[cpu]); + } + + WARN_ONCE(1, "Unsupported P-state VLP parameter update via debugging interface"); + + return 0; +} + +static int vlp_param_get(void *data, u64 *val) +{ + *val = *(u32 *)data; + return 0; +} +DEFINE_DEBUGFS_ATTRIBUTE(fops_vlp_param, vlp_param_get, vlp_param_set, + "%llu\n"); + +static struct dentry *debugfs_parent; + +struct vlp_param { + char *name; + void *value; + struct dentry *dentry; +}; + +static struct vlp_param vlp_files[] = { + {"vlp_sample_interval_ms", &vlp_params.sample_interval_ms, }, + {"vlp_setpoint_0_pml", &vlp_params.setpoint_0_pml, }, + {"vlp_setpoint_aggr_pml", &vlp_params.setpoint_aggr_pml, }, + {"vlp_avg_hz", &vlp_params.avg_hz, }, + {"vlp_realtime_gain_pml", &vlp_params.realtime_gain_pml, }, + {"vlp_debug", &vlp_params.debug, }, + {NULL, NULL, } +}; + +static void intel_pstate_debug_expose_params(void) +{ + int i; + + debugfs_parent = debugfs_create_dir("pstate_snb", NULL); + if (IS_ERR_OR_NULL(debugfs_parent)) + return; + + for (i = 0; vlp_files[i].name; i++) { + struct dentry *dentry; + + dentry = debugfs_create_file_unsafe(vlp_files[i].name, 0660, + debugfs_parent, + vlp_files[i].value, + &fops_vlp_param); + if (!IS_ERR(dentry)) + vlp_files[i].dentry = dentry; + } +} + +static void intel_pstate_debug_hide_params(void) +{ + int i; + + if (IS_ERR_OR_NULL(debugfs_parent)) + return; + + for (i = 0; vlp_files[i].name; i++) { + debugfs_remove(vlp_files[i].dentry); + vlp_files[i].dentry = NULL; + } + + debugfs_remove(debugfs_parent); + debugfs_parent = NULL; +} + +/************************** debugfs end ************************/ + /************************** sysfs begin ************************/ #define show_one(file_name, object) \ static ssize_t show_##file_name \ @@ -2977,6 +3059,8 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver) global.min_perf_pct = min_perf_pct_min(); + intel_pstate_debug_expose_params(); + return 0; } @@ -2985,6 +3069,8 @@ static int intel_pstate_unregister_driver(void) if (hwp_active) return -EBUSY; + intel_pstate_debug_hide_params(); + cpufreq_unregister_driver(intel_pstate_driver); intel_pstate_driver_cleanup();