diff mbox series

[01/14] cpufreq/amd-pstate: Show a warning when a CPU fails to setup

Message ID 20250206215659.3350066-2-superm1@kernel.org
State New
Headers show
Series amd-pstate cleanups | expand

Commit Message

Mario Limonciello Feb. 6, 2025, 9:56 p.m. UTC
From: Mario Limonciello <mario.limonciello@amd.com>

I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
populated.  This is an unexpected behavior that is most likely a
BIOS bug. In the event it happens I'd like users to report bugs
to properly root cause and get this fixed.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Gautham R. Shenoy Feb. 10, 2025, 1:50 p.m. UTC | #1
On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote:
> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
> > From: Mario Limonciello <mario.limonciello@amd.com>
> > 
> > I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
> > populated.  This is an unexpected behavior that is most likely a
> > BIOS bug. In the event it happens I'd like users to report bugs
> > to properly root cause and get this fixed.
> 
> I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(),
> so not sure if this is strictly necessary.
> 
> 1402                 /*
> 1403                  * Call driver. From then on the cpufreq must be able
> 1404                  * to accept all calls to ->verify and ->setpolicy for this CPU.
> 1405                  */
> 1406                 ret = cpufreq_driver->init(policy);
> 1407                 if (ret) {
> 1408                         pr_debug("%s: %d: initialization failed\n", __func__,
> 1409                                  __LINE__);
> 1410                         goto out_free_policy;
> 1411                  
>

Well, the pr_debug() doesn't always get printed unless the loglevel is
set to debug, which is usually done by the developers and not the end
users.

However you have a point that since the code jumps to free_cpudata1 on
failures from amd_pstate_init_perf(), amd_pstate_init_freq(),
amd_pstate_init_boost_support(), freq_qos_add_request(). So the
pr_warn() doesn't indicate that the failure is due to
MSR_AMD_CPPC_CAP1 not being populated.
Mario Limonciello Feb. 10, 2025, 3:13 p.m. UTC | #2
On 2/10/2025 07:50, Gautham R. Shenoy wrote:
> On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote:
>> On 2/7/2025 3:26 AM, Mario Limonciello wrote:
>>> From: Mario Limonciello <mario.limonciello@amd.com>
>>>
>>> I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't
>>> populated.  This is an unexpected behavior that is most likely a
>>> BIOS bug. In the event it happens I'd like users to report bugs
>>> to properly root cause and get this fixed.
>>
>> I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(),
>> so not sure if this is strictly necessary.
>>
>> 1402                 /*
>> 1403                  * Call driver. From then on the cpufreq must be able
>> 1404                  * to accept all calls to ->verify and ->setpolicy for this CPU.
>> 1405                  */
>> 1406                 ret = cpufreq_driver->init(policy);
>> 1407                 if (ret) {
>> 1408                         pr_debug("%s: %d: initialization failed\n", __func__,
>> 1409                                  __LINE__);
>> 1410                         goto out_free_policy;
>> 1411
>>
> 
> Well, the pr_debug() doesn't always get printed unless the loglevel is
> set to debug, which is usually done by the developers and not the end
> users.
> 
> However you have a point that since the code jumps to free_cpudata1 on
> failures from amd_pstate_init_perf(), amd_pstate_init_freq(),
> amd_pstate_init_boost_support(), freq_qos_add_request(). So the
> pr_warn() doesn't indicate that the failure is due to
> MSR_AMD_CPPC_CAP1 not being populated.
> 

Right; my point is that without the warning no one knows there is a problem.

I don't expect we can anticipate all the potential causes, and I want 
anyone who hits this to raise a bug and we can ask them to turn on 
dynamic debug / ftrace and then triage it.
diff mbox series

Patch

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index f425fb7ec77d7..573643654e8d6 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1034,6 +1034,7 @@  static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
 free_cpudata2:
 	freq_qos_remove_request(&cpudata->req[0]);
 free_cpudata1:
+	pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
 	kfree(cpudata);
 	return ret;
 }
@@ -1527,6 +1528,7 @@  static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
 	return 0;
 
 free_cpudata1:
+	pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
 	kfree(cpudata);
 	return ret;
 }