Message ID | 20250206215659.3350066-2-superm1@kernel.org |
---|---|
State | New |
Headers | show |
Series | amd-pstate cleanups | expand |
On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote: > On 2/7/2025 3:26 AM, Mario Limonciello wrote: > > From: Mario Limonciello <mario.limonciello@amd.com> > > > > I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't > > populated. This is an unexpected behavior that is most likely a > > BIOS bug. In the event it happens I'd like users to report bugs > > to properly root cause and get this fixed. > > I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(), > so not sure if this is strictly necessary. > > 1402 /* > 1403 * Call driver. From then on the cpufreq must be able > 1404 * to accept all calls to ->verify and ->setpolicy for this CPU. > 1405 */ > 1406 ret = cpufreq_driver->init(policy); > 1407 if (ret) { > 1408 pr_debug("%s: %d: initialization failed\n", __func__, > 1409 __LINE__); > 1410 goto out_free_policy; > 1411 > Well, the pr_debug() doesn't always get printed unless the loglevel is set to debug, which is usually done by the developers and not the end users. However you have a point that since the code jumps to free_cpudata1 on failures from amd_pstate_init_perf(), amd_pstate_init_freq(), amd_pstate_init_boost_support(), freq_qos_add_request(). So the pr_warn() doesn't indicate that the failure is due to MSR_AMD_CPPC_CAP1 not being populated.
On 2/10/2025 07:50, Gautham R. Shenoy wrote: > On Mon, Feb 10, 2025 at 05:29:24PM +0530, Dhananjay Ugwekar wrote: >> On 2/7/2025 3:26 AM, Mario Limonciello wrote: >>> From: Mario Limonciello <mario.limonciello@amd.com> >>> >>> I came across a system that MSR_AMD_CPPC_CAP1 for some CPUs isn't >>> populated. This is an unexpected behavior that is most likely a >>> BIOS bug. In the event it happens I'd like users to report bugs >>> to properly root cause and get this fixed. >> >> I'm okay with this patch, but I see a similar pr_debug in caller cpufreq_online(), >> so not sure if this is strictly necessary. >> >> 1402 /* >> 1403 * Call driver. From then on the cpufreq must be able >> 1404 * to accept all calls to ->verify and ->setpolicy for this CPU. >> 1405 */ >> 1406 ret = cpufreq_driver->init(policy); >> 1407 if (ret) { >> 1408 pr_debug("%s: %d: initialization failed\n", __func__, >> 1409 __LINE__); >> 1410 goto out_free_policy; >> 1411 >> > > Well, the pr_debug() doesn't always get printed unless the loglevel is > set to debug, which is usually done by the developers and not the end > users. > > However you have a point that since the code jumps to free_cpudata1 on > failures from amd_pstate_init_perf(), amd_pstate_init_freq(), > amd_pstate_init_boost_support(), freq_qos_add_request(). So the > pr_warn() doesn't indicate that the failure is due to > MSR_AMD_CPPC_CAP1 not being populated. > Right; my point is that without the warning no one knows there is a problem. I don't expect we can anticipate all the potential causes, and I want anyone who hits this to raise a bug and we can ask them to turn on dynamic debug / ftrace and then triage it.
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index f425fb7ec77d7..573643654e8d6 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -1034,6 +1034,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy) free_cpudata2: freq_qos_remove_request(&cpudata->req[0]); free_cpudata1: + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret); kfree(cpudata); return ret; } @@ -1527,6 +1528,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy) return 0; free_cpudata1: + pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret); kfree(cpudata); return ret; }