summaryrefslogtreecommitdiffstats
path: root/drivers/cpufreq/acpi-cpufreq.c
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2016-05-16 19:17:22 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2016-05-16 19:17:22 -0700
commitd57d39431924d1628ac9b93a2de7f806fc80680a (patch)
tree8d630b5b22333a6368beb3531f20ae5c5eb72229 /drivers/cpufreq/acpi-cpufreq.c
parent3e21e5dda4907ecb21a124517ab0eb1d176e5231 (diff)
parent27c4a1c5ef61b6d4a9aeae68b24419b4319b97ed (diff)
downloadlinux-d57d39431924d1628ac9b93a2de7f806fc80680a.tar.bz2
Merge tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "The majority of changes go into the cpufreq subsystem this time. To me, quite obviously, the biggest ticket item is the new "schedutil" governor. Interestingly enough, it's the first new cpufreq governor since the beginning of the git era (except for some out-of-the-tree ones). There are two main differences between it and the existing governors. First, it uses the information provided by the scheduler directly for making its decisions, so it doesn't have to track anything by itself. Second, it can invoke drivers (supporting that feature) to adjust CPU performance right away without having to spawn work items to be executed in process context or similar. Currently, the acpi-cpufreq driver is the only one supporting that mode of operation, but then it is used on a large number of systems. The "schedutil" governor as included here is very simple and mostly regarded as a foundation for future work on the integration of the scheduler with CPU power management (in fact, there is work in progress on top of it already). Nevertheless it works and the preliminary results obtained with it are encouraging. There also is some consolidation of CPU frequency management for ARM platforms that can add their machine IDs the the new stub dt-platdev driver now and that will take care of creating the requisite platform device for cpufreq-dt, so it is not necessary to do that in platform code any more. Several ARM platforms are switched over to using this generic mechanism. In addition to that, the intel_pstate driver is now going to respect CPU frequency limits set by the platform firmware (or a BMC) and provided via the ACPI _PPC object. The devfreq subsystem is getting a new "passive" governor for SoCs subsystems that will depend on somebody else to manage their voltage rails and its support for Samsung Exynos SoCs is consolidated. The rest is support for new hardware (Intel Broxton support in intel_idle for one example), bug fixes, optimizations and cleanups in a number of places. Specifics: - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki) - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez) - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada) - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi) - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches) - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat) - cpufreq qoriq driver fixes and cleanups (Jia Hongtao) - ACPI cpufreq driver cleanups (Viresh Kumar) - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla) - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann) - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla) - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham) - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King) - Intel Broxton support for the intel_idle driver (Len Brown) - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach) - ARM cpuidle cleanups (Jisheng Zhang) - Intel Kabylake support for the RAPL power capping driver (Jacob Pan) - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner) - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger)" * tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits) intel_pstate: Clean up get_target_pstate_use_performance() intel_pstate: Use sample.core_avg_perf in get_avg_pstate() intel_pstate: Clarify average performance computation intel_pstate: Avoid unnecessary synchronize_sched() during initialization cpufreq: schedutil: Make default depend on CONFIG_SMP cpufreq: powernv: del_timer_sync when global and local pstate are equal cpufreq: powernv: Move smp_call_function_any() out of irq safe block intel_pstate: Clean up intel_pstate_get() cpufreq: schedutil: Make it depend on CONFIG_SMP cpufreq: governor: Fix handling of special cases in dbs_update() PM / OPP: Move CONFIG_OF dependent code in a separate file cpufreq: intel_pstate: Ignore _PPC processing under HWP cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table cpufreq: tango: Use generic platdev driver PM / OPP: pass cpumask by reference cpufreq: Fix GOV_LIMITS handling for the userspace governor cpupower: fix potential memory leak PM / devfreq: style/typo fixes PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus ..
Diffstat (limited to 'drivers/cpufreq/acpi-cpufreq.c')
-rw-r--r--drivers/cpufreq/acpi-cpufreq.c129
1 files changed, 85 insertions, 44 deletions
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index fb5712141040..32a15052f363 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -25,6 +25,8 @@
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -50,8 +52,6 @@ MODULE_AUTHOR("Paul Diefenbaugh, Dominik Brodowski");
MODULE_DESCRIPTION("ACPI Processor P-States Driver");
MODULE_LICENSE("GPL");
-#define PFX "acpi-cpufreq: "
-
enum {
UNDEFINED_CAPABLE = 0,
SYSTEM_INTEL_MSR_CAPABLE,
@@ -65,7 +65,6 @@ enum {
#define MSR_K7_HWCR_CPB_DIS (1ULL << 25)
struct acpi_cpufreq_data {
- struct cpufreq_frequency_table *freq_table;
unsigned int resume;
unsigned int cpu_feature;
unsigned int acpi_perf_cpu;
@@ -200,8 +199,9 @@ static int check_amd_hwpstate_cpu(unsigned int cpuid)
return cpu_has(cpu, X86_FEATURE_HW_PSTATE);
}
-static unsigned extract_io(u32 value, struct acpi_cpufreq_data *data)
+static unsigned extract_io(struct cpufreq_policy *policy, u32 value)
{
+ struct acpi_cpufreq_data *data = policy->driver_data;
struct acpi_processor_performance *perf;
int i;
@@ -209,13 +209,14 @@ static unsigned extract_io(u32 value, struct acpi_cpufreq_data *data)
for (i = 0; i < perf->state_count; i++) {
if (value == perf->states[i].status)
- return data->freq_table[i].frequency;
+ return policy->freq_table[i].frequency;
}
return 0;
}
-static unsigned extract_msr(u32 msr, struct acpi_cpufreq_data *data)
+static unsigned extract_msr(struct cpufreq_policy *policy, u32 msr)
{
+ struct acpi_cpufreq_data *data = policy->driver_data;
struct cpufreq_frequency_table *pos;
struct acpi_processor_performance *perf;
@@ -226,20 +227,22 @@ static unsigned extract_msr(u32 msr, struct acpi_cpufreq_data *data)
perf = to_perf_data(data);
- cpufreq_for_each_entry(pos, data->freq_table)
+ cpufreq_for_each_entry(pos, policy->freq_table)
if (msr == perf->states[pos->driver_data].status)
return pos->frequency;
- return data->freq_table[0].frequency;
+ return policy->freq_table[0].frequency;
}
-static unsigned extract_freq(u32 val, struct acpi_cpufreq_data *data)
+static unsigned extract_freq(struct cpufreq_policy *policy, u32 val)
{
+ struct acpi_cpufreq_data *data = policy->driver_data;
+
switch (data->cpu_feature) {
case SYSTEM_INTEL_MSR_CAPABLE:
case SYSTEM_AMD_MSR_CAPABLE:
- return extract_msr(val, data);
+ return extract_msr(policy, val);
case SYSTEM_IO_CAPABLE:
- return extract_io(val, data);
+ return extract_io(policy, val);
default:
return 0;
}
@@ -374,11 +377,11 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
return 0;
data = policy->driver_data;
- if (unlikely(!data || !data->freq_table))
+ if (unlikely(!data || !policy->freq_table))
return 0;
- cached_freq = data->freq_table[to_perf_data(data)->state].frequency;
- freq = extract_freq(get_cur_val(cpumask_of(cpu), data), data);
+ cached_freq = policy->freq_table[to_perf_data(data)->state].frequency;
+ freq = extract_freq(policy, get_cur_val(cpumask_of(cpu), data));
if (freq != cached_freq) {
/*
* The dreaded BIOS frequency change behind our back.
@@ -392,14 +395,15 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
return freq;
}
-static unsigned int check_freqs(const struct cpumask *mask, unsigned int freq,
- struct acpi_cpufreq_data *data)
+static unsigned int check_freqs(struct cpufreq_policy *policy,
+ const struct cpumask *mask, unsigned int freq)
{
+ struct acpi_cpufreq_data *data = policy->driver_data;
unsigned int cur_freq;
unsigned int i;
for (i = 0; i < 100; i++) {
- cur_freq = extract_freq(get_cur_val(mask, data), data);
+ cur_freq = extract_freq(policy, get_cur_val(mask, data));
if (cur_freq == freq)
return 1;
udelay(10);
@@ -416,12 +420,12 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
unsigned int next_perf_state = 0; /* Index into perf table */
int result = 0;
- if (unlikely(data == NULL || data->freq_table == NULL)) {
+ if (unlikely(!data)) {
return -ENODEV;
}
perf = to_perf_data(data);
- next_perf_state = data->freq_table[index].driver_data;
+ next_perf_state = policy->freq_table[index].driver_data;
if (perf->state == next_perf_state) {
if (unlikely(data->resume)) {
pr_debug("Called after resume, resetting to P%d\n",
@@ -444,8 +448,8 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
drv_write(data, mask, perf->states[next_perf_state].control);
if (acpi_pstate_strict) {
- if (!check_freqs(mask, data->freq_table[index].frequency,
- data)) {
+ if (!check_freqs(policy, mask,
+ policy->freq_table[index].frequency)) {
pr_debug("acpi_cpufreq_target failed (%d)\n",
policy->cpu);
result = -EAGAIN;
@@ -458,6 +462,43 @@ static int acpi_cpufreq_target(struct cpufreq_policy *policy,
return result;
}
+unsigned int acpi_cpufreq_fast_switch(struct cpufreq_policy *policy,
+ unsigned int target_freq)
+{
+ struct acpi_cpufreq_data *data = policy->driver_data;
+ struct acpi_processor_performance *perf;
+ struct cpufreq_frequency_table *entry;
+ unsigned int next_perf_state, next_freq, freq;
+
+ /*
+ * Find the closest frequency above target_freq.
+ *
+ * The table is sorted in the reverse order with respect to the
+ * frequency and all of the entries are valid (see the initialization).
+ */
+ entry = policy->freq_table;
+ do {
+ entry++;
+ freq = entry->frequency;
+ } while (freq >= target_freq && freq != CPUFREQ_TABLE_END);
+ entry--;
+ next_freq = entry->frequency;
+ next_perf_state = entry->driver_data;
+
+ perf = to_perf_data(data);
+ if (perf->state == next_perf_state) {
+ if (unlikely(data->resume))
+ data->resume = 0;
+ else
+ return next_freq;
+ }
+
+ data->cpu_freq_write(&perf->control_register,
+ perf->states[next_perf_state].control);
+ perf->state = next_perf_state;
+ return next_freq;
+}
+
static unsigned long
acpi_cpufreq_guess_freq(struct acpi_cpufreq_data *data, unsigned int cpu)
{
@@ -611,10 +652,7 @@ static int acpi_cpufreq_blacklist(struct cpuinfo_x86 *c)
if ((c->x86 == 15) &&
(c->x86_model == 6) &&
(c->x86_mask == 8)) {
- printk(KERN_INFO "acpi-cpufreq: Intel(R) "
- "Xeon(R) 7100 Errata AL30, processors may "
- "lock up on frequency changes: disabling "
- "acpi-cpufreq.\n");
+ pr_info("Intel(R) Xeon(R) 7100 Errata AL30, processors may lock up on frequency changes: disabling acpi-cpufreq\n");
return -ENODEV;
}
}
@@ -631,6 +669,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
unsigned int result = 0;
struct cpuinfo_x86 *c = &cpu_data(policy->cpu);
struct acpi_processor_performance *perf;
+ struct cpufreq_frequency_table *freq_table;
#ifdef CONFIG_SMP
static int blacklisted;
#endif
@@ -690,7 +729,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
cpumask_copy(data->freqdomain_cpus,
topology_sibling_cpumask(cpu));
policy->shared_type = CPUFREQ_SHARED_TYPE_HW;
- pr_info_once(PFX "overriding BIOS provided _PSD data\n");
+ pr_info_once("overriding BIOS provided _PSD data\n");
}
#endif
@@ -742,9 +781,9 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
goto err_unreg;
}
- data->freq_table = kzalloc(sizeof(*data->freq_table) *
+ freq_table = kzalloc(sizeof(*freq_table) *
(perf->state_count+1), GFP_KERNEL);
- if (!data->freq_table) {
+ if (!freq_table) {
result = -ENOMEM;
goto err_unreg;
}
@@ -762,30 +801,29 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
if (perf->control_register.space_id == ACPI_ADR_SPACE_FIXED_HARDWARE &&
policy->cpuinfo.transition_latency > 20 * 1000) {
policy->cpuinfo.transition_latency = 20 * 1000;
- printk_once(KERN_INFO
- "P-state transition latency capped at 20 uS\n");
+ pr_info_once("P-state transition latency capped at 20 uS\n");
}
/* table init */
for (i = 0; i < perf->state_count; i++) {
if (i > 0 && perf->states[i].core_frequency >=
- data->freq_table[valid_states-1].frequency / 1000)
+ freq_table[valid_states-1].frequency / 1000)
continue;
- data->freq_table[valid_states].driver_data = i;
- data->freq_table[valid_states].frequency =
+ freq_table[valid_states].driver_data = i;
+ freq_table[valid_states].frequency =
perf->states[i].core_frequency * 1000;
valid_states++;
}
- data->freq_table[valid_states].frequency = CPUFREQ_TABLE_END;
+ freq_table[valid_states].frequency = CPUFREQ_TABLE_END;
perf->state = 0;
- result = cpufreq_table_validate_and_show(policy, data->freq_table);
+ result = cpufreq_table_validate_and_show(policy, freq_table);
if (result)
goto err_freqfree;
if (perf->states[0].core_frequency * 1000 != policy->cpuinfo.max_freq)
- printk(KERN_WARNING FW_WARN "P-state 0 is not max freq\n");
+ pr_warn(FW_WARN "P-state 0 is not max freq\n");
switch (perf->control_register.space_id) {
case ACPI_ADR_SPACE_SYSTEM_IO:
@@ -821,10 +859,13 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
*/
data->resume = 1;
+ policy->fast_switch_possible = !acpi_pstate_strict &&
+ !(policy_is_shared(policy) && policy->shared_type != CPUFREQ_SHARED_TYPE_ANY);
+
return result;
err_freqfree:
- kfree(data->freq_table);
+ kfree(freq_table);
err_unreg:
acpi_processor_unregister_performance(cpu);
err_free_mask:
@@ -842,13 +883,12 @@ static int acpi_cpufreq_cpu_exit(struct cpufreq_policy *policy)
pr_debug("acpi_cpufreq_cpu_exit\n");
- if (data) {
- policy->driver_data = NULL;
- acpi_processor_unregister_performance(data->acpi_perf_cpu);
- free_cpumask_var(data->freqdomain_cpus);
- kfree(data->freq_table);
- kfree(data);
- }
+ policy->fast_switch_possible = false;
+ policy->driver_data = NULL;
+ acpi_processor_unregister_performance(data->acpi_perf_cpu);
+ free_cpumask_var(data->freqdomain_cpus);
+ kfree(policy->freq_table);
+ kfree(data);
return 0;
}
@@ -876,6 +916,7 @@ static struct freq_attr *acpi_cpufreq_attr[] = {
static struct cpufreq_driver acpi_cpufreq_driver = {
.verify = cpufreq_generic_frequency_table_verify,
.target_index = acpi_cpufreq_target,
+ .fast_switch = acpi_cpufreq_fast_switch,
.bios_limit = acpi_processor_get_bios_limit,
.init = acpi_cpufreq_cpu_init,
.exit = acpi_cpufreq_cpu_exit,