summaryrefslogtreecommitdiffstats
path: root/drivers/cpufreq
AgeCommit message (Collapse)AuthorFilesLines
2018-06-08cpufreq: intel_pstate: enable boost for Skylake XeonSrinivas Pandruvada1-0/+10
Enable HWP boost on Skylake server and workstations. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: intel_pstate: New sysfs entry to control HWP boostSrinivas Pandruvada1-0/+30
A new attribute is added to intel_pstate sysfs to enable/disable HWP dynamic performance boost. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: intel_pstate: HWP boost performance on IO wakeupSrinivas Pandruvada1-0/+39
This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance. Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start boosting steps unless we see two consecutive flags in two ticks. This avoids boosting due to IO because of regular system activities. To avoid synchronization issues, the actual processing of the flag is done on the local CPU callback. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: intel_pstate: Add HWP boost utility and sched util hooksSrinivas Pandruvada1-3/+97
Added two utility functions to HWP boost up gradually and boost down to the default cached HWP request values. Boost up: Boost up updates HWP request minimum value in steps. This minimum value can reach upto at HWP request maximum values depends on how frequently, this boost up function is called. At max, boost up will take three steps to reach the maximum, depending on the current HWP request levels and HWP capabilities. For example, if the current settings are: If P0 (Turbo max) = P1 (Guaranteed max) = min No boost at all. If P0 (Turbo max) > P1 (Guaranteed max) = min Should result in one level boost only for P0. If P0 (Turbo max) = P1 (Guaranteed max) > min Should result in two level boost: (min + p1)/2 and P1. If P0 (Turbo max) > P1 (Guaranteed max) > min Should result in three level boost: (min + p1)/2, P1 and P0. We don't set any level between P0 and P1 as there is no guarantee that they will be honored. Boost down: After the system is idle for hold time of 3ms, the HWP request is reset to the default value from HWP init or user modified one via sysfs. Caching of HWP Request and Capabilities Store the HWP request value last set using MSR_HWP_REQUEST and read MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility functions. These boost utility functions calculated limits are based on the latest HWP request value, which can be modified by setpolicy() callback. So if user space modifies the minimum perf value, that will be accounted for every time the boost up is called. There will be case when there can be contention with the user modified minimum perf, in that case user value will gain precedence. For example just before HWP_REQUEST MSR is updated from setpolicy() callback, the boost up function is called via scheduler tick callback. Here the cached MSR value is already the latest and limits are updated based on the latest user limits, but on return the MSR write callback called from setpolicy() callback will update the HWP_REQUEST value. This will be used till next time the boost up function is called. In addition add a variable to control HWP dynamic boosting. When HWP dynamic boost is active then set the HWP specific update util hook. The contents in the utility hooks will be filled in the subsequent patches. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: ti-cpufreq: Use devres managed API in probe()Suman Anna1-5/+2
The ti_cpufreq_probe() function uses regular kzalloc to allocate the ti_cpufreq_data structure and kfree for freeing this memory on failures. Simplify this code by using the devres managed API. Signed-off-by: Suman Anna <s-anna@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: ti-cpufreq: Fix an incorrect error return valueSuman Anna1-1/+1
Commit 05829d9431df (cpufreq: ti-cpufreq: kfree opp_data when failure) has fixed a memory leak in the failure path, however the patch returned a positive value on get_cpu_device() failure instead of the previous negative value. Fix this incorrect error return value properly. Fixes: 05829d9431df (cpufreq: ti-cpufreq: kfree opp_data when failure) Cc: 4.14+ <stable@vger.kernel.org> # v4.14+ Signed-off-by: Suman Anna <s-anna@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: ACPI: make function acpi_cpufreq_fast_switch() staticColin Ian King1-2/+2
The acpi_cpufreq_fast_switch() function is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/cpufreq/acpi-cpufreq.c:468:14: warning: symbol 'acpi_cpufreq_fast_switch' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-06cpufreq: kryo: allow building as a loadable moduleArnd Bergmann1-1/+1
Building the kryo cpufreq driver while QCOM_SMEM is a loadable module results in a link error: drivers/cpufreq/qcom-cpufreq-kryo.o: In function `qcom_cpufreq_kryo_probe': qcom-cpufreq-kryo.c:(.text+0xbc): undefined reference to `qcom_smem_get' The problem is that Kconfig ignores interprets the dependency as met when the dependent symbol is a 'bool' one. By making it 'tristate', it will be forced to be a module here, which builds successfully. Fixes: 46e2856b8e18 (cpufreq: Add Kryo CPU scaling driver) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-04Merge branches 'pm-cpufreq-sched' and 'pm-cpuidle'Rafael J. Wysocki1-1/+1
* pm-cpufreq-sched: cpufreq: schedutil: Avoid missing updates for one-CPU policies schedutil: Allow cpufreq requests to be made even when kthread kicked cpufreq: Rename cpufreq_can_do_remote_dvfs() cpufreq: schedutil: Cleanup and document iowait boost cpufreq: schedutil: Fix iowait boost reset cpufreq: schedutil: Don't set next_freq to UINT_MAX Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily" * pm-cpuidle: cpuidle: governors: Consolidate PM QoS handling cpuidle: governors: Drop redundant checks related to PM QoS
2018-05-30cpufreq: Add Kryo CPU scaling driverIlia Lin4-0/+227
In Certain QCOM SoCs like apq8096 and msm8996 that have KRYO processors, the CPU frequency subset and voltage value of each OPP varies based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables defines the voltage and frequency value based on the msm-id in SMEM and speedbin blown in the efuse combination. The qcom-cpufreq-kryo driver reads the msm-id and efuse value from the SoC to provide the OPP framework with required information. This is used to determine the voltage and frequency value for each OPP of operating-points-v2 table when it is parsed by the OPP framework. Signed-off-by: Ilia Lin <ilialin@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30cpufreq: Use static SRCU initializerSebastian Andrzej Siewior1-12/+1
Use the static SRCU initializer for `cpufreq_transition_notifier_list'. This avoids the init_cpufreq_transition_notifier_list() initcall. Its only purpose is to initialize the SRCU notifier once during boot and set another variable which is used as an indicator whether the init was perfromed before cpufreq_register_notifier() was used. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30cpufreq: Fix new policy initialization during limits updates via sysfsTao Wang1-0/+2
If the policy limits are updated via cpufreq_update_policy() and subsequently via sysfs, the limits stored in user_policy may be set incorrectly. For example, if both min and max are set via sysfs to the maximum available frequency, user_policy.min and user_policy.max will also be the maximum. If a policy notifier triggered by cpufreq_update_policy() lowers both the min and the max at this point, that change is not reflected by the user_policy limits, so if the max is updated again via sysfs to the same lower value, then user_policy.max will be lower than user_policy.min which shouldn't happen. In particular, if one of the policy CPUs is then taken offline and back online, cpufreq_set_policy() will fail for it due to a failing limits check. To prevent that from happening, initialize the min and max fields of the new_policy object to the ones stored in user_policy that were previously set via sysfs. Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-23cpufreq: Rename cpufreq_can_do_remote_dvfs()Viresh Kumar1-1/+1
This routine checks if the CPU running this code belongs to the policy of the target CPU or if not, can it do remote DVFS for it remotely. But the current name of it implies as if it is only about doing remote updates. Rename it to make it more relevant. Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Wrap cpufreq into platform driverDmitry Osipenko1-59/+86
Currently tegra20-cpufreq kernel module isn't getting autoloaded because there is no device associated with the module, this is one of two patches that resolves the module autoloading. This patch adds a module alias that will associate the tegra20-cpufreq kernel module with the platform device, other patch will instantiate the actual platform device. And now it makes sense to wrap cpufreq driver into a platform driver for consistency. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Allow cpufreq driver to be built as loadable moduleDmitry Osipenko1-1/+1
Nothing prevents Tegra20 CPUFreq module to be unloaded, hence allow it to be built as a non-builtin kernel module. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Check if this is Tegra20 machineDmitry Osipenko1-0/+4
Don't even try to request the clocks during of module initialization on non-Tegra20 machines (this is the case for a multi-platform kernel) for consistency. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Remove unneeded variable initializationDmitry Osipenko1-1/+1
Remove unneeded variable initialization solely for consistency. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Remove unnecessary parenthesesDmitry Osipenko1-1/+1
Remove unnecessary parentheses as suggested by the checkpatch script. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Remove unneeded check in tegra_cpu_initDmitry Osipenko1-5/+0
Remove checking of the CPU number for consistency as it won't ever fail unless there is a severe bug in the cpufreq core. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Release clocks properlyDmitry Osipenko1-5/+26
Properly put requested clocks in the module init/exit code. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Remove EMC clock usageDmitry Osipenko1-22/+0
The EMC driver has been gone 4 years ago, since the commit a7cbe92cef27 ("ARM: tegra: remove tegra EMC scaling driver"). Remove the EMC clock usage as it does nothing. We may consider re-implementing the EMC scaling later, probably using PM Memory Bandwidth QoS API. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Clean up included headersDmitry Osipenko1-8/+4
Remove unused/unneeded headers and sort them in the alphabet order. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Clean up whitespaces in the codeDmitry Osipenko1-2/+1
Remove unneeded blank line and replace whitespaces with a tab in the code for consistency. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-21cpufreq: tegra20: Change module descriptionDmitry Osipenko1-1/+1
Change module description to be in line with the other Tegra drivers, just for consistency. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-18Merge back cpufreq material for v4.18.Rafael J. Wysocki8-52/+178
2018-05-15Revert "cpufreq: rcar: Add support for R8A7795 SoC"Simon Horman1-1/+0
This reverts commit 034def597bb73cbf29ffade7d8aec8408af8c743. This is no longer needed since the following commit and to the best of my knowledge is not relied on by any upstream DTS: edeec420de24 (cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2) Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-15Revert "cpufreq: dt: Add r8a7796 support to to use generic cpufreq driver"Simon Horman1-1/+0
This reverts commit bea2ebca6b917e46d0c585f416f1326fdf41e69b. This is no longer needed since the following commit and to the best of my knowledge is not relied on by any upstream DTS: edeec420de24 (cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2) Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14cpufreq: intel_pstate: allow trace in passive modeDoug Smythies1-2/+44
Allow use of the trace_pstate_sample trace function when the intel_pstate driver is in passive mode. Since the core_busy and scaled_busy fields are not used, and it might be desirable to know which path through the driver was used, either intel_cpufreq_target or intel_cpufreq_fast_switch, re-task the core_busy field as a flag indicator. The user can then use the intel_pstate_tracer.py utility to summarize and plot the trace. Note: The core_busy feild still goes by that name in include/trace/events/power.h and within the intel_pstate_tracer.py script and csv file headers, but it is graphed as "performance", and called core_avg_perf now in the intel_pstate driver. Sometimes, in passive mode, the driver is not called for many tens or even hundreds of seconds. The user needs to understand, and not be confused by, this limitation. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14cpufreq: armada-37xx: driver relies on cpufreq-dtMiquel Raynal1-1/+1
Armada-37xx driver registers a cpufreq-dt driver. Not having CONFIG_CPUFREQ_DT selected leads to a silent abort during the probe. Prevent that situation by having the former depending on the latter. Fixes: 92ce45fb875d7 (cpufreq: Add DVFS support for Armada 37xx) Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-13cpufreq: optimize cpufreq_notify_transition()Viresh Kumar1-31/+32
cpufreq_notify_transition() calls __cpufreq_notify_transition() for each CPU of a policy. There is a lot of code in __cpufreq_notify_transition() though which isn't required to be executed for each CPU, like checking about disabled cpufreq or irqs, adjusting jiffies, updating cpufreq stats and some debug print messages. This commit merges __cpufreq_notify_transition() into cpufreq_notify_transition() and modifies cpufreq_notify_transition() to execute minimum amount of code for each CPU. Also fix the kerneldoc for cpufreq_notify_transition() while at it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-10cpufreq: s3c2440: fix spelling mistake: "divsiors" -> "divisors"Colin Ian King1-1/+1
Trivial fix to spelling mistake in s3c_freq_dbg debug message text. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-10cpufreq: speedstep: fix speedstep_detect_processor()'s return typeLuc Van Oostenryck1-1/+1
speedstep_detect_processor() is declared as returing an 'enum speedstep_processor' but use an 'int' in its definition. Fix this by using 'enum speedstep_processor' in its definition too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-10cpufreq: add suspend/resume support in Armada 37xx DVFS driverMiquel Raynal1-2/+65
Add suspend/resume hooks in Armada 37xx DVFS driver to handle S2RAM operations. As there is currently no 'driver' structure, create one to store both the regmap and the register values during suspend operation. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-10cpufreq: armada: Free resources on error pathsViresh Kumar1-11/+22
The resources weren't freed on failures, free them properly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-10cpufreq: dt: Allow platform specific suspend/resume callbacksViresh Kumar2-2/+13
Platforms may need to implement platform specific suspend/resume hooks. Update cpufreq-dt driver's platform data to contain those for such platforms. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-30cpufreq / CPPC: Set platform specific transition_delay_usPrashanth Prakash1-2/+44
Add support to specify platform specific transition_delay_us instead of using the transition delay derived from PCC. With commit 3d41386d556d (cpufreq: CPPC: Use transition_delay_us depending transition_latency) we are setting transition_delay_us directly and not applying the LATENCY_MULTIPLIER. Because of that, on Qualcomm Centriq we can end up with a very high rate of frequency change requests when using the schedutil governor (default rate_limit_us=10 compared to an earlier value of 10000). The PCC subspace describes the rate at which the platform can accept commands on the CPPC's PCC channel. This includes read and write command on the PCC channel that can be used for reasons other than frequency transitions. Moreover the same PCC subspace can be used by multiple freq domains and deriving transition_delay_us from it as we do now can be sub-optimal. Moreover if a platform does not use PCC for desired_perf register then there is no way to compute the transition latency or the delay_us. CPPC does not have a standard defined mechanism to get the transition rate or the latency at the moment. Given the above limitations, it is simpler to have a platform specific transition_delay_us and rely on PCC derived value only if a platform specific value is not available. Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Fixes: 3d41386d556d (cpufreq: CPPC: Use transition_delay_us depending transition_latency) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-28Merge tag 'powerpc-4.17-4' of ↵Linus Torvalds1-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A bunch of fixes, mostly for existing code and going to stable. Our memory hot-unplug path wasn't flushing the cache before removing memory. That is a problem now that we are doing memory hotplug on bare metal. Three fixes for the NPU code that supports devices connected via NVLink (ie. GPUs). The main one tweaks the TLB flush algorithm to avoid soft lockups for large flushes. A fix for our memory error handling where we would loop infinitely, returning back to the bad access and hard lockup the CPU. Fixes for the OPAL RTC driver, which wasn't handling some error cases correctly. A fix for a hardlockup in the powernv cpufreq driver. And finally two fixes to our smp_send_stop(), required due to a recent change to use it on shutdown. Thanks to: Alistair Popple, Balbir Singh, Laurentiu Tudor, Mahesh Salgaonkar, Mark Hairgrove, Nicholas Piggin, Rashmica Gupta, Shilpasri G Bhat" * tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/kvm/booke: Fix altivec related build break powerpc: Fix deadlock with multiple calls to smp_send_stop cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interrupt powerpc: Fix smp_send_stop NMI IPI handling rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops powerpc/mce: Fix a bug where mce loops on memory UE. powerpc/powernv/npu: Do a PID GPU TLB flush when invalidating a large address range powerpc/powernv/npu: Prevent overwriting of pnv_npu2_init_contex() callback parameters powerpc/powernv/npu: Add lock to prevent race in concurrent context init/destroy powerpc/powernv/memtrace: Let the arch hotunplug code flush cache powerpc/mm: Flush cache on memory hot(un)plug
2018-04-27cpufreq: powernv: Fix hardlockup due to synchronous smp_call in timer interruptShilpasri G Bhat1-3/+11
gpstate_timer_handler() uses synchronous smp_call to set the pstate on the requested core. This causes the below hard lockup: smp_call_function_single+0x110/0x180 (unreliable) smp_call_function_any+0x180/0x250 gpstate_timer_handler+0x1e8/0x580 call_timer_fn+0x50/0x1c0 expire_timers+0x138/0x1f0 run_timer_softirq+0x1e8/0x270 __do_softirq+0x158/0x3e4 irq_exit+0xe8/0x120 timer_interrupt+0x9c/0xe0 decrementer_common+0x114/0x120 -- interrupt: 901 at doorbell_global_ipi+0x34/0x50 LR = arch_send_call_function_ipi_mask+0x120/0x130 arch_send_call_function_ipi_mask+0x4c/0x130 smp_call_function_many+0x340/0x450 pmdp_invalidate+0x98/0xe0 change_huge_pmd+0xe0/0x270 change_protection_range+0xb88/0xe40 mprotect_fixup+0x140/0x340 SyS_mprotect+0x1b4/0x350 system_call+0x58/0x6c One way to avoid this is removing the smp-call. We can ensure that the timer always runs on one of the policy-cpus. If the timer gets migrated to a cpu outside the policy then re-queue it back on the policy->cpus. This way we can get rid of the smp-call which was being used to set the pstate on the policy->cpus. Fixes: 7bc54b652f13 ("timers, cpufreq/powernv: Initialize the gpstate timer as pinned") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Nicholas Piggin <npiggin@gmail.com> Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-24cpufreq: brcmstb-avs-cpufreq: remove development debug supportMarkus Mayer2-332/+1
This debug code was helpful while developing the driver, but it isn't being used for anything anymore. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: Drop cpufreq_table_validate_and_show()Viresh Kumar1-14/+0
This isn't used anymore. Remove the helper and update documentation accordingly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: SCMI: Don't validate the frequency table twiceViresh Kumar1-9/+1
The cpufreq core is already validating the CPU frequency table after calling the ->init() callback of the cpufreq drivers and the drivers don't need to do the same anymore. Though they need to set the policy->freq_table field directly from the ->init() callback now. Stop validating the frequency table from SCMI driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: CPPC: Initialize shared perf capabilities of CPUsShunyong Yang1-2/+12
When multiple CPUs are related in one cpufreq policy, the first online CPU will be chosen by default to handle cpufreq operations. Let's take cpu0 and cpu1 as an example. When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf capabilities should be initialized. Otherwise, perf capabilities are 0s and speed change can not take effect. This patch copies perf capabilities of the first online CPU to other shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: armada-37xx: Fix clock leakGregory CLEMENT1-0/+2
There was no clk_put() balancing the clk_get(). This commit fixes it. Fixes: 92ce45fb875d (cpufreq: Add DVFS support for Armada 37xx) Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Reported-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: CPPC: Don't set transition_latencyViresh Kumar1-1/+0
Now that the driver has started to set transition_delay_us directly, there is no need to set transition_latency along with it, as it is not used by the cpufreq core. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: ti-cpufreq: Use builtin_platform_driver()Viresh Kumar1-1/+1
This driver can not be built as a module and there is no need of the platform driver unregister part. Use builtin_platform_driver() instead of module_platform_driver(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-10cpufreq: intel_pstate: Do not include debugfs.hRafael J. Wysocki1-1/+0
The intel_pstate driver doesn't use debugfs any more, so drop linux/debugfs.h from the list of included headers in it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-04-05Merge tag 'armsoc-drivers' of ↵Linus Torvalds3-0/+277
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "The main addition this time around is the new ARM "SCMI" framework, which is the latest in a series of standards coming from ARM to do power management in a platform independent way. This has been through many review cycles, and it relies on a rather interesting way of using the mailbox subsystem, but in the end I agreed that Sudeep's version was the best we could do after all. Other changes include: - the ARM CCN driver is moved out of drivers/bus into drivers/perf, which makes more sense. Similarly, the performance monitoring portion of the CCI driver are moved the same way and cleaned up a little more. - a series of updates to the SCPI framework - support for the Mediatek mt7623a SoC in drivers/soc - support for additional NVIDIA Tegra hardware in drivers/soc - a new reset driver for Socionext Uniphier - lesser bug fixes in drivers/soc, drivers/tee, drivers/memory, and drivers/firmware and drivers/reset across platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (87 commits) reset: uniphier: add ethernet reset control support for PXs3 reset: stm32mp1: Enable stm32mp1 reset driver dt-bindings: reset: add STM32MP1 resets reset: uniphier: add Pro4/Pro5/PXs2 audio systems reset control reset: imx7: add 'depends on HAS_IOMEM' to fix unmet dependency reset: modify the way reset lookup works for board files reset: add support for non-DT systems clk: scmi: use devm_of_clk_add_hw_provider() API and drop scmi_clocks_remove firmware: arm_scmi: prevent accessing rate_discrete uninitialized hwmon: (scmi) return -EINVAL when sensor information is unavailable amlogic: meson-gx-socinfo: Update soc ids soc/tegra: pmc: Use the new reset APIs to manage reset controllers soc: mediatek: update power domain data of MT2712 dt-bindings: soc: update MT2712 power dt-bindings cpufreq: scmi: add thermal dependency soc: mediatek: fix the mistaken pointer accessed when subdomains are added soc: mediatek: add SCPSYS power domain driver for MediaTek MT7623A SoC soc: mediatek: avoid hardcoded value with bus_prot_mask dt-bindings: soc: add header files required for MT7623A SCPSYS dt-binding dt-bindings: soc: add SCPSYS binding for MT7623 and MT7623A SoC ...
2018-04-03Merge tag 'pm-4.17-rc1' of ↵Linus Torvalds34-168/+123
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update the cpuidle poll state definition to reduce excessive energy usage related to it, add new CPU ID to the RAPL power capping driver, update the ACPI system suspend code to handle some special cases better, extend the PM core's device links code slightly, add new sysfs attribute for better suspend-to-idle diagnostics and easier hibernation handling, update power management tools and clean up cpufreq quite a bit. Specifics: - Modify the cpuidle poll state implementation to prevent CPUs from staying in the loop in there for excessive times (Rafael Wysocki). - Add Intel Cannon Lake chips support to the RAPL power capping driver (Joe Konno). - Add reference counting to the device links handling code in the PM core (Lukas Wunner). - Avoid reconfiguring GPEs on suspend-to-idle in the ACPI system suspend code (Rafael Wysocki). - Allow devices to be put into deeper low-power states via ACPI if both _SxD and _SxW are missing (Daniel Drake). - Reorganize the core ACPI suspend-to-idle wakeup code to avoid a keyboard wakeup issue on Asus UX331UA (Chris Chiu). - Prevent the PCMCIA library code from aborting suspend-to-idle due to noirq suspend failures resulting from incorrect assumptions (Rafael Wysocki). - Add coupled cpuidle supprt to the Exynos3250 platform (Marek Szyprowski). - Add new sysfs file to make it easier to specify the image storage location during hibernation (Mario Limonciello). - Add sysfs files for collecting suspend-to-idle usage and time statistics for CPU idle states (Rafael Wysocki). - Update the pm-graph utilities (Todd Brandt). - Reduce the kernel log noise related to reporting Low-power Idle constraings by the ACPI system suspend code (Rafael Wysocki). - Make it easier to distinguish dedicated wakeup IRQs in the /proc/interrupts output (Tony Lindgren). - Add the frequency table validation in cpufreq to the core and drop it from a number of cpufreq drivers (Viresh Kumar). - Drop "cooling-{min|max}-level" for CPU nodes from a couple of DT bindings (Viresh Kumar). - Clean up the CPU online error code path in the cpufreq core (Viresh Kumar). - Fix assorted issues in the SCPI, CPPC, mediatek and tegra186 cpufreq drivers (Arnd Bergmann, Chunyu Hu, George Cherian, Viresh Kumar). - Drop memory allocation error messages from a few places in cpufreq and cpuildle drivers (Markus Elfring)" * tag 'pm-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits) ACPI / PM: Fix keyboard wakeup from suspend-to-idle on ASUS UX331UA cpufreq: CPPC: Use transition_delay_us depending transition_latency PM / hibernate: Change message when writing to /sys/power/resume PM / hibernate: Make passing hibernate offsets more friendly cpuidle: poll_state: Avoid invoking local_clock() too often PM: cpuidle/suspend: Add s2idle usage and time state attributes cpuidle: Enable coupled cpuidle support on Exynos3250 platform cpuidle: poll_state: Add time limit to poll_idle() cpufreq: tegra186: Don't validate the frequency table twice cpufreq: speedstep: Don't validate the frequency table twice cpufreq: sparc: Don't validate the frequency table twice cpufreq: sh: Don't validate the frequency table twice cpufreq: sfi: Don't validate the frequency table twice cpufreq: scpi: Don't validate the frequency table twice cpufreq: sc520: Don't validate the frequency table twice cpufreq: s3c24xx: Don't validate the frequency table twice cpufreq: qoirq: Don't validate the frequency table twice cpufreq: pxa: Don't validate the frequency table twice cpufreq: ppc_cbe: Don't validate the frequency table twice cpufreq: powernow: Don't validate the frequency table twice ...
2018-03-30cpufreq: CPPC: Use transition_delay_us depending transition_latencyGeorge Cherian1-0/+3
With commit e948bc8fbee0 (cpufreq: Cap the default transition delay value to 10 ms) the cpufreq was not honouring the delay passed via ACPI (PCCT). Due to which on ARM based platforms using CPPC the cpufreq governor tries to change the frequency of CPUs faster than expected. This leads to continuous error messages like the following. " ACPI CPPC: PCC check channel failed. Status=0 " Earlier (without above commit) the default transition delay was taken form the value passed from PCCT. Use the same value provided by PCCT to set the transition_delay_us. Fixes: e948bc8fbee0 (cpufreq: Cap the default transition delay value to 10 ms) Signed-off-by: George Cherian <george.cherian@cavium.com> Cc: 4.14+ <stable@vger.kernel.org> # 4.14+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-26cpufreq: remove cris specific driversArnd Bergmann3-187/+0
The cris architecture is getting removed, including the artpec3 and etraxfs SoCs, so these cpufreq drivers are now unused. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>