summaryrefslogtreecommitdiffstats
path: root/drivers/thermal/cpufreq_cooling.c
AgeCommit message (Collapse)AuthorFilesLines
2021-06-17thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressureLukasz Luba1-1/+1
The thermal pressure signal gives information to the scheduler about reduced CPU capacity due to thermal. It is based on a value stored in a per-cpu 'thermal_pressure' variable. The online CPUs will get the new value there, while the offline won't. Unfortunately, when the CPU is back online, the value read from per-cpu variable might be wrong (stale data). This might affect the scheduler decisions, since it sees the CPU capacity differently than what is actually available. Fix it by making sure that all online+offline CPUs would get the proper value in their per-cpu variable when thermal framework sets capping. Fixes: f12e4f66ab6a3 ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210614191030.22241-1-lukasz.luba@arm.com
2021-04-15thermal/drivers/cpufreq_cooling: Fix slab OOB issuebrian-sy yang1-1/+1
Slab OOB issue is scanned by KASAN in cpu_power_to_freq(). If power is limited below the power of OPP0 in EM table, it will cause slab out-of-bound issue with negative array index. Return the lowest frequency if limited power cannot found a suitable OPP in EM table to fix this issue. Backtrace: [<ffffffd02d2a37f0>] die+0x104/0x5ac [<ffffffd02d2a5630>] bug_handler+0x64/0xd0 [<ffffffd02d288ce4>] brk_handler+0x160/0x258 [<ffffffd02d281e5c>] do_debug_exception+0x248/0x3f0 [<ffffffd02d284488>] el1_dbg+0x14/0xbc [<ffffffd02d75d1d4>] __kasan_report+0x1dc/0x1e0 [<ffffffd02d75c2e0>] kasan_report+0x10/0x20 [<ffffffd02d75def8>] __asan_report_load8_noabort+0x18/0x28 [<ffffffd02e6fce5c>] cpufreq_power2state+0x180/0x43c [<ffffffd02e6ead80>] power_actor_set_power+0x114/0x1d4 [<ffffffd02e6fac24>] allocate_power+0xaec/0xde0 [<ffffffd02e6f9f80>] power_allocator_throttle+0x3ec/0x5a4 [<ffffffd02e6ea888>] handle_thermal_trip+0x160/0x294 [<ffffffd02e6edd08>] thermal_zone_device_check+0xe4/0x154 [<ffffffd02d351cb4>] process_one_work+0x5e4/0xe28 [<ffffffd02d352f44>] worker_thread+0xa4c/0xfac [<ffffffd02d360124>] kthread+0x33c/0x358 [<ffffffd02d289940>] ret_from_fork+0xc/0x18 Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power") Signed-off-by: brian-sy yang <brian-sy.yang@mediatek.com> Signed-off-by: Michael Kao <michael.kao@mediatek.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Cc: stable@vger.kernel.org #v5.7 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com
2021-03-15thermal/drivers/cpufreq_cooling: Remove unused listDaniel Lezcano1-13/+0
There is a list with the purpose of grouping the cpufreq cooling device together as described in the comments but actually it is unused, the code evolved since 2012 and the list was no longer needed. Delete the remaining unused list related code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-5-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/cpufreq_cooling: Use device name instead of auto-numberingDaniel Lezcano1-22/+12
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-cpufreq-0 thermal-cpufreq-1 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal' prefix and changes the number by the device name. So the naming above becomes: cpufreq-cpu0 cpufreq-cpu4 etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-2-daniel.lezcano@linaro.org
2021-02-22Merge tag 'thermal-v5.12-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Use the newly introduced 'hot' and 'critical' ops for the acpi thermal driver (Daniel Lezcano) - Remove the notify ops as it is no longer used (Daniel Lezcano) - Remove the 'forced passive' option and the unused bind/unbind functions (Daniel Lezcano) - Remove the THERMAL_TRIPS_NONE and the code cleanup around this macro (Daniel Lezcano) - Rework the delays to make them pre-computed instead of computing them again and again at each polling interval (Daniel Lezcano) - Remove the pointless 'thermal_zone_device_reset' function (Daniel Lezcano) - Use the critical and hot ops to prevent an unexpected system shutdown on int340x (Kai-Heng Feng) - Make the cooling device state private to the thermal subsystem (Daniel Lezcano) - Prevent to use not-power-aware actor devices with the power allocator governor (Lukasz Luba) - Remove 'zx' and 'tango' support along with the corresponding platforms (Arnd Bergman) - Fix several issues on the Omap thermal driver (Tony Lindgren) - Add support for adc-tm5 PMIC thermal monitor for Qcom platforms (Dmitry Baryshkov) - Fix an initialization loop in the adc-tm5 (Colin Ian King) - Fix a return error check in the cpufreq cooling device (Viresh Kumar) * tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (26 commits) thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error thermal: qcom: Fix comparison with uninitialized variable channels_available thermal: qcom: add support for adc-tm5 PMIC thermal monitor dt-bindings: thermal: qcom: add adc-thermal monitor bindings thermal: ti-soc-thermal: Use non-inverted define for omap4 thermal: ti-soc-thermal: Simplify polling with iopoll thermal: ti-soc-thermal: Fix stuck sensor with continuous mode for 4430 thermal: ti-soc-thermal: Skip pointless register access for dra7 thermal/drivers/zx: Remove zx driver thermal/drivers/tango: Remove tango driver thermal: power allocator: fail binding for non-power actor devices thermal/core: Make cooling device state change private thermal: intel: pch: Fix unexpected shutdown at critical temperature thermal: int340x: Fix unexpected shutdown at critical temperature thermal/core: Remove pointless thermal_zone_device_reset() function thermal/core: Remove ms based delay fields thermal/core: Use precomputed jiffies for the polling thermal/core: Precompute the delays from msecs to jiffies thermal/core: Remove unused macro THERMAL_TRIPS_NONE thermal/core: Remove THERMAL_TRIPS_NONE test ...
2021-02-17thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on errorViresh Kumar1-1/+1
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures. The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure. Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors. Cc: v5.7+ <stable@vger.kernel.org> # v5.7+ Reported-by: Thara Gopinath <thara.gopinath@linaro.org> Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Thara Gopinath<thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/b2b7e84944937390256669df5a48ce5abba0c1ef.1613540713.git.viresh.kumar@linaro.org
2021-01-14thermal: cpufreq_cooling: Reuse sched_cpu_util() for SMP platformsViresh Kumar1-14/+55
Several parts of the kernel are already using the effective CPU utilization (as seen by the scheduler) to get the current load on the CPU, do the same here instead of depending on the idle time of the CPU, which isn't that accurate comparatively. This is also the right thing to do as it makes the cpufreq governor (schedutil) align better with the cpufreq_cooling driver, as the power requested by cpufreq_cooling governor will exactly match the next frequency requested by the schedutil governor since they are both using the same metric to calculate load. This was tested on ARM Hikey6220 platform with hackbench, sysbench and schbench. None of them showed any regression or significant improvements. Schbench is the most important ones out of these as it creates the scenario where the utilization numbers provide a better estimate of the future. Scenario 1: The CPUs were mostly idle in the previous polling window of the IPA governor as the tasks were sleeping and here are the details from traces (load is in %): Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=203 load={{0x35,0x1,0x0,0x31,0x0,0x0,0x64,0x0}} dynamic_power=1339 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=600 load={{0x60,0x46,0x45,0x45,0x48,0x3b,0x61,0x44}} dynamic_power=3960 Here, the "Old" line gives the load and requested_power (dynamic_power here) numbers calculated using the idle time based implementation, while "New" is based on the CPU utilization from scheduler. As can be clearly seen, the load and requested_power numbers are simply incorrect in the idle time based approach and the numbers collected from CPU's utilization are much closer to the reality. Scenario 2: The CPUs were busy in the previous polling window of the IPA governor: Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=800 load={{0x64,0x64,0x64,0x64,0x64,0x64,0x64,0x64}} dynamic_power=5280 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=708 load={{0x4d,0x5c,0x5c,0x5b,0x5c,0x5c,0x51,0x5b}} dynamic_power=4672 As can be seen, the idle time based load is 100% for all the CPUs as it took only the last window into account, but in reality the CPUs aren't that loaded as shown by the utilization numbers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lkml.kernel.org/r/9c255c83d78d58451abc06848001faef94c87a12.1607400596.git.viresh.kumar@linaro.org
2020-11-12thermal/drivers/cpufreq_cooling: Update cpufreq_state only if state has changedZhuguangqing1-3/+1
If state has not changed successfully and we updated cpufreq_state, next time when the new state is equal to cpufreq_state (not changed successfully last time), we will return directly and miss a freq_qos_update_request() that should have been. Fixes: 5130802ddbb1 ("thermal: cpu_cooling: Switch to QoS requests for freq limits") Cc: v5.4+ <stable@vger.kernel.org> # v5.4+ Signed-off-by: Zhuguangqing <zhuguangqing@xiaomi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201106092243.15574-1-zhuguangqing83@gmail.com
2020-10-12thermal: cooling: Remove unused variable *tzzhuguangqing1-7/+1
1. devfreq_cooling.c: The variable *tz is not used in devfreq_cooling_get_requested_power(), devfreq_cooling_state2power() and devfreq_cooling_power2state(). 2. cpufreq_cooling.c: After 84fe2cab48590, the variable *tz is not used anymore in cpufreq_get_requested_power(), cpufreq_state2power() and cpufreq_power2state(). Remove the variable *tz. Signed-off-by: zhuguangqing <zhuguangqing@xiaomi.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200914071101.13575-1-zhuguangqing83@gmail.com
2020-08-03Merge branches 'pm-em' and 'pm-core'Rafael J. Wysocki1-6/+6
* pm-em: OPP: refactor dev_pm_opp_of_register_em() and update related drivers Documentation: power: update Energy Model description PM / EM: change name of em_pd_energy to em_cpu_energy PM / EM: remove em_register_perf_domain PM / EM: add support for other devices than CPUs in Energy Model PM / EM: update callback structure and add device pointer PM / EM: introduce em_dev_register_perf_domain function PM / EM: change naming convention from 'capacity' to 'performance' * pm-core: mmc: jz4740: Use pm_ptr() macro PM: Make *_DEV_PM_OPS macros use __maybe_unused PM: core: introduce pm_ptr() macro
2020-06-29thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from powerFinley Xiao1-3/+3
The function cpu_power_to_freq is used to find a frequency and set the cooling device to consume at most the power to be converted. For example, if the power to be converted is 80mW, and the em table is as follow. struct em_cap_state table[] = { /* KHz mW */ { 1008000, 36, 0 }, { 1200000, 49, 0 }, { 1296000, 59, 0 }, { 1416000, 72, 0 }, { 1512000, 86, 0 }, }; The target frequency should be 1416000KHz, not 1512000KHz. Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power tables") Cc: <stable@vger.kernel.org> # v4.13+ Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com
2020-06-24PM / EM: change naming convention from 'capacity' to 'performance'Lukasz Luba1-6/+6
The Energy Model uses concept of performance domain and capacity states in order to calculate power used by CPUs. Change naming convention from capacity to performance state would enable wider usage in future, e.g. upcoming support for other devices other than CPUs. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-22thermal/drivers/cpufreq_cooling: Replace module.h with export.hAmit Kucheria1-1/+1
cpufreq_cooling cannot be modular, remove the unnecessary module.h include and replace with export.h to handle EXPORT_SYMBOL family of macros. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/7a439e41e91d8bc5ff99207f99723fcf04ca36eb.1589199124.git.amit.kucheria@linaro.org
2020-05-22thermal/drivers/cpufreq_cooling: Sort headers alphabeticallyAmit Kucheria1-4/+4
Sort headers to make it easier to read and find duplicate headers. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/4231f5dfe758b9bf716981be71cadf9642c83528.1589199124.git.amit.kucheria@linaro.org
2020-04-07Merge tag 'thermal-v5.7-rc1' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Convert tsens configuration DT binding to yaml (Rajeshwari) - Add interrupt support on the rcar sensor (Niklas Söderlund) - Add a new Spreadtrum thermal driver (Baolin Wang) - Add thermal binding for the fsl scu board, a new API to retrieve the sensor id bound to the thermal zone and i.MX system controller sensor (Anson Huang)) - Remove warning log when a deferred probe is requested on Exynos (Marek Szyprowski) - Add the thermal monitoring unit support for imx8mm with its DT bindings (Anson Huang) - Rephrase the Kconfig text for clarity (Linus Walleij) - Use the gpio descriptor for the ti-soc-thermal (Linus Walleij) - Align msg structure to 4 bytes for i.MX SC, fix the Kconfig dependency, add the __may_be unused annotation for PM functions and the COMPILE_TEST option for imx8mm (Anson Huang) - Fix a dependency on regmap in Kconfig for qoriq (Yuantian Tang) - Add DT binding and support for the rcar gen3 r8a77961 and improve the error path on the rcar init function (Niklas Söderlund) - Cleanup and improvements for the tsens Qcom sensor (Amit Kucheria) - Improve code by removing lock and caching values in the rcar thermal sensor (Niklas Söderlund) - Cleanup in the qoriq drivers and add a call to imx_thermal_unregister_legacy_cooling in the removal function (Anson Huang) - Remove redundant 'maxItems' in tsens and sprd DT bindings (Rob Herring) - Change the thermal DT bindings by making the cooling-maps optional (Yuantian Tang) - Add Tiger Lake support (Sumeet Pawnikar) - Use scnprintf() for avoiding potential buffer overflow (Takashi Iwai) - Make pkg_temp_lock a raw_spinlock_t(Clark Williams) - Fix incorrect data types by changing them to signed on i.MX SC (Anson Huang) - Replace zero-length array with flexible-array member (Gustavo A. R. Silva) - Add support for i.MX8MP in the driver and in the DT bindings (Anson Huang) - Fix return value of the cpufreq_set_cur_state() function (Willy Wolff) - Remove abusing and scary WARN_ON in the cpufreq cooling device (Daniel Lezcano) - Fix build warning of incorrect argument type reported by sparse on imx8mm (Anson Huang) - Fix stub for the devfreq cooling device (Martin Blumenstingl) - Fix cpu idle cooling documentation (Sergey Vidishev) * tag 'thermal-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (52 commits) Documentation: cpu-idle-cooling: Fix diagram for 33% duty cycle thermal: devfreq_cooling: inline all stubs for CONFIG_DEVFREQ_THERMAL=n thermal: imx8mm: Fix build warning of incorrect argument type thermal/drivers/cpufreq_cooling: Remove abusing WARN_ON thermal/drivers/cpufreq_cooling: Fix return of cpufreq_set_cur_state thermal: imx8mm: Add i.MX8MP support dt-bindings: thermal: imx8mm-thermal: Add support for i.MX8MP thermal: qcom: tsens.h: Replace zero-length array with flexible-array member thermal: imx_sc_thermal: Fix incorrect data type thermal: int340x_thermal: Use scnprintf() for avoiding potential buffer overflow thermal: int340x: processor_thermal: Add Tiger Lake support thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t dt-bindings: thermal: make cooling-maps property optional dt-bindings: thermal: qcom-tsens: Remove redundant 'maxItems' dt-bindings: thermal: sprd: Remove redundant 'maxItems' thermal: imx: Calling imx_thermal_unregister_legacy_cooling() in .remove thermal: qoriq: Sort includes alphabetically thermal: qoriq: Use devm_add_action_or_reset() to handle all cleanups thermal: rcar_thermal: Remove lock in rcar_thermal_get_current_temp() thermal: rcar_thermal: Do not store ctemp in rcar_thermal_priv ...
2020-03-23thermal/drivers/cpufreq_cooling: Remove abusing WARN_ONDaniel Lezcano1-2/+2
The WARN_ON macros are used at the entry functions state2power() and set_cur_state(). state2power() is called with the max_state retrieved from get_max_state which returns cpufreq_cdev->max_level, then it check if max_state is > cpufreq_cdev->max_level. The test does not really makes sense but let's assume we want to make sure to catch an error if the code evolves. However the WARN_ON is overkill. set_cur_state() is also called from userspace if we write to the sysfs. It is easy to see a stack dumped by just writing to sysfs /sys/class/thermal/cooling_device0/cur_state a value greater than "max_level". A bit scary. Returing -EINVAL is enough. Remove these WARN_ON. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20200321193107.21590-1-daniel.lezcano@linaro.org
2020-03-23thermal/drivers/cpufreq_cooling: Fix return of cpufreq_set_cur_stateWilly Wolff1-2/+4
When setting the cooling device current state from userspace via sysfs, the operation fails by returning an -EINVAL. It appears the recent changes with the per-policy frequency QoS introduced a regression as reported by: https://lkml.org/lkml/2020/3/20/599 The function freq_qos_update_request returns 0 or 1 describing update effectiveness, and a negative error code on failure. However, cpufreq_set_cur_state returns 0 on success or an error code otherwise. Consider the QoS update as successful if the function does not return an error. Fixes: 3000ce3c52f8b ("cpufreq: Use per-policy frequency QoS") Signed-off-by: Willy Wolff <willy.mh.wolff.ml@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200321092740.7vvwfxsebcrznydh@macmini.local
2020-03-06thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency ↵Thara Gopinath1-2/+17
capping Thermal governors can request for a CPU's maximum supported frequency to be capped in case of an overheat event. This in turn means that the maximum capacity available for tasks to run on the particular CPU is reduced. Delta between the original maximum capacity and capped maximum capacity is known as thermal pressure. Enable cpufreq cooling device to update the thermal pressure in event of a capped maximum frequency. Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200222005213.3873-9-thara.gopinath@linaro.org
2020-01-27thermal/drivers/cpu_cooling: Rename to cpufreq_coolingDaniel Lezcano1-0/+670
As we introduced the idle injection cooling device called cpuidle_cooling, let's be consistent and rename the cpu_cooling to cpufreq_cooling as this one mitigates with OPPs changes. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Link: https://lore.kernel.org/r/20191219225317.17158-3-daniel.lezcano@linaro.org