summaryrefslogtreecommitdiffstats
path: root/drivers/thermal
AgeCommit message (Collapse)AuthorFilesLines
2021-04-20thermal/drivers/intel: Introduce tcc cooling driverZhang Rui3-0/+141
On Intel processors, the core frequency can be reduced below OS request, when the current temperature reaches the TCC (Thermal Control Circuit) activation temperature. The default TCC activation temperature is specified by MSR_IA32_TEMPERATURE_TARGET. However, it can be adjusted by specifying an offset in degrees C, using the TCC Offset bits in the same MSR register. This patch introduces a cooling devices driver that utilizes the TCC Offset feature. The bigger the current cooling state is, the lower the effective TCC activation temperature is, so that the processors can be throttled earlier before system critical overheats. Note that, on different platforms, the behavior might be different on how fast the setting takes effect, and how much the CPU frequency is reduced. This patch has been tested on a KabyLake mobile platform from me, and also on a CometLake platform from Doug. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Tested by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210412125901.12549-1-rui.zhang@intel.com
2021-04-20thermal/drivers/bcm2835: Remove redundant dev_err call in ↵Ruiqi Gong1-1/+0
bcm2835_thermal_probe() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210408100329.7585-1-gongruiqi1@huawei.com
2021-04-20thermal/drivers/thermal_mmio: Remove redundant dev_err call in ↵Ruiqi Gong1-4/+1
thermal_mmio_probe() There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com> Acked-by: Talel Shenhar <talel@amazon.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210408100144.7494-1-gongruiqi1@huawei.com
2021-04-20thermal/drivers/qcom/tsens-v0_1: Add support for MDM9607Konrad Dybcio3-2/+101
MDM9607 TSENS IP is very similar to the one of MSM8916, with minor adjustments to various tuning values. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319220802.198215-2-konrad.dybcio@somainline.org
2021-04-15thermal/drivers/tsens: Fix missing put_device errorGuangqing Zhu1-2/+4
Fixes coccicheck error: drivers/thermal/qcom/tsens.c:759:4-10: ERROR: missing put_device; call of_find_device_by_node on line 715, but without a corresponding object release within this function. Fixes: a7ff82976122 ("drivers: thermal: tsens: Merge tsens-common.c into tsens.c") Signed-off-by: Guangqing Zhu <zhuguangqing83@gmail.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210404125431.12208-1-zhuguangqing83@gmail.com
2021-04-15thermal/drivers/qcom-spmi-temp-alarm: Add support for GEN2 rev 1 PMIC ↵David Collins1-30/+61
peripherals Add support for TEMP_ALARM GEN2 PMIC peripherals with digital major revision 1. This revision utilizes a different temperature threshold mapping than earlier revisions. Signed-off-by: David Collins <collinsd@codeaurora.org> Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/69c90a004b3f5b7ae282f5ec5ca2920a48f23e02.1596040416.git.gurus@codeaurora.org
2021-04-15thermal/drivers/cpufreq_cooling: Fix slab OOB issuebrian-sy yang1-1/+1
Slab OOB issue is scanned by KASAN in cpu_power_to_freq(). If power is limited below the power of OPP0 in EM table, it will cause slab out-of-bound issue with negative array index. Return the lowest frequency if limited power cannot found a suitable OPP in EM table to fix this issue. Backtrace: [<ffffffd02d2a37f0>] die+0x104/0x5ac [<ffffffd02d2a5630>] bug_handler+0x64/0xd0 [<ffffffd02d288ce4>] brk_handler+0x160/0x258 [<ffffffd02d281e5c>] do_debug_exception+0x248/0x3f0 [<ffffffd02d284488>] el1_dbg+0x14/0xbc [<ffffffd02d75d1d4>] __kasan_report+0x1dc/0x1e0 [<ffffffd02d75c2e0>] kasan_report+0x10/0x20 [<ffffffd02d75def8>] __asan_report_load8_noabort+0x18/0x28 [<ffffffd02e6fce5c>] cpufreq_power2state+0x180/0x43c [<ffffffd02e6ead80>] power_actor_set_power+0x114/0x1d4 [<ffffffd02e6fac24>] allocate_power+0xaec/0xde0 [<ffffffd02e6f9f80>] power_allocator_throttle+0x3ec/0x5a4 [<ffffffd02e6ea888>] handle_thermal_trip+0x160/0x294 [<ffffffd02e6edd08>] thermal_zone_device_check+0xe4/0x154 [<ffffffd02d351cb4>] process_one_work+0x5e4/0xe28 [<ffffffd02d352f44>] worker_thread+0xa4c/0xfac [<ffffffd02d360124>] kthread+0x33c/0x358 [<ffffffd02d289940>] ret_from_fork+0xc/0x18 Fixes: 371a3bc79c11b ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power") Signed-off-by: brian-sy yang <brian-sy.yang@mediatek.com> Signed-off-by: Michael Kao <michael.kao@mediatek.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Cc: stable@vger.kernel.org #v5.7 Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com
2021-04-15thermal/drivers/hisi: Use the correct HiSilicon copyrightHao Fang1-3/+3
s/Hisilicon/HiSilicon/g. It should use capital S, according to https://www.hisilicon.com/en/terms-of-use. Signed-off-by: Hao Fang <fanghao11@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1617086733-2705-1-git-send-email-fanghao11@huawei.com
2021-04-15thermal/drivers/cpuidle_cooling: Fix use after errorDaniel Lezcano1-3/+5
When the function successfully finishes it logs an information about the registration of the cooling device and use its name to build the message. Unfortunately it was freed right before: drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register() warn: 'name' was already freed. Fix this by freeing after the message happened. Fixes: 6fd1b186d900 ("thermal/drivers/cpuidle_cooling: Use device name instead of auto-numbering") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210319202522.891061-1-daniel.lezcano@linaro.org
2021-04-15thermal/drivers/devfreq_cooling: Fix wrong return on error pathDaniel Lezcano1-1/+1
The following error is reported by kbuild: smatch warnings: drivers/thermal/devfreq_cooling.c:433 of_devfreq_cooling_register_power() warn: passing zero to 'ERR_PTR' Fix the error code by the setting the 'err' variable instead of 'cdev'. Fixes: f8d354e821b2 ("thermal/drivers/devfreq_cooling: Use device name instead of auto-numbering") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202424.890968-1-daniel.lezcano@linaro.org
2021-04-15thermal/core: Fix memory leak in the error pathDaniel Lezcano1-0/+1
Fix the following error: smatch warnings: drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev' by freeing the cdev when exiting the function in the error path. Fixes: 584837618100 ("thermal/drivers/core: Use a char pointer for the cooling device name") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org
2021-03-17thermal/drivers/qcom/tsens_v1: Enable sensor 3 on MSM8976Konrad Dybcio1-2/+2
The sensor *is* in fact used and does report temperature. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Acked-by: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210225213119.116550-1-konrad.dybcio@somainline.org
2021-03-17thermal/core: Add NULL pointer check before using cooling device statsManaf Meethalavalappu Pallikunhi1-0/+3
There is a possible chance that some cooling device stats buffer allocation fails due to very high cooling device max state value. Later cooling device update sysfs can try to access stats data for the same cooling device. It will lead to NULL pointer dereference issue. Add a NULL pointer check before accessing thermal cooling device stats data. It fixes the following bug [ 26.812833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 27.122960] Call trace: [ 27.122963] do_raw_spin_lock+0x18/0xe8 [ 27.122966] _raw_spin_lock+0x24/0x30 [ 27.128157] thermal_cooling_device_stats_update+0x24/0x98 [ 27.128162] cur_state_store+0x88/0xb8 [ 27.128166] dev_attr_store+0x40/0x58 [ 27.128169] sysfs_kf_write+0x50/0x68 [ 27.133358] kernfs_fop_write+0x12c/0x1c8 [ 27.133362] __vfs_write+0x54/0x160 [ 27.152297] vfs_write+0xcc/0x188 [ 27.157132] ksys_write+0x78/0x108 [ 27.162050] ksys_write+0xf8/0x108 [ 27.166968] __arm_smccc_hvc+0x158/0x4b0 [ 27.166973] __arm_smccc_hvc+0x9c/0x4b0 [ 27.186005] el0_svc+0x8/0xc Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1607367181-24589-1-git-send-email-manafm@codeaurora.org
2021-03-16thermal/core/power_allocator: Using round the division when re-divvying up powerjeson.gao1-3/+5
The division is used directly in re-divvying up power, the decimal part will be discarded, devices will get less than the extra_actor_power - 1. if using round the division to make the calculation more accurate. For example: actor0 received more than its max_power, it has the extra_power 759 actor1 received less than its max_power, it require extra_actor_power 395 actor2 received less than its max_power, it require extra_actor_power 365 actor1 and actor2 require the total capped_extra_power 760 using division in re-divvying up power actor1 would actually get the extra_actor_power 394 actor2 would actually get the extra_actor_power 364 if using round the division in re-divvying up power actor1 would actually get the extra_actor_power 394 actor2 would actually get the extra_actor_power 365 Signed-off-by: Jeson Gao <jeson.gao@unisoc.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/1615796737-4688-1-git-send-email-gao.yunxiao6@gmail.com
2021-03-15thermal/drivers/cpufreq_cooling: Remove unused listDaniel Lezcano1-13/+0
There is a list with the purpose of grouping the cpufreq cooling device together as described in the comments but actually it is unused, the code evolved since 2012 and the list was no longer needed. Delete the remaining unused list related code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-5-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/cpuidle_cooling: Use device name instead of auto-numberingDaniel Lezcano1-17/+16
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-idle-0 thermal-idle-1 thermal-idle-2 thermal-idle-3 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal prefix and changes the number by the device name. So the naming above becomes: idle-cpu0 idle-cpu1 idle-cpu2 idle-cpu3 etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20210314111333.16551-4-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/devfreq_cooling: Use device name instead of auto-numberingDaniel Lezcano1-17/+8
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-devfreq-0 thermal-devfreq-1 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal' prefix and changes the number by the device name. So the naming above becomes: devfreq-5000000.gpu devfreq-1d84000.ufshc etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-3-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/cpufreq_cooling: Use device name instead of auto-numberingDaniel Lezcano1-22/+12
Currently the naming of a cooling device is just a cooling technique followed by a number. When there are multiple cooling devices using the same technique, it is impossible to clearly identify the related device as this one is just a number. For instance: thermal-cpufreq-0 thermal-cpufreq-1 etc ... The 'thermal' prefix is redundant with the subsystem namespace. This patch removes the 'thermal' prefix and changes the number by the device name. So the naming above becomes: cpufreq-cpu0 cpufreq-cpu4 etc ... Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20210314111333.16551-2-daniel.lezcano@linaro.org
2021-03-15thermal/drivers/core: Use a char pointer for the cooling device nameDaniel Lezcano1-16/+22
We want to have any kind of name for the cooling devices as we do no longer want to rely on auto-numbering. Let's replace the cooling device's fixed array by a char pointer to be allocated dynamically when registering the cooling device, so we don't limit the length of the name. Rework the error path at the same time as we have to rollback the allocations in case of error. Tested with a dummy device having the name: "Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch" A village on the island of Anglesey (Wales), known to have the longest name in Europe. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
2021-03-10thermal: thermal_of: Fix error return code of thermal_of_populate_bind_params()Jia-Ju Bai1-2/+5
When kcalloc() returns NULL to __tcbp or of_count_phandle_with_args() returns zero or -ENOENT to count, no error return code of thermal_of_populate_bind_params() is assigned. To fix these bugs, ret is assigned with -ENOMEM and -ENOENT in these cases, respectively. Fixes: a92bab8919e3 ("of: thermal: Allow multiple devices to share cooling map") Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210310122423.3266-1-baijiaju1990@gmail.com
2021-03-10thermal:ti-soc-thermal: Remove duplicate include in ti-bandgapZhang Yunkai1-1/+0
'of_device.h' included in 'ti-bandgap.c' is duplicated. It is also included in the 25th line. Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210306123415.219441-1-zhang.yunkai@zte.com.cn
2021-03-10thermal: rcar_gen3_thermal: Add support for up to five TSC nodesNiklas Söderlund1-1/+2
Add support for up to five TSC nodes. The new THCODE values are taken from the example in the datasheet. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210309162419.2621359-1-niklas.soderlund+renesas@ragnatech.se
2021-03-10thermal: Fix couple of spellos in the file sun8i_thermal.cBhaskar Chowdhury1-2/+2
s/calibartion/calibration/ s/undocummented/undocumented/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210305014348.17412-1-unixbhaskar@gmail.com
2021-03-10thermal: Fix a typo in the file soctherm.cBhaskar Chowdhury1-1/+1
s/calibaration/calibration/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210305015320.7614-1-unixbhaskar@gmail.com
2021-03-10thermal: amlogic: Omit superfluous error message in amlogic_thermal_probe()Tang Bin1-3/+1
The function devm_platform_ioremap_resource has already contains error message, so remove the redundant dev_err here. Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210222061105.6008-1-tangbin@cmss.chinamobile.com
2021-02-22Merge tag 'thermal-v5.12-rc1' of ↵Linus Torvalds23-598/+752
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Use the newly introduced 'hot' and 'critical' ops for the acpi thermal driver (Daniel Lezcano) - Remove the notify ops as it is no longer used (Daniel Lezcano) - Remove the 'forced passive' option and the unused bind/unbind functions (Daniel Lezcano) - Remove the THERMAL_TRIPS_NONE and the code cleanup around this macro (Daniel Lezcano) - Rework the delays to make them pre-computed instead of computing them again and again at each polling interval (Daniel Lezcano) - Remove the pointless 'thermal_zone_device_reset' function (Daniel Lezcano) - Use the critical and hot ops to prevent an unexpected system shutdown on int340x (Kai-Heng Feng) - Make the cooling device state private to the thermal subsystem (Daniel Lezcano) - Prevent to use not-power-aware actor devices with the power allocator governor (Lukasz Luba) - Remove 'zx' and 'tango' support along with the corresponding platforms (Arnd Bergman) - Fix several issues on the Omap thermal driver (Tony Lindgren) - Add support for adc-tm5 PMIC thermal monitor for Qcom platforms (Dmitry Baryshkov) - Fix an initialization loop in the adc-tm5 (Colin Ian King) - Fix a return error check in the cpufreq cooling device (Viresh Kumar) * tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (26 commits) thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error thermal: qcom: Fix comparison with uninitialized variable channels_available thermal: qcom: add support for adc-tm5 PMIC thermal monitor dt-bindings: thermal: qcom: add adc-thermal monitor bindings thermal: ti-soc-thermal: Use non-inverted define for omap4 thermal: ti-soc-thermal: Simplify polling with iopoll thermal: ti-soc-thermal: Fix stuck sensor with continuous mode for 4430 thermal: ti-soc-thermal: Skip pointless register access for dra7 thermal/drivers/zx: Remove zx driver thermal/drivers/tango: Remove tango driver thermal: power allocator: fail binding for non-power actor devices thermal/core: Make cooling device state change private thermal: intel: pch: Fix unexpected shutdown at critical temperature thermal: int340x: Fix unexpected shutdown at critical temperature thermal/core: Remove pointless thermal_zone_device_reset() function thermal/core: Remove ms based delay fields thermal/core: Use precomputed jiffies for the polling thermal/core: Precompute the delays from msecs to jiffies thermal/core: Remove unused macro THERMAL_TRIPS_NONE thermal/core: Remove THERMAL_TRIPS_NONE test ...
2021-02-21Merge tag 'sched-core-2021-02-17' of ↵Linus Torvalds1-14/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler updates: - Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the preempt=none/voluntary/full boot options (default: full), to allow distros to build a PREEMPT kernel but fall back to close to PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via a boot time selection. There's also the /debug/sched_debug switch to do this runtime. This feature is implemented via runtime patching (a new variant of static calls). The scope of the runtime patching can be best reviewed by looking at the sched_dynamic_update() function in kernel/sched/core.c. ( Note that the dynamic none/voluntary mode isn't 100% identical, for example preempt-RCU is available in all cases, plus the preempt count is maintained in all models, which has runtime overhead even with the code patching. ) The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast majority of distributions, are supposed to be unaffected. - Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that was found via rcutorture triggering a hang. The bug is that rcu_idle_enter() may wake up a NOCB kthread, but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't. In true 2020 fashion the original bug fix has grown into a 5-patch scheduler/RCU fix series plus another 16 RCU patches to address the underlying issue of missed preemption events. These are the initial fixes that should fix current incarnations of the bug. - Clean up rbtree usage in the scheduler, by providing & using the following consistent set of rbtree APIs: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two - Improve the SMP/NUMA load-balancer: scan for an idle sibling in a single pass. This is a 4-commit series where each commit improves one aspect of the idle sibling scan logic. - Improve the cpufreq cooling driver by getting the effective CPU utilization metrics from the scheduler - Improve the fair scheduler's active load-balancing logic by reducing the number of active LB attempts & lengthen the load-balancing interval. This improves stress-ng mmapfork performance. - Fix CFS's estimated utilization (util_est) calculation bug that can result in too high utilization values Misc updates & fixes: - Fix the HRTICK reprogramming & optimization feature - Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code - Reduce dl_add_task_root_domain() overhead - Fix uprobes refcount bug - Process pending softirqs in flush_smp_call_function_from_idle() - Clean up task priority related defines, remove *USER_*PRIO and USER_PRIO() - Simplify the sched_init_numa() deduplication sort - Documentation updates - Fix EAS bug in update_misfit_status(), which degraded the quality of energy-balancing - Smaller cleanups" * tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) sched,x86: Allow !PREEMPT_DYNAMIC entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point entry: Explicitly flush pending rcuog wakeup before last rescheduling point rcu/nocb: Trigger self-IPI on late deferred wake up before user resume rcu/nocb: Perform deferred wake up before last idle's need_resched() check rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers sched/features: Distinguish between NORMAL and DEADLINE hrtick sched/features: Fix hrtick reprogramming sched/deadline: Reduce rq lock contention in dl_add_task_root_domain() uprobes: (Re)add missing get_uprobe() in __find_uprobe() smp: Process pending softirqs in flush_smp_call_function_from_idle() sched: Harden PREEMPT_DYNAMIC static_call: Allow module use without exposing static_call_key sched: Add /debug/sched_preempt preempt/dynamic: Support dynamic preempt with preempt= boot option preempt/dynamic: Provide irqentry_exit_cond_resched() static call preempt/dynamic: Provide preempt_schedule[_notrace]() static calls preempt/dynamic: Provide cond_resched() and might_resched() static calls preempt: Introduce CONFIG_PREEMPT_DYNAMIC static_call: Provide DEFINE_STATIC_CALL_RET0() ...
2021-02-17thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on errorViresh Kumar1-1/+1
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures. The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure. Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors. Cc: v5.7+ <stable@vger.kernel.org> # v5.7+ Reported-by: Thara Gopinath <thara.gopinath@linaro.org> Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Thara Gopinath<thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/b2b7e84944937390256669df5a48ce5abba0c1ef.1613540713.git.viresh.kumar@linaro.org
2021-02-16thermal: qcom: Fix comparison with uninitialized variable channels_availableColin Ian King1-7/+7
Currently the check of chip->channels[i].channel is against an the uninitialized variable channels_available. I believe the variable channels_available needs to be fetched first by the call to adc_tm5_read before the channels check. Fix the issue swapping the order of the channels check loop with the call to adc_tm5_read. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: ca66dca5eda6 ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210216151626.162996-1-colin.king@canonical.com
2021-02-15thermal: qcom: add support for adc-tm5 PMIC thermal monitorDmitry Baryshkov3-0/+635
Add support for Thermal Monitoring part of PMIC5. This part is closely coupled with ADC, using it's channels directly. ADC-TM support generating interrupts on ADC value crossing low or high voltage bounds, which is used to support thermal trip points. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210205000118.493610-3-dmitry.baryshkov@linaro.org
2021-02-15thermal: ti-soc-thermal: Use non-inverted define for omap4Tony Lindgren2-4/+4
When we set bit 10 high we use continuous mode and not single mode. Let's correct this to avoid confusion. No functional changes here, the code does the right thing with bit 10. Cc: Adam Ford <aford173@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210205134534.49200-5-tony@atomide.com
2021-02-15thermal: ti-soc-thermal: Simplify polling with iopollTony Lindgren1-16/+14
We can use iopoll for checking the EOCZ (end of conversion) bit. And with this we now also want to handle the timeout errors properly. For omap3, we need about 1.2ms for the single mode sampling to wait for EOCZ down, so let's use 1.5ms timeout there. Waiting for sampling to start is faster and we can use 1ms timeout. Cc: Adam Ford <aford173@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210205134534.49200-4-tony@atomide.com
2021-02-15thermal: ti-soc-thermal: Fix stuck sensor with continuous mode for 4430Tony Lindgren3-4/+14
At least for 4430, trying to use the single conversion mode eventually hangs the thermal sensor. This can be quite easily seen with errors: thermal thermal_zone0: failed to read out thermal zone (-5) Also, trying to read the temperature shows a stuck value with: $ while true; do cat /sys/class/thermal/thermal_zone0/temp; done Where the temperature is not rising at all with the busy loop. Additionally, the EOCZ (end of conversion) bit is not rising on 4430 in single conversion mode while it works fine in continuous conversion mode. It is also possible that the hung temperature sensor can affect the thermal shutdown alert too. Let's fix the issue by adding TI_BANDGAP_FEATURE_CONT_MODE_ONLY flag and use it for 4430. Note that we also need to add udelay to for the EOCZ (end of conversion) bit polling as otherwise we have it time out too early on 4430. We'll be changing the loop to use iopoll in the following clean-up patch. Cc: Adam Ford <aford173@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210205134534.49200-3-tony@atomide.com
2021-02-15thermal: ti-soc-thermal: Skip pointless register access for dra7Tony Lindgren1-14/+15
On dra7, there is no Start of Conversion (SOC) register bit and we have an empty bgap_soc_mask in the configuration for the thermal driver. Let's not do pointless reads and writes with the empty mask. There's also no point waiting for End of Conversion bit (EOCZ) to go high on dra7. We only care about it going down, and are now mostly timing out waiting for EOCZ high while it has already gone down. When we add checking for the timeout errors in a later patch, waiting for EOCZ high would cause bogus time out errors. Cc: Adam Ford <aford173@gmail.com> Cc: Carl Philipp Klemm <philipp@uvos.xyz> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: H. Nikolaus Schaller <hns@goldelico.com> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210205134534.49200-2-tony@atomide.com
2021-02-08thermal: Move therm_throt there from x86/mceBorislav Petkov5-1/+741
This functionality has nothing to do with MCE, move it to the thermal framework and untangle it from MCE. Requested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lkml.kernel.org/r/20210202121003.GD18075@zn.tnic
2021-02-03thermal/drivers/zx: Remove zx driverArnd Bergmann3-265/+0
The zte zx platform is getting removed, so this driver is no longer needed. Cc: Jun Nie <jun.nie@linaro.org> Cc: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210120162400.4115366-3-arnd@kernel.org
2021-02-03thermal/drivers/tango: Remove tango driverArnd Bergmann3-136/+0
The tango platform is getting removed, so the driver is no longer needed. Cc: Marc Gonzalez <marc.w.gonzalez@free.fr> Cc: Mans Rullgard <mans@mansr.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210120162400.4115366-2-arnd@kernel.org
2021-01-19thermal: power allocator: fail binding for non-power actor devicesLukasz Luba1-1/+34
The thermal zone can have cooling devices which are missing power actor API. This could be due to missing Energy Model for devfreq or cpufreq cooling device. In this case it is safe to fail the binding rather than trying to workaround and control the temperature in such thermal zone. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210119114126.19480-1-lukasz.luba@arm.com
2021-01-19thermal/core: Make cooling device state change privateDaniel Lezcano2-1/+2
The change of the cooling device state should be used by the governor or at least by the core code, not by the drivers themselves. Remove the API usage and move the function declaration to the internal headers. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Zhang Rui <rui.zhang@intel.com> Link: https://lore.kernel.org/r/20210118173824.9970-1-daniel.lezcano@linaro.org
2021-01-19thermal: intel: pch: Fix unexpected shutdown at critical temperatureKai-Heng Feng1-0/+6
Like previous patch, the intel_pch_thermal device is not in ACPI ThermalZone namespace, so a critical trip doesn't mean shutdown. Override the default .critical callback to prevent surprising thermal shutdoown. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
2021-01-19thermal: int340x: Fix unexpected shutdown at critical temperatureKai-Heng Feng1-0/+6
We are seeing thermal shutdown on Intel based mobile workstations, the shutdown happens during the first trip handle in thermal_zone_device_register(): kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down However, we shouldn't do a thermal shutdown here, since 1) We may want to use a dedicated daemon, Intel's thermald in this case, to handle thermal shutdown. 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature): "... If this object it present under a device, the device’s driver evaluates this object to determine the device’s critical cooling temperature trip point. This value may then be used by the device’s driver to program an internal device temperature sensor trip point." So a "critical trip" here merely means we should take a more aggressive cooling method. As int340x device isn't present under ACPI ThermalZone, override the default .critical callback to prevent surprising thermal shutdown. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201221172345.36976-1-kai.heng.feng@canonical.com
2021-01-19thermal/core: Remove pointless thermal_zone_device_reset() functionDaniel Lezcano1-7/+1
The function thermal_zone_device_reset() is called in the thermal_zone_device_register() which allocates and initialize the structure. The passive field is already zero-ed by the allocation, the function is useless. Call directly thermal_zone_device_init() instead and thermal_zone_device_reset(). Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201222181110.1231977-1-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove ms based delay fieldsDaniel Lezcano4-8/+8
The code does no longer use the ms unit based fields to set the delays as they are replaced by the jiffies. Remove them and replace their user to use the jiffies version instead. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Peter Kästle <peter@piie.net> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20201216220337.839878-3-daniel.lezcano@linaro.org
2021-01-19thermal/core: Use precomputed jiffies for the pollingDaniel Lezcano1-10/+5
The delays are also stored in jiffies based unit. Use them instead of the ms. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201216220337.839878-2-daniel.lezcano@linaro.org
2021-01-19thermal/core: Precompute the delays from msecs to jiffiesDaniel Lezcano3-0/+11
The delays are stored in ms units and when the polling function is called this delay is converted into jiffies at each call. Instead of doing the conversion again and again, compute the jiffies at init time and use the value directly when setting the polling. Cc: Thara Gopinath <thara.gopinath@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove THERMAL_TRIPS_NONE testDaniel Lezcano1-1/+1
The last site calling the thermal_zone_bind_cooling_device() function with the THERMAL_TRIPS_NONE parameter was removed. We can get rid of this test as no user of this function is calling this function with this parameter. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-5-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove pointless test with the THERMAL_TRIPS_NONE macroDaniel Lezcano1-4/+1
The THERMAL_TRIPS_NONE is equal to -1, it is pointless to do a conversion in this function. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-3-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove unused functions rebind/unbind exceptionDaniel Lezcano2-41/+0
The functions thermal_zone_device_rebind_exception and thermal_zone_device_unbind_exception are not used from anywhere. Remove that code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-2-daniel.lezcano@linaro.org
2021-01-19thermal/core: Remove the 'forced_passive' optionDaniel Lezcano2-91/+3
The code was reorganized in 2012 with the commit 0c01ebbfd3caf1. The main change is a loop on the trip points array and a unconditional call to the throttle() ops of the governors for each of them even if the trip temperature is not reached yet. With this change, the 'forced_passive' is no longer checked in the thermal_zone_device_update() function but in the step wise governor's throttle() callback. As the force_passive does no belong to the trip point array, the thermal_zone_device_update() can not compare with the specified passive temperature, thus does not detect the passive limit has been crossed. Consequently, throttle() is never called and the 'forced_passive' branch is unreached. In addition, the default processor cooling device is not automatically bound to the thermal zone if there is not passive trip point, thus the 'forced_passive' can not operate. If there is an active trip point, then the throttle function will be called to mitigate at this temperature and the 'forced_passive' will override the mitigation of the active trip point in this case but with the default cooling device bound to the thermal zone, so usually a fan, and that is not a passive cooling effect. Given the regression exists since more than 8 years, nobody complained and at the best of my knowledge there is no bug open in https://bugzilla.kernel.org, it is reasonable to say it is unused. Remove the 'forced_passive' related code. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org> Link: https://lore.kernel.org/r/20201214233811.485669-1-daniel.lezcano@linaro.org
2021-01-14thermal: cpufreq_cooling: Reuse sched_cpu_util() for SMP platformsViresh Kumar1-14/+55
Several parts of the kernel are already using the effective CPU utilization (as seen by the scheduler) to get the current load on the CPU, do the same here instead of depending on the idle time of the CPU, which isn't that accurate comparatively. This is also the right thing to do as it makes the cpufreq governor (schedutil) align better with the cpufreq_cooling driver, as the power requested by cpufreq_cooling governor will exactly match the next frequency requested by the schedutil governor since they are both using the same metric to calculate load. This was tested on ARM Hikey6220 platform with hackbench, sysbench and schbench. None of them showed any regression or significant improvements. Schbench is the most important ones out of these as it creates the scenario where the utilization numbers provide a better estimate of the future. Scenario 1: The CPUs were mostly idle in the previous polling window of the IPA governor as the tasks were sleeping and here are the details from traces (load is in %): Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=203 load={{0x35,0x1,0x0,0x31,0x0,0x0,0x64,0x0}} dynamic_power=1339 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=600 load={{0x60,0x46,0x45,0x45,0x48,0x3b,0x61,0x44}} dynamic_power=3960 Here, the "Old" line gives the load and requested_power (dynamic_power here) numbers calculated using the idle time based implementation, while "New" is based on the CPU utilization from scheduler. As can be clearly seen, the load and requested_power numbers are simply incorrect in the idle time based approach and the numbers collected from CPU's utilization are much closer to the reality. Scenario 2: The CPUs were busy in the previous polling window of the IPA governor: Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=800 load={{0x64,0x64,0x64,0x64,0x64,0x64,0x64,0x64}} dynamic_power=5280 New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=708 load={{0x4d,0x5c,0x5c,0x5b,0x5c,0x5c,0x51,0x5b}} dynamic_power=4672 As can be seen, the idle time based load is 100% for all the CPUs as it took only the last window into account, but in reality the CPUs aren't that loaded as shown by the utilization numbers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lkml.kernel.org/r/9c255c83d78d58451abc06848001faef94c87a12.1607400596.git.viresh.kumar@linaro.org