summaryrefslogtreecommitdiffstats
path: root/kernel/power
AgeCommit message (Collapse)AuthorFilesLines
2021-06-11PM: sleep: remove trailing spaces and tabsZhen Lei2-7/+7
Run the following command to find and remove the trailing spaces and tabs: $ find kernel/power/ -type f | xargs sed -r -i 's/[ \t]+$//' Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-24PM: hibernate: fix spelling mistakesZhen Lei2-5/+5
Fix some spelling mistakes in comments: corresonds ==> corresponds alocated ==> allocated unitialized ==> uninitialized Deompression ==> Decompression Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-08PM: sleep: fix typos in commentsLu Jialin3-3/+3
Change "occured" to "occurred" in kernel/power/autosleep.c. Change "consiting" to "consisting" in kernel/power/snapshot.c. Change "avaiable" to "available" in kernel/power/swap.c. No functionality changed. Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-23PM: EM: postpone creating the debugfs dir till fs_initcallLukasz Luba1-1/+1
The debugfs directory '/sys/kernel/debug/energy_model' is needed before the Energy Model registration can happen. With the recent change in debugfs subsystem it's not allowed to create this directory at early stage (core_initcall). Thus creating this directory would fail. Postpone the creation of the EM debug dir to later stage: fs_initcall. It should be safe since all clients: CPUFreq drivers, Devfreq drivers will be initialized in later stages. The custom debug log below prints the time of creation the EM debug dir at fs_initcall and successful registration of EMs at later stages. [ 1.505717] energy_model: creating rootdir [ 3.698307] cpu cpu0: EM: created perf domain [ 3.709022] cpu cpu1: EM: created perf domain Fixes: 56348560d495 ("debugfs: do not attempt to create a new file before the filesystem is initalized") Reported-by: Ionela Voinescu <ionela.voinescu@arm.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-15Merge branches 'powercap' and 'pm-misc'Rafael J. Wysocki1-8/+4
* powercap: powercap: intel_rapl: Use topology interface in rapl_init_domains() powercap: intel_rapl: Use topology interface in rapl_add_package() powercap/intel_rapl: add support for AlderLake Mobile powercap/drivers/dtpm: Fix size of object being allocated powercap/drivers/dtpm: Fix an IS_ERR() vs NULL check powercap/drivers/dtpm: Fix some missing unlock bugs powercap/drivers/dtpm: Fix a double shift bug powercap/drivers/dtpm: Fix __udivdi3 and __aeabi_uldivmod unresolved symbols powercap/drivers/dtpm: Add CPU energy model based support powercap/drivers/dtpm: Add API for dynamic thermal power management Documentation/powercap/dtpm: Add documentation for dtpm units: Add Watt units * pm-misc: PM: Kconfig: remove unneeded "default n" options PM: EM: update Kconfig description and drop "default n" option
2021-02-12PM: sleep: Constify static struct attribute_groupRikard Falkeborn1-1/+1
The only usage of suspend_attr_group is to put its address in an array of pointers to const attribute_group structs. Make it const to allow the compiler to put it into read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12PM: Kconfig: remove unneeded "default n" optionsLukasz Luba1-3/+0
Remove "default n" options. If the "default" line is removed, it defaults to 'n'. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-02-12PM: EM: update Kconfig description and drop "default n" optionLukasz Luba1-5/+4
Energy Model supports now other devices like GPUs, DSPs, not only CPUs. Thus, update the description in the config option. Remove also unneeded "default n". If the "default" line is removed, it defaults to 'n'. Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-27PM: sleep: No need to check PF_WQ_WORKER in thaw_kernel_threads()Zqiang1-1/+1
Because PF_KTHREAD is set for all wq worker threads, it is not necessary to check PF_WQ_WORKER in addition to it in thaw_kernel_threads(), so stop doing that. Signed-off-by: Zqiang <qiang.zhang@windriver.com> [ rjw: Subject and changelog rewrite ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-01-25PM: hibernate: flush swap writer after markingLaurent Badel1-1/+1
Flush the swap writer after, not before, marking the files, to ensure the signature is properly written. Fixes: 6f612af57821 ("PM / Hibernate: Group swap ops") Signed-off-by: Laurent Badel <laurentbadel@eaton.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-12-15Merge tag 'pm-5.11-rc1' of ↵Linus Torvalds2-2/+26
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update cpufreq (core and drivers), cpuidle (polling state implementation and the PSCI driver), the OPP (operating performance points) framework, devfreq (core and drivers), the power capping RAPL (Running Average Power Limit) driver, the Energy Model support, the generic power domains (genpd) framework, the ACPI device power management, the core system-wide suspend code and power management utilities. Specifics: - Use local_clock() instead of jiffies in the cpufreq statistics to improve accuracy (Viresh Kumar). - Fix up OPP usage in the cpufreq-dt and qcom-cpufreq-nvmem cpufreq drivers (Viresh Kumar). - Clean up the cpufreq core, the intel_pstate driver and the schedutil cpufreq governor (Rafael Wysocki). - Fix up error code paths in the sti-cpufreq and mediatek cpufreq drivers (Yangtao Li, Qinglang Miao). - Fix cpufreq_online() to return error codes instead of success (0) in all cases when it fails (Wang ShaoBo). - Add mt8167 support to the mediatek cpufreq driver and blacklist mt8516 in the cpufreq-dt-platdev driver (Fabien Parent). - Modify the tegra194 cpufreq driver to always return values from the frequency table as the current frequency and clean up that driver (Sumit Gupta, Jon Hunter). - Modify the arm_scmi cpufreq driver to allow it to discover the power scale present in the performance protocol and provide this information to the Energy Model (Lukasz Luba). - Add missing MODULE_DEVICE_TABLE to several cpufreq drivers (Pali Rohár). - Clean up the CPPC cpufreq driver (Ionela Voinescu). - Fix NVMEM_IMX_OCOTP dependency in the imx cpufreq driver (Arnd Bergmann). - Rework the poling interval selection for the polling state in cpuidle (Mel Gorman). - Enable suspend-to-idle for PSCI OSI mode in the PSCI cpuidle driver (Ulf Hansson). - Modify the OPP framework to support empty (node-less) OPP tables in DT for passing dependency information (Nicola Mazzucato). - Fix potential lockdep issue in the OPP core and clean up the OPP core (Viresh Kumar). - Modify dev_pm_opp_put_regulators() to accept a NULL argument and update its users accordingly (Viresh Kumar). - Add frequency changes tracepoint to devfreq (Matthias Kaehlcke). - Add support for governor feature flags to devfreq, make devfreq sysfs file permissions depend on the governor and clean up the devfreq core (Chanwoo Choi). - Clean up the tegra20 devfreq driver and deprecate it to allow another driver based on EMC_STAT to be used instead of it (Dmitry Osipenko). - Add interconnect support to the tegra30 devfreq driver, allow it to take the interconnect and OPP information from DT and clean it up (Dmitry Osipenko). - Add interconnect support to the exynos-bus devfreq driver along with interconnect properties documentation (Sylwester Nawrocki). - Add suport for AMD Fam17h and Fam19h processors to the RAPL power capping driver (Victor Ding, Kim Phillips). - Fix handling of overly long constraint names in the powercap framework (Lukasz Luba). - Fix the wakeup configuration handling for bridges in the ACPI device power management core (Rafael Wysocki). - Add support for using an abstract scale for power units in the Energy Model (EM) and document it (Lukasz Luba). - Add em_cpu_energy() micro-optimization to the EM (Pavankumar Kondeti). - Modify the generic power domains (genpd) framwework to support suspend-to-idle (Ulf Hansson). - Fix creation of debugfs nodes in genpd (Thierry Strudel). - Clean up genpd (Lina Iyer). - Clean up the core system-wide suspend code and make it print driver flags for devices with debug enabled (Alex Shi, Patrice Chotard, Chen Yu). - Modify the ACPI system reboot code to make it prepare for system power off to avoid confusing the platform firmware (Kai-Heng Feng). - Update the pm-graph (multiple changes, mostly usability-related) and cpupower (online and offline CPU information support) PM utilities (Todd Brandt, Brahadambal Srinivasan)" * tag 'pm-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits) cpufreq: Fix cpufreq_online() return value on errors cpufreq: Fix up several kerneldoc comments cpufreq: stats: Use local_clock() instead of jiffies cpufreq: schedutil: Simplify sugov_update_next_freq() cpufreq: intel_pstate: Simplify intel_cpufreq_update_pstate() PM: domains: create debugfs nodes when adding power domains opp: of: Allow empty opp-table with opp-shared dt-bindings: opp: Allow empty OPP tables media: venus: dev_pm_opp_put_*() accepts NULL argument drm/panfrost: dev_pm_opp_put_*() accepts NULL argument drm/lima: dev_pm_opp_put_*() accepts NULL argument PM / devfreq: exynos: dev_pm_opp_put_*() accepts NULL argument cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_*() accepts NULL argument cpufreq: dt: dev_pm_opp_put_regulators() accepts NULL argument opp: Allow dev_pm_opp_put_*() APIs to accept NULL opp_table opp: Don't create an OPP table from dev_pm_opp_get_opp_table() cpufreq: dt: Don't (ab)use dev_pm_opp_get_opp_table() to create OPP table opp: Reduce the size of critical section in _opp_kref_release() PM / EM: Micro optimization in em_cpu_energy cpufreq: arm_scmi: Discover the power scale in performance protocol ...
2020-12-15kernel/power: allow hibernation with page_poison sanity checkingVlastimil Babka3-5/+13
Page poisoning used to be incompatible with hibernation, as the state of poisoned pages was lost after resume, thus enabling CONFIG_HIBERNATION forces CONFIG_PAGE_POISONING_NO_SANITY. For the same reason, the poisoning with zeroes variant CONFIG_PAGE_POISONING_ZERO used to disable hibernation. The latter restriction was removed by commit 1ad1410f632d ("PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO") and similarly for init_on_free by commit 18451f9f9e58 ("PM: hibernate: fix crashes with init_on_free=1") by making sure free pages are cleared after resume. We can use the same mechanism to instead poison free pages with PAGE_POISON after resume. This covers both zero and 0xAA patterns. Thus we can remove the Kconfig restriction that disables page poison sanity checking when hibernation is enabled. Link: https://lkml.kernel.org/r/20201113104033.22907-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [hibernation] Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <labbott@kernel.org> Cc: Mateusz Nosek <mateusznosek0@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15PM: hibernate: make direct map manipulations more explicitMike Rapoport1-2/+36
When DEBUG_PAGEALLOC or ARCH_HAS_SET_DIRECT_MAP is enabled a page may be not present in the direct map and has to be explicitly mapped before it could be copied. Introduce hibernate_map_page() and hibernation_unmap_page() that will explicitly use set_direct_map_{default,invalid}_noflush() for ARCH_HAS_SET_DIRECT_MAP case and debug_pagealloc_{map,unmap}_pages() for DEBUG_PAGEALLOC case. The remapping of the pages in safe_copy_page() presumes that it only changes protection bits in an existing PTE and so it is safe to ignore return value of set_direct_map_{default,invalid}_noflush(). Still, add a pr_warn() so that future changes in set_memory APIs will not silently break hibernation. Link: https://lkml.kernel.org/r/20201109192128.960-3-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Len Brown <len.brown@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15Merge branches 'pm-sleep', 'pm-acpi', 'pm-domains' and 'powercap'Rafael J. Wysocki1-0/+2
* pm-sleep: PM: sleep: Add dev_wakeup_path() helper PM / suspend: fix kernel-doc markup PM: sleep: Print driver flags for all devices during suspend/resume * pm-acpi: PM: ACPI: Refresh wakeup device power configuration every time PM: ACPI: PCI: Drop acpi_pm_set_bridge_wakeup() PM: ACPI: reboot: Use S5 for reboot * pm-domains: PM: domains: create debugfs nodes when adding power domains PM: domains: replace -ENOTSUPP with -EOPNOTSUPP * powercap: powercap: Adjust printing the constraint name with new line powercap: RAPL: Add AMD Fam19h RAPL support powercap: Add AMD Fam17h RAPL support powercap/intel_rapl_msr: Convert rapl_msr_priv into pointer x86/msr-index: sort AMD RAPL MSRs by address
2020-11-23PM / suspend: fix kernel-doc markupAlex Shi1-0/+2
Add parameter explanation to fix kernel-doc marks: kernel/power/suspend.c:233: warning: Function parameter or member 'state' not described in 'suspend_valid_only_mem' kernel/power/suspend.c:344: warning: Function parameter or member 'state' not described in 'suspend_prepare' Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> [ rjw: Change the proposed parameter descriptions. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-10PM: EM: update the comments related to power scaleLukasz Luba1-1/+1
The Energy Model supports power values expressed in milli-Watts or in an 'abstract scale'. Update the related comments is the code to reflect that state. Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-10PM: EM: Add a flag indicating units of power values in Energy ModelLukasz Luba1-1/+23
There are different platforms and devices which might use different scale for the power values. Kernel sub-systems might need to check if all Energy Model (EM) devices are using the same scale. Address that issue and store the information inside EM for each device. Thanks to that they can be easily compared and proper action triggered. Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-27PM: sleep: fix typo in kernel/power/process.cJackie Zamow1-1/+1
Fix a typo in a comment in freeze_processes(). Signed-off-by: Jackie Zamow <jackie.zamow@gmail.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-16kernel/: fix repeated words in commentsRandy Dunlap1-1/+1
Fix multiple occurrences of duplicated words in kernel/. Fix one typo/spello on the same line as a duplicate word. Change one instance of "the the" to "that the". Otherwise just drop one of the repeated words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-14Merge tag 'pm-5.10-rc1' of ↵Linus Torvalds2-11/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These rework the collection of cpufreq statistics to allow it to take place if fast frequency switching is enabled in the governor, rework the frequency invariance handling in the cpufreq core and drivers, add new hardware support to a couple of cpufreq drivers, fix a number of assorted issues and clean up the code all over. Specifics: - Rework cpufreq statistics collection to allow it to take place when fast frequency switching is enabled in the governor (Viresh Kumar). - Make the cpufreq core set the frequency scale on behalf of the driver and update several cpufreq drivers accordingly (Ionela Voinescu, Valentin Schneider). - Add new hardware support to the STI and qcom cpufreq drivers and improve them (Alain Volmat, Manivannan Sadhasivam). - Fix multiple assorted issues in cpufreq drivers (Jon Hunter, Krzysztof Kozlowski, Matthias Kaehlcke, Pali Rohár, Stephan Gerhold, Viresh Kumar). - Fix several assorted issues in the operating performance points (OPP) framework (Stephan Gerhold, Viresh Kumar). - Allow devfreq drivers to fetch devfreq instances by DT enumeration instead of using explicit phandles and modify the devfreq core code to support driver-specific devfreq DT bindings (Leonard Crestez, Chanwoo Choi). - Improve initial hardware resetting in the tegra30 devfreq driver and clean up the tegra cpuidle driver (Dmitry Osipenko). - Update the cpuidle core to collect state entry rejection statistics and expose them via sysfs (Lina Iyer). - Improve the ACPI _CST code handling diagnostics (Chen Yu). - Update the PSCI cpuidle driver to allow the PM domain initialization to occur in the OSI mode as well as in the PC mode (Ulf Hansson). - Rework the generic power domains (genpd) core code to allow domain power off transition to be aborted in the absence of the "power off" domain callback (Ulf Hansson). - Fix two suspend-to-idle issues in the ACPI EC driver (Rafael Wysocki). - Fix the handling of timer_expires in the PM-runtime framework on 32-bit systems and the handling of device links in it (Grygorii Strashko, Xiang Chen). - Add IO requests batching support to the hibernate image saving and reading code and drop a bogus get_gendisk() from there (Xiaoyi Chen, Christoph Hellwig). - Allow PCIe ports to be put into the D3cold power state if they are power-manageable via ACPI (Lukas Wunner). - Add missing header file include to a power capping driver (Pujin Shi). - Clean up the qcom-cpr AVS driver a bit (Liu Shixin). - Kevin Hilman steps down as designated reviwer of adaptive voltage scaling (AVS) drivers (Kevin Hilman)" * tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits) cpufreq: stats: Fix string format specifier mismatch arm: disable frequency invariance for CONFIG_BL_SWITCHER cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale() cpufreq: stats: Add memory barrier to store_reset() cpufreq: schedutil: Simplify sugov_fast_switch() ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe() ACPI: EC: PM: Flush EC work unconditionally after wakeup PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI PM: hibernate: remove the bogus call to get_gendisk() in software_resume() cpufreq: Move traces and update to policy->cur to cpufreq core cpufreq: stats: Enable stats for fast-switch as well cpufreq: stats: Mark few conditionals with unlikely() cpufreq: stats: Remove locking cpufreq: stats: Defer stats update to cpufreq_stats_record_transition() PM: domains: Allow to abort power off when no ->power_off() callback PM: domains: Rename power state enums for genpd PM / devfreq: tegra30: Improve initial hardware resetting PM / devfreq: event: Change prototype of devfreq_event_get_edev_by_phandle function PM / devfreq: Change prototype of devfreq_get_devfreq_by_phandle function PM / devfreq: Add devfreq_get_devfreq_by_node function ...
2020-10-13Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds2-29/+18
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ...
2020-10-05PM: hibernate: remove the bogus call to get_gendisk() in software_resume()Christoph Hellwig1-11/+0
get_gendisk grabs a reference on the disk and file operation, so this code will leak both of them while having absolutely no use for the gendisk itself. This effectively reverts commit 2df83fa4bce421f ("PM / Hibernate: Use get_gendisk to verify partition if resume_file is integer format") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-28PM: hibernate: Batch hibernate and resume IO requestsXiaoyi Chen1-0/+15
Hibernate and resume process submits individual IO requests for each page of the data, so use blk_plug to improve the batching of these requests. Testing this change with hibernate and resumes consistently shows merging of the IO requests and more than an order of magnitude improvement in hibernate and resume speed is observed. One hibernate and resume cycle for 16GB RAM out of 32GB in use takes around 21 minutes before the change, and 1 minutes after the change on a system with limited storage IOPS. Signed-off-by: Xiaoyi Chen <cxiaoyi@amazon.com> Co-Developed-by: Anchal Agarwal <anchalag@amazon.com> Signed-off-by: Anchal Agarwal <anchalag@amazon.com> [ rjw: Subject and changelog edits, white space damage fixes ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-23PM: mm: cleanup swsusp_swap_checkChristoph Hellwig1-6/+4
Use blkdev_get_by_dev instead of bdget + blkdev_get. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23mm: split swap_type_ofChristoph Hellwig2-21/+12
swap_type_of is used for two entirely different purposes: (1) check what swap type a given device/offset corresponds to (2) find the first available swap device that can be written to Mixing both in a single function creates an unreadable mess. Create two separate functions instead, and switch both to pass a dev_t instead of a struct block_device to further simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-23PM: rewrite is_hibernate_resume_dev to not require an inodeChristoph Hellwig1-6/+6
Just check the dev_t to help simplifying the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01notifier: Fix broken error handling patternPeter Zijlstra5-45/+33
The current notifiers have the following error handling pattern all over the place: int err, nr; err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr); if (err & NOTIFIER_STOP_MASK) __foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL) And aside from the endless repetition thereof, it is broken. Consider blocking notifiers; both calls take and drop the rwsem, this means that the notifier list can change in between the two calls, making @nr meaningless. Fix this by replacing all the __foo_notifier_call_chain() functions with foo_notifier_call_chain_robust() that embeds the above pattern, but ensures it is inside a single lock region. Note: I switched atomic_notifier_call_chain_robust() to use the spinlock, since RCU cannot provide the guarantee required for the recovery. Note: software_resume() error handling was broken afaict. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2-3/+3
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-11Merge tag 'libnvdimm-for-5.9' of ↵Linus Torvalds1-0/+97
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updayes from Vishal Verma: "You'd normally receive this pull request from Dan Williams, but he's busy watching a newborn (Congrats Dan!), so I'm watching libnvdimm this cycle. This adds a new feature in libnvdimm - 'Runtime Firmware Activation', and a few small cleanups and fixes in libnvdimm and DAX. I'd originally intended to make separate topic-based pull requests - one for libnvdimm, and one for DAX, but some of the DAX material fell out since it wasn't quite ready. Summary: - add 'Runtime Firmware Activation' support for NVDIMMs that advertise the relevant capability - misc libnvdimm and DAX cleanups" * tag 'libnvdimm-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr libnvdimm/security: the 'security' attr never show 'overwrite' state libnvdimm/security: fix a typo ACPI: NFIT: Fix ARS zero-sized allocation dax: Fix incorrect argument passed to xas_set_err() ACPI: NFIT: Add runtime firmware activate support PM, libnvdimm: Add runtime firmware activation support libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO() drivers/dax: Expand lock scope to cover the use of addresses fs/dax: Remove unused size parameter dax: print error message by pr_info() in __generic_fsdax_supported() driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW} tools/testing/nvdimm: Emulate firmware activation commands tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation tools/testing/nvdimm: Add command debug messages tools/testing/nvdimm: Cleanup dimm index passing ACPI: NFIT: Define runtime firmware activation commands ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor libnvdimm: Validate command family indices
2020-08-07mm: memcg: convert vmstat slab counters to bytesRoman Gushchin1-1/+1
In order to prepare for per-object slab memory accounting, convert NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat items to bytes. To make it obvious, rename them to NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B (similar to NR_KERNEL_STACK_KB). Internally global and per-node counters are stored in pages, however memcg and lruvec counters are stored in bytes. This scheme may look weird, but only for now. As soon as slab pages will be shared between multiple cgroups, global and node counters will reflect the total number of slab pages. However memcg and lruvec counters will be used for per-memcg slab memory tracking, which will take separate kernel objects in the account. Keeping global and node counters in pages helps to avoid additional overhead. The size of slab memory shouldn't exceed 4Gb on 32-bit machines, so it will fit into atomic_long_t we use for vmstats. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/20200623174037.3951353-4-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-03Merge branches 'pm-sleep', 'pm-domains', 'powercap' and 'pm-tools'Rafael J. Wysocki3-6/+6
* pm-sleep: PM: sleep: spread "const char *" correctness PM: hibernate: fix white space in a few places freezer: Add unsafe version of freezable_schedule_timeout_interruptible() for NFS PM: sleep: core: Emit changed uevent on wakeup_sysfs_add/remove * pm-domains: PM: domains: Restore comment indentation for generic_pm_domain.child_links PM: domains: Fix up terminology with parent/child * powercap: powercap: Add Power Limit4 support powercap: idle_inject: Replace play_idle() with play_idle_precise() in comments powercap: intel_rapl: add support for Sapphire Rapids * pm-tools: pm-graph v5.7 - important s2idle fixes cpupower: Replace HTTP links with HTTPS ones cpupower: Fix NULL but dereferenced coccicheck errors cpupower: Fix comparing pointer to 0 coccicheck warns
2020-07-28PM, libnvdimm: Add runtime firmware activation supportDan Williams1-0/+97
Abstract platform specific mechanics for nvdimm firmware activation behind a handful of generic ops. At the bus level ->activate_state() indicates the unified state (idle, busy, armed) of all DIMMs on the bus, and ->capability() indicates the system state expectations for activate. At the DIMM level ->activate_state() indicates the per-DIMM state, ->activate_result() indicates the outcome of the last activation attempt, and ->arm() attempts to transition the DIMM from 'idle' to 'armed'. A new hibernate_quiet_exec() facility is added to support firmware activation in an OS defined system quiesce state. It leverages the fact that the hibernate-freeze state wants to assert that a memory hibernation snapshot can be taken. This is in contrast to a platform firmware defined quiesce state that may forcefully quiet the memory controller independent of whether an individual device-driver properly supports hibernate-freeze. The libnvdimm sysfs interface is extended to support detection of a firmware activate capability. The mechanism supports enumeration and triggering of firmware activate, optionally in the hibernate_quiet_exec() context. [rafael: hibernate_quiet_exec() proposal] [vishal: fix up sparse warning, grammar in Documentation/] Cc: Pavel Machek <pavel@ucw.cz> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Reported-by: kernel test robot <lkp@intel.com> Co-developed-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-14PM: sleep: spread "const char *" correctnessAlexey Dobriyan2-3/+3
Fixed string literals can be referred to as "const char *". Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> [ rjw: Minor subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-14PM: hibernate: fix white space in a few placesXiang Chen1-3/+3
In hibernate.c, some places lack of spaces while some places have redundant spaces. So fix them. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-24PM / EM: remove em_register_perf_domainLukasz Luba1-25/+0
Remove old function em_register_perf_domain which is no longer needed. There is em_dev_register_perf_domain that covers old use cases and new as well. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-24PM / EM: add support for other devices than CPUs in Energy ModelLukasz Luba1-77/+167
Add support for other devices than CPUs. The registration function does not require a valid cpumask pointer and is ready to handle new devices. Some of the internal structures has been reorganized in order to keep consistent view (like removing per_cpu pd pointers). Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-24PM / EM: update callback structure and add device pointerLukasz Luba1-4/+5
The Energy Model framework is going to support devices other that CPUs. In order to make this happen change the callback function and add pointer to a device as an argument. Update the related users to use new function and new callback from the Energy Model. Acked-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-24PM / EM: introduce em_dev_register_perf_domain functionLukasz Luba1-6/+34
Add now function in the Energy Model framework which is going to support new devices. This function will help in transition and make it smoother. For now it still checks if the cpumask is a valid pointer, which will be removed later when the new structures and infrastructure will be ready. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-24PM / EM: change naming convention from 'capacity' to 'performance'Lukasz Luba1-22/+22
The Energy Model uses concept of performance domain and capacity states in order to calculate power used by CPUs. Change naming convention from capacity to performance state would enable wider usage in future, e.g. upcoming support for other devices other than CPUs. Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Quentin Perret <qperret@google.com> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-14treewide: replace '---help---' in Kconfig files with 'help'Masahiro Yamada1-13/+13
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-10Merge tag 'pm-5.8-rc1-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are operating performance points (OPP) framework updates mostly, including support for interconnect bandwidth in the OPP core, plus a few cpufreq changes, including boost support in the CPPC cpufreq driver, an ACPI device power management fix and a hibernation code cleanup. Specifics: - Add support for interconnect bandwidth to the OPP core (Georgi Djakov, Saravana Kannan, Sibi Sankar, Viresh Kumar). - Add support for regulator enable/disable to the OPP core (Kamil Konieczny). - Add boost support to the CPPC cpufreq driver (Xiongfeng Wang). - Make the tegra186 cpufreq driver set the CPUFREQ_NEED_INITIAL_FREQ_CHECK flag (Mian Yousaf Kaukab). - Prevent the ACPI power management from using power resources with devices where the list of power resources for power state D0 (full power) is missing (Rafael Wysocki). - Annotate a hibernation-related function with __init (Christophe JAILLET)" * tag 'pm-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PM: Avoid using power resources if there are none for D0 cpufreq: CPPC: add SW BOOST support cpufreq: change '.set_boost' to act on one policy PM: hibernate: Add __init annotation to swsusp_header_init() opp: Don't parse icc paths unnecessarily opp: Remove bandwidth votes when target_freq is zero opp: core: add regulators enable and disable opp: Reorder the code for !target_freq case opp: Expose bandwidth information via debugfs cpufreq: dt: Add support for interconnect bandwidth scaling opp: Update the bandwidth on OPP frequency changes opp: Add sanity checks in _read_opp_key() opp: Add support for parsing interconnect bandwidth cpufreq: tegra186: add CPUFREQ_NEED_INITIAL_FREQ_CHECK flag OPP: Add helpers for reading the binding properties dt-bindings: opp: Introduce opp-peak-kBps and opp-avg-kBps bindings
2020-06-09mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport1-1/+0
Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-07Merge tag 'tty-5.8-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here is the tty and serial driver updates for 5.8-rc1 Nothing huge at all, just a lot of little serial driver fixes, updates for new devices and features, and other small things. Full details are in the shortlog. All of these have been in linux-next with no issues for a while" * tag 'tty-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (67 commits) tty: serial: qcom_geni_serial: Add 51.2MHz frequency support tty: serial: imx: clear Ageing Timer Interrupt in handler serial: 8250_fintek: Add F81966 Support sc16is7xx: Add flag to activate IrDA mode dt-bindings: sc16is7xx: Add flag to activate IrDA mode serial: 8250: Support rs485 bus termination GPIO serial: 8520_port: Fix function param documentation dt-bindings: serial: Add binding for rs485 bus termination GPIO vt: keyboard: avoid signed integer overflow in k_ascii serial: 8250: Enable 16550A variants by default on non-x86 tty: hvc_console, fix crashes on parallel open/close serial: imx: Initialize lock for non-registered console sc16is7xx: Read the LSR register for basic device presence check sc16is7xx: Allow sharing the IRQ line sc16is7xx: Use threaded IRQ sc16is7xx: Always use falling edge IRQ tty: n_gsm: Fix bogus i++ in gsm_data_kick tty: n_gsm: Remove unnecessary test in gsm_print_packet() serial: stm32: add no_console_suspend support tty: serial: fsl_lpuart: Use __maybe_unused instead of #if CONFIG_PM_SLEEP ...
2020-06-05PM: hibernate: Add __init annotation to swsusp_header_init()Christophe JAILLET1-1/+1
'swsusp_header_init()' is only called via 'core_initcall'. It can be marked as __init to save a few bytes of memory. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-27PM: hibernate: Restrict writes to the resume deviceDomenico Andreoli1-1/+13
Hibernation via snapshot device requires write permission to the swap block device, the one that more often (but not necessarily) is used to store the hibernation image. With this patch, such permissions are granted iff: 1) snapshot device config option is enabled 2) swap partition is used as resume device In other circumstances the swap device is not writable from userspace. In order to achieve this, every write attempt to a swap device is checked against the device configured as part of the uswsusp API [0] using a pointer to the inode struct in memory. If the swap device being written was not configured for resuming, the write request is denied. NOTE: this implementation works only for swap block devices, where the inode configured by swapon (which sets S_SWAPFILE) is the same used by SNAPSHOT_SET_SWAP_AREA. In case of swap file, SNAPSHOT_SET_SWAP_AREA indeed receives the inode of the block device containing the filesystem where the swap file is located (+ offset in it) which is never passed to swapon and then has not set S_SWAPFILE. As result, the swap file itself (as a file) has never an option to be written from userspace. Instead it remains writable if accessed directly from the containing block device, which is always writeable from root. [0] Documentation/power/userland-swsusp.rst v2: - rename is_hibernate_snapshot_dev() to is_hibernate_resume_dev() - fix description so to correctly refer to the resume device Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com> Acked-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19PM: hibernate: Split off snapshot dev optionDomenico Andreoli2-1/+14
Make it possible to reduce the attack surface in case the snapshot device is not to be used from userspace. Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19PM: hibernate: Incorporate concurrency handlingDomenico Andreoli3-12/+22
Hibernation concurrency handling is currently delegated to user.c, where it's also used for regulating the access to the snapshot device. In the prospective of making user.c a separate configuration option, such mutual exclusion is brought into hibernate.c and made available through accessor helpers hereby introduced. Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-15kernel/power: constify sysrq_key_opEmil Velikov1-1/+1
With earlier commits, the API no longer discards the const-ness of the sysrq_key_op. As such we can add the notation. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: linux-kernel@vger.kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <len.brown@intel.com> Cc: linux-pm@vger.kernel.org Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Link: https://lore.kernel.org/r/20200513214351.2138580-10-emil.l.velikov@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-27PM: hibernate: Freeze kernel threads in software_resume()Dexuan Cui1-0/+7
Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47 At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI. Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-06PM / sleep: handle the compat case in snapshot_set_swap_area()Christoph Hellwig1-32/+22
Use in_compat_syscall to copy directly from the 32-bit ABI structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>